Comma separated values or delimited list in database – the Bad the Good

Saving comma separated values or delimited list in database looks convenient at first but never use them because most and most of the time that’s only a bad idea. I have provided many reasons to avoid them and only one when you can take favour of.

The Bad

1. Blocking the Power of Query
You will no more be able to take advantage of SQL queries as ‘where filter’, order by, group by, join etc won’t work on comma separated values.

2. Violation of First Normal Form
A relation is in first normal form if every column in every row can contain only one single (atomic) value in database. Here Comma separated strings are not atomic values.

3. Mismatch Data Type
You can’t ensure that each item in delimited list is the right data type. There is no way to prevent 1, 11, 21, Mango, 41.

4. Destroys Uniqueness
You can’t prevent duplicate values, Your list may content 1, 2, 2, 2, 3, 5.

5. String Column Size
You can’t store a list longer than what fits in the string column.

6. Hard to Count Elements
It’s hard to count number of elements in list or running aggregate queries.

7. Hard to Join
Comma separated values or delimited list are never easier to  join the values to the lookup table they reference.

8. Space Issue
Did you ever calculate the space taken by the comma characters or delimiters. Moreever storing integers as strings takes about twice as much space as storing binary integers.

9. No Way to Enforce Referential Integrity
You won’t be able to use foreign key constraints to link values to a lookup table.

10. Lose of Deletion in One Attempt
You can’t delete a value from the list directly. You must have to fetch whole list first.

11. Searching is Far
It’s hard to search for all entities with a given value in the list. You can’t perform well with an inefficient table-scan.

12. What about sorting?
Wanna sort the list in ascending/descend order? Be ready to write a long application code to reinvent functionality that the RDBMS already provides efficiently.

The Good

There is exceptional case when you are 100% sure that your application will never need to process the contents of your comma separated list (or serialised data). It would be preferable to store the comma-separated values if the list isn’t parsed (hopefully in future too) and is simply displayed as-is.

Popular CMS like WordPress, Drupal etc store basic information in comma separated strings and it works fine up to a point.

Only fingers burn when the requirements for the application change with time and sadly processing multi-value fields will need more memory and more CPU cycles due to the full table scans required to do anything other than simply read the contents of the field.

So we can find countable reasons to avoid saving data as comma separated values or delimited list in database and it’s my choice too. Up and do it right the first time, don’t waste time in re-engineering! I would like to hear your idea regarding this.

You Might Interested In

Leave a Reply

Enclose a code block like: <pre><code>Your Code Snippet</code></pre>.