You are currently browsing the Database Evolution BLOG weblog archives for October, 2008.
- Blogroll (1)
- Column Store Db (2)
Archive for October 2008
Searching & Compression
27/10/2008 by lilian.
If like me you have worked with relational databases for many years, you probably wonder what’s so different about a column database. Well in the column store world that means that all like column information is stored together. Therefore if your table comprises of 6 columns and 100,000 rows then in the relational world you store 100,000 records but in the column store world there are only 6 groups of data stored, one for each column.
Now imagine that you need to search one of those columns for certain values. Since they are all together, you don’t waste time looking at all the data, just the values for that column, so that must be quicker, but there is more you can do. What if one of the columns in the 100,000 records was a date. If all the records were for a limited range of dates, then instead of having to store each date, you can now keep a count of how many occurences you have for each date. Products like Vertica offer various compression techniques on the columns, and in this example, run-length encoding would be the obvious choice which would store 15/10/2008 and 16982 to represent the number of occurrences. In the relational world we would have to read every record.
Do you now start to understand why its so much faster in the column-store world. Not only is the data highly compressed so there is little of it to read, but you only read the data of interest.
But you cry, who reads all the data, we build objects like materialized views so we search less of the data, hence its not so much of a problem. In databases like Oracle materialized views do result in huge performance gains, but imagine what happens when you take that technology and apply it to a column-store. Vertica creates projections which are the equivalent of materialized views without the aggregation. So just when you speeded up your relational database, the column-store steams ahead once again using projections.
If you are now curious, take a look at some of the Vertica benchmarks and you will see what I mean
Posted in Column Store Db | 1 Comment »