Column-Stores vs. Row-Stores: How different are they really?

Abadi, Daniel J.; Madden, Samuel R.; Hachem, Nabil
Abadi, D
Hachem, N
Madden, S

There has been a significant amount of excitement and recent work
on column-oriented database systems (“column-stores”). These
database systems have been shown to perform more than an or-
der of magnitude better than traditional row-oriented database sys-
tems (“row-stores”) on analytical workloads such as those found in
data warehouses, decision support, and business intelligence appli-
cations. The elevator pitch behind this performance difference is
straightforward: column-stores are more I/O efficient for read-only
queries since they only have to read from disk (or from memory)
those attributes accessed by a query.
This simplistic view leads to the assumption that one can ob-
tain the performance benefits of a column-store using a row-store:
either by vertically partitioning the schema, or by indexing every
column so that columns can be accessed independently. In this pa-
per, we demonstrate that this assumption is false. We compare the
performance of a commercial row-store under a variety of differ-
ent configurations with a column-store and show that the row-store
performance is significantly slower on a recently proposed data
warehouse benchmark. We then analyze the performance differ-
ence and show that there are some important differences between
the two systems at the query executor level (in addition to the obvi-
ous differences at the storage layer level). Using the column-store,
we then tease apart these differences, demonstrating the impact on
performance of a variety of column-oriented query execution tech-
niques, including vectorized query processing, compression, and a
new join algorithm we introduce in this paper. We conclude that
while it is not impossible for a row-store to achieve some of the
performance advantages of a column-store, changes must be made
to both the storage layer and the query executor to fully obtain the
benefits of a column-oriented approach.

Citations range: 
Abadi2008ColumnStoresvsRowStoresHowdifferentaretheyreally.pdf413.91 KB