Trojan Index

Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing)

Dittrich, J; Quiane-Ruiz, J; Jindal, A; Kargin, Y; Setty, V; Schad, J

MapReduce is a computing paradigm that has gained a lot of at-
tention in recent years from industry and research. Unlike paral-
lel DBMSs, MapReduce allows non-expert users to run complex
analytical tasks over very large data sets on very large clusters
and clouds. However, this comes at a price: MapReduce pro-
cesses tasks in a scan-oriented fashion. Hence, the performance of
Hadoop — an open-source implementation of MapReduce — often
does not match the one of a well-configured parallel DBMS. In this
paper we propose a new type of system named Hadoop++: it boosts

Syndicate content