Apache Hadoop Goes Realtime at Facebook

Borthakur, Dhruba; Sarma, Joydeep Sen; Gray, Jonathan; Muthukkaruppan, Kannan; Spiegelberg, Nicolas; Kuang, Hairong; Ranganathan, Karthik; Molkov, Dmytro; Menon, Aravind; Rash, Samuel; Schmidt, Rodrigo; Aiyer, Amitanand

Facebook recently deployed Facebook Messages, its first ever
user-facing application built on the Apache Hadoop platform.
Apache HBase is a database-like layer built on Hadoop designed
to support billions of messages per day. This paper describes the
reasons why Facebook chose Hadoop and HBase over other
systems such as Apache Cassandra and Voldemort and discusses
the application’’s requirements for consistency, availability,
partition tolerance, data model and scalability. We explore the
enhancements made to Hadoop to make it a more effective


Hadoop: The Definitive Guide MapReduce for the Cloud - MapReduce for the Cloud

White, Tom; Gray, Jonathan; Stack, Michael

Hadoop: The Definitive Guide helps you harness the power of your data. Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters.

Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:


HBase-0.20.0 Performance Evaluation

Rao, Anty; Zhang, Schubert

We  have  been  using  HBase  for  around  a  year  in  our  development  and  projects,  from  0.17.x  to 
0.19.x. We and all in the community know the critical performance and reliability issues of these 
Now,  the  great  news  is  that  HBase‐0.20.0  will  be  released  soon.  Jonathan  Gray  from  Streamy, 
Ryan  Rawson  from  StumbleUpon,  Michael  Stack  from  Powerset/Microsoft,  Jean‐Daniel  Cryans 
from  OpenPlaces,  and  other  contributors  had  done  a  great  job  to  redesign  and  rewrite  many 

Syndicate content