Title/Author Year Citations addedsort icon
Kolb, L; Thor, A; Rahm, E
Dedoop: efficient deduplication with Hadoop
2012 Nov12
Afrat, Foto N.; Sarma, Anish Das; Menestrina, David; Parameswaran, Aditya; Ullman, Jeffrey D.
Fuzzy Joins Using MapReduce
2012 Sep12
Kwon, YongChul; Balazinska, Magdalena; Howe, Bill; Rolia, Jerome
SkewTune: Mitigating Skew in MapReduce Applications
2012 May12
Binnig, Carsten; Kossmann, Donald; Kraska, Tim; Loesing, Simon
How is the Weather tomorrow? Towards a Benchmark for the Cloud
2009 Feb12
Kolb, L; Thor, A; Rahm, E
Load Balancing for MapReduce-based Entity Resolution
2012 Nov11
Borthakur, Dhruba; Sarma, Joydeep Sen; Gray, Jonathan; Muthukkaruppan, Kannan; Spiegelberg, Nicolas; Kuang, Hairong; Ranganathan, Karthik; Molkov, Dmytro; Menon, Aravind; Rash, Samuel; Schmidt, Rodrigo; Aiyer, Amitanand
Apache Hadoop Goes Realtime at Facebook
2011 Oct11
Olston, Christopher; Chiou, Greg; Chitnis, Laukik; Liu, Francis; Han, Yiping; Larsson, Mattias; Neumann, Andreas; Rao, Vellanki B. N.; Sankarasubramanian, Vijayanand; Rao, Vellanki B. N.; Siddharth, Seth; Tian, Chao; ZiCornell, Topher; Wang, Xiaodan
Nova: Continuous Pig/Hadoop Workflows
2011 Aug11
Afrati, Foto N.; Ullman, Jeffrey D.
Optimizing Joins in a Map-Reduce Environment
2010 Aug11
Okcan, Alper; Riedewald, Mirek
Processing theta-joins using MapReduce
2011 Aug11
Kolb, L; Köpcke, H; Thor, A; Rahm, E
Learning-based Entity Resolution with MapReduce
2011 Aug11
Kolb, L; Thor, A; Rahm, E
Block-based Load Balancing for Entity Resolution with MapReduce
2011 Aug11
Kolb, L; Thor, A; Rahm, E
Multi-pass sorted neighborhood blocking with MapReduce
2011 Aug11
Eltabakh, MY; Tian, Y; Özcan, F; Gemulla, R; Krettek, A; McPherson, J
CoHadoop: flexible data placement and its exploitation in Hadoop
2011 Aug11
Wang, Chaokun; Wang, Jianmin; Lin, Xuemin; Wang, Wei, Wang, Haixun; Li, Hongsong; Tian, Wanpeng; Xu, Jun; Li, Rui
MapDupReducer: Detecting Near Duplicates over Massive Datasets
2010 Mar11
Jahani, Eaman; Cafarella, Michael J.; Ré, Christopher
Automatic Optimization for MapReduce Programs
2011 Mar11
Kolb, L; Thor, A; Rahm, E
Parallel Sorted Neighborhood Blocking with MapReduce
2011 Dec10
Nykiel, T; Potamias, M; Mishra, C; Kollios, G; N, Koudas
MRShare: Sharing Across Multiple Queries in MapReduce
2010 Dec10
Jiang, D; Ooi, BC; Shi, L; Wu, S
The Performance of MapReduce: An in-depth Study
2010 Dec10
Chen, Songting
Cheetah: A High Performance, Custom Data Warehouse on Top of MapReduce
2010 Dec10
Dittrich, J; Quiane-Ruiz, J; Jindal, A; Kargin, Y; Setty, V; Schad, J
Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing)
2010 Dec10
Schad, J; Dittrich, J; Quiané-Ruiz, JA
Runtime Measurements in the Cloud: Observing, Analyzing, and Reducing Variance
2010 Oct10
Schad, J
Flying Yellow Elephant: Predictable and Efficient MapReduce in the Cloud
2010 Oct10
Pike, R; Dorward, S; Griesemer, R; Quinlan, S
Interpreting the data: Parallel analysis with Sawzall
2005 Oct10
Thomson, Alexander; Abadi, Daniel J.
The Case for Determinism in Database Systems
2010 Sep10
Kossmann, Donald; Kraska, Tim; Loesing, Simon
An evaluation of alternative architectures for transaction processing in the cloud
2010 Aug10