Search: mapreduce, 2011, MapReduce

8 results

Results

Automatic Optimization for MapReduce Programs

... Ré, C The MapReduce distributed programming framework has become popular, despite ... relational databases to complete similar tasks. MapReduce jobs are amenable to many traditional database query optimizations ...

Publication - admin - 11/10/2023 - 00:05 - 1 attachment

Processing theta-joins using MapReduce

... analysis tasks, but are not supported directly by the MapReduce paradigm. While there has been progress on equi-joins, implementation of join algorithms in MapReduce in general is not sufficiently un- derstood. We study the problem ... simplifies creation of and reasoning about joins in MapReduce. Using this model, we derive a surprisingly simple randomized ...

Publication - kolb - 11/09/2023 - 23:16 - 1 attachment

Multi-pass sorted neighborhood blocking with MapReduce

... challenges and possible solu- tions of using the MapReduce programming model for par- allel entity resolution using Sorting ... blocking (SN). We propose and evaluate two efficient MapReduce- based implementations for single- and multi-pass SN that either ...

Publication - kolb - 11/09/2023 - 21:05 - 1 attachment

Parallel Sorted Neighborhood Blocking with MapReduce

... challenges and possi- ble solutions of using the MapReduce programming model for parallel entity resolu- tion. In particular, we propose and evaluate two MapReduce-based implementations for Sorted Neighborhood blocking that either ...

Publication - kolb - 11/16/2023 - 15:49 - 1 attachment

Learning-based Entity Resolution with MapReduce

... can be realized in a cloud infras- tructure using MapReduce. We propose and evaluate two efficient MapReduce-based strategies for pair-wise similar- ity computation and ...

Publication - kolb - 11/09/2023 - 21:05 - 1 attachment

Block-based Load Balancing for Entity Resolution with MapReduce

... The effectiveness and scalability of MapReduce-based im- plementations of complex data-intensive tasks depend on ... (Data Integration, Entity Resolution, load-balancing, MapReduce, Object Matching, Parallel Data Processing) ...

Publication - kolb - 11/10/2023 - 02:16 - 1 attachment

Nova: Continuous Pig/Hadoop Workflows

... sigmod11.pdf 773.13 KB (MapReduce, Parallel Data Processing, Pig, Workflow Engine) ...

Publication - kolb - 11/10/2023 - 00:27 - 1 attachment

CoHadoop: flexible data placement and its exploitation in Hadoop

... (Cloud Infrastructure, Colocation, Data locality, Hadoop, MapReduce, Parallel Data Processing) ...

Publication - kolb - 11/10/2023 - 00:27 - 1 attachment