Skip to content
KnpCode
Java, Spring, BigData, Web development tutorials with examples
Skip to content
  • Ubuntu
  • Big Data
    • Hadoop Tutorials
  • Java
    • Core Java Tutorials – Beginner
    • Core Java Tutorials – Advanced
    • Java Programs
  • spring
    • Spring Framework
    • Spring Boot Tutorial
Home Hadoop Archive by category "MapReduce"

Category: MapReduce

Counters in Hadoop MapReduce

UshaK May 14, 2020 November 18, 2020MapReduce hadoop, mapreduce 0

Counters in Hadoop MapReduce help in getting statistics about the MapReduce job. With counters in Hadoop you can get general information about the executed job like launched map and reduce tasks, map input records, use the information to diagnose if there is any problem with data, use information provided by…

Continue reading

How MapReduce Works in Hadoop

UshaK June 24, 2019 September 14, 2021MapReduce hadoop, mapreduce 0

In the post WordCount MapReduce program we have seen how to write a MapReduce program in Java, create a jar and run it. There are a lot of things that you do to create a MapReduce job and Hadoop framework also do a lot of processing internally. In this post…

Continue reading

Hadoop MapReduce Word Count Program

admin June 23, 2019 September 14, 2021MapReduce hadoop, mapreduce 0

Once you have installed Hadoop on your system and initial verification is done you would be looking to write your first MapReduce program. Before digging deeper into the intricacies of MapReduce programming first step is the word count MapReduce program in Hadoop which is also known as the “Hello World”…

Continue reading

How to Improve Map-Reduce Performance

UshaK June 15, 2019 August 27, 2021MapReduce hadoop, mapreduce 0

In this post we’ll see some of the ways to improve performance of the Map-Reduce job in Hadoop. The tips given here for improving the performance of MapReduce job are more from the MapReduce code and configuration perspective rather than cluster and hardware perspective. 1- Enabling uber mode– Like Hadoop…

Continue reading

OutputCommitter in Hadoop MapReduce

UshaK June 14, 2019 August 27, 2021MapReduce hadoop, mapreduce 0

In Hadoop framework distributed processing happens where map and reduce tasks are spawned on different nodes and process part of the data. In this type of distributed processing it is important to ensure that framework knows when a particular task finishes or there is a need to abort the task…

Continue reading

How to Chain MapReduce Job in Hadoop

UshaK June 12, 2019 August 12, 2021MapReduce hadoop, mapreduce 0

In many scenarios you would like to create a sequence of MapReduce jobs to completely transform and process the data. This is better than putting every thing in a single MapReduce job job and making it very complex. In fact you can get your data through various sources and use…

Continue reading

Predefined Mapper and Reducer Classes in Hadoop

UshaK June 12, 2019 August 26, 2021MapReduce hadoop, mapreduce 0

With in Hadoop framework there are some predefined Mapper and Reducer classes which can be used as is in the required scenarios. That way you are not required to write mapper or reducer for those scenarios, you can use ready made classes instead. Let’s see some of the predefined Mapper…

Continue reading

Distributed Cache in Hadoop

UshaK June 11, 2019 August 12, 2021MapReduce hadoop, mapreduce 0

In this post we’ll see what Distributed cache in Hadoop is. Table of contents What is a distributed cache Methods for adding the files in Distributed Cache How to use distributed cache Distributed cache example MapReduce code What is a distributed cache As the name suggests distributed cache in Hadoop…

Continue reading

Combiner in Hadoop MapReduce

UshaK June 11, 2019 August 12, 2021MapReduce hadoop, mapreduce 0

This post shows what is combiner in Hadoop MapReduce and how combiner function can be used to reduce the overall memory, I/O and network requirement of the overall MapReduce execution. Table of contents Why is combiner needed in MapReduce Combiner function in MapReduce How to specify a combiner in MapReduce…

Continue reading

Mapper Only Job in Hadoop MapReduce

UshaK June 10, 2019 August 12, 2021MapReduce hadoop, mapreduce 0

Generally when we think of MapReduce job in Hadoop we think of both mappers and reducers doing their share of processing. That is true for most of the cases but you can have scenarios where you want to have a mapper only job in Hadoop. When do you need map…

Continue reading

12
Privacy Policy | Disclaimer
Powered by Nirvana & WordPress.