MapReduce Example
Let’s use the MovieLens dataset as an example and find out how many movies did each user rated.

The MAPPER converts raw source data into key/value pairs

This is how the MovieLens u.data looks like:

- Map users to movies they watched:

- Extract and organize data we care about.
- The less data we put on the cluster, the better.

- MapReduce sorts and groups the mapped data (“Shuffle and Sort”)

- The REDUCER Processes each key’s values

- To summarize:

- Example on a cluster:
