Overview

MapReduce is a core component of the Apache Hadoop software framework. It allows for the parallel processing of massive datasets across a cluster of computers.

The Two Phases

  1. Map: Filtering and sorting data (e.g., sorting students by name into queues).
  2. Reduce: Summary operation (e.g., counting the number of students in each queue).

Related Terms