Tasks
Table of contents
In this part, you will implement the remainder of the basic MapReduce system. Specifically, you will be implementing distributing map and reduce tasks to workers as well as executing those tasks.
This task must be completed by the checkpoint 1 deadline to receive full credit (more information can be found on Ed).
Receiving tasks
Implement the GetTask
RPC to request a task from the coordinator.
The protocol buffers have already been provided for you in proto/coordinator.proto
:
message GetTaskReply {
uint32 job_id = 1;
string output_dir = 2;
string app = 3;
uint32 task = 4;
string file = 5;
uint32 n_reduce = 6;
uint32 n_map = 7;
bool reduce = 8;
bool wait = 9;
repeated MapTaskAssignment map_task_assignments = 10;
bytes args = 11;
}
Most of the fields should be fairly straightforward to match with the fields provided by the SubmitJob
RPC. Additionally, the task
field should denote the map or reduce task number that this worker is executing. The wait
field should only be true if the worker should become idle and wait before requesting a new task. The reduce
field tells the worker whether it should execute a map task or a reduce task.
For map tasks, file
corresponds to the input file that the worker should operate on. For reduce tasks, map_task_assignments
provides a list of which workers have which map tasks so that the reduce worker can contact the necessary workers for data.
Once you are done, sanity-check that tasks are being assigned correctly by inserting logging statements.
Finishing tasks
If a task completes successfully, the worker will notify the coordinator using the FinishTask
RPC. Implement this RPC.
Once the coordinator learns that a task is complete, it should update its data structures. Once all map tasks for a job are complete, the coordinator should begin assigning reduce tasks. Once all map and reduce tasks for a job are complete, the coordinator should mark the job complete. Subsequent calls to the PollJob
RPC should have done = true
.
Autograder
Once you complete this portion of the assignment, you should be passing the autograder tests up to and including mr-no-duplicates
.