Making Yin and Yan of YARN and Mesos

YARN has drawn considerable spotlight as the resource scheduler allowing Hadoop 2.x to finally transcend its MapReduce roots. The strength and weakness of YARN was its MapReduce roots – meaning there was backward compatibility to managing MapReduce workloads that dominated Hadoop., but also limitations for running ongoing workloads because of its job-oriented batch origins. By contrast, Apache Mesos has existed as an open source project for some time that provides resource management for scale-out clusters of all kinds – not just Hadoop. It was well suited for dynamic management on continuous (ongoing) workloads.

While a bit dated, this 2011 Quora posting provides a good point by point comparison of YARN’s and Mesos’ strengths and shortcomings. Although not directly comparable, until now both have been considered rival approaches.

A new project – Myriad – proposes to bring them together. Pending Apache incubation status, it would superimpose Mesos as the top level dynamic juggler of resources, while YARN sticks to its knitting and schedules them. In essence, it would make YARN elastic. MapR, which is staking new ground as a participant rather than consumer of Apache projects, is joining with Mesosphere and eBay to drive the project with plans to submit to Apache for incubation.

Myriad is not the only game in town. Slider, a project lead by Hortonworks, is taking the reverse approach. Instead of Mesos dynamically allocating containers (resources) to YARN, Slider works as a helper to YARN for dynamically requesting new resources when a YARN container fails.

Myriad vs. Slider typifies the emerging reality for Hadoop; when issues arise in the Hadoop platform, chances are there will be competing remedies vying for adoption.