Preemptive, Multi-tenant Spark on Mesos

Posted on October 8, 2015

Author: David Greenberg

Presented at: MesosCon Europe, Dublin, Ireland

Abstract: Spark is a popular new platform for interactive high performance analytics, machine learning, and data processing. The trouble is, Spark tends to monopolize whatever Mesos cluster you run it on, so you either create completely separate Spark clusters for each user, or you otherwise limit the resources each user can use. Cook is an advanced fair-sharing, preemptive scheduling backend for Spark. You can run one instance of Cook on your Mesos cluster, and it will automatically adapt the capacity for every user and team on your cluster so that interactive jobs run immediately but utilization remains high. Cook also has a REST API and Java client, and it’s written in Clojure with Datomic.

preemptive multi tenant spark on mesos

The views expressed above are not necessarily the views of Two Sigma Investments, LP or any of its affiliates (collectively, “Two Sigma”).  The information presented above is only for informational and educational purposes and is not an offer to sell or the solicitation of an offer to buy any securities or other instruments. Additionally, the above information is not intended to provide, and should not be relied upon for investment, accounting, legal or tax advice. Two Sigma makes no representations, express or implied, regarding the accuracy or completeness of this information, and the reader accepts all risks in relying on the above information for any purpose whatsoever. Click here for other important disclaimers and disclosures.