Apache Livy is a service to interact with Apache Spark through a REST interface. It enables both submissions of Spark jobs or snippets of Spark code. The following features are supported:
- The jobs can be submitted as pre-compiled jars, snippets of code or via Java/Scala client API.
- Interactive Scala, Python, and R shells.
- Support for Spark 2.x and Spark1.x, Scala 2.10 and 2.11.
- It doesn't require any change to Spark code.
- It allows long running Spark Contexts that can be used for multiple Spark jobs, by multiple clients.
- Multiple Spark Contexts can be managed simultaneously: they run on the cluster instead of the Livy Server, in order to have good fault tolerance and concurrency.
- Possibility to share cached RDDs or Dataframes across multiple jobs and clients.
- Secure authenticated communication.
In the second part of this series I am going to cover the details on starting a Livy server and submitting PySpark code.