In the first part of this series we became familiar with ScalaTest . When it comes to unit test Scala Spark applications ScalaTest isn't enough: you need to add to the roster spark-testing-base . It is an Open Source framework which provides base classes for the main Spark abstractions like SparkContext, RDD, DataFrame, DataSet and Streaming. Let's start to explore all of the facilities provided by this framework and how it works along with ScalaTest with some simple examples. Let's consider the following Scala word count example found on the web: import org.apache.spark.{SparkConf, SparkContext} object SparkWordCount { def main(args: Array[String]) { val inputFile = args(0) val outputFile = args(1) val conf = new SparkConf().setAppName(" SparkWordCount ") // Create a Scala Spark Context. val sc = new SparkContext(conf) // Load our input data....
Sharing thoughts and tips on Python, Java, Scala, Open Source, DevOps, Data Science, ML/DL/AI.