In have uploaded on GitHub a sample code of Apache Spark most common Operations and Actions over RDDs. It also covers examples such as reading and writing to files (text, sequence), a word count function and a simple PageRank implementation.
Download Repository
This code has been written in Java and compiled with Maven. I have been following Holden Karau's book: "Learning Spark. Lightning-Fast Big Data Analytics" which gives very useful advice about Spark engine.
No comments:
Post a Comment