A simple and easy Spark standalone program for beginners.


A simple and easy Spark standalone program for beginners....

Courtsey: Hariprasad Bhaskaran


Aim :
To demonstrate writing a simple spark scala program
use sbt to build it
create dependencies for spark-core and spark-sql in build.sbt
create a final jar that can be submitted using spark-submit...


Program code below:


MyAdd2.scala
=================

import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.sql._
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.hive._
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SQLContext._


object MyAdd2
{

def main(args: Array[String] )
{

val conf = new SparkConf().setAppName("t1").setMaster("local")
val sc2 = new SparkContext(conf)
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc2)

import sqlContext.implicits._

val mylist = sc2.parallelize(List( 1 ,2,3,4,5,6,7,8,9,100) )
val mylistDF = mylist.map( x => (x,1) ).toDF("vals","ones")

mylistDF.registerTempTable("t1")
val qry1 = sqlContext.sql("select * from t1")

val r2 = qry1.collect

println ("results are blow.. " )

for( i <- r2)
{
println(i)
}

sc2.stop()

}
}





build.sbt
==========

name := "Sparksqltest"
version := "1.0"
scalaVersion := "2.11.7"
 
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.6.0"
libraryDependencies += "org.apache.spark" %% "spark-hive" % "1.6.0"



How to compile
=============

##first create the following directory structure... ==> sampleprj1/src/main/scala

mkdir -p ~/sampleprj1/src/main/scala

## put  scala code inside ==>  sampleprj1/src/main/scala  subfolder

cp  MyAdd2.scala  ~/sampleprj1/src/main/scala

#put build.sbt in ~/sampleprj1
cp  build.sbt  ~/sampleprj1


#then run the following command
sbt package


#once you run it , output will be in 
~/sampleprj1/target/scala-2.11/classes



Comments

  1. I really appreciate information shared above. It’s of great help. If someone want to learn Online (Virtual) instructor lead live training in Apache Spark TECHNOLOGY , kindly contact us http://www.maxmunus.com/contact
    MaxMunus Offer World Class Virtual Instructor-led training on TECHNOLOGY. We have industry expert trainer. We provide Training Material and Software Support. MaxMunus has successfully conducted 100000+ pieces of training in India, USA, UK, Australia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.
    For Demo Contact us.
    Pratik Shekhar
    MaxMunus
    E-mail: pratik@maxmunus.com
    Ph:(0) +91 9066268701
    http://www.maxmunus.com/

    ReplyDelete
  2. Hi

    I am trying to implement your code. However, i am unable to understand initially where and how to create the scala file. You mentioned a folder structure, we need to create it in which path?
    could you explain me in more simpler steps


    Thanks

    ReplyDelete

Post a Comment

Popular posts from this blog

Converting OVA file to VMX file

How to upgrade Huawei WA1003A to a Feature Rich Firmware

How to Restart your Router by Software