Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Apache Spark create rdd from an array java | convert an array in to RDD

  • RDD is the spark's core abstraction.
  • Full form of RDD is resilient distributed dataset.
  • Each RDD will split into multiple partitions which may be computed in different machines of cluster
  • We can create Apache spark RDD in two ways
    1. Parallelizing a collection
    2. Loading an external dataset.
     
  • So Creating RDD from an array comes in under Parallelizing a collection.
  • Let us see Apache spark an example program to convert an array into RDD.



Program #1: Write a Apache spark java example program to create simple RDD using parallelize method of JavaSparkContext. convert an array in to RDD


  1.  package com.instanceofjava.sparkInterview;
  2.  
  3. import java.util.Arrays;
  4.  
  5. import org.apache.spark.SparkConf;
  6. import org.apache.spark.api.java.JavaRDD;
  7. import org.apache.spark.api.java.JavaSparkContext;
  8.  
  9. /**
  10.  *  Apache spark examples:RDD in spark example program  
  11. *  converting an array to RDD
  12.  * @author www.instanceofjava.com
  13.  */
  14. public class SparkTest {
  15.     
  16.     public static void main(String[] args) {
  17.         
  18.         SparkConf conf = new
  19. SparkConf().setMaster("local[2]").setAppName("InstanceofjavaAPP");
  20.         JavaSparkContext sc = new JavaSparkContext(conf);
  21.         
  22.         String[] arrayStr={"convert array to rdd","convert array into rdd"};
  23.         
  24.         JavaRDD strRdd=sc.parallelize(Arrays.asList(arrayStr));
  25.         System.out.println("apache spark rdd created: "+strRdd);
  26.         
  27.         /**
  28.          * Return the first element in this RDD.
  29.          */
  30.         System.out.println(strRdd.first());
  31.         
  32.     }
  33.  
  34. }
  35. }

Output:

  1. apache spark rdd created: ParallelCollectionRDD[0] at parallelize at SparkTest.java:24
  2. convert array to rdd






    This post first appeared on Java Tutorial - InstanceOfJava, please read the originial post: here

    Share the post

    Apache Spark create rdd from an array java | convert an array in to RDD

    ×

    Subscribe to Java Tutorial - Instanceofjava

    Get updates delivered right to your inbox!

    Thank you for your subscription

    ×