RDDs vs DataFrames in Apache Spark

Big data chennai

Apache Spark:

Apache Spark is a general-purpose & lightning fast cluster computing system. It provides a high-level API like Java, Scala, Python and R. It is a tool for running spark applications and it is 100 times faster than Hadoop and 10 times faster than accessing data from disk. Big data Chennai

Necessity of Apache Spark:

In the industry world , every one needed a general purpose cluster computing tools , such as

MapReduce(It is limited to batch processing).

Storm(It is limited to stream processing).

Impala(It is limited to interactive processing).

Neo4j(It is limited to graph processing).

So, here every one is handling single process only. But in Apache Apark , it provides real-time stream processing,interactive processing,graph processing,in-memeory processing as well as batch procesing with very fast speed, ease of use and standard interface. Big data Chennai

Components of Apache Spark;

  • Spark Core
  • Spaerk Sql
  • Spark streaming
  • Mlib
  • Graphx

RDDs – Resilient Distributed Datasets:

Iit is the fundamental unit of data in spark, which is didtributed collection of elements across cluster nodes and can perform parallel operations. Big data Chennai

RDDs are immutable but can generate new RDD by transforming existing RDD.

There are two ways to create RDDs:

Parallelized Collections:

It is created by invoking parallelize method in the driver program.

External Datasets:

It can be created by calling textfile method. This method takes an URI of the file and reda it as a collections of lines.


Article_Source : Geoinsyssoft

You Must be Like




Leave a Reply

Working Hours

  • Monday9am - 6pm
  • Tuesday9am - 6pm
  • Wednesday9am - 6pm
  • Thursday9am - 6pm
  • Friday9am - 6pm
  • SaturdayClosed
  • SundayClosed
Latest Posts

Big Data training Academy in chennai
data science course in chennai
Wanna attend a demo class?

We are glad that you preferred to schedule a demo class. Please fill our short form and one of our friendly team members will contact you back.


Demo Class