Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Handling dataframe/dataset with Spark Evaluator

We need to perform complex transformation in the spark evaluator.(ETL type). Our origin is Kafka MultiTopic consumer(Each topic is one table from oracle) . Doing so using JavaRDD seems impractical and we would like to use spark sql. And we are unable to transform JavaRDD<record> to dataframe/dataset and back to JavaRDD<record>. Challenges: 1. Handle Multitopic Kafka to perform join/transformation as spark evaluator will get data in batches and splitting into different dataframes/datasets. 2. Converting JavaRDD<record> to datasets/dataframes and back to JavaRDD<record>

Any sample code or skeleton code would be appreciated in this regard.

Handling dataframe/dataset with Spark Evaluator

We need to perform complex transformation in the spark evaluator.(ETL type). Our origin is Kafka MultiTopic consumer(Each topic is one table from oracle) . Doing so using JavaRDD seems impractical and we would like to use spark sql. And we are unable to transform JavaRDD<record> to dataframe/dataset and back to JavaRDD<record>.

Challenges: 1.

Challenges:

  1. Handle Multitopic Kafka to perform join/transformation as spark evaluator will get data in batches and splitting into different dataframes/datasets. 2. dataframes/datasets.
  2. Converting JavaRDD<record> to datasets/dataframes and back to JavaRDD<record>

Any sample code or skeleton code would be appreciated in this regard.