Ask Your Question

Convert RDBMS data to Parquet [closed]

asked 2019-06-03 13:36:20 -0500

KeerthiS gravatar image

updated 2019-06-03 13:42:44 -0500

metadaddy gravatar image

How to convert Data/Records retrieved from RDBMS to Parquet.

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by KeerthiS
close date 2019-06-04 08:32:53.174294

1 Answer

Sort by ยป oldest newest most voted

answered 2019-06-03 13:42:35 -0500

metadaddy gravatar image

Write the data to Avro format via the Hadoop FS destination and use the MapReduce executor to convert from Avro to Parquet.

The Parquet Conversion case study that describes this in detail.

edit flag offensive delete link more


Thank you for the quick reply. I am using the flow : JDBC Producer - > Schema Generator -> Data Parser - > Hadoop FS destination. But unable to write from Data Parser step , as they are in records and getting this error:

KeerthiS gravatar imageKeerthiS ( 2019-06-03 14:59:16 -0500 )edit

HADOOPFS_14 - Cannot write record: java.lang.IllegalArgumentException: Record does not contain the mandatory fields /fileRef,/fileInfo,/fileInfo/size for Whole File Format.

KeerthiS gravatar imageKeerthiS ( 2019-06-03 14:59:22 -0500 )edit

Resolved the issue. Used JDBC Consumer -> Schema Generator -> Hadoop FS destination -> Map reduce Executor. And am successfully able to write a parquet file. Thanks!

KeerthiS gravatar imageKeerthiS ( 2019-06-04 08:32:35 -0500 )edit

Question Tools

1 follower


Asked: 2019-06-03 13:36:20 -0500

Seen: 344 times

Last updated: Jun 03 '19