Ask Your Question

Can not write Avro into HDFS

asked 2019-02-07 15:26:13 -0500

hxue gravatar image

updated 2019-02-08 10:41:42 -0500

Here is my simple data flow: Kafka Consumer -> Hadoop FS -- event -> MapReduce

I am sure that incoming data to kafka are in Avro format, and I can see them in preview mode.

In Hadoop FS, in Data Format tap, I used Avro as Data Format. in Output Files tap, I used Sequence File as File type.

I did see some files were written into hdfs, but without the correct format.

edit retag flag offensive close merge delete

2 Answers

Sort by ยป oldest newest most voted

answered 2019-02-11 18:23:30 -0500

metadaddy gravatar image

updated 2019-02-11 18:23:46 -0500

You should set the File Type to 'Text files' rather than 'Sequence files' when writing Avro data format to Hadoop FS.

edit flag offensive delete link more


thanks for your reply, but it seems it does not support complex data type, so i turn to spark

hxue gravatar imagehxue ( 2019-06-10 14:22:07 -0500 )edit

answered 2019-02-10 14:00:07 -0500

supahcraig gravatar image

I haven't consumed avro from kafka into streamsets (yet), but are you deserializing the avro when you consume it? Then have you specified the avro schema in the Hadoop FS output stage?

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower


Asked: 2019-02-07 15:26:13 -0500

Seen: 195 times

Last updated: Feb 11 '19