Ask Your Question

Can not write Avro into HDFS

asked 2019-02-07 15:26:13 -0600

hxue gravatar image

updated 2019-02-08 10:41:42 -0600

Here is my simple data flow: Kafka Consumer -> Hadoop FS -- event -> MapReduce

I am sure that incoming data to kafka are in Avro format, and I can see them in preview mode.

In Hadoop FS, in Data Format tap, I used Avro as Data Format. in Output Files tap, I used Sequence File as File type.

I did see some files were written into hdfs, but without the correct format.

edit retag flag offensive close merge delete

2 Answers

Sort by ยป oldest newest most voted

answered 2019-02-11 18:23:30 -0600

metadaddy gravatar image

updated 2019-02-11 18:23:46 -0600

You should set the File Type to 'Text files' rather than 'Sequence files' when writing Avro data format to Hadoop FS.

edit flag offensive delete link more


thanks for your reply, but it seems it does not support complex data type, so i turn to spark

hxue gravatar imagehxue ( 2019-06-10 14:22:07 -0600 )edit

answered 2019-02-10 14:00:07 -0600

supahcraig gravatar image

I haven't consumed avro from kafka into streamsets (yet), but are you deserializing the avro when you consume it? Then have you specified the avro schema in the Hadoop FS output stage?

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower


Asked: 2019-02-07 15:26:13 -0600

Seen: 228 times

Last updated: Feb 11 '19