Modifying schema while writing to Hadoop FS from KAFKA consumer

asked 2019-08-26 09:23:05 -0600

Ankit gravatar image


I am using KAFKA consumer as origin. KAFKA topic is created from Informatica CDC component and having data (avro) in one file and associated avro schema in another file. Data is in avro format and we are using in pipeline configuration where passing the avro schema to de-serialize the data.

While loading the target as Hadoop FS , I need to re-arrange the fields and remove the fields from the source (KAFKA topic). I am using Data Format as AVRO and in pipeline configuration - passing new schema which I want in HDFS. Using Output file type as Text File.

File is getting created but with given schema (HDFS schema) but all attributes are null and when trying to do - Hadoop fs -text <filename> - saying it is not a data file.

Seeking help to have HDFS data in avro as per given schema (modified one as per target)

edit retag flag offensive close merge delete