Ask Your Question
1

Metadata Drift Solution for File based Ingestion

asked 2018-08-24 14:29:04 -0500

jay1988 gravatar image

Is it possible to do Metadata drift solution for a file based ingestion.We are doing it for a Database based Ingestion but not for file based.Can we do that?

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

answered 2018-08-24 20:21:46 -0500

metadaddy gravatar image

updated 2018-08-26 20:38:48 -0500

You can use the Hive Metadata processor and Hive Metastore destination with any origin. The processor just examines the incoming record's structure and metadata, such as field names.

If you're reading CSV data from a file, that should work just fine - you can use the headers from the file, or assign them in the pipeline using the Field Renamer processor. If you're reading something with more structure, such as Avro or JSON, you'll need to flatten the record before it hits the processor.

edit flag offensive delete link more

Comments

In that cases does the file should contain header information?Also if header information is present can the file be processed as whole file or should it be record based processing?

jay1988 gravatar imagejay1988 ( 2018-08-24 22:28:14 -0500 )edit

Edited my answer to include info on headers. You will need to use record-based processing.

metadaddy gravatar imagemetadaddy ( 2018-08-26 20:39:21 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2018-08-24 14:29:04 -0500

Seen: 122 times

Last updated: Aug 26