Ask Your Question
0

Multiple sources to hive database

asked 2020-03-17 07:16:46 -0500

Venkat gravatar image

updated 2020-03-18 10:55:25 -0500

metadaddy gravatar image

I am getting multiple sources like the below and how to process to hive table using StreamSets pipeline

Ex:

  • 1st day - 10 flat files(.csv format)
  • 2nd day - 10 flat files and 10 pdf files
  • 3rd day - 10 oracle tables and 10 flat files

How do I process the data into Hive with dynamic sources?

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

answered 2020-03-18 12:58:08 -0500

metadaddy gravatar image

You will need a separate pipeline for each source. Also, there is no off-the-shelf origin for PDF files. You would need to look at using a scripting origin or custom Java origin to do that.

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2020-03-17 07:16:46 -0500

Seen: 14 times

Last updated: Mar 18