Extract ZIP File Directory Origin

asked 2019-10-11 08:37:12 -0500

Peter Delaney gravatar image

I am new to streamsets and I want to execute a simple Pipeline to read ZIP files from Origin/Directory and Unzip the contents of the ZIP and use the contents of the archive as input to a Processor or Destination. I would like to accomplish this without writing any code. Just looking for some examples I can study.

I've been reading the documentation and reading many blogs but cannot seem to determine the best way to do this. I see FTP Client examples, but does not fit.

edit retag flag offensive close merge delete


A lot depends on the actual compression algorithm used. PKZip? GZip? Something else?

metadaddy gravatar imagemetadaddy ( 2019-10-11 11:31:56 -0500 )edit

As of know I am just trying for file that are PKZip (.zip archive format). I will eventually need to work with the others GZip (.tar.gz format) as well. There could be others I am not aware of now, but trying to understand how to configure my pipeline

Peter Delaney gravatar imagePeter Delaney ( 2019-10-11 12:52:01 -0500 )edit

Should I create two data pipelines for my Use Case? 1) That unzips the contents to a new directory 2) another Pipeline that then reads this new directory and processes the contents ??

Peter Delaney gravatar imagePeter Delaney ( 2019-10-14 14:22:01 -0500 )edit