I have a zip file containing xml files in S3. I want to pick the file up from s3 unzip it and save the extracted xmls to hdfs. I am unable to find a processor that can do that. Can someone please guide if there is any solution or workaround to this problem.

You should be able to use the Amazon S3 origin as is, choosing XML as the Data Format and specifying Compressed File as the Compression Format. Do you have errors with that configuration?

Hi jeff, I tried to send xml file into a compressed format in SFTP (origion) and it is pushed into HadoopFS.StreamSet was running without error but I couldn't receive the file in HadoopFS.

