Ask Your Question

How to use Apache POI in custom processors to manipulate Excel files from directory origin?

asked 2018-08-31 03:33:56 -0600

Ruchita gravatar image

updated 2018-08-31 09:30:39 -0600

metadaddy gravatar image

Hi, I want to read excel files from directory and do some manipulation on it using apache poi and convert it into records. How can I achieve this ?

Thanks, Ruchita

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted

answered 2018-08-31 09:30:26 -0600

metadaddy gravatar image

Starting with StreamSets Data Collector 3.4.0 the Directory origin can read Excel files. The data is parsed into records by the origin, and can then be manipulated in the pipeline and written out as a supported data format in a destination - Avro, Delimited, JSON etc. Note that, at present, Excel is not supported as an output data format.

edit flag offensive delete link more


I tried that option with only one sheet, but there was NumberFormatException as the data were of different types such as Currency, Percentage. I don't see any option to select particular sheet from the excel.

Ruchita gravatar imageRuchita ( 2018-09-03 01:30:09 -0600 )edit

can we use fileRef to read the exel file in custom processor and use POI in custom processor ?

Ruchita gravatar imageRuchita ( 2018-09-03 04:52:11 -0600 )edit
Login/Signup to Answer

Question Tools

1 follower


Asked: 2018-08-31 03:33:56 -0600

Seen: 136 times

Last updated: Aug 31 '18