use ftp client operator to load excel, the number of records is not correct

asked 2020-06-15 11:20:18 -0500

sihao gravatar image

If i load an excel file in a simple workflow, FTP Client works fine. But if I load five excel files (they are the same, but in five different locations) in five workflows(FTP client -> trash), some workflows load the wrong number of records, and have an error in the log. (data size: each excel file around 120000 records)

ERROR Error while attempting to parse file: /Fund Report repayments 20200119.xlsx com.streamsets.pipeline.lib.parser.DataParserException: DATA_PARSER_02 - Parser error: 'java.lang.ArrayIndexOutOfBoundsException: 17' at com.streamsets.pipeline.lib.parser.WrapperDataParserFactory$WrapperDataParser.normalizeException(WrapperDataParserFactory.java:147) at com.streamsets.pipeline.lib.parser.WrapperDataParserFactory$WrapperDataParser.parse(WrapperDataParserFactory.java:107) at com.streamsets.pipeline.stage.origin.remote.RemoteDownloadSource.addRecordsToBatch(RemoteDownloadSource.java:328) at com.streamsets.pipeline.stage.origin.remote.RemoteDownloadSource.produce(RemoteDownloadSource.java:308) at com.streamsets.pipeline.api.base.configurablestage.DSource.produce(DSource.java:38) at com.streamsets.datacollector.runner.StageRuntime.lambda$execute$2(StageRuntime.java:283) at com.streamsets.pipeline.api.impl.CreateByRef.call(CreateByRef.java:40) at com.streamsets.datacollector.runner.StageRuntime.execute(StageRuntime.java:235) at com.streamsets.datacollector.runner.StageRuntime.execute(StageRuntime.java:298) at com.streamsets.datacollector.runner.StagePipe.process(StagePipe.java:219) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.processPipe(ProductionPipelineRunner.java:810) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.runPollSource(ProductionPipelineRunner.java:554) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.run(ProductionPipelineRunner.java:383) at com.streamsets.datacollector.runner.Pipeline.run(Pipeline.java:527) at com.streamsets.datacollector.execution.runner.common.ProductionPipeline.run(ProductionPipeline.java:109) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunnable.run(ProductionPipelineRunnable.java:75) at com.streamsets.datacollector.execution.runner.standalone.StandaloneRunner.start(StandaloneRunner.java:703) at com.streamsets.datacollector.execution.runner.common.AsyncRunner.lambda$start$3(AsyncRunner.java:151) at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226) at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33) at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222) at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226) at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33) at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at com.streamsets.datacollector.metrics.MetricSafeScheduledExecutorService$MetricsTask.run(MetricSafeScheduledExecutorService.java:100) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ArrayIndexOutOfBoundsException: 17 at sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:453) at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2397) at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2312) at java.util.Calendar.setTimeInMillis(Calendar.java:1804) at java.util.Calendar.setTime(Calendar.java:1770) at java.text.SimpleDateFormat.format(SimpleDateFormat.java:943) at java.text.SimpleDateFormat.format(SimpleDateFormat.java:936) at org.apache.poi ... (more)

edit retag flag offensive close merge delete