Pipeline Status: RUNNING_ERROR: S3_21 - Unable to write object to Amazon S3

asked 2020-03-18 11:05:07 -0500

anonymous user

Anonymous

Hi there,

I'm doing a capability testing POC with the Streamsets for our strategic data pipeline solution. When I attempt to copy the files from SFTP to S3 bucket, getting the subjected error. I’m using whole file data format.

I’m using default value for multipart upload threshold and partitioning threshold. Please let me know if I need any additional configuration to support S3 copy or any limitation with my S3 bucket permission.

Please note, same functionality is working with any error if I use NiFi.

com.streamsets.pipeline.api.StageException: S3_21 - Unable to write object to Amazon S3, reason : com.amazonaws.SdkClientException: Data read has a different length than the expected: dataLength=19465638; expectedLength=20776378; includeSkipped=false; in.getClass()=class com.amazonaws.internal.ReleasableInputStream; markedSupported=false; marked=0; resetSinceLastMarked=false; markCount=0; resetCount=0 at com.streamsets.pipeline.stage.destination.s3.AmazonS3Target.write(AmazonS3Target.java:182) at com.streamsets.pipeline.api.base.configurablestage.DTarget.write(DTarget.java:34) at com.streamsets.datacollector.runner.StageRuntime.lambda$execute$2(StageRuntime.java:303) at com.streamsets.datacollector.runner.StageRuntime.execute(StageRuntime.java:244) at com.streamsets.datacollector.runner.StageRuntime.execute(StageRuntime.java:311) at com.streamsets.datacollector.runner.StagePipe.process(StagePipe.java:220) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.processPipe(ProductionPipelineRunner.java:850) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.lambda$executeRunner$3(ProductionPipelineRunner.java:894) at com.streamsets.datacollector.runner.PipeRunner.acceptConsumer(PipeRunner.java:221) at com.streamsets.datacollector.runner.PipeRunner.executeBatch(PipeRunner.java:142) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.executeRunner(ProductionPipelineRunner.java:893) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.runSourceLessBatch(ProductionPipelineRunner.java:871) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.runPollSource(ProductionPipelineRunner.java:599) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.run(ProductionPipelineRunner.java:390) at com.streamsets.datacollector.runner.Pipeline.run(Pipeline.java:516) at com.streamsets.datacollector.execution.runner.common.ProductionPipeline.run(ProductionPipeline.java:112) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunnable.run(ProductionPipelineRunnable.java:75) at com.streamsets.datacollector.execution.runner.standalone.StandaloneRunner.start(StandaloneRunner.java:720) at com.streamsets.datacollector.execution.AbstractRunner.lambda$scheduleForRetries$0(AbstractRunner.java:363) at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226) at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33) at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at com.streamsets.datacollector.metrics.MetricSafeScheduledExecutorService$MetricsTask.run(MetricSafeScheduledExecutorService.java:100) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: com.amazonaws.SdkClientException: Data read has a different length than the expected: dataLength=19465638; expectedLength=20776378; includeSkipped=false; in.getClass()=class com.amazonaws.internal.ReleasableInputStream; markedSupported=false; marked=0; resetSinceLastMarked=false; markCount=0; resetCount=0 at com.amazonaws.util ... (more)

edit retag flag offensive close merge delete