ADLS FileNotFound & FileAlreadyExists (after Restart)

asked 2019-08-27 08:35:53 -0600

CaptinCCat gravatar image

updated 2019-08-27 11:30:21 -0600

Hey guys, so I'm getting this Error. Failed to write to Azure Data Lake Store: Operation APPEND failed with HTTP404

After running this pipeline all weekend, I came back today and there 1.6k records that errored out 20 min before I got into work.

I have checked, and all of the files for the path are there and valid. If anyone can help me on this, I'm curious as to why its giving file not found errors now after running for the weekend.

EDIT: So after stopping the pipeline for a period of time, and restarting it, the FileNotFoundException goes away, however another ADSL that I have in the pipeline, gives the FileAlreadyExists error... SO restarting the pipeline fixes one error on one destination, but creates a new error on a different destination.

com.streamsets.pipeline.api.base.OnRecordErrorException: ADLS_03 - Failed to write to Azure Data Lake Store: 'com.microsoft.azure.datalake.store.ADLException: Error appending to file /MeterReads/ElecReads/READDATE=2019-08-27/_tmp_sdc-a1e0d80a-3aa6-11e9-b99c-d7821cd854f2-a1e0d80a-3aa6-11e9-b99c-d7821cd854f2-TEST2FNConsumptionReadingtoHDFScopy2e70fad3-c2c6-49fd-83b2-6fc63e7dcafa-0
Operation APPEND failed with HTTP404 : FileNotFoundException
Last encountered exception thrown after 1 tries. [HTTP404(FileNotFoundException)]
 [ServerRequestId:9daec71e-8979-47a1-908d-fde27057c260]'
    at com.streamsets.pipeline.stage.destination.datalake.writer.DataLakeWriterThread.call(DataLakeWriterThread.java:99)
    at com.streamsets.pipeline.stage.destination.datalake.writer.DataLakeWriterThread.call(DataLakeWriterThread.java:32)
    at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226)
    at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33)
    at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222)
    at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226)
    at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33)
    at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

Here is the FileAlreadyExists error I get

com.streamsets.pipeline.api.base.OnRecordErrorException: ADLS_03 - Failed to write to Azure Data Lake Store: 'com.microsoft.azure.datalake.store.ADLException: Error creating file /MeterReads/ElecReads/READDATE=2019-08-27/_tmp_sdc-a1e0d80a-3aa6-11e9-b99c-d7821cd854f2-a1e0d80a-3aa6-11e9-b99c-d7821cd854f2-TEST2FNConsumptionReadingtoHDFScopy2e70fad3-c2c6-49fd-83b2-6fc63e7dcafa-0
Operation CREATE failed with HTTP403 : FileAlreadyExistsException
Last encountered exception thrown after 1 tries. [HTTP403(FileAlreadyExistsException)]
 [ServerRequestId:f4098fc1-c08a-4fc6-893f-e92971b53474]'
    at com.streamsets.pipeline.stage.destination.datalake.writer.DataLakeWriterThread.call(DataLakeWriterThread.java:99)
    at com.streamsets.pipeline.stage.destination.datalake.writer.DataLakeWriterThread.call(DataLakeWriterThread.java:32)
    at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226)
    at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33)
    at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222)
    at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226)
    at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33)
    at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java ...
(more)
edit retag flag offensive close merge delete