Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Why are my new files not being read by my SFTP origin and sent though the pipeline?

Pipeline details - Origin : SFTP/FTP/FTPS Client, Destination : Amazon s3

Background Details - I made a pipeline that transfer files from an SFTP source to an S3 folder. The origin is configured to move whole files based on a filename pattern. The pipeline is also configured to track the data that has already been processed by saving the offset.

Issue - The first file uploaded to SFTP will be picked up by the origin and processed though the pipeline. The new file that is uploaded overwrites the previous file because it has the same name but the last modified timestamp is different. This new file is not picked up and processed.

My questions are, is this behaviour expected? Does the origin rescan and look for any new files? Does it think that a file with that name has already been processed so it ignores it due to offset tracking?

After some testing I have found that having the new files be different names allows them to be picked up as you would normally expect.