Ask Your Question

Revision history [back]

Multiple data files at same time stamp

HI,

i am getting data from Kafka and loading it to Local file, i have opted for option max records in a file as 1000, i havent choose the option idle timeout of file because file was not closing for very long time as records were coming one by one from kafka.

now the issue is, as i choose record limit per file, i am getting multiple records from kafka at a time and multiple data files are getting created in Local file system with same time stamp,

from Local file system through event files i am loading to table through script and now the event file are getting picked randomly, out of the 10 files of same timestamp, normally the oldest one should process, but now it is picking randomly 4th or 5th file,

we have a sql were the latest records will only get processed to table,, in the above process we loose data.

Hope i have explained the issue clearly.

thanks,

click to hide/show revision 2
None

Multiple data files at same time stamp

HI,

i I am getting data from Kafka and loading it to Local file, i have opted for option max records in a file as 1000, i havent choose haven't chosen the option idle timeout of file because file was not closing for very long time as records were coming one by one from kafka.Kafka.

now Now the issue is, as i choose record limit per file, i am getting multiple records from kafka Kafka at a time and multiple data files are getting created in Local file system with same time stamp,

stamp, from Local file system through event files i am loading to table through script and now the event file are getting picked randomly, out of the 10 files of same timestamp, normally the oldest one should process, but now it is picking randomly 4th or 5th file,

we We have a sql were the latest records will only get processed to table,, table, in the above process we loose lose data.

Hope i have explained the issue clearly.

thanks,