Can SDC handle big XML files?

I can not find how to make SDC stream over a huge XML file whereas it's coming from an HTTP request or from the file system. Don't know how to tell it to stream the file and not load everything into memory in which case the XML parser associated to the HTTPClient or the FSClient crashes because the content is too big.

2 Answers

Your assumptions are correct. It contains roughly only 'action' nodes. Using the delimiter fixes my problem.

Doing so, I end up with many Map Records such as:

image description

As you see, there is no root document and my kafka producer that comes next complains about it: XML_GENERATOR_04 - Record root field must have 1 element

I tried to somehow transform the data but got no luck.

Any idea how I should proceed?

OK, this is a different problem now. The XML data generator requires a single root element in the map in order to generate an XML document (as suggested by the error message). There is not enough space to answer how fully in this comment. Please open a new question for this.

jeff gravatar imagejeff ( 2018-03-01 12:15:35 -0600 )edit

What is the exact error you're getting? Theoretically the parser should be able to handle arbitrarily large documents, as long as each individual record within the document is under the configured max record size, but there are still some complications that arise. I assume there is a list of elements that repeats over and over that represent your records? If so, you should configure the Delimiter Element correctly so that each of these individual elements is being handled rather than the document as a whole.

