Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Limit number of resource

Hi All,

I developed a StreamSets batch job that loads file from Hadoop (in JSON format), do some transformations and finally writes the transformed fields into Hadoop (in AVRO format). I usually run jobs in Cluster YARN Streaming and in order to limit the resources (in terms of memory and cores) I play with the number of workers, that can be set directly on StreamSets.

However, if you run a job in Cluster Batch it's not possible to set the number of workers, and therefore I cannot limit the amount of resources consumed ... When I run it the amount of mem/cores used explodes...

How can I limit the amount of resources consumed?

Thanks, Alessandro