Ask Your Question

Revision history [back]

Streamsets capabilities

I am exploring streamsets and have few questions:-

  1. First and foremost how to deploy streamsets data pipeline in production environment, what is the best practice, does open source streamsets data collector has option to deploy on kubernetes? I have 5 node greenplum cluster can I deploy streamset cluster mode in the same?
  2. I have main use case to get data from activemq and mysql and ingest into greenplum does streamsets has this?
  3. What is the best way to generate summary table data through streamsets data collector?
  4. How to monitor streamsets jobs, how to put retry mechanism, failover etc?

Streamsets capabilities

I am exploring streamsets and have few questions:-

  1. First and foremost how to deploy streamsets data pipeline in production environment, what is the best practice, does open source streamsets data collector has option to deploy on kubernetes? I have 5 node greenplum cluster can I deploy streamset cluster mode in the same?
  2. I have main use case to get data from activemq and mysql and ingest into greenplum does streamsets has this?
  3. What is the best way to generate summary table data through streamsets data collector?
  4. How to monitor streamsets jobs, how to put retry mechanism, failover etc?