Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Multitable CDC from Oracle to Kafka as registered Avro - possible?

Hi,

I have the need to CDC 50 Tables from an Oracle DB to 50 Kafka Topics in an CSR registered Avro format. Since it seems not to be a good idea to create and run 50 StreamSets Pipelines simultaneously, I like to use as few as possible of them (>= 1) and do it in a generic way.

I could manage to use Oracle CDC Stage handling multiple tables. I am also able to derive the topic name for the Kafka Producer for each table at runtime by accessing a Record/RecordHeader-Variable containing the Name of the Table.

However, I did not yet manage to do this with the schema-ID or subject name. I am not able to derive it from each record. It seems that I can only add a constant or fix parameter - which does not solve my problem.

Is there any possibility to set the subject or schema-ID dependent on the record (-> TopicName) I want to sent to Kafka?

Thanks in advance and Regards,

Peter

Multitable CDC from Oracle to Kafka as registered Avro - possible?

Hi,

no answer so far, hencre I have the need to CDC 50 Tables from an Oracle DB to 50 Kafka Topics in an CSR registered Avro format. Since it seems not to be a good idea to create and run 50 StreamSets Pipelines simultaneously, I like to use as few as possible of them (>= 1) and do it in a generic way.simplifiy my question:

  • I could manage capture records of 100 tables in one CDC stage (possible with the CDC stages)
  • I need to write the records to 100 corresponding Kafka Topics (possible since I can derive the kafka topic from each record-header)
  • I need to use Oracle CDC Stage handling multiple tables. I am also able to derive the topic name AVRO recordtype for the Kafka Producer for each table at runtime by accessing a Record/RecordHeader-Variable containing the Name of the Table.

    Topics (doesn't seem to be possible?)

However, It's a common task in CDC to capture schematized data of many tables and it's also common to use AVRO in kafka as destination. How can I did not yet manage to do this that with the schema-ID or subject name. I am not able to derive it from each record. It seems that I can only add a constant or fix parameter - which does not solve my problem.

Is there any possibility to set the subject or schema-ID dependent on the record (-> TopicName) I want to sent to Kafka?SDC?

Thanks in advance and Regards,

Regards, Peter

Multitable CDC from Oracle to Kafka as registered Avro - possible?

Hi,

no answer so far, hencre hence I simplifiy my question:

  • I capture records of 100 tables in one CDC stage (possible with the CDC stages)
  • I need to write the records to 100 corresponding Kafka Topics (possible since I can derive the kafka topic from each record-header)
  • I need to use the AVRO recordtype for the Kafka Topics (doesn't seem to be possible?)

It's a common task in CDC to capture schematized data of many tables and it's also common to use AVRO in kafka as destination. How can I do that with SDC?

Thanks in advance and Regards, Peter

Multitable CDC from Oracle to Kafka as registered Avro - possible?

Hi,

no answer so far, hence I simplifiy my question:

  • I capture records of 100 tables in one CDC stage (possible with the CDC stages)
  • I need to write the records to 100 corresponding Kafka Topics (possible since I can derive the kafka topic from each record-header)
  • I need to use the AVRO recordtype for the Kafka Topics (doesn't seem to be possible?)

It's a common task in CDC to capture schematized data of many tables and it's also common to use AVRO in kafka as destination. How can I do that with SDC?

Thanks in advance and Regards, Peter

Multitable CDC from Oracle to Kafka as registered Avro - possible?

Hi,

no answer so far, hence I simplifiy my question:

  • I capture records of 100 tables in one CDC stage (possible with the CDC stages)
  • I need to write the records to 100 corresponding Kafka Topics (possible since I can derive the kafka topic from each record-header)
  • I need to use the AVRO recordtype for the Kafka Topics (doesn't seem to be possible?)

It's a common task in CDC to capture schematized data of many tables and it's also common to use AVRO in kafka as destination. How can I do that with SDC?

Thanks in advance and Regards, Peter

Multitable CDC from Oracle to Kafka as registered Avro - in AVRO format not possible?

Hi,

no answer so far, hence I simplifiy my question:

  • I capture records of 100 tables in one CDC stage (possible with the CDC stages)
  • I need to write the records to 100 corresponding Kafka Topics (possible since I can derive the kafka topic from each record-header)
  • I need to use the AVRO recordtype for the Kafka Topics (doesn't seem to be possible?)

It's a common task in CDC to capture schematized data of many tables and it's also common to use AVRO in kafka as destination. How can I do that with SDC?

Thanks in advance and Regards, Peter

Multitable CDC to Kafka in AVRO format not possible?

Hi,

no answer so far, hence I simplifiy my question:

  • I capture records of 100 tables in one CDC stage (possible with the CDC stages)
  • I need to write the records to 100 corresponding Kafka Topics (possible since I can derive the kafka topic from each record-header)
  • I need to use the AVRO recordtype for the Kafka Topics (doesn't seem to be possible?)

It's a common task in CDC to capture schematized data of many tables and it's also common to use AVRO in kafka as destination. How can I do that with SDC?

If it is not possible, I'd like to suggest implement something like this:

  • Allow to derive/define the Schema ID / Subject dynamically per record
  • load the avro schema from the schema-registry when accessing a record of a certain schema for the first time
  • Cache this schema (for a certain, configurable time - e.g. 60 minutes) within the pipeline
  • use it with the according record

Thanks in advance and Regards, Peter