Ask Your Question
1

Automatic registration of Avro Schema in Confluent Schema Registry not working

asked 2018-02-01 11:00:42 -0500

pwel gravatar image

I'd like to connect an Oracle CDC origin directly to an Kafka Producer using Confluent Schema registered Avro. Especially, with these two components, I'd like to have the same functionality as with the Kafka JDBC Source Connector which automatically...

  1. creates a Kafka Topic
  2. detects the Schema of the source Table
  3. converts it to an Avro schema
  4. registers the Avro Schema in the Confluent Schema Registry
  5. reads and transfers the data

Especially #3 and #4 don't seem to work automatically. Is it not possible to have the Kafka Producer automatically registering the Avro Schema at first run just based on the existing Data flowing in?

How should this be done with StreamSets without any manual steps?

edit retag flag offensive close merge delete

3 Answers

Sort by ยป oldest newest most voted
3

answered 2018-02-01 11:41:42 -0500

adam gravatar image

Support for registering dynamic schemes was added in Data Collector 3.1. That should resolve your issue.

ref: https://issues.streamsets.com/browse/...

edit flag offensive delete link more

Comments

Yes, thanks, I'll try this out

pwel gravatar imagepwel ( 2018-02-08 06:03:22 -0500 )edit
3

answered 2018-02-01 11:46:57 -0500

metadaddy gravatar image

SDC-6086 fixes the schema creation issue in SDC 3.1.0.0, due for release in mid-February, though you could try a nightly build - grab the streamsets-datacollector-core-x.y.0.0-SNAPSHOT.tgz tarball, install and run, and add stage libraries via the package manager.

Regarding #1, the Kafka Producer has no functionality to create a Kafka topic itself, though you can set Kafka to auto-create topics.

edit flag offensive delete link more

Comments

Thanks, I'll try this out!

pwel gravatar imagepwel ( 2018-02-08 06:03:08 -0500 )edit
0

answered 2018-02-08 06:10:31 -0500

pwel gravatar image

An obvoius follow-up question in terms of having functionality comparable to kafka-Connect JDBC Source is this:

My Oracle CDC Connector will replicate up tp 380 tables on a database. Since I do not want to create 380 pipielines (and poll logimier in 380 simultaneous sessions) I want to list multiple tables in the OracleCDC Origin - which works fine!

But what should I do to have the Kafka Producer create 380 avro schemas in the registry? Is it possible to use the table name in the "Schema-Subject" property?

Thanks and Regards, Peter

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2018-02-01 06:40:36 -0500

Seen: 641 times

Last updated: Feb 08