Ask Your Question
2

What about introducing conflict detection methods for JDBC producer?

asked 2018-12-10 05:45:40 -0600

Mathias Zarick gravatar image

Whenever we implement database replication with cdc-enabled sources and JDBC producer as destination at the moment we have the following situation regarding conflicts which might happen: - insert or uniqueness conflict: a row with the same PK already exists at target, SDC JDBC producer throws an error, this is fine, but - update and delete conflict: a row with the same PK does not exist at target, SDC JDBC producer just goes on without any warning, thus no chance of noticing this.

I would recommend to add some additional and optional parameters to SDC JDBC producer, with the option to notice data out-of-sync scenarios. If an update from a cdc-enabled source does not find exactly one row at the target site when using the PK, we have a conflict. How to implement? In the source code, in class com/streamsets/pipeline/lib/jdbc/JdbcMultiRowRecordWriter.java resp. com/streamsets/pipeline/lib/jdbc/JdbcGenericRecordWriter.java you use "statement.executeUpdate();"

This is actually a function, returning number of changed rows, if this is not 1 in a scenario as described above, I would wish to have an option to generate some events or errors.

Thanks for your comments Mathias Zarick

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
1

answered 2018-12-11 11:58:14 -0600

metadaddy gravatar image

Sounds like a reasonable feature - please file an issue and we can take a look, or even implement it and file a pull request on the Data Collector GitHub project.

edit flag offensive delete link more

Comments

Hi Pat, thank you for the answer. Jira Task is https://issues.streamsets.com/browse/SDC-10690 . Cheers Mathias

Mathias Zarick gravatar imageMathias Zarick ( 2018-12-17 04:38:14 -0600 )edit
Login/Signup to Answer

Question Tools

2 followers

Stats

Asked: 2018-12-10 05:45:40 -0600

Seen: 28 times

Last updated: Dec 11 '18