Ask Your Question
1

PostgreSQL CDC - UTF-8 Error

asked 2020-04-24 04:39:41 -0500

Teleassist gravatar image

updated 2020-04-28 00:45:55 -0500

Hi ! I am trying to use the Postgres CDC client. Followed the instructions in the documentation. I see the following error - running the pipeline :

DataCollector Version : 3.13.0 (Host Docker)
PostgreSQL Version : 10.12 (Host Windows 10 - wal2json compiled and installed)

org.postgresql.util.PSQLException: Database connection failed when writing to copy
at org.postgresql.core.v3.QueryExecutorImpl.flushCopy(QueryExecutorImpl.java:1013)
at org.postgresql.core.v3.CopyDualImpl.flushCopy(CopyDualImpl.java:23)
at org.postgresql.core.v3.replication.V3PGReplicationStream.updateStatusInternal(V3PGReplicationStream.java:190)
at org.postgresql.core.v3.replication.V3PGReplicationStream.forceUpdateStatus(V3PGReplicationStream.java:109)
at com.streamsets.pipeline.stage.origin.jdbc.cdc.postgres.PostgresCDCWalReceiver.createReplicationStream(PostgresCDCWalReceiver.java:174)
at com.streamsets.pipeline.stage.origin.jdbc.cdc.postgres.PostgresCDCSource.init(PostgresCDCSource.java:156)
at com.streamsets.pipeline.api.base.BaseStage.init(BaseStage.java:48)
at com.streamsets.pipeline.api.base.configurablestage.DStage.init(DStage.java:36)
at com.streamsets.datacollector.runner.StageRuntime.lambda$init$0(StageRuntime.java:220)
at com.streamsets.datacollector.util.LambdaUtil.withClassLoaderInternal(LambdaUtil.java:148)
at com.streamsets.datacollector.util.LambdaUtil.withClassLoader(LambdaUtil.java:44)
at com.streamsets.datacollector.runner.StageRuntime.init(StageRuntime.java:218)
at com.streamsets.datacollector.runner.StagePipe.init(StagePipe.java:107)
at com.streamsets.datacollector.runner.StagePipe.init(StagePipe.java:44)
at com.streamsets.datacollector.runner.Pipeline.initPipe(Pipeline.java:392)
at com.streamsets.datacollector.runner.Pipeline.init(Pipeline.java:297)
at com.streamsets.datacollector.runner.preview.PreviewPipeline.run(PreviewPipeline.java:49)
at com.streamsets.datacollector.execution.preview.sync.SyncPreviewer.start(SyncPreviewer.java:230)
at com.streamsets.datacollector.execution.preview.async.AsyncPreviewer.lambda$start$1(AsyncPreviewer.java:98)
at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226)
at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33)
at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222)
at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226)
at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33)
at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at com.streamsets.datacollector.metrics.MetricSafeScheduledExecutorService$MetricsTask.run(MetricSafeScheduledExecutorService.java:100)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Illegal UTF-8 sequence: byte 2 of 3 byte sequence is not 10xxxxxx: 98
    at org.postgresql.core.UTF8Encoding.checkByte(UTF8Encoding.java:28)
    at org.postgresql.core.UTF8Encoding.decode(UTF8Encoding.java:113)
    at org.postgresql.core.PGStream.receiveString(PGStream.java:341)
    at org.postgresql.core.v3.QueryExecutorImpl.receiveNoticeResponse(QueryExecutorImpl.java:2444)
    at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1091)
    at org.postgresql.core.v3.QueryExecutorImpl.flushCopy(QueryExecutorImpl.java:1011)
    ... 31 more

EDIT (2020-04-26)

PostgreSQL Version : 9.6.17 (Host Windows 10 - wal2json compiled and installed)
DataCollector Version : 3.13 ...
(more)
edit retag flag offensive close merge delete

Comments

Hi - could you edit your question to include the versions of PostgreSQL and Data Collector? Thanks!

metadaddy gravatar imagemetadaddy ( 2020-04-24 10:15:01 -0500 )edit

3 Answers

Sort by ยป oldest newest most voted
1

answered 2020-04-26 06:28:42 -0500

J3 gravatar image

updated 2020-04-28 17:40:14 -0500

metadaddy gravatar image

Hi Teleassist,

This problem doesn't sound like it's a StreamSets Postgres compatibility issue. For instance, in Postgres installation, I am using at one of my client's shop is Postgres 11.7 with SDC 3.13.0. Moreover, doing some googling for a few minutes (albeit), it sounds like you are experiencing a common issue with Postgres CDC in general. The problem you may be having is a client connection latency issue between the Postres server and the CDC client (i.e., StreamSets server). Take a look at the Postgres server logs for some insight into the root of the cause of your issue would be my advice.

Sorry I couldn't be of further assistance.

Be safe, J3

edit flag offensive delete link more

Comments

Hi J3, i was aware about the connection latency, but my tests are on my local machine. PostgreSQL server is installed on the host (Windows 10) and i'm using SDC with Docker. When i'm using the JDBC Query Consumer as Origin, i can query PostgreSQL without any errors.

Teleassist gravatar imageTeleassist ( 2020-04-26 06:40:16 -0500 )edit

Hi, Ok, that's good to know. So, you have a separate docker container for SDC and Postgres, respectively? Maybe there is a problem with your wal2json.jar. Also, have you been able to peek at the logical slot changes? If so, did you see some change activities in the slot?

J3 gravatar imageJ3 ( 2020-04-27 21:06:52 -0500 )edit

For instance, run this query: SELECT * FROM pg_logical_slot_peek_changes('<logicial replication="" slot="" name="">', NULL, NULL, 'include-timestamp', 'on'); (Sorry I unable to show this query properly in the comment, lookup pg_logical_slot_peek_changes, if you need more info.)

J3 gravatar imageJ3 ( 2020-04-27 21:07:43 -0500 )edit

By the way, the wal2json.jar is built for the version of Postgres you are using correct?

J3 gravatar imageJ3 ( 2020-04-27 21:11:13 -0500 )edit

Hi, you can check my question (edit from 2020-04-28) about wal2json testing. It works perfectly.

Teleassist gravatar imageTeleassist ( 2020-04-28 00:47:15 -0500 )edit
0

answered 2020-04-28 09:20:08 -0500

Teleassist gravatar image
PostgreSQL Version : 11.7 (Host Windows 10 - wal2json compiled and installed)
DataCollector Version : 3.13.0 (Host Docker)

This combination seems to work perfectly. I don't know why with other versions (9.6.17, 10.12) we've got this error.

edit flag offensive delete link more

Comments

Ok, good I am glad I could help out. So, it would seem that there is some incompatibility issue with those older versions of Postgres. Hopefully, you can use Postgres 11.7 for your project. All the best, J3

J3 gravatar imageJ3 ( 2020-04-28 16:48:54 -0500 )edit
0

answered 2020-04-24 16:55:36 -0500

metadaddy gravatar image

It's possible that there is some incompatibility here. I don't think we've tested with PG 10.12. From the docs:

image description

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2020-04-24 04:39:41 -0500

Seen: 60 times

Last updated: Apr 28