Ask Your Question

How do I base S3 directory path on Oracle CDC schema and table?

asked 2018-07-30 15:16:52 -0500

Mehul gravatar image

updated 2018-07-31 18:56:01 -0500

metadaddy gravatar image

Hello Everyone,

I compared AWS DMS with Streamsets Oracle CDC. I noticed that Streamsets Oracle fullload and CDC creates .csv file with sdc- as prefix but there is no way for us to identify table based on file name. In case of AWS DMS, we get schema bucket/<Table Bucket>/LOAD0001.csv file. The LOAD0001.csv file has records. Is there a way in Streamsets to get the same directory structure?

Thanks, Mehul

edit retag flag offensive close merge delete


What destination are you using?

metadaddy gravatar imagemetadaddy ( 2018-07-30 16:42:40 -0500 )edit

AWS S3 bucket.

Mehul gravatar imageMehul ( 2018-07-31 11:30:53 -0500 )edit

1 Answer

Sort by ยป oldest newest most voted

answered 2018-07-31 18:55:26 -0500

metadaddy gravatar image

You can configure the Amazon S3 destination's Partition Prefix with an expression. The Oracle CDC origin puts the table name in the oracle.cdc.table attribute and the schema name in oracle.cdc.schema, so, to get a similar directory structure to AWS DWS, you would set the Partition Prefix to:


Set the Object Name Suffix to csv.

edit flag offensive delete link more


Where do I get a list of all origins like Salesforce, SQL Server etc ?

Mehul gravatar imageMehul ( 2018-08-01 19:10:03 -0500 )edit

I found it myself. All I have to do is take a snapshot and then in preview window it will show record header. I managed to get details of Salesforce object as well and pipeline automatically created bucket in S3 for object.

Mehul gravatar imageMehul ( 2018-08-03 11:48:06 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower


Asked: 2018-07-30 15:16:52 -0500

Seen: 746 times

Last updated: Jul 31 '18