StreamSets Community Forum For Data Professionals | StreamSets Community

- - Knowledge base
Product Updates
Events

30-Day Free Trial: It’s Never Been Easier To Get Started With StreamSets

8 months ago

1,868 Topics
2,665 Replies
1,673 Members

Find answers to your questions
Stay up to date on the latest topics
Ask questions and help others

Ask your question

Featured Topic

Quick Links

Updates on all things product

Upcoming events

User guides and tutorials

Get trained and certified

Contact support

Recently active
Help others

Meghana ChinnuFan

Community Articles and Got a Question?

HTTP Client missing output records

The HTTP Client in my pipeline is not processing all of the input records that it gets. For eg: Input to HTTP client is 1430 records, but the output records processed in the same client are 1360 records only, with 0 error records. Not sure if I am missing any configuration to be added so that I can balance the I/O records, and send the error records to the error stage.

Karan BhatiaFan

Community Articles and Got a Question?

API request using multiple values for a query parameter in one job

Hello There I am trying to solve a very specific usecase here. I am trying to query a DB using an API and this API is using multiple query parameters. one of the query parameters is going to be ids which are more than 100 in count. The catch is that I can’t pass all 100 ids as an array to the API call because API is not designed to accept array for that parameter. It is going to be kind of looping over those 100 ids one by one and then calling the API with new id as the parameter value in each iteration.Also, these IDs needs to be fetched from a snowflake table and then passed as the parameter to the API call. So I am thinking of having some snowflake or JDBC query consumer as as an origin. And these IDs would increase over the period of time so want to make it as dynamic as possible but that is not priority for now. Having multiple jobs to solve this would lead to 100+ jobs and that would keep increasing which is not a good practice at all. Could someone please suggest the best possi

DolphinDiscovered Fame

Show us your Pipelines

How to Config to make Pipeline "Stop Event" not run when pipeline failed

Hi team,we have a pipeline, which configured a stop event, it let a sql statement to run once pipeline finish processing data, however we found, when the pipeline failed due to some reason, this “Stop Event” still run, which is not expected for us. Can you please let me know if some place can be configured, which let the Pipeline "Stop Event" not run when pipeline failed processing data? I actually found if the pipelien “Start Event” failed, the pipeline will not run which is expected, however the “Stop Event” always run even pipeline failed.

AtousaRoadie

StreamSets Academy

Parameters in stream selector stage

Hi,In my pipeline, I am having a stream selector stage. I want to parameterize it and use the following expression for the condition:${record:value('/rating_text') == '${pipeline_rating_text}'}Here, pipeline_rating_text is my parameter that I have defined for my pipeline. The problem is that when I run the pipeline it does not work. If I use this expression ${record:value('/rating_text') == 'Excellent'}everything will be fine. Can somebody help me, please?

AtousaRoadie

StreamSets Academy

[JDBC Table 1] Cannot connect to specified database: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server. (JDBC_00)

Hi,I have a JDBC connection for a database that is located in a docker container in my local machine. This connection works perfectly when I make a data collector pipeline.I installed a transformer engine in a docker container in my local machine (I have installed the external JDBC libraries). Then, I made a very simple pipeline to read from my database using this JDBC connection. I constantly get this error message that “[JDBC Table 1] Cannot connect to specified database: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server. (JDBC_00)”.Can anyone help me with solving the problem?

Community Articles and Got a Question?

Kafka Connection error Http failure response for https TUNNELING_INSTANCE_ID=tunneling-1: 500 OK

Dears,I have configured a local docker image for the Streamsets and kafka images to try a simple Kafka connection, but I´m receiving the following:Http failure response for https://na01.hub.streamsets.com/tunneling/rest/660c2f92-c396-4322-a9ea-cd73758897a1/rest/v1/pipeline/dynamicPreview?TUNNELING_INSTANCE_ID=tunneling-1: 500 OK The kafka image names are:bash-3.2$ docker-compose psNAME IMAGE COMMAND SERVICE CREATED STATUS PORTSkafka wurstmeister/kafka "start-kafka.sh" kafka 27 minutes ago Up 27 minutes 0.0.0.0:9092->9092/tcpzookeeper wurstmeister/zookeeper "/bin/sh -c '/usr/sb…" zookeeper 27 minutes ago Up 27 minutes 22/tcp, 2888/tcp, 3888/tcp, 0.0.0.0:2181->2181/tcp And I try the Test connection kafka:9092 or localhost:9092 and receive the Http failure response for https://na01.hub.streamsets.com/tunneling

jangala.karthikFan

Community Articles and Got a Question?

Change Runtime Parameter value

I have passed a parameter to my pipeline which has a default value when I start the pipeline. I need to change it's value in from one of processors. Is it possible to change the runtime parameter? If yes, can you please provide the steps to do that.

Community Articles and Got a Question?

SNOWFLAKE_74 - Using EL requires auto-creation error

I’m getting this error when I’m trying to use Snowflake uploader. I manually created the stage and I know I can put files there. I’m on 5.8 data collector. How do I over come this error?

MelanieDiscovered Fame

StreamSets Academy

JDBC Connection not working (Zomato)

HiI am trying to establishing a connection to jdbc:mysql://mysqldb:3306/zomato but it is not working.I tried 2 differents approach:at first, I did not want to use Strigo so I was not limited to the 8 hours. So I run my engines in containers within my local machine. The engines are running fine on the Control Hub and I can create and run some pipelines. However, I would like to use the connection cited in one of the lab with mysqldb and zomato. I am not sure how to “install” the zomato db and reviews table locally? I have a container running mysql but from there i dont know what to. If i try to create the connection using my local engine, i get then the below error.JDBC_00 - Cannot connect to specified database: com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server. I could see a post with the same error message but it does

vishalsdc2024New Member

Show us your Pipelines

Stream set pipeline is not starting

Hi , i am facing an issue while starting my sdc service and sdc is not coming up. it is prod and need to be up. can someone please help on this at the earliest. Caused by: com.streamsets.datacollector.store.PipelineStoreException: CONTAINER_0206 - Cannot load details for pipeline 'SupplyCha__eed7a3d5-486c-499b-baa1-6eac92aa198b__averydennison.com': java.io.IOException: File '/apps/sdc/data/pipelines/SupplyCha__eed7a3d5-486c-499b-baa1-6eac92aa198b__averydennison.com/pipeline.json-tmp' exists, '/apps/sdc/data/pipelines/SupplyCha__eed7a3d5-486c-499b-baa1-6eac92aa198b__averydennison.com/pipeline.json-old' should exists we have referred link for solution but not sure what can be done as files are not there on those pipelines. Thanks & Regards,Vishal Verma

Karan BhatiaFan

Community Articles and Got a Question?

API request using multiple values for a query parameter in one job

Hello There I am trying to solve a very specific usecase here. I am trying to query a DB using an API and this API is using multiple query parameters. one of the query parameters is going to be ids which are more than 100 in count. The catch is that I can’t pass all 100 ids as an array to the API call because API is not designed to accept array for that parameter. It is going to be kind of looping over those 100 ids one by one and then calling the API with new id as the parameter value in each iteration.Also, these IDs needs to be fetched from a snowflake table and then passed as the parameter to the API call. So I am thinking of having some snowflake or JDBC query consumer as as an origin. And these IDs would increase over the period of time so want to make it as dynamic as possible but that is not priority for now. Having multiple jobs to solve this would lead to 100+ jobs and that would keep increasing which is not a good practice at all. Could someone please suggest the best possi

DolphinDiscovered Fame

Show us your Pipelines

How to Config to make Pipeline "Stop Event" not run when pipeline failed

Hi team,we have a pipeline, which configured a stop event, it let a sql statement to run once pipeline finish processing data, however we found, when the pipeline failed due to some reason, this “Stop Event” still run, which is not expected for us. Can you please let me know if some place can be configured, which let the Pipeline "Stop Event" not run when pipeline failed processing data? I actually found if the pipelien “Start Event” failed, the pipeline will not run which is expected, however the “Stop Event” always run even pipeline failed.

AtousaRoadie

StreamSets Academy

[JDBC Table 1] Cannot connect to specified database: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server. (JDBC_00)

Hi,I have a JDBC connection for a database that is located in a docker container in my local machine. This connection works perfectly when I make a data collector pipeline.I installed a transformer engine in a docker container in my local machine (I have installed the external JDBC libraries). Then, I made a very simple pipeline to read from my database using this JDBC connection. I constantly get this error message that “[JDBC Table 1] Cannot connect to specified database: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server. (JDBC_00)”.Can anyone help me with solving the problem?

Community Articles and Got a Question?

Kafka Connection error Http failure response for https TUNNELING_INSTANCE_ID=tunneling-1: 500 OK

Dears,I have configured a local docker image for the Streamsets and kafka images to try a simple Kafka connection, but I´m receiving the following:Http failure response for https://na01.hub.streamsets.com/tunneling/rest/660c2f92-c396-4322-a9ea-cd73758897a1/rest/v1/pipeline/dynamicPreview?TUNNELING_INSTANCE_ID=tunneling-1: 500 OK The kafka image names are:bash-3.2$ docker-compose psNAME IMAGE COMMAND SERVICE CREATED STATUS PORTSkafka wurstmeister/kafka "start-kafka.sh" kafka 27 minutes ago Up 27 minutes 0.0.0.0:9092->9092/tcpzookeeper wurstmeister/zookeeper "/bin/sh -c '/usr/sb…" zookeeper 27 minutes ago Up 27 minutes 22/tcp, 2888/tcp, 3888/tcp, 0.0.0.0:2181->2181/tcp And I try the Test connection kafka:9092 or localhost:9092 and receive the Http failure response for https://na01.hub.streamsets.com/tunneling

Community Articles and Got a Question?

SNOWFLAKE_74 - Using EL requires auto-creation error

I’m getting this error when I’m trying to use Snowflake uploader. I manually created the stage and I know I can put files there. I’m on 5.8 data collector. How do I over come this error?

vishalsdc2024New Member

Show us your Pipelines

Stream set pipeline is not starting

Hi , i am facing an issue while starting my sdc service and sdc is not coming up. it is prod and need to be up. can someone please help on this at the earliest. Caused by: com.streamsets.datacollector.store.PipelineStoreException: CONTAINER_0206 - Cannot load details for pipeline 'SupplyCha__eed7a3d5-486c-499b-baa1-6eac92aa198b__averydennison.com': java.io.IOException: File '/apps/sdc/data/pipelines/SupplyCha__eed7a3d5-486c-499b-baa1-6eac92aa198b__averydennison.com/pipeline.json-tmp' exists, '/apps/sdc/data/pipelines/SupplyCha__eed7a3d5-486c-499b-baa1-6eac92aa198b__averydennison.com/pipeline.json-old' should exists we have referred link for solution but not sure what can be done as files are not there on those pipelines. Thanks & Regards,Vishal Verma

StreamSets Academy

How to fetch actual error message in a Streamsets pipeline?

We are logging pipeline errors /orchestrated task errors in database . Now often, we see error messages like - “for actual error , open the logs “ as runtime errors.Per our solution , this gets logged into the database . We don’t want that ..Instead we want the actual errors to get logged . How can we do that ?

ColinNew Member

Show us your Pipelines

Inquiry Regarding StreamSets Product Versions and Feature Differences

As per my understanding, the Community Edition is available for free. Could you please provide insights into the advantages of the Enterprise Edition over the Community Edition? Additionally, what are the limitations of the Community Edition?As per my understanding, the Community Edition is available for free. Could you please provide insights into the advantages of the Enterprise Edition over the Community Edition? Additionally, what are the limitations of the Community Edition?

Community Articles and Got a Question?

Running Stored procedure and using the data returned by the SP

The case is simple,I have a Kafka topic to which the data gets pushed, I use the data to execute a stored procedure using the JDBC query node. This is working fine but I need to get the return value from the SP so I can log if the execution is a success or there was any error in input which is returned by the SP itself. The SP has some complex logic that involves multiple tables so it's not really good I think to eliminate the SP. Since I don't see any output in JDBC query executor how do I redo the pipeline so I can store/process the SP outputs?

DolphinDiscovered Fame

Show us your Pipelines

Oracle CDC client pipeline keep running without processing any records, but in DBeaver same account can get data from V$LOGMNR_CONTENTS

Hi, I build a pipeline using Oracle CDC client, it is a very simple pipeline , have attached my exported pipeline. I currently using sysdba access configured in Streamsets and using this account I ran below in DBeaver can get records from V$LOGMNR_CONTENTS, please refer to attached screenshot "DBeaver_logmnr_screeashot.png"", however the streamsets cdc pipeline keeps running without any input and output, I also attached the sdc.log from server.from the log I can see the pipeline has gotten the timestamp of the starting SCN operation , however it cannot get records from LOGMNR and then insert into destination.can you let know anything wrong here?

Community Leaderboard

📌 Start a conversation. Ask a question. Help others.

Become a leader!

Show full leaderboard

📌 Start a conversation. Ask a question. Help others.

Become a leader!

Show full leaderboard

Events calendar

Badge winners

ajinkyahas earned the badge Innovator
Sanjeevhas earned the badge Eager to help

Show all badges

Powered by Gainsight

Terms & Conditions

Sign up

Already have an account? Login

Social Login

or

Username *

E-mail address *

What I do... *

Data Leader Data Architect Data Engineer Data Scientist Other

Company *

Country *

Zip Code *

Marketing Communications

Yes No

Password *

I have read and Agree to the Website Terms of Service and I have read and acknowledged the Privacy Policy.

loginBox.register.email_repeat

Login to the community

No account yet? Create an account

Social Login

or

Username or Email

Password

Remember me

Forgot password?

Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.

Username or e-mail

Back to overview

Scanning file for viruses.

Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.

OK

This file cannot be downloaded

Sorry, our virus scanner detected that this file isn't safe to download.

OK