30-Day Free Trial: It’s Never Been Easier To Get Started With StreamSets
Hello,I'm currently using the HTTP Client Processor in StreamSets Data Collector and I've encountered an issue with the retry mechanism. Despite configuring the processor to retry on receiving certain HTTP status codes, it doesn't seem to be doing so.Specifically, I've set the processor to retry immediately when it receives a 409 status code, with a maximum of 2 retries. However, when the processor receives a 409 status code, it doesn't retry the request and instead gives the following error: "HTTP_101: Applying passthrough and error policy on status configuration".I've checked the pipeline configuration. I'm not sure why the processor isn't retrying as expected.Has anyone else encountered this issue? Any insights or suggestions would be greatly appreciated.Thank you in advance for your help.
Hi Team While installing DC Tarball engine using install script from Dataops platform, it asks for download and install directory at run time.Is there a way we could avoid passing it during execution and pass it in install script or use current directory?
We are logging pipeline errors /orchestrated task errors in database . Now often, we see error messages like - “for actual error , open the logs “ as runtime errors.Per our solution , this gets logged into the database . We don’t want that ..Instead we want the actual errors to get logged . How can we do that ?
Hello,I'm currently working on a simple pipeline to ingest kafka messages inside a log file.I'm trying to consume all the data from the beginning of a topic but i'm only getting newer data added to this topic.Once consumed, the previous topic messages are not accessible anymore.I've already test all the different "Auto Offset Reset" properties.Same for simple and multi topic consumers.In the official documentation docs.streamsets.com :auto.commit.interval.ms bootstrap.servers enable.auto.commit group.id max.poll.recordsIf I understand correctly all those parameters are locked so I can't disable the offset management and process all the data from the beginning of a topic.Is there an additionnal Kafka configuration property to use or do I need to configure the topic directly via kafka CLI ??StreamSets Data Collector version : 3.14.0Kafka Consumer version : 2.0.0Regards.
Error: Source : hello everyone, I want to pull data from the source to the warehouse, in the source there is a "customer_id" column, when the streamsets are run why is there an invalid customer error? in streamsets I also flag based on the "id" column. please help, thank you
Hi Team,We have some pipelines, which has some configuration like below, How can I get the exact plain test value for these parameter property value? Also how and where can I set these kind of values?
I read from an SFTP server and when writing to S3 my files get renamed. How can I retain the original file name?
I have a requirement to update job parameter via restAPI. I get the job details via GET function. I modify the content via code. But I am not sure how to pass the modified json content in a variable/file in the restapi post call? I have the need to run the api as automated batch with no manual intervention. I tried below two ways but none worked. I dont see any other viable option to pass modified content in the curl command.var1=`cat /home/script/modified.json`curl -X POST https://XXX.hub.streamsets.com/jobrunner/rest/v1/job/72b8200c-0e3b-426e-b564-bceb17220b1e:8c2c652f-e3d9-11eb-9fb3-b974ac4c3f67 -d '{ @var1}’ -H "Content-Type:application/json" -H "X-Requested-By:curl" -H "X-SS-REST-CALL:true" -H "X-SS-App-Component-Id: $CRED_ID" -H "X-SS-App-Auth-Token: $CRED_TOKEN" -i curl -X POST https://XXX.hub.streamsets.com/jobrunner/rest/v1/job/72b8200c-0e3b-426e-b564-bceb17220b1e:8c2c652f-e3d9-11eb-9fb3-b974ac4c3f67 -d '{ /home/script/modified.json}’ -H "Content-Type:application/json" -H "X-R
Hello Team,I need to update one job config which is “globalMaxRetries” using rest call.I refer Restful section in Control Hub but that Rest call for “updateJob” requires many other configs also needs to mention in Request Body. could you please assist me with this ?
I’m getting this error when I’m trying to use Snowflake uploader. I manually created the stage and I know I can put files there. I’m on 5.8 data collector. How do I over come this error?
Hi,We are using Microsoft SQL CDC Client to extract changes on one of our database. Now we noticed that the target (a Snowflake destination) is complaining about unknown operations.So added this following step to the data flow: if (records[i].attributes['sdc.operation.type'] == 5) continue; sdc.output.write(records[i]);Is it normal that we have to filter out operation type 5 manually? So that the pipeline works.Is there an example pipeline (SQL CDC → Snowflake) so that we could apply best practicesThanksSebastian
I have developed a pipeline that utilizes an Http Server as the origin. In the configuration of the origin, I have designated port 8000 for the purpose of writing data to the pipeline. However, the issue lies in the fact that the pipeline does not release this port. I aim to utilize the same port for multiple pipelines, distinguishing them by their respective Application-Id. Upon executing the initial pipeline, the port remains occupied and fails to release. Upon attempting to execute a second pipeline using the same origin, I encountered an error message stating "Port is not free."If anyone got solution, Please Help me out with this.
Become a leader!
Already have an account? Login
No account yet? Create an account
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.
Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.
Sorry, our virus scanner detected that this file isn't safe to download.