30-Day Free Trial: It’s Never Been Easier To Get Started With StreamSets
Hello!I'm currently trying out the free (open source version) of SDC. When I created an empty pipeline and added a cron scheduler and JavaScript Evaluator and tried to run it, I encountered the following error:Pipeline Status: STARTING_ERROR: java.lang.NoClassDefFoundError: Could not initialize class com.streamsets.pipeline.stage.processor.javascript.Java8JavaScriptObjectFactoryI was under the impression that using the JavaScript Evaluator did not require any additional package installations. Is there an additional step that needs to be taken?I need some help. Thank you.
Hello,I'm currently using the HTTP Client Processor in StreamSets Data Collector and I've encountered an issue with the retry mechanism. Despite configuring the processor to retry on receiving certain HTTP status codes, it doesn't seem to be doing so.Specifically, I've set the processor to retry immediately when it receives a 409 status code, with a maximum of 2 retries. However, when the processor receives a 409 status code, it doesn't retry the request and instead gives the following error: "HTTP_101: Applying passthrough and error policy on status configuration".I've checked the pipeline configuration. I'm not sure why the processor isn't retrying as expected.Has anyone else encountered this issue? Any insights or suggestions would be greatly appreciated.Thank you in advance for your help.
Hi Team While installing DC Tarball engine using install script from Dataops platform, it asks for download and install directory at run time.Is there a way we could avoid passing it during execution and pass it in install script or use current directory?
We are logging pipeline errors /orchestrated task errors in database . Now often, we see error messages like - “for actual error , open the logs “ as runtime errors.Per our solution , this gets logged into the database . We don’t want that ..Instead we want the actual errors to get logged . How can we do that ?
Hello,I'm currently working on a simple pipeline to ingest kafka messages inside a log file.I'm trying to consume all the data from the beginning of a topic but i'm only getting newer data added to this topic.Once consumed, the previous topic messages are not accessible anymore.I've already test all the different "Auto Offset Reset" properties.Same for simple and multi topic consumers.In the official documentation docs.streamsets.com :auto.commit.interval.ms bootstrap.servers enable.auto.commit group.id max.poll.recordsIf I understand correctly all those parameters are locked so I can't disable the offset management and process all the data from the beginning of a topic.Is there an additionnal Kafka configuration property to use or do I need to configure the topic directly via kafka CLI ??StreamSets Data Collector version : 3.14.0Kafka Consumer version : 2.0.0Regards.
Error: Source : hello everyone, I want to pull data from the source to the warehouse, in the source there is a "customer_id" column, when the streamsets are run why is there an invalid customer error? in streamsets I also flag based on the "id" column. please help, thank you
Hi Team,We have some pipelines, which has some configuration like below, How can I get the exact plain test value for these parameter property value? Also how and where can I set these kind of values?
I read from an SFTP server and when writing to S3 my files get renamed. How can I retain the original file name?
I have a requirement to update job parameter via restAPI. I get the job details via GET function. I modify the content via code. But I am not sure how to pass the modified json content in a variable/file in the restapi post call? I have the need to run the api as automated batch with no manual intervention. I tried below two ways but none worked. I dont see any other viable option to pass modified content in the curl command.var1=`cat /home/script/modified.json`curl -X POST https://XXX.hub.streamsets.com/jobrunner/rest/v1/job/72b8200c-0e3b-426e-b564-bceb17220b1e:8c2c652f-e3d9-11eb-9fb3-b974ac4c3f67 -d '{ @var1}’ -H "Content-Type:application/json" -H "X-Requested-By:curl" -H "X-SS-REST-CALL:true" -H "X-SS-App-Component-Id: $CRED_ID" -H "X-SS-App-Auth-Token: $CRED_TOKEN" -i curl -X POST https://XXX.hub.streamsets.com/jobrunner/rest/v1/job/72b8200c-0e3b-426e-b564-bceb17220b1e:8c2c652f-e3d9-11eb-9fb3-b974ac4c3f67 -d '{ /home/script/modified.json}’ -H "Content-Type:application/json" -H "X-R
Hello Team,I need to update one job config which is “globalMaxRetries” using rest call.I refer Restful section in Control Hub but that Rest call for “updateJob” requires many other configs also needs to mention in Request Body. could you please assist me with this ?
I’m getting this error when I’m trying to use Snowflake uploader. I manually created the stage and I know I can put files there. I’m on 5.8 data collector. How do I over come this error?
Hi,We are using Microsoft SQL CDC Client to extract changes on one of our database. Now we noticed that the target (a Snowflake destination) is complaining about unknown operations.So added this following step to the data flow: if (records[i].attributes['sdc.operation.type'] == 5) continue; sdc.output.write(records[i]);Is it normal that we have to filter out operation type 5 manually? So that the pipeline works.Is there an example pipeline (SQL CDC → Snowflake) so that we could apply best practicesThanksSebastian
Become a leader!
Already have an account? Login
No account yet? Create an account
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.
Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.
Sorry, our virus scanner detected that this file isn't safe to download.