Ask Your Question

How to call API using StreamSets Data Collector? Please provide any use case with detailed screenshots

asked 2019-09-10 04:58:37 -0600

mspatil gravatar image

updated 2019-09-10 10:03:33 -0600

metadaddy gravatar image

I am trying to build a pipeline where I have to call an API using StreamSets Data Collector. I am able to login successfully using StreamSets HTTP client but when I try to call another API using the same pipeline i am getting error as "Unauthorized" and I am not able to call any other API after login. My assumption is that Data Collector is not able to capture and save the cookie sent by API server because of the session getting lost. Correct me if my understanding is wrong. Please suggest how to proceed with this use case and if possible attach any use case where StreamSets is interacting with an API.

Should I use HTTP client origin or HTTP client processor to call API?

edit retag flag offensive close merge delete


REST APIs for standalone SDC do not use/require auth cookies. Is your SDC registered with StreamSets Control Hub?

iamontheinet gravatar imageiamontheinet ( 2019-09-10 09:47:44 -0600 )edit

What API are you trying to use? It's very unusual for API clients to have to deal with cookies. Usually, the authentication API returns a token for use in future calls. You should examine the API documentation to confirm the authentication mechanism.

metadaddy gravatar imagemetadaddy ( 2019-09-10 09:53:39 -0600 )edit

@iamontheinet i am using stand alone SDC and it is not registered with Control Hub

mspatil gravatar imagemspatil ( 2019-09-11 07:30:52 -0600 )edit

Got it. See Pat's answer below.

iamontheinet gravatar imageiamontheinet ( 2019-09-11 16:50:10 -0600 )edit

1 Answer

Sort by ยป oldest newest most voted

answered 2019-09-10 10:06:11 -0600

metadaddy gravatar image

APIs vary widely, but, as I mentioned in my comment above, it's very rare for them to use cookies to carry session data. Typically, the authorization call returns a token in the HTTP response body for the client to use in subsequent calls. Again, the mechanism can vary, but StreamSets has built-in support for OAuth 2.0, one of the most common. Extract Data from Google Analytics using StreamSets Data Collector gives an example of how to call the Google Analytics API from a pipeline.

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower


Asked: 2019-09-10 04:58:37 -0600

Seen: 1,933 times

Last updated: Sep 10 '19