Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Yes. You can run StreamSets Data Collector on Windows using the Windows Subsystem for Linux. I have used WSL to run data collectors on my laptop. If you have minimal Linux skills (can use a package manager and get around in the Linux shell) then you should be able to get it working.

Basic steps are:

  1. Install Windows Subsystem for Linux. (Windows app store, search for “Ubuntu” and install the generic Ubuntu) Note that you should be able to use any Linux distribution. I just prefer Ubuntu.
  2. Use package manager to update your machine and install required packages. (see my suggested list below)
  3. Install SDC following the installation documentation.

I believe the following packages are required. Though I copied this from an old ansible script that installs Control Hub so your mileage may vary.

sudo apt install \
   openjdk-8-jdk-headless \
   mlocate \
   ntp

Notes:

  1. If you find you needed to install other packages, please make a note on this page. In fact if you have any other tips, I am sure others in the community would appreciate your additions here!
  2. You could just use a virtual machine but what fun is that?
  3. Auto-starting SDC is a little more involved. I leave it to your google skills. :-)

Yes. You can run pipelines on Windows natively using StreamSets Data Collector Edge Edge can execute pipelines but it doesn't provide a UI so I assume you want the full Data Collector. You can run StreamSets Data Collector on Windows using the Windows Subsystem for Linux. I have used WSL to run data collectors on my laptop. If you have minimal Linux skills (can use a package manager and get around in the Linux shell) then you should be able to get it working.

Basic steps are:

  1. Install Windows Subsystem for Linux. (Windows app store, search for “Ubuntu” and install the generic Ubuntu) Note that you should be able to use any Linux distribution. I just prefer Ubuntu.
  2. Use package manager to update your machine and install required packages. (see my suggested list below)
  3. Install SDC following the installation documentation.

I believe the following packages are required. Though I copied this from an old ansible script that installs Control Hub so your mileage may vary.

sudo apt install \
   openjdk-8-jdk-headless \
   mlocate \
   ntp

Notes:

  1. If you find you needed to install other packages, please make a note on this page. In fact if you have any other tips, tips I am sure others in the community would appreciate your additions here!
  2. You could just use a virtual machine but what fun is that?
  3. Auto-starting SDC is a little more involved. I leave it to your google skills. :-)
  4. This is not a supported configuration.

Yes. You can run pipelines on Windows natively using StreamSets Data Collector Edge Edge can execute pipelines but it doesn't provide a UI so I assume you want the full Data Collector. You can run StreamSets Data Collector on Windows using the Windows Subsystem for Linux. I have used WSL to run data collectors on my laptop. If you have minimal Linux skills (can use a package manager and get around in the Linux shell) then you should be able to get it working.

Basic steps are:

  1. Install Windows Subsystem for Linux. (Windows app store, search for “Ubuntu” and install the generic Ubuntu) Note that you should be able to use any Linux distribution. I just prefer Ubuntu.
  2. Use package manager to update your machine and install required packages. (see my suggested list below)
  3. Install SDC following the installation documentation.

I believe the following packages are required. Though I copied this from an old ansible script that installs Control Hub so your mileage may vary.

sudo apt install \
   openjdk-8-jdk-headless \
   mlocate \
   ntp

Notes:

  1. If you find you needed to install other packages, please make a note on this page. In fact if you have any other tips I am sure others in the community would appreciate your additions here!
  2. You could just use a virtual machine but what fun is that?
  3. Auto-starting SDC is a little more involved. I leave it to your google skills. :-)
  4. This is not a supported configuration.configuration. i.e. You can get functional support but if a problem exists at the OS level then you are unlikely to get assistance in figuring out what is wrong. Bottom line: Don't run this in production.