sha1sum fails on CentOS 7 & Ubuntu 16.04

asked 2018-05-12



We are building Docker images with the MapR software all set up and configured. (Dockerfile below) When we reach the step to install the MapR stagelib, we get this error:

    Step 16/22 : RUN cd ${SDC_DIST} && ${SDC_DIST}/bin/streamsets stagelibs -install=streamsets-datacollector-mapr_6_0-mep4-lib
 ---> Running in f5b96186b0f8

sha1sum: invalid option -- 's'
Try 'sha1sum --help' for more information.
Failed! running sha1sum -s -c in /tmp/sdc-setup-7

The command '/bin/sh -c cd ${SDC_DIST} && ${SDC_DIST}/bin/streamsets stagelibs -install=streamsets-datacollector-mapr_6_0-mep4-lib' returned a non-zero code: 1

Has this been seen with these operating systems?

Dockerfile (WIP):

FROM maprtech/pacc:6.0.1_5.0.0_centos7
MAINTAINER Paul Curtis <>


# The paths below should generally be attached to a VOLUME for persistence.
# SDC_CONF is where configuration files are stored. This can be shared.
# SDC_DATA is a volume for storing collector state. Do not share this between containers.
# SDC_LOG is an optional volume for file based logs.
# SDC_RESOURCES is where resource files such as runtime:conf resources and Hadoop configuration can be placed.
# STREAMSETS_LIBRARIES_EXTRA_DIR is where extra libraries such as JDBC drivers should go.
ENV SDC_CONF=/etc/sdc \
    SDC_DATA=/data \
    SDC_DIST="/opt/streamsets-datacollector" \
    SDC_LOG=/logs \

RUN groupadd --system ${SDC_USER} && \
    adduser --system -g ${SDC_USER} ${SDC_USER}

RUN cd /tmp && \
    curl -o /tmp/sdc.tgz -L "${SDC_URL}" && \
    mkdir /opt/streamsets-datacollector && \
    tar xzf /tmp/sdc.tgz --strip-components 1 -C /opt/streamsets-datacollector && \
    rm -rf /tmp/sdc.tgz

# Add logging to stdout to make logs visible through `docker logs`.
RUN sed -i 's|INFO, streamsets|INFO, streamsets,stdout|' "${SDC_DIST}/etc/"

# Create necessary directories.
RUN mkdir -p /mnt \
    "${SDC_DATA}" \
    "${SDC_LOG}" \

# Move configuration to /etc/sdc
RUN mv "${SDC_DIST}/etc" "${SDC_CONF}"

# Use short option -s as long option --status is not supported on alpine linux.
RUN sed -i 's|--status|-s|' "${SDC_DIST}/libexec/_stagelibs"

# Setup filesystem permissions.
RUN chown -R "${SDC_USER}:${SDC_USER}" "${SDC_DIST}/streamsets-libs" \
    "${SDC_CONF}" \
    "${SDC_DATA}" \
    "${SDC_LOG}" \
    "${SDC_RESOURCES}" \

# Set the required MapR locations
ENV MAPR_HOME=/opt/mapr \
    MAPR_VERSION=6.0 \

# Install the MapR library, and run the mapr-setup
RUN cd ${SDC_DIST} && ${SDC_DIST}/bin/streamsets stagelibs -install=streamsets-datacollector-mapr_6_0-mep4-lib

RUN ${SDC_DIST}/bin/streamsets setup-mapr

EXPOSE 18630
CMD ["dc", "-exec"]
1 Answer

answered 2018-05-14



It looks like you're copy/pasting our instructions for spinning up Data Collector on Alpine Linux. Unfortunately, the option for sha1sum on Alpine is -s, while GNU sha1sum, used in CentOS and Ubuntu uses --status.

The clue is in the Dockerfile:

# Use short option -s as long option --status is not supported on alpine linux.
RUN sed -i 's|--status|-s|' "${SDC_DIST}/libexec/_stagelibs"

Just remove that line and you should be good to go.

Asked: 2018-05-12

