Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Error Using MapReduce Executor In streamsets 3.7.0

Hello Everyone,

I have setup streamsets 3.7.0 with the MapR v6.0.0-mep4 stage libraries, The MapR streams consumer, Producer and other components work correctly.

However when I try to attach the MapReduce executor to an Event produced from the MapR-FS destination, I get the following exception

com.streamsets.datacollector.util.PipelineException: PREVIEW_0003 - Encountered error while previewing : java.lang.NoClassDefFoundError: org.apache.commons.configuration.Configuration at com.streamsets.datacollector.execution.preview.sync.SyncPreviewer.validateConfigs(SyncPreviewer.java:166) at com.streamsets.datacollector.execution.preview.async.AsyncPreviewer$1.call(AsyncPreviewer.java:70) at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226) at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable$$Lambda$113.000000001C00BCD0.call(Unknown Source) at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33) at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222) at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226) at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable$$Lambda$113.000000001C00BCD0.call(Unknown Source) at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33) at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at com.streamsets.datacollector.metrics.MetricSafeScheduledExecutorService$MetricsTask.run(MetricSafeScheduledExecutorService.java:100) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:813) Caused by: java.lang.NoClassDefFoundError: org.apache.commons.configuration.Configuration at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38) at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36) at org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:145) at org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:159) at com.streamsets.datacollector.security.MapRLoginUgiProvider.getLoginUgi(MapRLoginUgiProvider.java:48) at com.streamsets.datacollector.security.HadoopSecurityUtil.getLoginUser(HadoopSecurityUtil.java:35) at com.streamsets.pipeline.stage.destination.mapreduce.config.MapReduceConfig.init(MapReduceConfig.java:132) at com.streamsets.pipeline.stage.destination.mapreduce.MapReduceExecutor.init(MapReduceExecutor.java:70) at com.streamsets.pipeline.api.base.BaseStage.init(BaseStage.java:48) at com.streamsets.pipeline.api.base.configurablestage.DStage.init(DStage.java:36) at com.streamsets.datacollector.runner.StageRuntime.lambda$init$0(StageRuntime.java:211) at com.streamsets.datacollector.runner.StageRuntime$$Lambda$143.0000000044743230.get(Unknown Source) at com.streamsets.datacollector.util.LambdaUtil.withClassLoaderInternal(LambdaUtil.java:148) at com.streamsets.datacollector.util.LambdaUtil.withClassLoader(LambdaUtil.java:44) at com.streamsets.datacollector.runner.StageRuntime.init(StageRuntime.java:209) at com.streamsets.datacollector.runner.StagePipe.init(StagePipe.java:123) at com.streamsets.datacollector.runner.StagePipe.init(StagePipe.java:47) at com.streamsets.datacollector.runner.Pipeline.initPipe(Pipeline.java:408) at com.streamsets.datacollector.runner.Pipeline.lambda$init$0(Pipeline.java:397) at com.streamsets.datacollector.runner.Pipeline$$Lambda$148.0000000044BF34F0.accept(Unknown Source) at com.streamsets.datacollector.runner.PipeRunner.forEach(PipeRunner.java:170) at com.streamsets.datacollector.runner.Pipeline.init(Pipeline.java:394) at com.streamsets.datacollector.runner.Pipeline.validateConfigs(Pipeline.java:220) at com.streamsets.datacollector.runner.preview.PreviewPipeline.validateConfigs(PreviewPipeline.java:60) at com.streamsets.datacollector.execution.preview.sync.SyncPreviewer.validateConfigs(SyncPreviewer.java:144) ... 16 more Caused by: java.lang.ClassNotFoundException: org.apache.commons.configuration.Configuration at java.net.URLClassLoader.findClass(URLClassLoader.java:592) at java.lang.ClassLoader.loadClassHelper(ClassLoader.java:934) at java.lang.ClassLoader.loadClass(ClassLoader.java:879) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:862) at com.streamsets.pipeline.SDCClassLoader.loadClass(SDCClassLoader.java:353) at com.streamsets.pipeline.SDCClassLoader.loadClass(SDCClassLoader.java:316) at com.streamsets.pipeline.SDCClassLoader.loadClass(SDCClassLoader.java:353) at com.streamsets.pipeline.SDCClassLoader.loadClass(SDCClassLoader.java:316) ... 41 more

I understand that the class in question resides in the hadoop-common-2.7.0 jar file and I can see this file in the streamsets-datacollector-mapr_6_0-mep4-lib/lib path but why I keep getting this exception is something I don't quite get

PS: I used the setup-mapr utility to setup mapr-stage libraries

Thanks In advance

Error Using MapReduce Executor In streamsets 3.7.0

Hello Everyone,

I have setup streamsets 3.7.0 with the MapR v6.0.0-mep4 stage libraries, The MapR streams consumer, Producer and other components work correctly.

However when I try to attach the MapReduce executor to an Event produced from the MapR-FS destination, I get the following exception

com.streamsets.datacollector.util.PipelineException: when previewing the pipeline

com.streamsets.datacollector.util.PipelineException: PREVIEW_0003 - Encountered error while
  while previewing :
  java.lang.NoClassDefFoundError:
  org.apache.commons.configuration.Configuration
    at
  com.streamsets.datacollector.execution.preview.sync.SyncPreviewer.validateConfigs(SyncPreviewer.java:166)
    at
  com.streamsets.datacollector.execution.preview.async.AsyncPreviewer$1.call(AsyncPreviewer.java:70)
    at
  com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226)
    at
  com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable$$Lambda$113.000000001C00BCD0.call(Unknown
  : java.lang.NoClassDefFoundError: org.apache.commons.configuration.Configuration     at com.streamsets.datacollector.execution.preview.sync.SyncPreviewer.validateConfigs(SyncPreviewer.java:166)    at com.streamsets.datacollector.execution.preview.async.AsyncPreviewer$1.call(AsyncPreviewer.java:70)   at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226)  at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable$$Lambda$113.000000001C00BCD0.call(Unknown Source)   at
  com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33)
    at
  com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222)
    at
  com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226)
    at
  com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable$$Lambda$113.000000001C00BCD0.call(Unknown
    at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33)   at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222)   at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226)  at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable$$Lambda$113.000000001C00BCD0.call(Unknown Source)   at
  com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33)
    at
  com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222)
    at
  java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at
  java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at
  java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at
  com.streamsets.datacollector.metrics.MetricSafeScheduledExecutorService$MetricsTask.run(MetricSafeScheduledExecutorService.java:100)
    at
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at
  java.lang.Thread.run(Thread.java:813)
    at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33)   at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222)   at java.util.concurrent.FutureTask.run(FutureTask.java:266)     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)   at com.streamsets.datacollector.metrics.MetricSafeScheduledExecutorService$MetricsTask.run(MetricSafeScheduledExecutorService.java:100)     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)  at java.lang.Thread.run(Thread.java:813) Caused by:
  java.lang.NoClassDefFoundError:
  org.apache.commons.configuration.Configuration
    at
  org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
    at
  org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
    at
  org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:145)
    at
  org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:159)
    at
  com.streamsets.datacollector.security.MapRLoginUgiProvider.getLoginUgi(MapRLoginUgiProvider.java:48)
    at
  com.streamsets.datacollector.security.HadoopSecurityUtil.getLoginUser(HadoopSecurityUtil.java:35)
    at
  com.streamsets.pipeline.stage.destination.mapreduce.config.MapReduceConfig.init(MapReduceConfig.java:132)
    at
  com.streamsets.pipeline.stage.destination.mapreduce.MapReduceExecutor.init(MapReduceExecutor.java:70)
    at
  com.streamsets.pipeline.api.base.BaseStage.init(BaseStage.java:48)
    at
  com.streamsets.pipeline.api.base.configurablestage.DStage.init(DStage.java:36)
    at
  com.streamsets.datacollector.runner.StageRuntime.lambda$init$0(StageRuntime.java:211)
    at
  com.streamsets.datacollector.runner.StageRuntime$$Lambda$143.0000000044743230.get(Unknown
  by: java.lang.NoClassDefFoundError: org.apache.commons.configuration.Configuration  at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)     at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)   at org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:145)     at org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:159)  at com.streamsets.datacollector.security.MapRLoginUgiProvider.getLoginUgi(MapRLoginUgiProvider.java:48)     at com.streamsets.datacollector.security.HadoopSecurityUtil.getLoginUser(HadoopSecurityUtil.java:35)    at com.streamsets.pipeline.stage.destination.mapreduce.config.MapReduceConfig.init(MapReduceConfig.java:132)    at com.streamsets.pipeline.stage.destination.mapreduce.MapReduceExecutor.init(MapReduceExecutor.java:70)    at com.streamsets.pipeline.api.base.BaseStage.init(BaseStage.java:48)   at com.streamsets.pipeline.api.base.configurablestage.DStage.init(DStage.java:36)   at com.streamsets.datacollector.runner.StageRuntime.lambda$init$0(StageRuntime.java:211)    at com.streamsets.datacollector.runner.StageRuntime$$Lambda$143.0000000044743230.get(Unknown Source)   at
  com.streamsets.datacollector.util.LambdaUtil.withClassLoaderInternal(LambdaUtil.java:148)
    at
  com.streamsets.datacollector.util.LambdaUtil.withClassLoader(LambdaUtil.java:44)
    at
  com.streamsets.datacollector.runner.StageRuntime.init(StageRuntime.java:209)
    at
  com.streamsets.datacollector.runner.StagePipe.init(StagePipe.java:123)
    at
  com.streamsets.datacollector.runner.StagePipe.init(StagePipe.java:47)
    at
  com.streamsets.datacollector.runner.Pipeline.initPipe(Pipeline.java:408)
    at
  com.streamsets.datacollector.runner.Pipeline.lambda$init$0(Pipeline.java:397)
    at
  com.streamsets.datacollector.runner.Pipeline$$Lambda$148.0000000044BF34F0.accept(Unknown
   at com.streamsets.datacollector.util.LambdaUtil.withClassLoaderInternal(LambdaUtil.java:148)    at com.streamsets.datacollector.util.LambdaUtil.withClassLoader(LambdaUtil.java:44)     at com.streamsets.datacollector.runner.StageRuntime.init(StageRuntime.java:209)     at com.streamsets.datacollector.runner.StagePipe.init(StagePipe.java:123)   at com.streamsets.datacollector.runner.StagePipe.init(StagePipe.java:47)    at com.streamsets.datacollector.runner.Pipeline.initPipe(Pipeline.java:408)     at com.streamsets.datacollector.runner.Pipeline.lambda$init$0(Pipeline.java:397)    at com.streamsets.datacollector.runner.Pipeline$$Lambda$148.0000000044BF34F0.accept(Unknown Source)   at
  com.streamsets.datacollector.runner.PipeRunner.forEach(PipeRunner.java:170)
    at
  com.streamsets.datacollector.runner.Pipeline.init(Pipeline.java:394)
    at
  com.streamsets.datacollector.runner.Pipeline.validateConfigs(Pipeline.java:220)
    at
  com.streamsets.datacollector.runner.preview.PreviewPipeline.validateConfigs(PreviewPipeline.java:60)
    at
  com.streamsets.datacollector.execution.preview.sync.SyncPreviewer.validateConfigs(SyncPreviewer.java:144)
   at com.streamsets.datacollector.runner.PipeRunner.forEach(PipeRunner.java:170)  at com.streamsets.datacollector.runner.Pipeline.init(Pipeline.java:394)     at com.streamsets.datacollector.runner.Pipeline.validateConfigs(Pipeline.java:220)  at com.streamsets.datacollector.runner.preview.PreviewPipeline.validateConfigs(PreviewPipeline.java:60)     at com.streamsets.datacollector.execution.preview.sync.SyncPreviewer.validateConfigs(SyncPreviewer.java:144)    ... 16 more Caused by:
  java.lang.ClassNotFoundException:
  org.apache.commons.configuration.Configuration
    at
  java.net.URLClassLoader.findClass(URLClassLoader.java:592)
    at
  java.lang.ClassLoader.loadClassHelper(ClassLoader.java:934)
    at
  java.lang.ClassLoader.loadClass(ClassLoader.java:879)
    at
  sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
    at
  java.lang.ClassLoader.loadClass(ClassLoader.java:862)
    at
  com.streamsets.pipeline.SDCClassLoader.loadClass(SDCClassLoader.java:353)
    at
  com.streamsets.pipeline.SDCClassLoader.loadClass(SDCClassLoader.java:316)
    at
  com.streamsets.pipeline.SDCClassLoader.loadClass(SDCClassLoader.java:353)
    at
  com.streamsets.pipeline.SDCClassLoader.loadClass(SDCClassLoader.java:316)
 by: java.lang.ClassNotFoundException: org.apache.commons.configuration.Configuration     at java.net.URLClassLoader.findClass(URLClassLoader.java:592)   at java.lang.ClassLoader.loadClassHelper(ClassLoader.java:934)  at java.lang.ClassLoader.loadClass(ClassLoader.java:879)    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)    at java.lang.ClassLoader.loadClass(ClassLoader.java:862)    at com.streamsets.pipeline.SDCClassLoader.loadClass(SDCClassLoader.java:353)    at com.streamsets.pipeline.SDCClassLoader.loadClass(SDCClassLoader.java:316)    at com.streamsets.pipeline.SDCClassLoader.loadClass(SDCClassLoader.java:353)    at com.streamsets.pipeline.SDCClassLoader.loadClass(SDCClassLoader.java:316)    ... 41 more

more

When I try to run the pipeline, I get the following stacktrace,

ERROR ProductionPipelineRunnable - An exception occurred while running the pipeline,

java.lang.NullPointerException
java.lang.NullPointerException
        at com.streamsets.datacollector.runner.StagePipe.finishBatchAndCalculateMetrics(StagePipe.java:279)
        at com.streamsets.datacollector.runner.StagePipe.process(StagePipe.java:258)
        at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.lambda$destroy$2(ProductionPipelineRunner.java:706)
        at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner$$Lambda$217.00000000AC08CB60.accept(Unknown Source)
        at com.streamsets.datacollector.runner.PipeRunner.executeBatch(PipeRunner.java:138)
        at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.destroy(ProductionPipelineRunner.java:697)
        at com.streamsets.datacollector.runner.Pipeline.destroy(Pipeline.java:436)
        at com.streamsets.datacollector.execution.runner.common.ProductionPipeline.run(ProductionPipeline.java:152)
        at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunnable.run(ProductionPipelineRunnable.java:75)
        at com.streamsets.datacollector.execution.runner.standalone.StandaloneRunner.start(StandaloneRunner.java:724)
        at com.streamsets.datacollector.execution.AbstractRunner.lambda$scheduleForRetries$0(AbstractRunner.java:349)
        at com.streamsets.datacollector.execution.AbstractRunner$$Lambda$218.00000000AC093940.call(Unknown Source)
        at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:226)
        at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable$$Lambda$113.00000000EC00BCD0.call(Unknown Source)
        at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33)
        at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:222)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at com.streamsets.datacollector.metrics.MetricSafeScheduledExecutorService$MetricsTask.run(MetricSafeScheduledExecutorService.java:100)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:813)

I understand that the class in question resides in the hadoop-common-2.7.0 jar file and I can see this file in the streamsets-datacollector-mapr_6_0-mep4-lib/lib path but why I keep getting this exception is something I don't quite get

PS: I used the setup-mapr utility to setup mapr-stage libraries

Thanks In advance