Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Exception when use HttpClient (P) with SDC in "Cluster Yarn Streaming" mode

I found when I use HttpClient(P) with SDC in "Cluster Yarn Streaming" mode, the yarn job complains the following exception. Can't it be used in cluster mode? If not, How can I use HttpClient in cluster mode?

User class threw exception: java.lang.IllegalStateException: Error trying to invoke BootstrapClusterStreaming.main: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, ambari-agent1.xxx.com, executor 1): java.lang.RuntimeException: com.streamsets.pipeline.cluster.ConsumerRuntimeException: Consumer encountered error: java.lang.LinkageError: ClassCastException: attempting to castjar:file:/hadoop/yarn/local/filecache/10/spark2-hdp-yarn-archive.tar.gz/javax.ws.rs-api-2.0.1.jar!/javax/ws/rs/ext/RuntimeDelegate.class to jar:file:/hadoop/hadoop/yarn/local/usercache/root/appcache/application_1530870180619_0022/container_e01_1530870180619_0022_02_000002/libs.tar.gz/streamsets-libs/streamsets-datacollector-basic-lib/lib/javax.ws.rs-api-2.0.1.jar!/javax/ws/rs/ext/RuntimeDelegate.class at com.streamsets.pipeline.cluster.Producer.waitForCommit(Producer.java:107) at com.streamsets.pipeline.stage.origin.kafka.cluster.ClusterKafkaSource.completeBatch(ClusterKafkaSource.java:117) at com.streamsets.pipeline.EmbeddedSDCPool.checkInAfterReadingBatch(EmbeddedSDCPool.java:188) at com.streamsets.pipeline.cluster.ClusterFunctionImpl.startBatch(ClusterFunctionImpl.java:113) at com.streamsets.pipeline.spark.Driver$$anonfun$1.apply(Driver.scala:113) at com.streamsets.pipeline.spark.Driver$$anonfun$1.apply(Driver.scala:102) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:800) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:800) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:337) at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:335) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1092) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1083) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1018) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1083) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:809) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335) at org.apache.spark.rdd.RDD.iterator(RDD.scala:286) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)