环境

spark:3.0.0
scala:2.12.10
kafka:2.12-2.2.2

描述

使用spark-streaming连接Kafka并读取数据,出现如下错误:

报错信息

Exception in thread "streaming-start" java.lang.NoClassDefFoundError: org/apache/spark/kafka010/KafkaConfigUpdater
	at org.apache.spark.streaming.kafka010.ConsumerStrategy.setAuthenticationConfigIfNeeded(ConsumerStrategy.scala:64)
	at org.apache.spark.streaming.kafka010.Subscribe.onStart(ConsumerStrategy.scala:91)
	at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.consumer(DirectKafkaInputDStream.scala:73)
	at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.start(DirectKafkaInputDStream.scala:258)
	at org.apache.spark.streaming.DStreamGraph.$anonfun$start$7(DStreamGraph.scala:55)
	at org.apache.spark.streaming.DStreamGraph.$anonfun$start$7$adapted(DStreamGraph.scala:55)
	at scala.collection.Iterator.foreach(Iterator.scala:941)
	at scala.collection.Iterator.foreach$(Iterator.scala:941)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
	at scala.collection.parallel.ParIterableLike$Foreach.leaf(ParIterableLike.scala:974)
	at scala.collection.parallel.Task.$anonfun$tryLeaf$1(Tasks.scala:53)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at scala.util.control.Breaks$$anon$1.catchBreak(Breaks.scala:67)
	at scala.collection.parallel.Task.tryLeaf(Tasks.scala:56)
	at scala.collection.parallel.Task.tryLeaf$(Tasks.scala:50)
	at scala.collection.parallel.ParIterableLike$Foreach.tryLeaf(ParIterableLike.scala:971)
	at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask.compute(Tasks.scala:153)
	at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask.compute$(Tasks.scala:149)
	at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:440)
	at java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.kafka010.KafkaConfigUpdater
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 24 more

解决方法

到Maven仓库下载:
1.org.apache.spark:spark-streaming-kafka-0-10_2.12-3.0.0.jar
2.spark-token-provider-kafka-0-10_2.12-3.0.0.jar
将这两个jar包放到$SPARK_HOME/jars目录下;
3.创建目录$SPARK_HOME/jars将kafka安装目录中libs/* 复制到 $SPARK_HOME/jars/kafka中.
4.重新提交任务:

$SPARK_HOME/bin/spark-submit --driver-class-path \ 
 $SPARK_HOME/jars*:$SPARK_HOME/jars/kafka/* \ 
 --class youClass \
 
Logo

Kafka开源项目指南提供详尽教程,助开发者掌握其架构、配置和使用,实现高效数据流管理和实时处理。它高性能、可扩展,适合日志收集和实时数据处理,通过持久化保障数据安全,是企业大数据生态系统的核心。

更多推荐