SparkStreaming——在RDD中查询redis

问题描述：在读取kafka数据时需要从redis查询出来上一条数据和当前数据进行计算。解决步骤：1.进入依赖<dependency><groupId>com.redislabs</groupId><artif

月月._.

1602人浏览 · 2022-01-05 15:25:27

月月._. · 2022-01-05 15:25:27 发布

问题描述：

在读取kafka数据时需要从redis查询出来上一条数据和当前数据进行计算。

解决步骤：

1.进入依赖

<!-- https://mvnrepository.com/artifact/com.redislabs/spark-redis -->
		<dependency>
			<groupId>com.redislabs</groupId>
			<artifactId>spark-redis</artifactId>
			<version>2.4.0</version>
		</dependency>

2.设置sparkConf

//设置参数
        SparkConf conf = new SparkConf();
        conf.set("spark.redis.host", "192.168.144.153");    //redis 主机节点
        conf.set("spark.redis.port", "6379"); //端口号，不填默认为6379
//        conf.set("spark.redis.auth","null");  //用户权限配置
        conf.set("spark.redis.db","2");  //数据库设置

3.拿到redis连接，查询数据

Jedis jedis = ConnectionPool.connect(new RedisEndpoint(conf));
//查询redis中的数据
String realDataLastStr = jedis.hget(RedisKey.HASH_VEHICLE_REAL_DATA, vin);

4.查出来的数据是String需要转换成对象，JSONUtil工具类使用的hutool的

RealtimeDataHB realtimeDataHBLast = JSONUtil.toBean(realDataLastStr, RealtimeDataHB.class);

搞定了！

运行过程中可能会报错

[ WARN ] 2022-01-05 13:48:21 [ driver-heartbeater:23640 ] [org.apache.spark.internal.Logging$class.logWarning(Logging.scala:69)] Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped

出现这个问题需要在适当的位置关闭掉数据库连接就可以了，或者直接将连接定义在try中

Kafka开源项目指南

Kafka开源项目指南提供详尽教程，助开发者掌握其架构、配置和使用，实现高效数据流管理和实时处理。它高性能、可扩展，适合日志收集和实时数据处理，通过持久化保障数据安全，是企业大数据生态系统的核心。

更多推荐

zookeeper+kafka群集

Kafka开源项目指南

zookeeper+kafka+ELK+filebeat集群

Kafka开源项目指南

lua-resty-kafka 使用及安装教程

lua-resty-kafka 使用及安装教程lua-resty-kafkaLua kafka client driver for the Openresty based on the cosocket API项目地址:https://gitcode.com/gh_mirrors/lu/lua-resty-kafka 1. 项目目录结构及介绍lua-resty-kafka 的目录结构如下：...