转自:http://blog.csdn.net/desilting/article/details/23194039

配置flume:

       http://blog.csdn.net/desilting/article/details/22811593

conf/flume-conf.properties文件:

[html]  view plain copy print ? 在CODE上查看代码片 派生到我的代码片
  1. producer.sources = s  
  2. producer.channels = c  
  3. producer.sinks = r  
  4.   
  5. producer.sources.s.channels = c  
  6. producer.sources.s.typenetcat  
  7. producer.sources.s.bind192.168.40.134  
  8. producer.sources.s.port44444  
  9.   
  10. producer.sinks.r.type = org.apache.flume.plugins.KafkaSink  
  11. producer.sinks.r.metadata.broker.list=192.168.40.134:9092  
  12. producer.sinks.r.serializer.class=kafka.serializer.StringEncoder  
  13. producer.sinks.r.request.required.acks=1  
  14. producer.sinks.r.max.message.size=1000000  
  15. producer.sinks.r.custom.topic.name=mykafka  
  16. producer.sinks.r.channel = c  
  17.   
  18. producer.channels.c.type = memory  
  19. producer.channels.c.capacity = 1000  


配置kafka:

       http://blog.csdn.net/desilting/article/details/22872839

启动zookeeper、kafka及storm

创建topic:

        bin/kafka-topics.sh --create --zookeeper 192.168.40.132:2181 --replication-factor 3 --partitions 1 --topic  mykafka

查看topic:

       bin/kafka-topics.sh --describe --zookeeper 192.168.40.132:2181

       Topic:mykafka PartitionCount:1ReplicationFactor:3Configs:

       Topic: mykafka Partition: 0Leader: 134Replicas: 133,134,132Isr: 134,133,132

partition 同一个topic下可以设置多个partition,将topic下的message存储到不同的partition下,目的是为了提高并行性
leader 负责此partition的读写操作,每个broker都有可能成为某partition的leader
replicas 副本,即此partition在哪几个broker上有备份,不管broker是否存活
isr 存活的replicas    
启动flume:

       bin/flume-ng agent --conf conf --conf-file conf/flume-conf.properties --name producer -Dflume.root.logger=INFO,console


KafkaSink类:

[java]  view plain copy print ? 在CODE上查看代码片 派生到我的代码片
  1. import org.slf4j.Logger;  
  2. import org.slf4j.LoggerFactory;  
  3.   
  4. import java.util.Map;  
  5. import java.util.Properties;  
  6. import kafka.javaapi.producer.Producer;  
  7. import kafka.producer.KeyedMessage;  
  8. import kafka.producer.ProducerConfig;  
  9.   
  10. import org.apache.flume.Context;  
  11. import org.apache.flume.Channel;  
  12. import org.apache.flume.Event;  
  13. import org.apache.flume.Transaction;  
  14. import org.apache.flume.conf.Configurable;  
  15. import org.apache.flume.sink.AbstractSink;  
  16. import com.google.common.base.Preconditions;  
  17. import com.google.common.collect.ImmutableMap;  
  18.   
  19. public class KafkaSink extends AbstractSink implements Configurable {  
  20.   
  21.     private Context context;  
  22.     private Properties parameters;  
  23.     private Producer<String, String> producer;  
  24.   
  25.     private static final String PARTITION_KEY_NAME = "custom.partition.key";  
  26.     private static final String CUSTOME_TOPIC_KEY_NAME = "custom.topic.name";  
  27.     private static final String DEFAULT_ENCODING = "UTF-8";  
  28.     private static final Logger LOGGER = LoggerFactory.getLogger(KafkaSink.class);  
  29.   
  30.     public void configure(Context context) {  
  31.         this.context = context;  
  32.         ImmutableMap<String, String> props = context.getParameters();  
  33.         this.parameters = new Properties();  
  34.         for (Map.Entry<String,String> entry : props.entrySet()) {  
  35.             this.parameters.put(entry.getKey(), entry.getValue());  
  36.         }  
  37.     }  
  38.   
  39.     @Override  
  40.     public synchronized void start() {  
  41.         super.start();  
  42.         ProducerConfig config = new ProducerConfig(this.parameters);  
  43.         this.producer = new Producer<String, String>(config);  
  44.     }  
  45.   
  46.     public Status process() {  
  47.         Status status = null;  
  48.         Channel channel = getChannel();  
  49.         Transaction transaction = channel.getTransaction();  
  50.   
  51.         try {  
  52.             transaction.begin();  
  53.             Event event = channel.take();  
  54.             if (event != null) {  
  55.                 String partitionKey = (String) parameters.get(PARTITION_KEY_NAME);  
  56.                 String topic = Preconditions.checkNotNull((String) this.parameters.get(CUSTOME_TOPIC_KEY_NAME),  
  57.                         "topic name is required");  
  58.                 String eventData = new String(event.getBody(), DEFAULT_ENCODING);  
  59.                 KeyedMessage<String, String> data = (partitionKey.isEmpty()) ? new KeyedMessage<String, String>(topic,  
  60.                         eventData) : new KeyedMessage<String, String>(topic, partitionKey, eventData);  
  61.                 LOGGER.info("Sending Message to Kafka : [" + topic + ":" + eventData + "]");  
  62.                 producer.send(data);  
  63.                 transaction.commit();  
  64.                 LOGGER.info("Send message success");  
  65.                 status = Status.READY;  
  66.             } else {  
  67.                 transaction.rollback();  
  68.                 status = Status.BACKOFF;  
  69.             }  
  70.         } catch (Exception e) {  
  71.             e.printStackTrace();  
  72.             LOGGER.info("Send message failed!");  
  73.             transaction.rollback();  
  74.             status = Status.BACKOFF;  
  75.         } finally {  
  76.             transaction.close();  
  77.         }  
  78.         return status;  
  79.     }  
  80.   
  81.     @Override  
  82.     public void stop() {  
  83.         producer.close();  
  84.     }  
  85. }  


KafkaSpout参考其他人的代码:

https://github.com/HolmesNL/kafka-spout

需要做一些修改。


storm测试程序:

[java]  view plain copy print ? 在CODE上查看代码片 派生到我的代码片
  1. public class ExclamationTopology {  
  2.   
  3.   public static class ExclamationBolt extends BaseRichBolt {  
  4.     OutputCollector _collector;  
  5.     transient CountMetric _countMetric;  
  6.     transient MultiCountMetric _wordCountMetric;  
  7.     transient ReducedMetric _wordLengthMeanMetric;  
  8.     public void prepare(Map conf, TopologyContext context, OutputCollector collector) {  
  9.       _collector = collector;  
  10.       initMetrics(context);  
  11.     }  
  12.   
  13.     public void execute(Tuple tuple) {  
  14.       _collector.emit(tuple, new Values(tuple.getString(0) + "!!!"));  
  15.       _collector.ack(tuple);  
  16.         updateMetrics(tuple.getString(0));  
  17.     }  
  18.   
  19.       void updateMetrics(String word)  
  20.       {  
  21.           _countMetric.incr();  
  22.           _wordCountMetric.scope(word).incr();  
  23.           _wordLengthMeanMetric.update(word.length());  
  24.       }  
  25.     public void declareOutputFields(OutputFieldsDeclarer declarer) {  
  26.       declarer.declare(new Fields("word"));  
  27.     }  
  28.   
  29.       void initMetrics(TopologyContext context)  
  30.       {  
  31.           _countMetric = new CountMetric();  
  32.           _wordCountMetric = new MultiCountMetric();  
  33.           _wordLengthMeanMetric = new ReducedMetric(new MeanReducer());  
  34.   
  35.           context.registerMetric("execute_count", _countMetric, 5);  
  36.           context.registerMetric("word_count", _wordCountMetric, 60);  
  37.           context.registerMetric("word_length", _wordLengthMeanMetric, 60);  
  38.       }  
  39.   }  
  40.   
  41.   public static void main(String[] args) throws Exception {  
  42.     TopologyBuilder builder = new TopologyBuilder();  
  43.     String topic = args.length==2 ? args[1] : args[0];  
  44.     KafkaSpout kafkaSpout = new KafkaSpout(topic,"testKafkaGroup","192.168.40.132:2181");  
  45.   
  46.     builder.setSpout("word", kafkaSpout, 10);  
  47.     builder.setBolt("exclaim1"new ExclamationBolt(), 3).shuffleGrouping("word");  
  48.     builder.setBolt("exclaim2"new ExclamationBolt(), 2).shuffleGrouping("exclaim1");  
  49.   
  50.     Config conf = new Config();  
  51.     conf.setDebug(true);  
  52.     conf.registerMetricsConsumer(LoggingMetricsConsumer.class2);  
  53.   
  54.     if (args != null && args.length == 2) {  
  55.       conf.setNumWorkers(3);  
  56.   
  57.       StormSubmitter.submitTopology(args[0], conf, builder.createTopology());  
  58.     }  
  59.     else {  
  60.   
  61.       LocalCluster cluster = new LocalCluster();  
  62.       cluster.submitTopology("test", conf, builder.createTopology());  
  63.       Utils.sleep(10000);  
  64.       cluster.killTopology("test");  
  65.       cluster.shutdown();  
  66.     }  
  67.   }  
  68. }  

测试结果:

telnet 192.168.40.134 44444

在telnet端随便输入ASDF字符:

[java]  view plain copy print ? 在CODE上查看代码片 派生到我的代码片
  1. ASDF  
  2. OK  
  3. ASD  
  4. OK  
  5. F  
  6. OK  
  7. ASDF  
  8. OK  

flume端显示:

[java]  view plain copy print ? 在CODE上查看代码片 派生到我的代码片
  1. 2014-04-08 01:30:44,379 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.plugins.KafkaSink.process(KafkaSink.java:77)] Sending Message t] Kafka : [mykafka:ASDF  
  2. 2014-04-08 01:30:44,387 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.plugins.KafkaSink.process(KafkaSink.java:80)] Send message success  
  3. 2014-04-08 01:30:44,604 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.plugins.KafkaSink.process(KafkaSink.java:77)] Sending Message t] Kafka : [mykafka:ASD  
  4. 2014-04-08 01:30:44,611 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.plugins.KafkaSink.process(KafkaSink.java:80)] Send message success  
  5. 2014-04-08 01:30:44,794 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.plugins.KafkaSink.process(KafkaSink.java:77)] Sending Message t] Kafka : [mykafka:F  
  6. 2014-04-08 01:30:44,799 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.plugins.KafkaSink.process(KafkaSink.java:80)] Send message success  
  7. 2014-04-08 01:30:45,038 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.plugins.KafkaSink.process(KafkaSink.java:77)] Sending Message t] Kafka : [mykafka:ASDF  

最终在storm的logs/metrics.log文件中,会找到这样的记录:

[plain]  view plain copy print ? 在CODE上查看代码片 派生到我的代码片
  1. 2014-04-08 01:30:28,446 495106 1396945828 ubun

Logo

Kafka开源项目指南提供详尽教程,助开发者掌握其架构、配置和使用,实现高效数据流管理和实时处理。它高性能、可扩展,适合日志收集和实时数据处理,通过持久化保障数据安全,是企业大数据生态系统的核心。

更多推荐