1,要求:监听一个tcp,udp端口41414将数据打印在控制台

  1. # example.conf: A single-node Flume configuration
  2.  
  3. # Name the components on this agent
  4. a1.sources = r1
  5. a1.sinks = k1
  6. a1.channels = c1
  7.  
  8. # Describe/configure the source
  9. a1.sources.r1.type = netcat
  10. a1.sources.r1.bind = 0.0.0.0
  11. a1.sources.r1.port =
  12.  
  13. # Describe the sink
  14. a1.sinks.k1.type = logger
  15.  
  16. # Use a channel which buffers events in memory
  17. a1.channels.c1.type = memory
  18. a1.channels.c1.capacity =
  19. a1.channels.c1.transactionCapacity =
  20.  
  21. # Bind the source and sink to the channel
  22. a1.sources.r1.channels = c1
  23. a1.sinks.k1.channel = c1

启动命令:

  1. bin/flume-ng agent --conf conf/ --conf-file conf/one.conf --name a1 -Dflume.root.logger=INFO,console &

Telnet:

  1. root@Ubuntu-:~# telnet 0.0.0.0
  2. Trying 0.0.0.0...
  3. Connected to 0.0.0.0.
  4. Escape character is '^]'.
  5. huxing
  6. OK

结果:

2,要求:将A机器的日志文件access.log传输到机器B上,并打印到控制台上

这里我假设A机器是131,B机器是132,则 需要将配置文件写在132上,然后正常启动132,而131中只需要启动avro_client,通过avro序列化将文件打到132中。

132中的配置文件内容:

  1. # example.conf: A single-node Flume configuration
  2.  
  3. # Name the components on this agent
  4. a1.sources = r1
  5. a1.sinks = k1
  6. a1.channels = c1
  7.  
  8. # Describe/configure the source
  9. a1.sources.r1.type = avro
  10. a1.sources.r1.bind = 0.0.0.0
  11. a1.sources.r1.port =
  12.  
  13. # Describe the sink
  14. a1.sinks.k1.type = logger
  15.  
  16. # Use a channel which buffers events in memory
  17. a1.channels.c1.type = memory
  18. a1.channels.c1.capacity =
  19. a1.channels.c1.transactionCapacity =
  20.  
  21. # Bind the source and sink to the channel
  22. a1.sources.r1.channels = c1
  23. a1.sinks.k1.channel = c1

启动132的flume:

  1. bin/flume-ng agent --conf conf/ --conf-file conf/two.conf --name a1 -Dflume.root.logger=INFO,console &

启动131的avro_client:

  1. bin/flume-ng avro-client --host 192.168.22.132 --port --filename logs/avro.log

查看132控制台:

成功

3,监听一个日志文件access.log,如果有日志追加及时的将数据打印在控制台上,如果是大文件呢?堆?

conf内容:

  1. # example.conf: A single-node Flume configuration
  2.  
  3. # Name the components on this agent
  4. a1.sources = r1
  5. a1.sinks = k1
  6. a1.channels = c1
  7.  
  8. # Describe/configure the source
  9. a1.sources.r1.type = exec
  10. a1.sources.r1.command = tail -F /opt/logs/access.log
  11.  
  12. # Describe the sink
  13. a1.sinks.k1.type = logger
  14.  
  15. # Use a channel which buffers events in memory
  16. a1.channels.c1.type = memory
  17. a1.channels.c1.capacity =
  18. a1.channels.c1.transactionCapacity =
  19.  
  20. # Bind the source and sink to the channel
  21. a1.sources.r1.channels = c1
  22. a1.sinks.k1.channel = c1

启动命令:

  1. bin/flume-ng agent --conf conf/ --conf-file conf/three.conf --name a1 -Dflume.root.logger=INFO,console &

打文件到控制台:

  1. root@Ubuntu-:/usr/local/apache-flume/logs# cat hu.log >> avro.log

成功

----------------------------------------------------------------------------------

如果是个很大文件的话怎么办呢?

--将这个文件中的的注释消掉。

4,A,B机器中的access.log汇总到C机器上然后统一收集到hdfs上分天存储。

在132,135中写入four_avro_sink.conf文件:

  1. # example.conf: A single-node Flume configuration
  2.  
  3. # Name the components on this agent
  4. a1.sources = r1
  5. a1.sinks = k1
  6. a1.channels = c1
  7.  
  8. # Describe/configure the source
  9. a1.sources.r1.type = exec
  10. a1.sources.r1.command = tail -F /usr/local/apache-flume/logs/avro.log
  11.  
  12. # Describe the sink
  13. a1.sinks.k1.type = avro
  14. a1.sinks.k1.hostname = 192.168.22.131
  15. a1.sinks.k1.port =
  16.  
  17. # Use a channel which buffers events in memory
  18. a1.channels.c1.type = memory
  19. a1.channels.c1.capacity =
  20. a1.channels.c1.transactionCapacity =
  21.  
  22. # Bind the source and sink to the channel
  23. a1.sources.r1.channels = c1
  24. a1.sinks.k1.channel = c1

就是将以exec形式持续的输出最新的数据到sink,再以avro的方式将文件序列化的方式传到131的sink上

启动flume:

  1. root@Ubuntu-:/usr/local/apache-flume# bin/flume-ng agent --conf conf/ --conf-file conf/four_avro_sink.conf --name a1 -Dflume.root.logger=INFO,console &

在131中写入four.conf文件:

  1. #定义agent名, source、channel、sink的名称
  2. access.sources = r1
  3. access.channels = c1
  4. access.sinks = k1
  5.  
  6. #具体定义source
  7. access.sources.r1.type = avro
  8. access.sources.r1.bind = 0.0.0.0
  9. access.sources.r1.port =
  10.  
  11. #具体定义channel
  12. access.channels.c1.type = memory
  13. access.channels.c1.capacity =
  14. access.channels.c1.transactionCapacity =
  15.  
  16. #定义拦截器,为消息添加时间戳
  17. access.sources.r1.interceptors = i1
  18. access.sources.r1.interceptors.i1.type = org.apache.flume.interceptor.TimestampInterceptor$Builder
  19.  
  20. #具体定义sink
  21. access.sinks.k1.type = hdfs
  22. access.sinks.k1.hdfs.path = hdfs://Ubuntu-1:9000/%Y%m%d
  23. access.sinks.k1.hdfs.filePrefix = events-
  24. access.sinks.k1.hdfs.fileType = DataStream
  25. #access.sinks.k1.hdfs.fileType = CompressedStream
  26. #access.sinks.k1.hdfs.codeC = gzip
  27. #不按照条数生成文件
  28. access.sinks.k1.hdfs.rollCount =
  29. #HDFS上的文件达到64M时生成一个文件
  30. access.sinks.k1.hdfs.rollSize =
  31. access.sinks.k1.hdfs.rollInterval =
  32.  
  33. #组装source、channel、sink
  34. access.sources.r1.channels = c1
  35. access.sinks.k1.channel = c1

启动Hadoop:

  1. root@Ubuntu-:/usr/local/hadoop-2.6.# sbin/start-dfs.sh

启动flume:

  1. root@Ubuntu-:/usr/local/apache-flume# bin/flume-ng agent --conf conf/ --conf-file conf/four.conf --name access -Dflume.root.logger=INFO,console &

5,A,B,机器中的access.log ugcheader.log ugctail.log汇总到C机器上。然后统一收集到HDFS的不同目录上

改成

  1. access.sinks.k1.hdfs.path = hdfs://Ubuntu-1:9000/%{type}/%Y%m%d

另132中的配置文件:

  1. # example.conf: A single-node Flume configuration
  2.  
  3. # Name the components on this agent
  4. a1.sources = r1 r2 r3
  5. a1.sinks = k1
  6. a1.channels = c1
  7.  
  8. # Describe/configure the source
  9. a1.sources.r1.type = exec
  10. a1.sources.r1.command = tail -F /usr/local/apache-flume/logs/avro.log
  11. a1.sources.r1.interceptors = i1
  12. a1.sources.r1.interceptors.i1.type = static
  13. a1.sources.r1.interceptors.i1.key = type
  14. a1.sources.r1.interceptors.i1.value = access
  15.  
  16. a1.sources.r2.type = exec
  17. a1.sources.r2.command = tail -F /usr/local/apache-flume/logs/flume.log
  18. a1.sources.r2.interceptors = i2
  19. a1.sources.r2.interceptors.i2.type = static
  20. a1.sources.r2.interceptors.i2.key = type
  21. a1.sources.r2.interceptors.i2.value = ugchead
  22.  
  23. a1.sources.r3.type = exec
  24. a1.sources.r3.command = tail -F /usr/local/apache-flume/logs/hu.log
  25. a1.sources.r3.interceptors = i3
  26. a1.sources.r3.interceptors.i3.type = static
  27. a1.sources.r3.interceptors.i3.key = type
  28. a1.sources.r3.interceptors.i3.value = ugctail
  29.  
  30. # Describe the sink
  31. a1.sinks.k1.type = avro
  32. a1.sinks.k1.hostname = 192.168.22.131
  33. a1.sinks.k1.port =
  34.  
  35. #a1.sinks.k1.type = logger
  36.  
  37. # Use a channel which buffers events in memory
  38. a1.channels.c1.type = memory
  39. a1.channels.c1.capacity =
  40. a1.channels.c1.transactionCapacity =
  41.  
  42. # Bind the source and sink to the channel
  43. a1.sources.r1.channels = c1
  44. a1.sources.r2.channels = c1
  45. a1.sources.r3.channels = c1
  46. a1.sinks.k1.channel = c1

6,access.log收集后指定多个目的地【同时,打印到控制台、输出到HDFS】

131中:

  1. #定义agent名, source、channel、sink的名称
  2. access.sources = r1
  3. access.channels = c1 c2
  4. access.sinks = k1 k2
  5.  
  6. #具体定义source
  7. access.sources.r1.type = avro
  8. access.sources.r1.bind = 0.0.0.0
  9. access.sources.r1.port =
  10.  
  11. #具体定义channel
  12. access.channels.c1.type = memory
  13. access.channels.c1.capacity =
  14. access.channels.c1.transactionCapacity =
  15.  
  16. access.channels.c2.type = memory
  17. access.channels.c2.capacity =
  18. access.channels.c2.transactionCapacity =
  19.  
  20. access.sinks.k2.type = logger !!!!重点是这里的k2!!!!!
  21.  
  22. #定义拦截器,为消息添加时间戳
  23. access.sources.r1.interceptors = i1
  24. access.sources.r1.interceptors.i1.type = org.apache.flume.interceptor.TimestampInterceptor$Builder
  25.  
  26. #具体定义sink
  27. access.sinks.k1.type = hdfs
  28. access.sinks.k1.hdfs.path = hdfs://Ubuntu-1:9000/source/%{type}/%Y%m%d
  29. access.sinks.k1.hdfs.filePrefix = events-
  30. access.sinks.k1.hdfs.fileType = DataStream
  31. #access.sinks.k1.hdfs.fileType = CompressedStream
  32. #access.sinks.k1.hdfs.codeC = gzip
  33. #不按照条数生成文件
  34. access.sinks.k1.hdfs.rollCount =
  35. #HDFS上的文件达到64M时生成一个文件
  36. access.sinks.k1.hdfs.rollSize =
  37. access.sinks.k1.hdfs.rollInterval =
  38.  
  39. #组装source、channel、sink
  40.  
  41. access.sources.r1.channels = c1 c2
  42. access.sinks.k1.channel = c1
  43. access.sinks.k2.channel = c2

132中还是之前第5题中的配置

7,在程序里打印日志到flume根据不同的业务指定不同的目的地【控制台、avro】,查看日志的log4j日志的header

pom文件:

  1. <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  2. xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  3. <modelVersion>4.0.</modelVersion>
  4.  
  5. <groupId>cn.hx</groupId>
  6. <artifactId>FlumeSource</artifactId>
  7. <version>1.0-SNAPSHOT</version>
  8. <packaging>jar</packaging>
  9.  
  10. <name>FlumeSource</name>
  11. <url>http://maven.apache.org</url>
  12.  
  13. <properties>
  14. <project.build.sourceEncoding>UTF-</project.build.sourceEncoding>
  15. <maven.compiler.source>1.8</maven.compiler.source>
  16. <maven.compiler.target>1.8</maven.compiler.target>
  17. </properties>
  18.  
  19. <build>
  20. <pluginManagement>
  21. <plugins>
  22. <plugin>
  23. <groupId>org.apache.maven.plugins</groupId>
  24. <artifactId>maven-jar-plugin</artifactId>
  25. <configuration>
  26.  
  27. <archive>
  28. <manifest>
  29. <mainClass>cn.hx.test</mainClass>
  30. <addClasspath>true</addClasspath>
  31. <classpathPrefix>lib/</classpathPrefix>
  32. </manifest>
  33.  
  34. </archive>
  35. <classesDirectory>
  36. </classesDirectory>
  37. </configuration>
  38. </plugin>
  39. </plugins>
  40. </pluginManagement>
  41. </build>
  42.  
  43. <dependencies>
  44. <dependency>
  45. <groupId>junit</groupId>
  46. <artifactId>junit</artifactId>
  47. <version>3.8.</version>
  48. <scope>test</scope>
  49. </dependency>
  50. <dependency>
  51. <groupId>org.apache.hadoop</groupId>
  52. <artifactId>hadoop-common</artifactId>
  53. <version>2.6.</version>
  54. </dependency>
  55. <dependency>
  56. <groupId>org.apache.hadoop</groupId>
  57. <artifactId>hadoop-client</artifactId>
  58. <version>2.6.</version>
  59. </dependency>
  60. <dependency>
  61. <groupId>org.apache.hadoop</groupId>
  62. <artifactId>hadoop-hdfs</artifactId>
  63. <version>2.6.</version>
  64. </dependency>
  65. <dependency>
  66. <groupId>log4j</groupId>
  67. <artifactId>log4j</artifactId>
  68. <version>1.2.</version>
  69. </dependency>
  70. <dependency>
  71. <groupId>log4j</groupId>
  72. <artifactId>log4j</artifactId>
  73. <version>1.2.</version>
  74. </dependency>
  75. </dependencies>
  76. </project>

loj4j文件:

  1. ##<!-- ========================== 自定义输出格式说明================================ -->
  2. ##<!-- %p 输出优先级,即DEBUG,INFO,WARN,ERROR,FATAL -->
  3. ##<!-- %r 输出自应用启动到输出该log信息耗费的毫秒数 -->
  4. ##<!-- %c 输出所属的类目,通常就是所在类的全名 -->
  5. ##<!-- %t 输出产生该日志事件的线程名 -->
  6. ##<!-- %n 输出一个回车换行符,Windows平台为“/r/n”,Unix平台为“/n” -->
  7. ##<!-- %d 输出日志时间点的日期或时间,默认格式为ISO8601,也可以在其后指定格式,比如:%d{yyy MMM dd ##HH:mm:ss,SSS},输出类似:2002年10月18日 ::, -->
  8. ##<!-- %l 输出日志事件的发生位置,包括类目名、发生的线程,以及在代码中的行数。举例:Testlog4.main(TestLog4.java:) -->
  9. ##<!-- ========================================================================== -->
  10.  
  11. ### set log levels ###
  12.  
  13. #默认logger
  14. #INFO是指级别不小于INFO的日志才会使用stdoutappender。ERROR、WARN、INFO
  15. log4j.rootLogger=INFO,stdout1
  16.  
  17. #自定义logger
  18.  
  19. #log4j.logger.accessLogger=INFO,flume
  20. #log4j.logger.ugcLogger=INFO,flume
  21.  
  22. log4j.logger.std1Logger=INFO,stdout1,
  23. log4j.logger.std2Logger=INFO,stdout2
  24.  
  25. log4j.logger.access=INFO,flume
  26.  
  27. log4j.logger.ugchead=INFO,flume
  28. log4j.logger.ugctail=INFO,flume
  29.  
  30. #某个包的level的appender
  31. #log4j.logger.com.zenith.flume = INFO,flume
  32.  
  33. ### flume ###
  34. log4j.appender.flume=org.apache.flume.clients.log4jappender.Log4jAppender
  35. log4j.appender.flume.layout=org.apache.log4j.PatternLayout
  36. log4j.appender.flume.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %c{} [%p] %m%n
  37. log4j.appender.flume.Hostname=192.168.22.131
  38. log4j.appender.flume.Port=
  39. log4j.appender.flume.UnsafeMode = true
  40.  
  41. ### stdout ###
  42. log4j.appender.stdout1=org.apache.log4j.ConsoleAppender
  43. log4j.appender.stdout1.Threshold=DEBUG
  44. log4j.appender.stdout1.Target=System.out
  45. log4j.appender.stdout1.layout=org.apache.log4j.PatternLayout
  46. log4j.appender.stdout1.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %c{} [%p] %m%n
  47.  
  48. ### stdout ###
  49. log4j.appender.stdout2=org.apache.log4j.ConsoleAppender
  50. log4j.appender.stdout2.Threshold=DEBUG
  51. log4j.appender.stdout2.Target=System.out
  52. log4j.appender.stdout2.layout=org.apache.log4j.PatternLayout
  53. log4j.appender.stdout2.layout.ConversionPattern=%d{yyyy-MM-dd hh:mm:ss} %c{} [%p] %m%n
  54.  
  55. ### access ###
  56. log4j.appender.access=org.apache.log4j.DailyRollingFileAppender
  57. log4j.appender.access.Threshold=INFO
  58. log4j.appender.access.File=/usr/local/apache-flume/logs/avro.log
  59. log4j.appender.access.Append=true
  60. log4j.appender.access.DatePattern='.'yyyy-MM-dd
  61. log4j.appender.access.layout=org.apache.log4j.PatternLayout
  62. log4j.appender.access.layout.ConversionPattern=%m%n
  63.  
  64. ### ugchead ###
  65. log4j.appender.ugchead=org.apache.log4j.DailyRollingFileAppender
  66. log4j.appender.ugchead.Threshold=INFO
  67. log4j.appender.ugchead.File=/usr/local/apache-flume/logs/flume.log
  68. log4j.appender.ugchead.Append=true
  69. log4j.appender.ugchead.DatePattern='.'yyyy-MM-dd
  70. log4j.appender.ugchead.layout=org.apache.log4j.PatternLayout
  71. log4j.appender.ugchead.layout.ConversionPattern=%m%n
  72.  
  73. ### ugctail ###
  74. log4j.appender.ugctail=org.apache.log4j.DailyRollingFileAppender
  75. log4j.appender.ugctail.Threshold=INFO
  76. log4j.appender.ugctail.File=/usr/local/apache-flume/logs/hu.log
  77. log4j.appender.ugctail.Append=true
  78. log4j.appender.ugctail.DatePattern='.'yyyy-MM-dd
  79. log4j.appender.ugctail.layout=org.apache.log4j.PatternLayout
  80. log4j.appender.ugctail.layout.ConversionPattern=%m%n

程序:

  1. package cn.hx;
  2.  
  3. import org.apache.log4j.BasicConfigurator;
  4. import org.apache.log4j.Logger;
  5.  
  6. /**
  7. * Created by hushiwei on 2017/8/20.
  8. */
  9. public class test {
  10. protected static final Logger loggeaccess = Logger.getLogger("access");
  11.  
  12. protected static final Logger loggerugc = Logger.getLogger("ugchead");
  13.  
  14. public static void main(String[] args) throws Exception {
  15. BasicConfigurator.configure();
  16.  
  17. while (true) {
  18. loggeaccess.info("this is acccess log");
  19. loggerugc.info("ugc");
  20. //KafkaUtil util=new KafkaUtil();
  21. //util.initProducer();
  22. //util.produceData("crxy","time",String.valueOf(new Date().getTime()));
  23. Thread.sleep();
  24. }
  25. }
  26. }

在131中执行:

  1. root@Ubuntu-:/usr/local/apache-flume# bin/flume-ng agent --conf conf/ --conf-file conf/avro_source.conf --name agent1 -Dflume.root.logger=INFO,console &

avro.source文件是上面某道题中的文件

打jar包后到131中执行

可是报错,没有解决:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/log4j/Logger

Caused by:java.lang.ClassNotFoundException:org.apache.log4j.Logger

8,A机器的access.log日志采集后打印到B、C做负载均衡,打印到控制台上,load_balance

132和135中:

conf文件用avro_source.conf

启动:

  1. root@Ubuntu-:/usr/local/apache-flume# bin/flume-ng agent --conf conf/ --conf-file conf/avro_source.conf --name agent1 -Dflume.root.logger=INFO,console &

131中:

  1. # Name the components on this agent
  2. a1.sources = r1
  3. a1.sinks = k1 k2
  4. a1.channels = c1
  5.  
  6. # Describe/configure the source
  7. a1.sources.r1.type = exec
  8. a1.sources.r1.channels=c1
  9. a1.sources.r1.command=tail -F /usr/local/apache-flume/logs/xing.log
  10.  
  11. #define sinkgroups
  12. a1.sinkgroups=g1
  13. a1.sinkgroups.g1.sinks=k1 k2
  14. a1.sinkgroups.g1.processor.type=load_balance
  15. a1.sinkgroups.g1.processor.backoff=true
  16. a1.sinkgroups.g1.processor.selector=round_robin
  17.  
  18. #define the sink 1
  19. a1.sinks.k1.type=avro
  20. a1.sinks.k1.hostname=192.168.22.132
  21. a1.sinks.k1.port=
  22.  
  23. #define the sink 2
  24. a1.sinks.k2.type=avro
  25. a1.sinks.k2.hostname=192.168.22.135
  26. a1.sinks.k2.port=
  27.  
  28. # Use a channel which buffers events in memory
  29. a1.channels.c1.type = memory
  30. a1.channels.c1.capacity =
  31. a1.channels.c1.transactionCapacity =
  32.  
  33. # Bind the source and sink to the channel
  34. a1.sources.r1.channels = c1
  35. a1.sinks.k1.channel = c1
  36. a1.sinks.k2.channel=c1

启动

  1. root@Ubuntu-:/usr/local/apache-flume# bin/flume-ng agent --conf conf/ --conf-file conf/eight.conf --name a1 -Dflume.root.logger=INFO,console &

在131中:

在132中:

在135中:

9,A机器的access.log日志采集后打印到B、C做故障转移,打印到控制台上,failover

132和135中起avro_source的conf文件

131中启:

  1. # Name the components on this agent
  2. a1.sources = r1
  3. a1.sinks = k1 k2
  4. a1.channels = c1
  5.  
  6. # Describe/configure the source
  7. a1.sources.r1.type = exec
  8. a1.sources.r1.channels=c1
  9. a1.sources.r1.command=tail -F /usr/local/apache-flume/logs/xing.log
  10.  
  11. #define sinkgroups
  12. a1.sinkgroups=g1
  13. a1.sinkgroups.g1.sinks=k1 k2
  14. a1.sinkgroups.g1.processor.type=failover
  15. a1.sinkgroups.g1.processor.priority.k1=
  16. a1.sinkgroups.g1.processor.priority.k2=
  17. a1.sinkgroups.g1.processor.maxpenalty=
  18.  
  19. #define the sink 1
  20. a1.sinks.k1.type=avro
  21. a1.sinks.k1.hostname=192.168.22.132
  22. a1.sinks.k1.port=
  23.  
  24. #define the sink 2
  25. a1.sinks.k2.type=avro
  26. a1.sinks.k2.hostname=192.168.22.135
  27. a1.sinks.k2.port=
  28.  
  29. # Use a channel which buffers events in memory
  30. a1.channels.c1.type = memory
  31. a1.channels.c1.capacity =
  32. a1.channels.c1.transactionCapacity =
  33.  
  34. # Bind the source and sink to the channel
  35. a1.sources.r1.channels = c1
  36. a1.sinks.k1.channel = c1
  37. a1.sinks.k2.channel=c1

启131

  1. root@Ubuntu-:/usr/local/apache-flume# bin/flume-ng agent --conf conf/ --conf-file conf/nine.conf --name a1 -Dflume.root.logger=INFO,console &

查看:

关闭132中的flume之后

132宕机之后 可以看到数据直接转到135中了:

关于flume的几道题的更多相关文章

  1. Flume1 初识Flume和虚拟机搭建Flume环境

    前言:       工作中需要同步日志到hdfs,以前是找运维用rsync做同步,现在一般是用flume同步数据到hdfs.以前为了工作简单看个flume的一些东西,今天下午有时间自己利用虚拟机搭建了 ...

  2. 处理Assetbundle依赖关系时想到的一道题

    在处理unit3d的assetbundle依赖关系的时候,想到了一道有趣的题目: 给定一堆数据,例如{A = {1, 3, 4}, B = {3, 4}, C = {5, 6}, D = {6, 7, ...

  3. Flume(4)实用环境搭建:source(spooldir)+channel(file)+sink(hdfs)方式

    一.概述: 在实际的生产环境中,一般都会遇到将web服务器比如tomcat.Apache等中产生的日志倒入到HDFS中供分析使用的需求.这里的配置方式就是实现上述需求. 二.配置文件: #agent1 ...

  4. Flume(3)source组件之NetcatSource使用介绍

    一.概述: 本节首先提供一个基于netcat的source+channel(memory)+sink(logger)的数据传输过程.然后剖析一下NetcatSource中的代码执行逻辑. 二.flum ...

  5. Flume(2)组件概述与列表

    上一节搭建了flume的简单运行环境,并提供了一个基于netcat的演示.这一节继续对flume的整个流程进行进一步的说明. 一.flume的基本架构图: 下面这个图基本说明了flume的作用,以及f ...

  6. Flume(1)使用入门

    一.概述: Flume是Cloudera提供的一个高可用的,高可靠的,分布式的海量日志采集.聚合和传输的系统. 当前Flume有两个版本Flume 0.9X版本的统称Flume-og,Flume1.X ...

  7. 大数据平台架构(flume+kafka+hbase+ELK+storm+redis+mysql)

    上次实现了flume+kafka+hbase+ELK:http://www.cnblogs.com/super-d2/p/5486739.html 这次我们可以加上storm: storm-0.9.5 ...

  8. flume+kafka+spark streaming整合

    1.安装好flume2.安装好kafka3.安装好spark4.流程说明: 日志文件->flume->kafka->spark streaming flume输入:文件 flume输 ...

  9. flume使用示例

    flume的特点: flume是一个分布式.可靠.和高可用的海量日志采集.聚合和传输的系统.支持在日志系统中定制各类数据发送方,用于收集数据;同时,Flume提供对数据进行简单处理,并写到各种数据接受 ...

随机推荐

  1. C++11中std::bind的使用

    std::bind: Each argument may either be bound to a value or be a placeholder: (1).If bound to a value ...

  2. 台湾ML笔记--1.2 formalize the learning probelm

    Basic notations input:     x∈χ  (customer application) output:   y∈y  (good/bad after approving cred ...

  3. define 和 const常量有什么区别?

    define在预处理阶段进行替换,const常量在编译阶段使用 宏不做类型检查,仅仅进行替换,const常量有数据类型,会执行类型检查 define不能调试,const常量可以调试 define定义的 ...

  4. 基于jersey和Apache Tomcat构建Restful Web服务(二)

    基于jersey和Apache Tomcat构建Restful Web服务(二) 上篇博客介绍了REST以及Jersey并使用其搭建了一个简单的“Hello World”,那么本次呢,再来点有趣的东西 ...

  5. Java中大数的使用与Java入门(NCPC-Intergalactic Bidding)

    引入 前几天参加湖南多校的比赛,其中有这样一道题,需要使用高精度,同时需要排序,如果用c++实现的话,重载运算符很麻烦,于是直接学习了一发怎样用Java写大数,同时也算是学习Java基本常识了 题目 ...

  6. LeetCode - 38. Count and Say(36ms)

    The count-and-say sequence is the sequence of integers with the first five terms as following: 1. 1 ...

  7. Python登录小程序

    ------------------------------------------------- 主要实现功能 1.用户输入用户名,在用户名文件中查找对应的用户,若无对应用户名则打印输入错误 2.用 ...

  8. java设计模式之模版方法模式以及在java中作用

    模板方法模式是类的行为模式.准备一个抽象类,将部分逻辑以具体方法以及具体构造函数的形式实现,然后声明一些抽象方法来迫使子类实现剩余的逻辑.不同的子类可以以不同的方式实现这些抽象方法,从而对剩余的逻辑有 ...

  9. web开发速查表(php,css,html5........)

  10. mysql初识(5)

    将mysql数据库内的表导出为execel格式文件: 方法1:mysql命令:select * into outfile '/tmp/test.xls' from table_name;(需要注意的是 ...