flink 安装及wordcount

1、下载

http://mirror.bit.edu.cn/apache/flink/

2、安装

确保已经安装java8以上

解压flink

tar zxvf flink-1.8.0-bin-scala_2.11.tgz

启动本地模式

$ ./bin/start-cluster.sh  # Start Flink

[hadoop@bigdata-senior01 flink-1.8.0]$ ./bin/start-cluster.sh

Starting cluster.

Starting standalonesession daemon on host bigdata-senior01.home.com.

Starting taskexecutor daemon on host bigdata-senior01.home.com.

[hadoop@bigdata-senior01 flink-1.8.0]$ jps

1995 StandaloneSessionClusterEntrypoint

2443 TaskManagerRunner

2526 Jps

3、访问flink

http://localhost:8081

4、第一个程序wordcount，从一个socket流中读出字符串，计算10秒内的词频

4.1 引入依赖

    <dependencies>

        <dependency>

            <groupId>org.apache.flink</groupId>

            <artifactId>flink-clients_2.12</artifactId>

            <version>1.8.0</version>

        </dependency>

        <dependency>

            <groupId>org.apache.flink</groupId>

            <artifactId>flink-streaming-java_2.12</artifactId>

            <version>1.8.0</version>

            <scope>provided</scope>

        </dependency>

    </dependencies>

4.2 代码

public class SocketWindowWordCount {

    public static void main(String args[]) throws Exception {

        // the host and the port to connect to

        final String hostname;

        final int port;

        try {

            final ParameterTool params = ParameterTool.fromArgs(args);

            hostname = params.has("hostname") ? params.get("hostname") : "localhost";

            port = params.getInt("port");

        } catch (Exception e) {

            e.printStackTrace();

            System.err.println(e.getMessage());

            System.err.println("No port specified. Please run 'SocketWindowWordCount " +

                    "--hostname <hostname> --port <port>', where hostname (localhost by default) " +

                    "and port is the address of the text server");

            System.err.println("To start a simple text server, run 'netcat -l <port>' and " +

                    "type the input text into the command line");

            return;

        }

        // get the execution environment

        final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        // get input data by connecting to the socket

        DataStream<String> text = env.socketTextStream(hostname, port, "\n");

        // parse the data, group it, window it, and aggregate the counts

        DataStream<WordWithCount> windowCounts = text

                .flatMap(new FlatMapFunction<String, WordWithCount>() {

                    @Override

                    public void flatMap(String value, Collector<WordWithCount> out) throws Exception {

                        for (String word : value.split("\\s")) {

                            out.collect(new WordWithCount(word,1L));

                        }

                    }

                })

                .keyBy("word")

                .timeWindow(Time.seconds(10))

                .reduce(new ReduceFunction<WordWithCount>() {

                    @Override

                    public WordWithCount reduce(WordWithCount value1, WordWithCount value2) throws Exception {

                        return new WordWithCount(value1.word,value1.count+value2.count);

                    }

                });

        // print the results with a single thread, rather than in parallel

        windowCounts.print().setParallelism(1);

        env.execute("Socket Window WordCount");

    }

    /**

     * Data type for words with count.

     */

    public static class WordWithCount {

        public String word;

        public long count;

        public WordWithCount() {

        }

        public WordWithCount(String word, long count) {

            this.word = word;

            this.count = count;

        }

        @Override

        public String toString() {

            return word + " : " + count;

        }

    }

}

4.4 编译成jar包上传

先用nc启动侦听并接受连接

nc -lk 9000

启动SocketWindowWordCount

[hadoop@bigdata-senior01 bin]$ ./flink run /home/hadoop/SocketWindowWordCount.jar --port 9000

查看输出

[root@bigdata-senior01 log]# tail -f flink-hadoop-taskexecutor-0-bigdata-senior01.home.com.out

在nc端输入字符串，在日志监控端10秒为一个周期就可以看到输出合计。

flink 安装及wordcount的更多相关文章

Flink单机版安装与wordCount
Flink为大数据处理工具,类似hadoop,spark.但它能够在大规模分布式系统中快速处理,与spark相似也是基于内存运算,并以低延迟性和高容错性主城,其核心特性是实时的处理流数据.从此大数据生 ...
第02讲：Flink 入门程序 WordCount 和 SQL 实现
我们右键运行时相当于在本地启动了一个单机版本.生产中都是集群环境,并且是高可用的,生产上提交任务需要用到flink run 命令,指定必要的参数. 本课时我们主要介绍 Flink 的入门程序以及 SQ ...
Eclipse的下载、安装和WordCount的初步使用（本地模式和集群模式）
包括: Eclipse的下载 Eclipse的安装 Eclipse的使用本地模式或集群模式 Scala IDE for Eclipse的下载.安装和WordCount的初步使用(本地模式和集群 ...
IntelliJ IDEA的下载、安装和WordCount的初步使用（本地模式和集群模式）
包括: IntelliJ IDEA的下载 IntelliJ IDEA的安装 IntelliJ IDEA中的scala插件安装用SBT方式来创建工程或选择Scala方式来创建工程本地模式或集群 ...
Hadoop-2.4.0安装和wordcount执行验证
Hadoop-2.4.0安装和wordcount执行验证下面描写叙述了64位centos6.5机器下,安装32位hadoop-2.4.0,并通过执行系统自带的WordCount样例来验证服务正确性 ...
IntelliJ IDEA（Ultimate版本）的下载、安装和WordCount的初步使用（本地模式和集群模式）
不多说,直接上干货! IntelliJ IDEA号称当前Java开发效率最高的IDE工具.IntelliJ IDEA有两个版本:社区版(Community)和旗舰版(Ultimate).社区版时免费的 ...
IntelliJ IDEA（Community版本）的下载、安装和WordCount的初步使用（本地模式和集群模式）
不多说,直接上干货! 对于初学者来说,建议你先玩玩这个免费的社区版,但是,一段时间,还是去玩专业版吧,这个很简单哈,学聪明点,去搞到途径激活!可以看我的博客. 包括: IntelliJ IDEA(Co ...
从flink-example分析flink组件(3)WordCount 流式实战及源码分析
前面介绍了批量处理的WorkCount是如何执行的 <从flink-example分析flink组件(1)WordCount batch实战及源码分析> <从flink-exampl ...
2、flink入门程序Wordcount和sql实现
一.DataStream Wordcount 代码地址:https://gitee.com/nltxwz_xxd/abc_bigdata 基于scala实现 maven依赖如下: <depend ...

随机推荐

Tkinter 鼠标键盘事件（二）
一个Tkinter主要跑在mainloop进程里.Events可能来自多个地方,比如按键,鼠标,或是系统事件. Tkinter提供了丰富的方法来处理这些事件.对于每一个控件Widget,你都可以为其绑 ...
京东框架jd_frame
#!/user/bin/python# -*- coding:utf-8 -*-#1.定义京东首页def index(): pass#2.定义加目录def home(): pass#3.定义购物车功能 ...
微信小程序 - 视图层 | 基础语法
视图层 WXML(WeiXin Markup Language)是框架设计的一套标签语言,结合基础组件.事件系统,可以构建出页面的结构. 类似前端HTML 一.数据绑定普通语法 test.wxml ...
Python进阶-VI 生成器函数进阶、生成器表达式、推导式
一.生成器函数进阶需求:求取移动平均数 1.应用场景之一,在奥运会气枪射击比赛中,每打完一发都会显示平均环数! def show_avg(): print('你已进入显示移动平均环数系统!') a ...
11/10 <priorityQueue> 215 347
215. Kth Largest Element in an Array 快速排序法,选择一个数,比这个数大的交换到左边,比这个数小的交换到右边. class Solution { public in ...
[学习笔记] 网络最大流的HLPP算法
#define \(u\)的伴点集合与\(u\)相隔一条边的且\(u\)能达到的点的集合 \(0x00~ {}~Preface\) \(HLPP(Highest~Label~Preflow~Push ...
Navicat的安装和pymysql模块的使用
内容回顾 select distinct 字段1,字段2,... from 表名 where 分组之前的过滤条件 group by 分组条件 having 分组之后过滤条件 order by 排序字段 ...
Loj #3124. 「CTS2019 | CTSC2019」氪金手游
Loj #3124. 「CTS2019 | CTSC2019」氪金手游题目描述小刘同学是一个喜欢氪金手游的男孩子. 他最近迷上了一个新游戏,游戏的内容就是不断地抽卡.现在已知: - 卡池里总共有 ...
Kettle提高表输出写入速度（每秒万条记录）
重点: ETL 优化多数在于表输入和表输出. 转自: https://blog.csdn.net/qq_37124304 https://blog.csdn.net/qq_37124304/artic ...
RabbitMQ的构架
初识rabbitMQ RabbitMQ 是一个由 Erlang 语言开发的 AMQP 的开源实现. AMQP :Advanced Message Queue,高级消息队列协议.它是应用层协议的一个开放 ...

flink 安装及wordcount

flink 安装及wordcount的更多相关文章

随机推荐

热门专题