Hue 工具使用

Hue 是一个 Web 接口的 Hadoop 分析数据工具，由 Cloudra 公司开源

一.Build

1.ubuntu安装所需环境(以Github为准)

# JDK

# maven

# 其他环境

$ sudo apt-get install git ant gcc g++ libffi-dev libkrb5-dev libmysqlclient-dev libsasl2-dev libsasl2-modules-gssapi-mit libsqlite3-dev libssl-dev libxml2-dev libxslt-dev make maven libldap2-dev python-dev python-setuptools libgmp3-dev

2.build

$ make apps

二.配置

1.基础配置（位于官方文档3.1节）

secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn<qW5o

http_host=cen-ubuntu

http_port=8888

time_zone=Asia/Shanghai

2.WebHDFS 配置

# hdfs-site.xml(默认为true)

<property>

    <name>dfs.webhdfs.enabled</name>

    <value>true</value>

</property>

# core-site.xml 配置代理

<property>

    <name>hadoop.proxyuser.hue.hosts</name>

    <value>*</value>

</property>

<property>

    <name>hadoop.proxyuser.hue.groups</name>

    <value>*</value>

</property>

# hue.ini 配置 3 处，若配置 HA 需要配置 logical_name

[hadoop]

  # Configuration for HDFS NameNode

  # ------------------------------------------------------------------------

  [[hdfs_clusters]]

    # HA support by using HttpFs

    [[[default]]]

      # Enter the filesystem uri

      fs_defaultfs=hdfs://cen-ubuntu:8020

      # NameNode logical name.

      ## logical_name=

      # Use WebHdfs/HttpFs as the communication mechanism.

      # Domain should be the NameNode or HttpFs host.

      # Default port is 14000 for HttpFs.

      webhdfs_url=http://cen-ubuntu:50070/webhdfs/v1

      # Change this if your HDFS cluster is Kerberos-secured

      ## security_enabled=false

      # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs

      # have to be verified against certificate authority

      ## ssl_cert_ca_verify=True

      # Directory of the Hadoop configuration

      hadoop_conf_dir=/opt/cdh5.3.6/hadoop-2.6.0-cdh5.12.0/etc/hadoop

3.YARN 配置

# hue.ini

[[yarn_clusters]]

  [[[default]]]

    # Enter the host on which you are running the ResourceManager

    resourcemanager_host=cen-ubuntu

    # The port where the ResourceManager IPC listens on

    resourcemanager_port=8032

    # Whether to submit jobs to this cluster

    submit_to=True

    # Resource Manager logical name (required for HA)

    ## logical_name=

    # Change this if your YARN cluster is Kerberos-secured

    ## security_enabled=false

    # URL of the ResourceManager API

    resourcemanager_api_url=http://cen-ubuntu:8088

    # URL of the ProxyServer API

    proxy_api_url=http://cen-ubuntu:8088

    # URL of the HistoryServer API

    history_server_api_url=http://cen-ubuntu:19888

    # URL of the Spark History Server

    ## spark_history_server_url=http://localhost:18088

    # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs

    # have to be verified against certificate authority

    ## ssl_cert_ca_verify=True

4.临时文件目录

[filebrowser]

  # Location on local filesystem where the uploaded archives are temporary stored.

  archive_upload_tempdir=/tmp

5.Hive 配置(需要启动Hive server2 服务启动 Hive 服务)

# hive-site.xml

<!-- 配置server2 的地址和端口 -->

<property>

  <name>hive.server2.thrift.port</name>

  <value>10000</value>

  <description>Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'binary'.</description>

</property>

<property>

  <name>hive.server2.thrift.bind.host</name>

  <value>cen-ubuntu</value>

  <description>Bind host on which to run the HiveServer2 Thrift service.</description>

</property>

# 启动hiveserver2

$ bin/hiveserver2 

# hive-site.xml

<!-- 配置远程 remote metastore 的uri 见hive官方文档-->

<property>

  <name>hive.metastore.uris</name>

  <value>thrift://cen-ubuntu:9083</value>

</property>

# 启动 metastore server

hive --service metastore

# hue.ini

[beeswax]

  # Host where HiveServer2 is running.

  # If Kerberos security is enabled, use fully-qualified domain name (FQDN).

  hive_server_host=cen-ubuntu

  # Port where HiveServer2 Thrift server runs on.

  hive_server_port=10000

  # Hive configuration directory, where hive-site.xml is located

  hive_conf_dir=/opt/cdh5.3.6/hive-1.1.0-cdh5.12.0/conf

  # Timeout in seconds for thrift calls to Hive service

  server_conn_timeout=120

6.database 链接管理关系型数据库(SQLite3 是 que 自带的数据库)(注意：需要删除[[[xxx]]]前注释)

###########################################################################

# Settings for the RDBMS application

###########################################################################

[librdbms]

  # The RDBMS app can have any number of databases configured in the databases

  # section. A database is known by its section name

  # (IE sqlite, mysql, psql, and oracle in the list below).

  [[databases]]

    # sqlite configuration.

    ## [[[sqlite]]]

      # Name to show in the UI.

      nice_name=SQLite

      # For SQLite, name defines the path to the database.

      name=/opt/cdh5.3.6/hue-3.9.0-cdh5.12.0/desktop/desktop.db

      # Database backend to use.

      engine=sqlite

      # Database options to send to the server when connecting.

      # https://docs.djangoproject.com/en/1.4/ref/databases/

      ## options={}

    # mysql, oracle, or postgresql configuration.

    [[[mysql]]]

      # Name to show in the UI.

      nice_name="My SQL DB"

      # For MySQL and PostgreSQL, name is the name of the database.

      # For Oracle, Name is instance of the Oracle server. For express edition

      # this is 'xe' by default.

      name=mysqldb

      # Database backend to use. This can be:

      # 1. mysql

      # 2. postgresql

      # 3. oracle

      engine=mysql

      # IP or hostname of the database to connect to.

      host=cen-ubuntu

      # Port the database server is listening to. Defaults are:

      # 1. MySQL: 3306

      # 2. PostgreSQL: 5432

      # 3. Oracle Express Edition: 1521

      port=3306

      # Username to authenticate with when connecting to the database.

      user=root

      # Password matching the username to authenticate with when

      # connecting to the database.

      password=ubuntu

      # Database options to send to the server when connecting.

      # https://docs.djangoproject.com/en/1.4/ref/databases/

      ## options={}

7.Oozie 配置

[liboozie]

  # The URL where the Oozie service runs on. This is required in order for

  # users to submit jobs. Empty value disables the config check.

  oozie_url=http://cen-ubuntu:11000/oozie

  # Requires FQDN in oozie_url if enabled

  ## security_enabled=false

  # Location on HDFS where the workflows/coordinator are deployed when submitted.

  remote_deployement_dir=/user/cen/examples/apps

  [oozie]

    # Location on local FS where the examples are stored.

    local_data_dir=/opt/cdh5.3.6/oozie-4.1.0-cdh5.12.0/examples

    # Location on local FS where the data for the examples is stored.

    sample_data_dir=/opt/cdh5.3.6/oozie-4.1.0-cdh5.12.0/examples/input-data

    # Location on HDFS where the oozie examples and workflows are stored.

    # Parameters are $TIME and $USER, e.g. /user/$USER/hue/workspaces/workflow-$TIME

    remote_data_dir=/user/cen/examples/apps/

三.运行

# 0.0.0.0意味着所有ip都能访问，本来是在hue.ini中配置的，但是配置不生效，因此手动设置

$ build/env/bin/hue runserver 0.0.0.0:8000

Hue 工具使用的更多相关文章

HUE工具使用
1.HUE简介来源 HUE=HadoopUser Experience,看这名字就知道怎么回事了吧,没错,直白来说就是Hadoop用户体验,是一个开源的Apache Hadoop UI系统,由Clo ...
高可用Hadoop平台－Hue In Hadoop
1.概述前面一篇博客<高可用Hadoop平台-Ganglia安装部署>,为大家介绍了Ganglia在Hadoop中的集成,今天为大家介绍另一款工具——Hue,该工具功能比较丰富,下面是今 ...
HUE搭配基础
* HUE搭配基础首先简单说一下Hue框架的来源:HUE=HadoopUser Experience,看这名字就知道怎么回事了吧,没错,直白来说就是Hadoop用户体验,是一个开源的Apache H ...
从0到1进行Spark history分析
一.总体思路以上是我在平时工作中分析spark程序报错以及性能问题时的一般步骤.当然,首先说明一下,以上分析步骤是基于企业级大数据平台,该平台会抹平很多开发难度,比如会有调度日志(spark-sub ...
在字节跳动，一个更好的企业级SparkSQL Server这么做
SparkSQL是Spark生态系统中非常重要的组件.面向企业级服务时,SparkSQL存在易用性较差的问题,导致难满足日常的业务开发需求.本文将详细解读,如何通过构建SparkSQL服务器实现使用效 ...
CentOS6安装各种大数据软件第九章：Hue大数据可视化工具安装和配置
相关文章链接 CentOS6安装各种大数据软件第一章:各个软件版本介绍 CentOS6安装各种大数据软件第二章:Linux各个软件启动命令 CentOS6安装各种大数据软件第三章:Linux基础 ...
给Clouderamanager集群里安装可视化分析利器工具Hue步骤（图文详解）
扩展博客以下,是我在手动的CDH版本,安装Hue. CDH版本大数据集群下搭建Hue(hadoop-2.6.0-cdh5.5.4.gz + hue-3.9.0-cdh5.5.4.tar.gz)(博主 ...
给Ambari集群里安装可视化分析利器工具Hue步骤（图文详解）
扩展博客以下,是我在手动的CDH版本平台下,安装Hue. CDH版本大数据集群下搭建Hue(hadoop-2.6.0-cdh5.5.4.gz + hue-3.9.0-cdh5.5.4.tar.gz) ...
Hadoop 管理工具HUE配置-初始配置
1 界面换成中文默认是英文的,可以修改为中文 1.修改配置文件settings.pynano hue/desktop/core/src/desktop/settings.py LANGUAGE_CO ...

随机推荐

npm proxy设置网络代理并使用taobao registry
npm config set https-proxy http://server:portnpm config set proxy http://server:port npm set registr ...
QT学习之文件系统读写类
#QT学习之文件系统读写类 QIODevice QFileDevice QBuffer QProcess 和 QProcessEnvironment QFileDevice QFile QFileIn ...
IDEA tomcat热部署方法及乱码问题解决
在项目开发过程中,我们一般希望在修改完代码之后不重启项目即可提现出修改的结果,那么热部署项目就显得十分必要了.在idea中将项目热部署至tomcat中的方法如下: 首先打开tomcat配置界面,在se ...
字符ASCII转换
实现效果: 关键知识: 实现代码: private void button1_Click(object sender, EventArgs e) { if (textBox1.Text != stri ...
使用TextView/EditText应该注意的地方,监听EditText，addTextChangedListener
http://blog.csdn.net/huichengongzi/article/details/7818676 监听 EditText 控件: addTextChangedListener(ne ...
TIDB2 —— 三篇文章了解 TiDB 技术内幕 - 说存储
原文地址:https://pingcap.com/blog-cn/tidb-internal-1/ 引言数据库.操作系统和编译器并称为三大系统,可以说是整个计算机软件的基石.其中数据库更靠近应用层, ...
duilib属性列表
<?xml version="1.0" encoding="UTF-8"?> <!-- 可能有错漏,欢迎补充.wangchyz(wangchy ...
Linux计算某一列的和
ll | awk '{print $5}' | egrep -v "^$"| paste -sd+|bc 简单说明: ll:拿到当前目录下所有的文件大小 awk:拿到第几列 egr ...
python 面向对象类与类之间的关系
主要内容: 1. 依赖关系 2. 关联关系, 组合关系, 聚合关系 3. 继承关系 4. 类中的特殊成员引子大千世界, 万物之间皆有规则和规律,我们的类和对象是对大千世界中的所有事物进行归类. 那 ...
HTTP状态保持的原理
a)在用户登录之后,浏览器返回响应的时候会在响应中添加上cookieb)浏览器接收到cookie之后会自动保存c)当用户再次请求同一服务器中的其他网页的时候,浏览器会自动带上之前保存的cookied) ...

Hue 工具使用

官方网址

Github 地址 -> 安装方法

文档地址

一.Build

1.ubuntu安装所需环境(以Github为准)

2.build

二.配置

1.基础配置（位于官方文档3.1节）

2.WebHDFS 配置

3.YARN 配置

4.临时文件目录

5.Hive 配置(需要启动Hive server2 服务启动 Hive 服务)

6.database 链接管理关系型数据库(SQLite3 是 que 自带的数据库)(注意：需要删除[[[xxx]]]前注释)

7.Oozie 配置

三.运行

Hue 工具使用的更多相关文章

随机推荐

热门专题

Hue 工具使用

官方网址

Github 地址 -> 安装方法

文档地址

一.Build

1.ubuntu安装所需环境(以Github为准)

2.build

二.配置

1.基础配置（位于官方文档3.1节）

2.WebHDFS 配置

3.YARN 配置

4.临时文件目录

5.Hive 配置(需要启动Hive server2 服务 启动 Hive 服务)

6.database 链接管理关系型数据库(SQLite3 是 que 自带的数据库)(注意：需要删除[[[xxx]]]前注释)

7.Oozie 配置

三.运行

Hue 工具使用的更多相关文章

随机推荐

热门专题

5.Hive 配置(需要启动Hive server2 服务启动 Hive 服务)