clickhouse安装使用文档
Clickhouse简介
Clickhouse是什么
1. 开源的列存储数据库管理系统
2. 支持线性扩展
3. 简单方便
4. 高可靠性
5. 容错(支持多主机异步复制,可以跨多个数据中心部署。 单个节点或整个数据中心的停机时间不会影响系统的读写可用性)
clickhouse架构及存储方式
clickhouse架构未开源
clickhouse特点
用于对干净,结构良好且不可变的事件或日志进行分析。建议将每个这样的流放入一个带有预加入尺寸的单一宽事实表中。
Clickhouse使用场景
可行的应用程序的一些例子:
- Web和App分析
- 广告网络和RTB
- 电信
- 电子商务和金融
- 信息安全
- 监测和遥测
- 时间序列
- 商业智能
- 线上游戏
- 物联网
- 事务性工作负载(OLTP)
- 高请求率的键值访问
- Blob或文档存储
- 超标准化的数据
不适用场景
clickhouse安装
clickhouse单节点安装
检查系统是否支持clickhouse安装
执行命令:
grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
若显示为SSE4.2suported 则可以继续安装如为后者:
那么很不幸的告诉你你的电脑cpu不支持sse指令集,请自想办法。
拉取repo源文件
curl -s https://packagecloud.io/install/repositories/altinity/clickhouse/script.rpm.sh | sudo bash
或者直接新建:
altinity_clickhouse.repo文件
将此内容插入centos6版本
[altinity_clickhouse]
name=altinity_clickhouse
baseurl=https://packagecloud.io/altinity/clickhouse/el/6/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://packagecloud.io/altinity/clickhouse/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300
[altinity_clickhouse-source]
name=altinity_clickhouse-source
baseurl=https://packagecloud.io/altinity/clickhouse/el/6/SRPMS
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://packagecloud.io/altinity/clickhouse/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300
centos7版本
[altinity_clickhouse]
name=altinity_clickhouse
baseurl=https://packagecloud.io/altinity/clickhouse/el/7/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://packagecloud.io/altinity/clickhouse/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300
[altinity_clickhouse-source]
name=altinity_clickhouse-source
baseurl=https://packagecloud.io/altinity/clickhouse/el/7/SRPMS
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://packagecloud.io/altinity/clickhouse/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300
yum list ‘clickhouse*’
yum –y install ‘clickhouse*
clickhouse多节点安装
在每台机器上安装click house数据库然后,在每台机器上做如下修改
修改host文件
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.3.251 host1
192.168.3.252 host2
192.168.3.247 host3
~
新建文件metrika.xml
在/etc下新建文件cd /etc
vi metrika.xml
将以下内容修改后粘贴入metrika.xml
<yandex>
<clickhouse_remote_servers>
<perftest_3shards_1replicas>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>192.168.3.247</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<internal_replication>true</internal_replication>
<host>192.168.3.252</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<internal_replication>true</internal_replication>
<host>192.168.3.251</host>
<port>9000</port>
</replica>
</shard>
</perftest_3shards_1replicas>
</clickhouse_remote_servers>
<zookeeper-servers>
<node index="1">
<host>192.168.3.251</host>
<port>2181</port>
</node>
</zookeeper-servers>
<macros>
<replica>192.168.3.252</replica>
</macros>
<networks>
<ip>::/0</ip>
</networks>
<clickhouse_compression>
<case>
<min_part_size>10000000000</min_part_size>
<min_part_size_ratio>0.01</min_part_size_ratio>
<method>lz4</method>
</case>
</clickhouse_compression>
</yandex>
修改/etc/clickhouse-server下的config.xml文件
<!-- Listen specified host. use :: (wildcard IPv6 address), if you want to accept connections both with IPv4 and IPv6 from everywhere. -->
<!-- <listen_host>::</listen_host> -->
<listen_host>::1</listen_host>
<listen_host>192.168.3.252</listen_host>
clickhouse使用
简单的使用
启动
/etc/init.d/clickhouse-server start
命令行clickhouse-client –h host –u –p
默认即可:使用clickhouse-client 进入客户端。
DML(data manipulation language)
insert into funtest values(3,'xiaoming',22,'2017-11-09')
insert into funtest values(32,'xiaolan',33,'2017-11-08')
insert into funtest values(35,'xiaotong',33,'2017-11-07')
insert into funtest values(4,'xiaohuang',33,'2017-11-08')
insert into funtest values(44,'xiaolvas',34,'2017-11-05')
insert into funtest values(6,'xiaohuanasg',32,'2017-11-28')
select * from funtest
select * from funtest order by id
select * from funtest order by id desc
select avg(age) from funtest
select count(name) from funtest
select age from funtest group by age
select round(age/3) FROM funtest
select cast('2015-12-22' as date) from funtest
select cast('2015-12-22' as date)+30 from funtest
select stddev_samp(age) FROM funtest
select upper('hhh') from funtest
select upper(name) from funtest
select abs(-1) from funtest
select * FROM funtest where times =cast('2015-12-22' as date)
select max(age) from funtest
select case when name ='xiaoming' then concat(name,'dddd') else 'ddddfdfdfdf' end from funtest
select substring(name,1,3) from funtest
select rand() from funtest
DDL(data definition language)
create table funtest(id UInt32, name String ,age UInt32,times Date)ENGINE=Log
drop table funtest
alter table ontime_all add COLUMN name String;
性能测试
性能测试代码如下
获取数据
for s in `seq 1987 2017`
do
for m in `seq 1 12`
do
echo http://transtats.bts.gov/PREZIP/On_Time_On_Time_Performance_${s}_${m}.zip >> a.lst
done
done
解压上传至click house数据库
for i in *.zip; do echo $i; unzip -cq $i '*.csv' | sed 's/\.00//g' | clickhouse-client --query="INSERT INTO ontime_test FORMAT CSVWithNames"; done
创建hive表
CREATE TABLE ontime
(
Year int,
Quarter int,
Month int,
DayofMonth int,
DayOfWeek int,
FlightDate Date,
UniqueCarrier String,
AirlineID int,
Carrier String,
TailNum String,
FlightNum String,
OriginAirportID int,
OriginAirportSeqID int,
OriginCityMarketID int,
Origin String,
OriginCityName String,
OriginState String,
OriginStateFips String,
OriginStateName String,
OriginWac int,
DestAirportID int,
DestAirportSeqID int,
DestCityMarketID int,
Dest String,
DestCityName String,
DestState String,
DestStateFips String,
DestStateName String,
DestWac int,
CRSDepTime int,
DepTime int,
DepDelay int,
DepDelayMinutes int,
DepDel15 int,
DepartureDelayGroups String,
DepTimeBlk String,
TaxiOut int,
WheelsOff int,
WheelsOn int,
TaxiIn int,
CRSArrTime int,
ArrTime int,
ArrDelay int,
ArrDelayMinutes int,
ArrDel15 int,
ArrivalDelayGroups int,
ArrTimeBlk String,
Cancelled int,
CancellationCode String,
Diverted int,
CRSElapsedTime int,
ActualElapsedTime int,
AirTime int,
Flights int,
Distance int,
DistanceGroup int,
CarrierDelay int,
WeatherDelay int,
NASDelay int,
SecurityDelay int,
LateAircraftDelay int,
FirstDepTime String,
TotalAddGTime String,
LongestAddGTime String,
DivAirportLandings String,
DivReachedDest String,
DivActualElapsedTime String,
DivArrDelay String,
DivDistance String,
Div1Airport String,
Div1AirportID int,
Div1AirportSeqID int,
Div1WheelsOn String,
Div1TotalGTime String,
Div1LongestGTime String,
Div1WheelsOff String,
Div1TailNum String,
Div2Airport String,
Div2AirportID int,
Div2AirportSeqID int,
Div2WheelsOn String,
Div2TotalGTime String,
Div2LongestGTime String,
Div2WheelsOff String,
Div2TailNum String,
Div3Airport String,
Div3AirportID int,
Div3AirportSeqID int,
Div3WheelsOn String,
Div3TotalGTime String,
Div3LongestGTime String,
Div3WheelsOff String,
Div3TailNum String,
Div4Airport String,
Div4AirportID int,
Div4AirportSeqID int,
Div4WheelsOn String,
Div4TotalGTime String,
Div4LongestGTime String,
Div4WheelsOff String,
Div4TailNum String,
Div5Airport String,
Div5AirportID int,
Div5AirportSeqID int,
Div5WheelsOn String,
Div5TotalGTime String,
Div5LongestGTime String,
Div5WheelsOff String,
Div5TailNum String
)row format delimited
fields terminated by ','
stored as textfile;
load data inpath ‘/data’into table ontime;
修改hive存储格式
orc
与spark对比测试
创建clickhouse本地表
CREATE TABLE ontime
(
Year UInt16,
Quarter UInt8,
Month UInt8,
DayofMonth UInt8,
DayOfWeek UInt8,
FlightDate Date,
UniqueCarrier FixedString(7),
AirlineID Int32,
Carrier FixedString(2),
TailNum String,
FlightNum String,
OriginAirportID Int32,
OriginAirportSeqID Int32,
OriginCityMarketID Int32,
Origin FixedString(5),
OriginCityName String,
OriginState FixedString(2),
OriginStateFips String,
OriginStateName String,
OriginWac Int32,
DestAirportID Int32,
DestAirportSeqID Int32,
DestCityMarketID Int32,
Dest FixedString(5),
DestCityName String,
DestState FixedString(2),
DestStateFips String,
DestStateName String,
DestWac Int32,
CRSDepTime Int32,
DepTime Int32,
DepDelay Int32,
DepDelayMinutes Int32,
DepDel15 Int32,
DepartureDelayGroups String,
DepTimeBlk String,
TaxiOut Int32,
WheelsOff Int32,
WheelsOn Int32,
TaxiIn Int32,
CRSArrTime Int32,
ArrTime Int32,
ArrDelay Int32,
ArrDelayMinutes Int32,
ArrDel15 Int32,
ArrivalDelayGroups Int32,
ArrTimeBlk String,
Cancelled UInt8,
CancellationCode FixedString(1),
Diverted UInt8,
CRSElapsedTime Int32,
ActualElapsedTime Int32,
AirTime Int32,
Flights Int32,
Distance Int32,
DistanceGroup UInt8,
CarrierDelay Int32,
WeatherDelay Int32,
NASDelay Int32,
SecurityDelay Int32,
LateAircraftDelay Int32,
FirstDepTime String,
TotalAddGTime String,
LongestAddGTime String,
DivAirportLandings String,
DivReachedDest String,
DivActualElapsedTime String,
DivArrDelay String,
DivDistance String,
Div1Airport String,
Div1AirportID Int32,
Div1AirportSeqID Int32,
Div1WheelsOn String,
Div1TotalGTime String,
Div1LongestGTime String,
Div1WheelsOff String,
Div1TailNum String,
Div2Airport String,
Div2AirportID Int32,
Div2AirportSeqID Int32,
Div2WheelsOn String,
Div2TotalGTime String,
Div2LongestGTime String,
Div2WheelsOff String,
Div2TailNum String,
Div3Airport String,
Div3AirportID Int32,
Div3AirportSeqID Int32,
Div3WheelsOn String,
Div3TotalGTime String,
Div3LongestGTime String,
Div3WheelsOff String,
Div3TailNum String,
Div4Airport String,
Div4AirportID Int32,
Div4AirportSeqID Int32,
Div4WheelsOn String,
Div4TotalGTime String,
Div4LongestGTime String,
Div4WheelsOff String,
Div4TailNum String,
Div5Airport String,
Div5AirportID Int32,
Div5AirportSeqID Int32,
Div5WheelsOn String,
Div5TotalGTime String,
Div5LongestGTime String,
Div5WheelsOff String,
Div5TailNum String
) ENGINE = MergeTree(FlightDate, (Year, FlightDate), 8192)
创建分区表
CREATE TABLE ontimetest AS ontime ENGINE = Distributed(perftest_3shards_1replicas, default, ontime, rand())
注意:
每个节点分别创建本地表,和分区表
clickhouse安装使用文档的更多相关文章
- Xcode离线安装帮助文档
Xcode离线安装帮助文档 1.在线查看帮助文件:Xcode下查看帮助文件,菜单Help-Developer Documentation在右上角搜索框中即可检索,但速度很慢,在线查看. 2.下载帮 ...
- CM5(Cloudera Manager 5) + CDH5(Cloudera's Distribution Including Apache Hadoop 5)的安装详细文档
参考 :http://www.aboutyun.com/thread-9219-1-1.html Cloudera Manager5及CDH5在线(cloudera-manager-installer ...
- Visual Studio 2010 安装帮助文档问题
今天重装系统,装完VS2010后,如往常一样安装文档,却弹出如下错误"Could not create the local store in the specified folder.... ...
- keepalived双机热备,安装部署文档
keepalived双击热备,安装部署文档: 下载目录:/apps/keepalived-1.2.7.tar.gz 1:---> yum install -y make wget 2:---&g ...
- azkaban编译安装配置文档
azkaban编译安装配置文档 参考官方文档: http://azkaban.github.io/azkaban/docs/latest/ azkaban的配置文件说明:http://azkaban. ...
- Jmeter+Badboy安装使用文档
Jmeter+Badboy安装使用文档 目录 1.jmeter安装 1 2.Jmeter基础使用 3 3. 使用Jmeter进行分布式测试 ...
- EasyGBS国标流媒体服务器GB28181国标方案安装使用文档
EasyGBS - GB28181 国标方案安装使用文档 下载 安装包下载,正式使用需商业授权, 功能一致 在线演示 在线API 架构图 EasySIPCMS SIP 中心信令服务, 单节点, 自带一 ...
- LVS+Heartbeat安装部署文档
LVS+Heartbeat安装部署文档 发表回复 所需软件: ipvsadm-1.24-10.x86_64.rpmheartbeat-2.1.3-3.el5.centos.x86_64.rpmhear ...
- (转)SQL Server 2012 手动安装帮助文档+排错
逆天SQL Server 2012装的不要不要的,最后发现...竟然没帮助文档...汗啊!原来它跟vs一样要自己装帮助文档...好吧,官网一下载,妹的...报错...然后就让我们还原这个安装过程以及逆 ...
随机推荐
- Chapter 5 数组:为什么很多编程语言种数组都是从0开始编号?
如何实现随机访问? 线性表:数组,队列,链表,栈 非线性表:树,图 总结:数组用一块连续的内存空间,来存储相同类型的一组数据,最大的特点就是支持随机访问,但插入,删除操作也因此变得比较低效,平均情况时 ...
- C#中redis订阅后程序不再继续执行
项目开发中使用到了redis订阅功能,在订阅的代码执行成功后,发现本应继续执行的程序断不到点, 经过查看redis订阅的源码,发现订阅成功后,会一直循环执行一个监听频道推送消息的动作,导致后续代码无法 ...
- Android Studio2.0 教程从入门到精通Windows版
系列教程 Android Studio2.0 教程从入门到精通Windows版 - 安装篇Android Studio2.0 教程从入门到精通Windows版 - 入门篇Android Studio2 ...
- Redis之父九条编程忠告
最近在学习redis,特地了解了一下redis之父Salvatore Sanfilippo ,而看到了一篇优秀的文章,总解分享之 个人解读总结如下 取巧编程品质key word: 过硬的编码能力 快 ...
- 遇到问题或bug时要做的事。
1,做事细心,只有细心才能减少bug量,做总结. 2,开发中遇到bug和错误,第一要想到是程序代码的问题.而首先想到的不是其他问题(比如版本,框架或兼容问题等). 3,程序不能按照自己的意愿执行,时先 ...
- powerdesigner 基本概念
PowerDesigner是Sybase的企业建模和设计解决方案,采用模型驱动方法,将业务与IT结合起来,可帮助部署有效的企业体系架构,并为研发生命周期管理提供强大的分析与设计技术.PowerDesi ...
- 渲染函数render和函数式组件
vnode对象 vnode对象包括(vnode并不是vue实例,而是vue实例中渲染函数render执行后生成的结果) this.tag = tag // 当前节点标签名 this.data = da ...
- python入门学习0
Python 是什么类型的语言 Python是脚本语言 Python下载地址:https://www.python.org/downloads/ Python版本:Python 3.4.2 - 64b ...
- week07 13.2 NewsPipeline之 二 News Fetcher - Xpath
我们使用Xpath来专门做一个scrapter 我们专门弄个文件夹 里面全部是 各个新闻源(CNN BBC等)的scraper来抓取网站的text内容 主要函数(就是传入text内容的那个url)然后 ...
- 简单定时器的Java实现
这两个类使用起来非常方便,可以完成我们对定时器的绝大多数需求 Timer类是用来执行任务的类,它接受一个TimerTask做参数 Timer有两种执行任务的模式,最常用的是schedule,它可以以两 ...