【原创】大数据基础之Logstash(4)高可用
logstash高可用体现为不丢数据(前提为服务器短时间内不可用后可恢复比如重启服务器或重启进程),具体有两个方面:
- 进程重启(服务器重启)
- 事件消息处理失败
在logstash中对应的解决方案为:
- Persistent Queues
- Dead Letter Queues
默认都没有开启;
另外可以通过docker或marathon或systemd来实现进程的自动重启;
As data flows through the event processing pipeline, Logstash may encounter situations that prevent it from delivering events to the configured output. For example, the data might contain unexpected data types, or Logstash might terminate abnormally.
To guard against data loss and ensure that events flow through the pipeline without interruption, Logstash provides the following data resiliency features.
- Persistent Queues protect against data loss by storing events in an internal queue on disk.
- Dead Letter Queues provide on-disk storage for events that Logstash is unable to process. You can easily reprocess events in the dead letter queue by using the dead_letter_queue input plugin.
These resiliency features are disabled by default.
1 Persistent Queues
By default, Logstash uses in-memory bounded queues between pipeline stages (inputs → pipeline workers) to buffer events. The size of these in-memory queues is fixed and not configurable. If Logstash experiences a temporary machine failure, the contents of the in-memory queue will be lost. Temporary machine failures are scenarios where Logstash or its host machine are terminated abnormally but are capable of being restarted.
In order to protect against data loss during abnormal termination, Logstash has a persistent queue feature which will store the message queue on disk. Persistent queues provide durability of data within Logstash.
logstash默认使用内存queue来缓冲事件消息,一旦进程重启则内存queue里的数据全部丢失;
好处
- Absorbs bursts of events without needing an external buffering mechanism like Redis or Apache Kafka.
- Provides an at-least-once delivery guarantee against message loss during a normal shutdown as well as when Logstash is terminated abnormally.
实现
The queue sits between the input and filter stages in the same process:
input → queue → filter + output
When an input has events ready to process, it writes them to the queue. When the write to the queue is successful, the input can send an acknowledgement to its data source.
When processing events from the queue, Logstash acknowledges events as completed, within the queue, only after filters and outputs have completed. The queue keeps a record of events that have been processed by the pipeline. An event is recorded as processed (in this document, called "acknowledged" or "ACKed") if, and only if, the event has been processed completely by the Logstash pipeline.
配置
queue.type: persisted
path.queue: "path/to/data/persistent_queue"
其他配置
queue.page_capacity
queue.drain
queue.max_events
queue.max_bytes
更进一步
First, the queue itself is a set of pages. There are two kinds of pages: head pages and tail pages. The head page is where new events are written. There is only one head page. When the head page is of a certain size (see queue.page_capacity), it becomes a tail page, and a new head page is created. Tail pages are immutable, and the head page is append-only. Second, the queue records details about itself (pages, acknowledgements, etc) in a separate file called a checkpoint file.
When recording a checkpoint, Logstash will:
Call fsync on the head page.
Atomically write to disk the current state of the queue.
The process of checkpointing is atomic, which means any update to the file is saved if successful.
If Logstash is terminated, or if there is a hardware-level failure, any data that is buffered in the persistent queue, but not yet checkpointed, is lost.
You can force Logstash to checkpoint more frequently by setting queue.checkpoint.writes. This setting specifies the maximum number of events that may be written to disk before forcing a checkpoint. The default is 1024. To ensure maximum durability and avoid losing data in the persistent queue, you can set queue.checkpoint.writes: 1 to force a checkpoint after each event is written. Keep in mind that disk writes have a resource cost. Setting this value to 1 can severely impact performance.
即使开启persistent queue,也有可能会有数据丢失,影响因素是flush间隔(checkpoint),默认是1024个事件flush一次,设置为1则每个事件flush一次,虽然不丢消息,但是对性能影响较大;
queue.checkpoint.writes: 1
2 Dead Letter Queues
By default, when Logstash encounters an event that it cannot process because the data contains a mapping error or some other issue, the Logstash pipeline either hangs or drops the unsuccessful event. In order to protect against data loss in this situation, you can configure Logstash to write unsuccessful events to a dead letter queue instead of dropping them.
Each event written to the dead letter queue includes the original event, along with metadata that describes the reason the event could not be processed, information about the plugin that wrote the event, and the timestamp for when the event entered the dead letter queue.
To process events in the dead letter queue, you simply create a Logstash pipeline configuration that uses the dead_letter_queue input plugin to read from the queue.
当logstash遇到无法处理的数据(mapping错误等),logstash要么卡住,要么丢掉不成功的事件;为了避免这种情况下的数据丢失,可以配置logstash将不成功的事件写到一个dead letter queue而不是直接丢掉;
使用限制
The dead letter queue feature is currently supported for the elasticsearch output only. Additionally, The dead letter queue is only used where the response code is either 400 or 404, both of which indicate an event that cannot be retried. Support for additional outputs will be available in future releases of the Logstash plugins. Before configuring Logstash to use this feature, refer to the output plugin documentation to verify that the plugin supports the dead letter queue feature.
目前dead letter queue只支持elasticsearch output;其他output将在未来支持;
配置
dead_letter_queue.enable: true
path.dead_letter_queue: "path/to/data/dead_letter_queue"
参考:
https://www.elastic.co/guide/en/logstash/current/resiliency.html
https://www.elastic.co/guide/en/logstash/current/persistent-queues.html
https://www.elastic.co/guide/en/logstash/current/dead-letter-queues.html
【原创】大数据基础之Logstash(4)高可用的更多相关文章
- 入门大数据---基于Zookeeper搭建Kafka高可用集群
一.Zookeeper集群搭建 为保证集群高可用,Zookeeper 集群的节点数最好是奇数,最少有三个节点,所以这里搭建一个三个节点的集群. 1.1 下载 & 解压 下载对应版本 Zooke ...
- 大数据 - hadoop - HDFS+Zookeeper实现高可用
高可用(Hign Availability,HA) 一.概念 作用:用于解决负载均衡和故障转移(Failover)问题. 问题描述:一个NameNode挂掉,如何启动另一个NameNode.怎样让两个 ...
- 入门大数据---基于Zookeeper搭建Spark高可用集群
一.集群规划 这里搭建一个 3 节点的 Spark 集群,其中三台主机上均部署 Worker 服务.同时为了保证高可用,除了在 hadoop001 上部署主 Master 服务外,还在 hadoop0 ...
- 【原创】大数据基础之Logstash(3)应用之http(in和out)
一个logstash很容易通过http打断成两个logstash实现跨服务器或者跨平台间数据同步,比如原来的流程是 logstash: nginx log -> kafka 打断成两个是 log ...
- 【原创】大数据基础之Logstash(1)简介、安装、使用
Logstash 6.6.2 官方:https://www.elastic.co/products/logstash 一 简介 Centralize, Transform & Stash Yo ...
- 【原创】大数据基础之Logstash(2)应用之mysql-kafka
应用一:mysql数据增量同步到kafka 1 准备mysql测试表 mysql> create table test_sync(id int not null auto_increment, ...
- 【原创】大数据基础之Logstash(5)监控
有两种方式来监控logstash: api ui(xpack) When you run Logstash, it automatically captures runtime metrics tha ...
- 【原创】大数据基础之Logstash(3)应用之file解析(grok/ruby/kv)
从nginx日志中进行url解析 /v1/test?param2=v2¶m3=v3&time=2019-03-18%2017%3A34%3A14->{'param1':' ...
- 【原创】大数据基础之Logstash(6)mongo input
logstash input插件之mongodb是第三方的,配置如下: input { mongodb { uri => 'mongodb://mongo_server:27017/db' pl ...
随机推荐
- 【问题解决方案】Github中的jupyter notebook文件(.ipynb)加载失败/失败
两个方法: 法一:本机安装jupyter notebook的情况下直接下载文件并打开 本机打开的话会在浏览器中显示,地址为localhost:8888,也就是本机 法二:在线打开:利用 'https: ...
- flutter 返回键监听
本篇为继上片监听返回键基础下优化: 以下做返回键监听两种情况: import 'package:fluttertoast/fluttertoast.dart'; //提示第三方插件 1. 单击提示双击 ...
- 接口测试(jmeter和postman 接口使用)
接口测试基础知识 接口测试主要用于检测外部系统与系统之间以及内部各个子系统之间的交互点.把前端(client)和后端(server)联系起来,测试的重点是要检查数据的交换,传递和控制管理过程,以及系统 ...
- Quartus16.1布线优化选择,重编译可能会满足时序
流程 (1)在默认的优化编译下,时序违例. (2)在assignments中选择setting. (3)根据需求,选择不同的优化方式,目前选择性能优先. (4)可以发现时序满足要求. 以上.
- <Android基础> (六) 数据存储 Part 3 SQLite数据库存储
6.4 SQLite数据库存储 SQLite是一种轻量级的关系型数据库,运算速度快,占用资源少. 6.4.1 创建数据库 Android为了管理数据库,专门提供了SQLiteOpenHelper帮助类 ...
- 【简】题解 AWSL090429 【市场】
因为这有个时间的限制 并且 求的时间都是前缀和 那么 我们可以根据时间将排序 因为题中没有修改可以直接用背包预处理出答案 但是因为题目ci mi<=1e9 vi<=300 所以发现不 ...
- 【linux】线上服务器要关注哪些参数
服务器(nginx/apache): 1.吞吐率. 2.并发连接数. 3.qps. 4.并发连接数详细数据统计,包括读取请求.持久连接.发送响应内容.关闭连接.等待连接. 5.连接线程池利用率. 关系 ...
- CF700E E. Cool Slogans
https://codeforces.com/contest/700/problem/E 题解:https://www.luogu.org/problemnew/solution/CF700E 其实就 ...
- 理解vue 修饰符sync
也是在vux中看到了这个sync 现在我们来看看vue中的sync 我们先看下官方文档:vue .sync 修饰符,里面说vue .sync 修饰符以前存在于vue1.0版本里,但是在在 2.0 中移 ...
- 第六周java学习总结
学号 20175206 <Java程序设计>第六周学习总结 教材学习内容总结 第七章: 主要内容 内部类 匿名类 异常类 断言 重点和难点 重点:内部类和异常类的理解 难点:异常类的使用 ...