Splunk Enterprise architecture and processes

This topic discusses the internal architecture and processes of Splunk Enterprise at a high level. If you're looking for information about third-party components used in Splunk Enterprise, see the credits section in the Release notes.

Splunk Enterprise Processes

A Splunk Enterprise server installs a process on your host, splunkd.

splunkd is a distributed C/C++ server that accesses, processes and indexes streaming IT data. It also handles search requests. splunkd processes and indexes your data by streaming it through a series of pipelines, each made up of a series of processors.

  • Pipelines are single threads inside the splunkd process, each configured with a single snippet of XML.
  • Processors are individual, reusable C or C++ functions that act on the stream of IT data that passes through a pipeline. Pipelines can pass data to one another through queues.

Architecture diagram

注意:负载均衡,副本!

Splunk Architecture

A Bit About Architecture

Splunk is a high performance, scalable software server written in C/C++ and Python. It indexes and searches logs and other IT data in real time. Splunk works with data generated by any application, server or device. The Splunk Developer API is accessible via REST, SOAP or the command line. After downloading, installing and starting Splunk, you'll find two Splunk Server processes running on your host, splunkd and splunkweb.

    • splunkd is a distributed C/C++ server that accesses, processes and indexes streaming IT data and also handles search requests. splunkd processes and indexes your data by streaming it through a series of pipelines, each made up of a series of processors. Pipelines are single threads inside the splunkd process, each configured with a single snippet of XML. Processors are individual, reusable C/C++ or Python functions that act on the stream of IT data passing through a pipeline. Pipelines can pass data to one another via queues. splunkd supports a command line interface for searching and viewing results.
  • splunkweb is a Python-based application server providing the Splunk Web user interface. It allows users to search and navigate IT data stored by Splunk servers and to manage your Splunk deployment through the browser interface. splunkweb communicates with your web browser via REST and communicates with splunkd via SOAP.

    • Splunk's Data Store manages the original raw data in compressed format as well as the indexes into the data. Data can be deleted or archived based on retention period or maximum data store size.
    • Splunk Servers can communicate with one another via Splunk-2-Splunk, a TCP-based protocol, to forward data from one server to another and to distribute searches across multiple servers.
    • Bundles are files that contain configuration settings including, user accounts, Splunks, Live Splunks, Data Inputs and Processing Properties to easily create specific Splunk environments.
  • Modules are files that add new functionality to Splunk by adding to or modifying existing processors and pipelines.

About forwarding and receiving

You can forward data from one Splunk instance to another Splunk server or even to a non-Splunk system. The Splunk instance that performs theforwarding is typically a smaller footprint version of Splunk, called a forwarder.

A Splunk instance that receives data from one or more forwarders is called a receiver. The receiver is usually a Splunk indexer, but can also be another forwarder, as described here.

This diagram shows three forwarders sending data to a single Splunk receiver (an indexer), which then indexes the data and makes it available for searching:

Splunk Enterprise architecture——转发器本质上是日志收集client附加负载均衡,indexer是分布式索引,外加一个集中式管理协调的中心节点的更多相关文章

  1. Jexus是一款Linux平台上的高性能WEB服务器和负载均衡网关

    什么是Jexus Jexus是一款Linux平台上的高性能WEB服务器和负载均衡网关,以支持ASP.NET.ASP.NET CORE.PHP为特色,同时具备反向代理.入侵检测等重要功能.可以这样说,J ...

  2. 动态DNS——本质上是IP变化,将任意变换的IP地址绑定给一个固定的二级域名。不管这个线路的IP地址怎样变化,因特网用户还是可以使用这个固定的域名 这样看的话,p2p可以用哇

    动态域名是因应网络远程访问的需要而产生的一项应用技术.因为没有固定IP,只能运用二级域名来应对经常变化的IP,动态域名的由来因此而产生. 它当前主要应用在:路由器.网络摄像机.带网络监控的硬盘录像机. ...

  3. Linux下Rsyslog日志远程集中式管理

    Rsyslog简介 Rsyslog的全称是 rocket-fast system for log,它提供了高性能,高安全功能和模块化设计.rsyslog能够接受从各种各样的来源,将其输入,输出的结果到 ...

  4. Dubbo入门到精通学习笔记(二十):MyCat在MySQL主从复制的基础上实现读写分离、MyCat 集群部署(HAProxy + MyCat)、MyCat 高可用负载均衡集群Keepalived

    文章目录 MyCat在MySQL主从复制的基础上实现读写分离 一.环境 二.依赖课程 三.MyCat 介绍 ( MyCat 官网:http://mycat.org.cn/ ) 四.MyCat 的安装 ...

  5. Centos7搭建集中式日志系统

    在CentOS7中,Rsyslong是一个集中式的日志收集系统,可以运行在TCP或者UDP的514端口上.   目录 开始之前 配置接收日志的主机 配置发送日志的主机 日志回滚 附件:创建日志接收模板 ...

  6. flume集群日志收集

    一.Flume简介 Flume是一个分布式的.高可用的海量日志收集.聚合和传输日志收集系统,支持在日志系统中定制各类数据发送方(如:Kafka,HDFS等),便于收集数据.其核心为agent,agen ...

  7. ELK日志收集平台部署

    需求背景 由于公司的后台服务有三台,每当后台服务运行异常,需要看日志排查错误的时候,都必须开启3个ssh窗口进行查看,研发们觉得很不方便,于是便有了统一日志收集与查看的需求. 这里,我用ELK集群,通 ...

  8. 理解OpenShift(6):集中式日志处理

    理解OpenShift(1):网络之 Router 和 Route 理解OpenShift(2):网络之 DNS(域名服务) 理解OpenShift(3):网络之 SDN 理解OpenShift(4) ...

  9. SpringBoot+kafka+ELK分布式日志收集

    一.背景 随着业务复杂度的提升以及微服务的兴起,传统单一项目会被按照业务规则进行垂直拆分,另外为了防止单点故障我们也会将重要的服务模块进行集群部署,通过负载均衡进行服务的调用.那么随着节点的增多,各个 ...

随机推荐

  1. [省选模拟]Rhyme

    考的时候脑子各种短路,用个SAM瞎搞了半天没搞出来,最后中午火急火燎的打了个SPFA才混了点分. 其实这个可以把每个模式串长度为$K-1$的字符串看作一个状态,这个用字符串Hash实现,然后我们发现这 ...

  2. 2018-2019-1 1723《程序设计与数据结构》第5&6&7周作业 总结

    作业地址 第五周作业: 提交情况如图: 第六周作业: 提交情况如图: 第七周作业: 提交情况如图: 作业问题 很多看上写的比较详细并且语言组织的也不错,我就这么随手一百度,搜出来了很多篇博客.(无奈) ...

  3. tf.reduce_sum tensorflow维度上的操作

    tensorflow中有很多在维度上的操作,本例以常用的tf.reduce_sum进行说明.官方给的api reduce_sum( input_tensor, axis=None, keep_dims ...

  4. keil_4/MDK各种数据类型占用的字节数

    笔者正在学习uCOS-II,移植到ARM时考虑到数据类型的定义,但对于Keil MDK编译器的数据类型定义还是很模糊,主要就是区分不了short int.int.long 和long int占用多少字 ...

  5. c++类定义代码的分离

    类文件 实际工程中,对一个类的说明.架构.描述方法是:    往往分成头文件和实现的源文件,来实现代码的分离 然后,源文件中包含类的头文件... 头文件的包含问题: 类对应的实现文件cpp.main主 ...

  6. SQL NULL

    表 select CHARACTER_MAXIMUM_LENGTH from information_schema.columns where table_name= 'Alliance' selec ...

  7. The equation (扩展欧几里得)题解

    There is an equation ax + by + c = 0. Given a,b,c,x1,x2,y1,y2 you must determine, how many integer r ...

  8. 基于Java的三种对象持久化方式

    1:序列化技术: 序列化的过程就是将对象写入字节流和从字节流中读取对象.将对象状态转换成字节流之后,可以用java.io包中的各种字节流类将其保存到文件中,可以通过管道或线程读取,或通过网络连接将对象 ...

  9. hdu 5668 Circle 中国剩余定理

    Circle Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/65536 K (Java/Others) Problem D ...

  10. c++ 指定长度容器元素的拷贝移动(copy_backward)

    #include <iostream> // cout #include <algorithm> // copy_backward #include <vector> ...