How To Install and Configure Elasticsearch on Ubuntu 14.04
Reference: https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-elasticsearch-on-ubuntu-14-04
Introduction
Elasticsearch is a platform for distributed search and analysis of data in real time. Its popularity is due to its ease of use, powerful features, and scalability.
Elasticsearch supports RESTful operations. This means that you can use HTTP methods (GET, POST, PUT, DELETE, etc.) in combination with an HTTP URI (/collection/entry) to manipulate your data. The intuitive RESTful approach is both developer and user friendly, which is one of the reasons for Elasticsearch's popularity.
Elasticsearch is a free and open source software with a solid company behind it — Elastic. This combination makes it suitable for use in anywhere from personal testing to corporate integration.
This article will introduce you to Elasticsearch and show you how to install, configure, and start using it.
Prerequisites
Before following this tutorial, please make sure you complete the following prerequisites:
- A Ubuntu 14.04 Droplet
- A non-root sudo user. Check out Initial Server Setup with Ubuntu 14.04 for details.
Except otherwise noted, all of the commands that require root privileges in this tutorial should be run as a non-root user with sudo privileges.
Assumptions
This tutorial assumes that your servers are using a VPN like the one described here: How To Use Ansible and Tinc VPN to Secure Your Server Infrastructure. This will provide private network functionality regardless of the physical network that your servers are using.
If you are using a shared private network, such as DigitalOcean Private Networking, you must use a VPN to protect Elasticsearch from unauthorized access. Each server must be on the same private network because Elasticsearch doesn't have security built into its HTTP interface. The private network must not be shared with any computers you don't trust.
Step 1 — Installing Java
First, you will need a Java Runtime Environment (JRE) on your Droplet because Elasticsearch is written in the Java programming language. Elasticsearch requires Java 7 or higher. Elasticsearch recommends Oracle JDK version 1.8.0_73, but the native Ubuntu OpenJDK native package for the JRE works as well.
This step shows you how to install both versions so you can decide which is best for you.
Installing OpenJDK
The native Ubuntu OpenJDK native package for the JRE is free, well-supported, and automatically managed through the Ubuntu APT installation manager.
Before installing OpenJDK with APT, update the list of available packages for installation on your Ubuntu Droplet by running the command:
- sudo apt-get update
After that, you can install OpenJDK with the command:
- sudo apt-get install openjdk-7-jre
To verify your JRE is installed and can be used, run the command:
- java -version
The result should look like this:
java version "1.7.0_79"
OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.14.04.1)
OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode)
Installing Java 8
When you advance in using Elasticsearch and you start looking for better Java performance and compatibility, you may opt to install Oracle's proprietary Java (Oracle JDK 8).
Add the Oracle Java PPA to apt:
- sudo add-apt-repository -y ppa:webupd8team/java
Update your apt package database:
- sudo apt-get update
Install the latest stable version of Oracle Java 8 with this command (and accept the license agreement that pops up):
- sudo apt-get -y install oracle-java8-installer
Lastly, verify it is installed:
- java -version
Step 2 — Downloading and Installing Elasticsearch
Elasticsearch can be downloaded directly from elastic.co in zip, tar.gz, deb, or rpm packages. For Ubuntu, it's best to use the deb (Debian) package which will install everything you need to run Elasticsearch.
At the time of this writing, the latest Elasticsearch version is 1.7.2. Download it in a directory of your choosing with the command:
- wget https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.7.2.deb
Then install it in the usual Ubuntu way with the dpkg
command like this:
- sudo dpkg -i elasticsearch-1.7.2.deb
Tip: If you want the latest released version of Elasticsearch, go to elastic.co to find the link, and then usewget
to download it to your Droplet. Be sure to download the deb package.
This results in Elasticsearch being installed in /usr/share/elasticsearch/
with its configuration files placed in /etc/elasticsearch
and its init script added in /etc/init.d/elasticsearch
.
To make sure Elasticsearch starts and stops automatically with the Droplet, add its init script to the default runlevels with the command:
- sudo update-rc.d elasticsearch defaults
Step 3 — Configuring Elastic
Now that Elasticsearch and its Java dependencies have been installed, it is time to configure Elasticsearch.
The Elasticsearch configuration files are in the /etc/elasticsearch
directory. There are two files:
elasticsearch.yml
— Configures the Elasticsearch server settings. This is where all options, except those for logging, are stored, which is why we are mostly interested in this file.logging.yml
— Provides configuration for logging. In the beginning, you don't have to edit this file. You can leave all default logging options. You can find the resulting logs in/var/log/elasticsearch
by default.
The first variables to customize on any Elasticsearch server are node.name
and cluster.name
inelasticsearch.yml
. As their names suggest, node.name
specifies the name of the server (node) and the cluster to which the latter is associated.
If you don't customize these variable, a node.name
will be assigned automatically in respect to the Droplet hostname. The cluster.name
will be automatically set to the name of the default cluster.
The cluster.name
value is used by the auto-discovery feature of Elasticsearch to automatically discover and associate Elasticsearch nodes to a cluster. Thus, if you don't change the default value, you might have unwanted nodes, found on the same network, in your cluster.
To start editing the main elasticsearch.yml
configuration file:
- sudo nano /etc/elasticsearch/elasticsearch.yml
Remove the #
character at the beginning of the lines for node.name
and cluster.name
to uncomment them, and then change their values. Your first configuration changes in the/etc/elasticsearch/elasticsearch.yml
file should look like this:
...
node.name: "My First Node"
cluster.name: mycluster1
...
Another important setting is the role of the server, which could be either "master" or "slave". "Masters" are responsible for the cluster health and stability. In large deployments with a lot of cluster nodes, it's recommended to have more than one dedicated "master." Typically, a dedicated "master" will not store data or create indexes. Thus, there should be no chance of being overloaded, by which the cluster health could be endangered.
"Slaves" are used as "workhorses" which can be loaded with data tasks. Even if a "slave" node is overloaded, the cluster health shouldn't be affected seriously, provided there are other nodes to take additional load.
The setting which determines the role of the server is called node.master
. If you have only one Elasticsearch node, you should leave this option commented out so that it keeps its default value of true
— i.e. the sole node should be also a master. Alternatively, if you wish to configure the node as a slave, remove the #
character at the beginning of the node.master
line, and change the value to false
:
...
node.master: false
...
Another important configuration option is node.data
, which determines whether a node will store data or not. In most cases this option should be left to its default value (true
), but there are two cases in which you might wish not to store data on a node. One is when the node is a dedicated "master," as we have already mentioned. The other is when a node is used only for fetching data from nodes and aggregating results. In the latter case the node will act up as a "search load balancer".
Again, if you have only one Elasticsearch node, you should leave this setting commented out so that it keeps the default true
value. Otherwise, to disable storing data locally, uncomment the following line and change the value to false
:
...
node.data: false
...
Two other important options are index.number_of_shards
and index.number_of_replicas
. The first determines into how many pieces (shards) the index will be split into. The second defines the number of replicas which will be distributed across the cluster. Having more shards improves the indexing performance, while having more replicas makes searching faster.
Assuming that you are still exploring and testing Elasticsearch on a single node, it's better to start with only one shard and no replicas. Thus, their values should be set to the following (make sure to remove the #
at the beginning of the lines):
...
index.number_of_shards: 1
index.number_of_replicas: 0
...
One final setting which you might be interested in changing is path.data
, which determines the path where data is stored. The default path is /var/lib/elasticsearch
. In a production environment it's recommended that you use a dedicated partition and mount point for storing Elasticsearch data. In the best case, this dedicated partition will be a separate storage media which will provide better performance and data isolation. You can specify a different path.data
path by uncommenting the path.data
line and changing its value:
...
path.data: /media/different_media
...
Once you make all the changes, please save and exit the file. Now you can start Elasticsearch for the first time with the command:
- sudo service elasticsearch start
Please allow at least 10 seconds for Elasticsearch to fully start before you are able to use it. Otherwise, you may get errors about not being able to connect.
Step 4 — Securing Elastic
Elasticsearch has no built-in security and can be controlled by anyone who can access the HTTP API. This section is not a comprehensive guide to securing Elasticsearch. Take whatever measures are necessary to prevent unauthorized access to it and the server/virtual machine on which it is running. Consider usingiptables to further secure your system.
The first security tweak is to prevent public access. To remove public access edit the fileelasticsearch.yml
:
- sudo nano /etc/elasticsearch/elasticsearch.yml
Find the line that contains network.bind_host
, uncomment it by removing the #
character at the beginning of the line, and change the value to localhost
so it looks like this:
...
network.bind_host: localhost
...
Warning: Because Elasticsearch doesn't have any built-in security, it is very important that you do not set this to any IP address that is accessible to any servers that you do not control or trust. Do not bind Elasticsearch to a public or shared private network IP address!
Also, for additional security you can disable dynamic scripts which are used to evaluate custom expressions. By crafting a custom malicious expression, an attacker might be able to compromise your environment.
To disable custom expressions, add the following line is at the end of the/etc/elasticsearch/elasticsearch.yml
file:
...
script.disable_dynamic: true
...
Step 5 — Testing
By now, Elasticsearch should be running on port 9200. You can test it with curl, the command line client-side URL transfers tool and a simple GET request like this:
- curl -X GET 'http://localhost:9200'
You should see the following response:
{
"status" : 200,
"name" : "Harry Leland",
"cluster_name" : "elasticsearch",
"version" : {
"number" : "1.7.2",
"build_hash" : "e43676b1385b8125d647f593f7202acbd816e8ec",
"build_timestamp" : "2015-09-14T09:49:53Z",
"build_snapshot" : false,
"lucene_version" : "4.10.4"
},
"tagline" : "You Know, for Search"
}
If you see a response similar to the one above, Elasticsearch is working properly. If not, make sure that you have followed correctly the installation instructions and you have allowed some time for Elasticsearch to fully start.
Step 6 — Using Elasticsearch
To start using Elasticsearch, let's add some data first. As already mentioned, Elasticsearch uses a RESTful API, which responds to the usual CRUD commands: Create, Read, Update, and Delete. For working with it, we'll use again curl.
You can add your first entry with the command:
- curl -X POST 'http://localhost:9200/tutorial/helloworld/1' -d '{ "message": "Hello World!" }'
You should see the following response:
{"_index":"tutorial","_type":"helloworld","_id":"1","_version":1,"created":true}
With curl, we have sent an HTTP POST request to the Elasticseach server. The URI of the request was/tutorial/helloworld/1
. It's important to understand the parameters here:
tutorial
is the index of the data in Elasticsearch.helloworld
is the type.1
is the id of our entry under the above index and type.
You can retrieve this first entry with an HTTP GET request like this:
- curl -X GET 'http://localhost:9200/tutorial/helloworld/1'
The result should look like:
{"_index":"tutorial","_type":"helloworld","_id":"1","_version":1,"found":true,"_source":{ "message": "Hello World!" }}
To modify an existing entry you can use an HTTP PUT request like this:
- curl -X PUT 'localhost:9200/tutorial/helloworld/1?pretty' -d '
- {
- "message": "Hello People!"
- }'
Elasticsearch should acknowledge successful modification like this:
{
"_index" : "tutorial",
"_type" : "helloworld",
"_id" : "1",
"_version" : 2,
"created" : false
}
In the above example we have modified the message
of the first entry to "Hello People!". With that, the version number has been automatically increased to 2
.
You may have noticed the extra argument pretty
in the above request. It enables human readable format so that you can write each data field on a new row. You can also "prettify" your results when retrieving data and get much nicer output like this:
- curl -X GET 'http://localhost:9200/tutorial/helloworld/1?pretty'
Now the response will be in a much better format:
{
"_index" : "tutorial",
"_type" : "helloworld",
"_id" : "1",
"_version" : 2,
"found" : true,
"_source":{ "message": "Hello World!" }
}
So far we have added to and queried data in Elasticsearch. To learn about the other operations please check the API documentation.
Conclusion
That's how easy it is to install, configure, and begin using Elasticsearch. Once you have played enough with manual queries, your next task will be to start using it from your applications.
How To Install and Configure Elasticsearch on Ubuntu 14.04的更多相关文章
- ubuntu 16.04源码编译和配置caffe详细教程 | Install and Configure Caffe on ubuntu 16.04
本文首发于个人博客https://kezunlin.me/post/b90033a9/,欢迎阅读! Install and Configure Caffe on ubuntu 16.04 Series ...
- [Part 1] Ubuntu 16.04安装和配置QT5 | Part-1: Install and Configure Qt5 on Ubuntu 16.04
本文首发于个人博客https://kezunlin.me/post/91842b71/,欢迎阅读! Part-1: Install and Configure Qt5 on Ubuntu 16.04 ...
- Install Cocos2d-x v3.3 on Ubuntu 14.04 & Ubuntu 14.10(转)
Install Cocos2d-x v3.3 on Ubuntu 14.04 & Ubuntu 14.10 1 get the source code sudo apt-get install ...
- Install CUDA 6.0 on Ubuntu 14.04 LTS
Ubuntu 14.04 LTS is out, loads of new features have been added. Here are some procedures I followed ...
- install hdp 2.2 on ubuntu 14.04
http://www.swiss-scalability.com/2014/12/install-hdp-22-on-ubuntu-1404-trusty.html 在新加节点上运行 sed -e & ...
- [Ubuntu 14.04] 创建可以用于Android的WIFI热点
Ubuntu的网络管理为创建Wifi热点提供了方便,可是因为它用了ad-hoc网络,所以其创建的Wifi又不能让Android系统使用.这篇文字就是为了解决这个问题 1.Install AP-Host ...
- Ubuntu 14.04 安装sublime
参考 How do I install Sublime Text 2/3? Ubuntu 14.04 安装sublime 通过apt-get包管理器安装sublime. sublime2.0: sud ...
- ZH奶酪:Ubuntu 14.04配置LAMP(Linux、Apache、MySQL、PHP)
ZH奶酪:Ubuntu 14.04安装LAMP(Linux,Apache,MySQL,PHP) 之前已经介绍过LAMP的安装,这边文章主要讲解一下LAMP的配置. 1.配置Apache (1)调整Ke ...
- Install Google Pinyin on Ubuntu 14.04
Install Google Pinyin on Ubuntu 14.04 I've been spending more and more time on Ubuntu and I'm not us ...
随机推荐
- Oracle单实例启动多个实例
Oracle多实例运行,单个实例就是一个数据库!,一个数据库对应多个实例是RAC Linux建立oracle的实例步骤: 1.在linux服务器的图形界面下,打开一个终端,输入如下的命令: xhost ...
- H5版如何在微信外(非微信浏览器)进行微信支付技术方案
官方是支持在非微信内置浏览器中调起微信支付的!H5支付是基于公众号基础开发的一种非微信内浏览器支付方式(需要单独申请支付权限),可以满足在微信外的手机H5页面进行微信支付的需求.同时,由于H5链接传播 ...
- ASP.NET CORE 学习之自定义异常处理
为什么异常处理选择中间件? 传统的ASP.NET可以采用异常过滤器的方式处理异常,在ASP.NET CORE中,是以多个中间件连接而成的管道形式处理请求的,不过常用的五大过滤器得以保留,同样可以采用异 ...
- Python学习笔记——MySQL的基本操作(2)
1 运算符操作(配合查.修.删操作) 数据库的语法结构 查:select * from 表名 where 字段名 运算符 数字/字符; 改:update 表名 set 字段名=值,... wher ...
- AR_销售订单收款基本操作(流程)
2014-06-04 Created By BaoXinjian
- Android:GridView中实现点击Item变色,再点击还原。
使用GridView时想实现点击其中的一个Item,该Item改变背景,再次点击Item变回原来的背景,网上搜了很多资料都没有看到类似的案例,但还是有所启发,现来分享我的做法. 首先,首先为GridV ...
- System V 共享内存 和 系列函数
跟消息队列一样,共享内存也有自己的数据结构,如下: struct shmid_ds { struct ipc_perm shm_perm; /* Ownership and permission ...
- 怎样让VMware上的虚拟机ping通外网(图解教程)
近期在实习项目中遇到一个问题. 因測试须要,本人在win7上安装VMWare后在启动两台ubuntuserver.两台主机的网络配置所有採用NAT方式实现连接. 之后一路畅通.主机ping通虚拟机和外 ...
- 使用python执行linux命令
python版本是2.7.12 一.简单的获取linux命令的执行结果,比如:获取一个PID的进程树结构,linux命令是pstree -p pid,在python中有一个模块可以方便的获取.至于有时 ...
- python中如果函数后面有多于一个括号是怎么回事?
一般而言,调用一个函数是加一个括号.如果看见括号后还有一个括号,说明第一个函数返回了一个函数,如果后面还有括号,说明前面那个也返回了一个函数.以此类推. 比如fun()() def fun(): pr ...