折腾kubernetes各种问题汇总
折腾fluend-elasticsearch日志,折腾出一大堆问题,解决这些问题过程中,感觉又了解了不少.
1.如何删除不一致状态下的rc,deployment,service.
在某些情况下,经常发现kubectl进程挂起现象,然后在get时候发现删了一半,而另外的删除不了
|
[root@k8s-master ~]# kubectl get -f fluentd-elasticsearch/ NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE |
删除这些deployment,service或者rc命令如下:
|
kubectl delete deployment kibana-logging -n kube-system --cascade=false kubectl delete deployment kibana-logging -n kube-system --ignore-not-found delete rc elasticsearch-logging-v1 -n kube-system --force now --grace-period=0 |
2.删除不了后如何重置etcd
| rm -rf /var/lib/etcd/* |
删除后重新reboot master结点.
reset etcd后需要重新设置网络
etcdctl mk /atomic.io/network/config '{ "Network": "192.168.0.0/16" }'
3.启动apiserver失败
每次启动都是报
start request repeated too quickly for kube-apiserver.service
但其实不是启动频率问题,需要查看,/var/log/messages,在我的情况中是因为开启ServiceAccount后找不到ca.crt等文件,导致启动出错
May 21 07:56:41 k8s-master kube-apiserver: Flag --port has been deprecated, see --insecure-port instead.
May 21 07:56:41 k8s-master kube-apiserver: F0521 07:56:41.692480 4299 universal_validation.go:104] Validate server run options failed: unable to load client CA file: open /var/run/kubernetes/ca.crt: no such file or directory
May 21 07:56:41 k8s-master systemd: kube-apiserver.service: main process exited, code=exited, status=255/n/a
May 21 07:56:41 k8s-master systemd: Failed to start Kubernetes API Server.
May 21 07:56:41 k8s-master systemd: Unit kube-apiserver.service entered failed state.
May 21 07:56:41 k8s-master systemd: kube-apiserver.service failed.
May 21 07:56:41 k8s-master systemd: kube-apiserver.service holdoff time over, scheduling restart.
May 21 07:56:41 k8s-master systemd: start request repeated too quickly for kube-apiserver.service
May 21 07:56:41 k8s-master systemd: Failed to start Kubernetes API Server.
在部署fluentd等日志组件的时候,很多问题都是因为需要开启ServiceAccount选项需要配置安全导致,所以说到底还是需要配置好ServiceAccount.
4.出现Permission denied情况
在配置fluentd时候出现cannot create /var/log/fluentd.log: Permission denied错误,这是因为没有关掉SElinux安全导致.
可以在/etc/selinux/config中将SELINUX=enforcing设置成disabled,然后reboot
5.基于ServiceAccount的配置
首先生成各种需要的keys,k8s-master需替换成master的主机名.
|
openssl genrsa -out ca.key 2048 echo subjectAltName=IP:10.254.0.1 > extfile.cnf #ip由下述命令决定 #kubectl get services --all-namespaces |grep 'default'|grep 'kubernetes'|grep '443'|awk '{print $3}' openssl req -new -key server.key -subj "/CN=k8s-master" -out server.csr openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -extfile extfile.cnf -out server.crt -days 10000 |
如果修改/etc/kubernetes/apiserver的配置文件参数的话,通过systemctl start kube-apiserver启动失败,出错信息为:
Validate server run options failed: unable to load client CA file: open /root/keys/ca.crt: permission denied
但可以通过命令行启动API Server
|
/usr/bin/kube-apiserver --logtostderr=true --v=0 --etcd-servers=http://k8s-master:2379 --address=0.0.0.0 --port=8080 --kubelet-port=10250 --allow-privileged=true --service-cluster-ip-range=10.254.0.0/16 --admission-control=ServiceAccount --insecure-bind-address=0.0.0.0 --client-ca-file=/root/keys/ca.crt --tls-cert-file=/root/keys/server.crt --tls-private-key-file=/root/keys/server.key --basic-auth-file=/root/keys/basic_auth.csv --secure-port=443 &>> /var/log/kubernetes/kube-apiserver.log & |
命令行启动Controller-manager
|
/usr/bin/kube-controller-manager --logtostderr=true --v=0 --master=http://k8s-master:8080 --root-ca-file=/root/keys/ca.crt --service-account-private-key-file=/root/keys/server.key & >>/var/log/kubernetes/kube-controller-manage.log |
6.ETCD启动不起来
etcd是kubernetes集群的zookeeper进程,几乎所有的service都依赖于etcd的启动,比如flanneld,apiserver,docker.....
在启动etcd是报错日志如下
May :: k8s-master systemd: Stopped Flanneld overlay address etcd agent.
May :: k8s-master systemd: Starting Etcd Server...
May :: k8s-master etcd: recognized and used environment variable ETCD_ADVERTISE_CLIENT_URLS=http://etcd:2379,http://etcd:4001
May :: k8s-master etcd: recognized environment variable ETCD_NAME, but unused: shadowed by corresponding flag
May :: k8s-master etcd: recognized environment variable ETCD_DATA_DIR, but unused: shadowed by corresponding flag
May :: k8s-master etcd: recognized environment variable ETCD_LISTEN_CLIENT_URLS, but unused: shadowed by corresponding flag
May :: k8s-master etcd: etcd Version: 3.1.
May :: k8s-master etcd: Git SHA: 21fdcc6
May :: k8s-master etcd: Go Version: go1.7.4
May :: k8s-master etcd: Go OS/Arch: linux/amd64
May :: k8s-master etcd: setting maximum number of CPUs to , total number of available CPUs is
May :: k8s-master etcd: the server is already initialized as member before, starting as etcd member...
May :: k8s-master etcd: listening for peers on http://localhost:2380
May :: k8s-master etcd: listening for client requests on 0.0.0.0:
May :: k8s-master etcd: listening for client requests on 0.0.0.0:
May :: k8s-master etcd: recovered store from snapshot at index
May :: k8s-master etcd: name = master
May :: k8s-master etcd: data dir = /var/lib/etcd/default.etcd
May :: k8s-master etcd: member dir = /var/lib/etcd/default.etcd/member
May :: k8s-master etcd: heartbeat = 100ms
May :: k8s-master etcd: election = 1000ms
May :: k8s-master etcd: snapshot count =
May :: k8s-master etcd: advertise client URLs = http://etcd:2379,http://etcd:4001
May :: k8s-master etcd: ignored file -.wal.broken in wal
May :: k8s-master etcd: restarting member 8e9e05c52164694d in cluster cdf818194e3a8c32 at commit index
May :: k8s-master etcd: 8e9e05c52164694d became follower at term
May :: k8s-master etcd: newRaft 8e9e05c52164694d [peers: [8e9e05c52164694d], term: , commit: , applied: , lastindex: , lastterm: ]
May :: k8s-master etcd: enabled capabilities for version 3.1
May :: k8s-master etcd: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32 from store
May :: k8s-master etcd: set the cluster version to 3.1 from store
May :: k8s-master etcd: starting server... [version: 3.1., cluster version: 3.1]
May :: k8s-master etcd: raft save state and entries error: open /var/lib/etcd/default.etcd/member/wal/.tmp: is a directory
May :: k8s-master systemd: etcd.service: main process exited, code=exited, status=/FAILURE
May :: k8s-master systemd: Failed to start Etcd Server.
May :: k8s-master systemd: Unit etcd.service entered failed state.
May :: k8s-master systemd: etcd.service failed.
May :: k8s-master systemd: etcd.service holdoff time over, scheduling restart.
核心语句
raft save state and entries error: open /var/lib/etcd/default.etcd/member/wal/0.tmp: is a directory 进入相关目录,删除0.tmp,然后就可以启动啦! 7.CentOS下配置主机互信
- 在每台服务器需要建立主机互信的用户名执行以下命令生成公钥/密钥,默认回车即可
ssh-keygen -t rsa
可以看到生成个公钥的文件
- 互传公钥,第一次需要输入密码,之后就OK了
ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.199.132 (-p 2222)
-p 端口 默认端口不加-p,如果更改过端口,就得加上-p
可以看到是在.ssh/下生成了个authorized_keys的文件,记录了能登陆这台服务器的其他服务器的公钥
- 测试看是否能登陆
ssh 192.168.199.132 (-p 2222)
8.CentOS主机名的修改
hostnamectl set-hostname k8s-master1
9.Virtualbox实现CentOS复制和粘贴功能
如果不安装或者不输出,可以将update修改成install再运行
yum install update
yum update kernel
yum update kernel-devel
yum install kernel-headers
yum install gcc
yum install gcc make
运行完后sh VBoxLinuxAdditions.run
10. 删除Pod一直处于Terminating状态
可以通过下面命令强制删除
kubectl delete pod NAME --grace-period= --force
折腾kubernetes各种问题汇总的更多相关文章
- 折腾kubernetes各种问题汇总-<1>
折腾kubernetes各种问题汇总-<1> 折腾部署fluend-elasticsearch日志,折腾出一大堆问题,解决这些问题过程中,感觉又了解了不少. 如何删除不一致状态下的rc,d ...
- kubernetes 报错汇总
一. pod的报错: 1. pod的容器无法启动报错: 报错信息: Normal SandboxChanged 4m9s (x12 over 5m18s) kubelet, k8sn1 Pod san ...
- .NET Core Run On Docker By Kubernetes 系列文章汇总
前言介绍 .NET Core是微软新一代主力编程平台,开源.免费.跨平台.轻量级.高性能,支持Linux.Docker.k8s等环境,适合开发微服务.云原生.大型互联网应用.全开源解决方案. Dock ...
- docker 学习
vim /usr/lib/systemd/system/docker.service ExecStart=/usr/bin/docker daemon --bip=172.18.42.1/16 --r ...
- docker.service启动失败:Unit not found
docker.service启动失败:Unit not found 版权声明:本文为博主原创文章,未经博主允许不得转载. 背景 因为最近一直在折腾Kubernetes集群版本升级.Docker版本升级 ...
- kubernetes 基本概念和资源对象汇总
kubernetes 基本概念和知识点脑图 基本概念 kubernetes 中的绝大部分概念都抽象成kubernets管理的资源对象,主要有以下类别: Master : Master节点是kubern ...
- ASP.NET Core 折腾笔记二:自己写个完整的Cache缓存类来支持.NET Core
背景: 1:.NET Core 已经没System.Web,也木有了HttpRuntime.Cache,因此,该空间下Cache也木有了. 2:.NET Core 有新的Memory Cache提供, ...
- kubernetes部署Fluentd+Elasticsearch+kibana 日志收集系统
一.介绍 1. Fluentd 是一个开源收集事件和日志系统,用与各node节点日志数据的收集.处理等等.详细介绍移步-->官方地址:http://fluentd.org/ 2. Elastic ...
- Openstack+Kubernetes+Docker微服务实践之路--服务发布
结合上文,我们的服务已经可以正常运行了,但它的访问方式只能通过服务器IP加上端口来访问,如何通过域名的方式来访问到我们服务,本来想使用Kubernetes的Ingress来做,折腾一天感觉比较麻烦,I ...
随机推荐
- [BZOJ1391]解题报告|网络流的又一类建图&Dinic的若干优化
1391: [Ceoi2008]order 有N个工作,M种机器,每种机器你可以租或者买过来. 每个工作包括若干道工序,每道工序需要某种机器来完成,你可以通过购买或租用机器来完成. 现在给出这些参数, ...
- [object-c 2.0 程序设计]object-c file handle (二)
// // main.m // cmdTry // // Created by Calos Chen on 2017/8/21. // Copyright © 2017年 Calos Chen. Al ...
- linux下使用wget下载整个网站
linux下可以用wget下载整个网站,而且网站链接中包含utf-8编码的中文也能正确处理. 简要方法记录如下: wget --restrict-file-name=ascii -m -c -nv - ...
- iOS 动画整理
序列帧动画 曾经项目里的一段源码: 1234567891011121314 UIImageView * activityImageView = [[UIImageView alloc] init];N ...
- css用法(持续更新ing)
*:选择所有节点 #container:选取id为container的节点 .container:选取所有class包含container的节点 li a:选取li下的所有a节点 ul +p:选取ul ...
- WPF中使用WPFMediaKit视频截图案例
前台 代码: <Window x:Class="WpfAppWPFMediaKit.MainWindow" xmlns="http://schemas.micros ...
- 【SQL】索引
1.定义 索引:一种数据结构,典型的是B-树,有键值对,键对应属性的某个值,值对应该键的存放位置. 建立索引的目的:加快查询速度 比如: SELECT * FROM Movies ; 如果有studi ...
- 区块链开发(六)truffle使用入门和testrpc安装
在上篇博文中我们已经成功安装了truffle及所需相关环境,此篇就简单介绍一些truffle的使用及目录结构等. 简介truffle和testrpc truffle是本地的用来编译.部署智能合约的工具 ...
- Python与数据结构[3] -> 树/Tree[1] -> 表达式树和查找树的 Python 实现
表达式树和查找树的 Python 实现 目录 二叉表达式树 二叉查找树 1 二叉表达式树 表达式树是二叉树的一种应用,其树叶是常数或变量,而节点为操作符,构建表达式树的过程与后缀表达式的计算类似,只不 ...
- Codeforces #442 Div2 F
#442 Div2 F 题意 给出一些包含两种类型(a, b)问题的问题册,每本问题册有一些题目,每次查询某一区间,问有多少子区间中 a 问题的数量等于 b 问题的数量加 \(k\) . 分析 令包含 ...