1. 资源视图隔离
容器中的执行top
、free
等命令展示出来的CPU,内存等信息是从/proc
目录中的相关文件里读取出来的。而容器并没有对/proc
,/sys
等文件系统做隔离,因此容器中读取出来的CPU和内存的信息是宿主机的信息,与容器实际分配和限制的资源量不同。
1 2 3 4 5 6
| /proc/cpuinfo /proc/diskstats /proc/meminfo /proc/stat /proc/swaps /proc/uptime
|
为了实现让容器内部的资源视图更像虚拟机,使得应用程序可以拿到真实的CPU和内存信息,就需要通过文件挂载的方式将cgroup的真实的容器资源信息挂载到容器内/proc
下的文件,使得容器内执行top、free等命令时可以拿到真实的CPU和内存信息。
2. Lxcfs简介
lxcfs是一个FUSE文件系统,使得Linux容器的文件系统更像虚拟机。lxcfs是一个常驻进程运行在宿主机上,从而来自动维护宿主机cgroup中容器的真实资源信息与容器内/proc
下文件的映射关系。
lxcfs的命令信息如下:
1 2 3 4 5 6 7 8 9
| Usage:
lxcfs [-f|-d] -u -l -n [-p pidfile] mountpoint -f running foreground by default; -d enable debug output -l use loadavg -u no swap Default pidfile is /run/lxcfs.pid lxcfs -h
|
lxcfs的源码:https://github.com/lxc/lxcfs
3. Lxcfs原理
lxcfs实现的基本原理是通过文件挂载的方式,把cgroup中容器相关的信息读取出来,存储到lxcfs相关的目录下,并将相关目录映射到容器内的/proc目录下,从而使得容器内执行top,free等命令时拿到的/proc下的数据是真实的cgroup分配给容器的CPU和内存数据。
原理图

映射目录
类别 |
容器内目录 |
宿主机lxcfs目录 |
cpu |
/proc/cpuinfo |
/var/lib/lxcfs/{container_id}/proc/cpuinfo |
内存 |
/proc/meminfo |
/var/lib/lxcfs/{container_id}/proc/meminfo |
|
/proc/diskstats |
/var/lib/lxcfs/{container_id}/proc/diskstats |
|
/proc/stat |
/var/lib/lxcfs/{container_id}/proc/stat |
|
/proc/swaps |
/var/lib/lxcfs/{container_id}/proc/swaps |
|
/proc/uptime |
/var/lib/lxcfs/{container_id}/proc/uptime |
|
/proc/loadavg |
/var/lib/lxcfs/{container_id}/proc/loadavg |
|
/sys/devices/system/cpu/online |
/var/lib/lxcfs/{container_id}/sys/devices/system/cpu/online |
4. 使用方式
4.1. 安装lxcfs
环境准备
1
| yum install -y fuse fuse-lib fuse-devel
|
源码编译安装
1 2 3 4 5 6
| git clone git://github.com/lxc/lxcfs cd lxcfs ./bootstrap.sh ./configure make make install
|
或者通过rpm包安装
1 2
| wget https://copr-be.cloud.fedoraproject.org/results/ganto/lxc3/epel-7-x86_64/01041891-lxcfs/lxcfs-3.1.2-0.2.el7.x86_64.rpm; rpm -ivh lxcfs-3.1.2-0.2.el7.x86_64.rpm --force --nodeps
|
查看是否安装成功
4.2. 运行lxcfs
运行lxcfs主要执行两条命令。
1 2
| sudo mkdir -p /var/lib/lxcfs sudo lxcfs /var/lib/lxcfs
|
可以通过systemd运行。
lxcfs.service文件:
1 2 3 4 5 6 7 8 9 10 11 12
| cat > /usr/lib/systemd/system/lxcfs.service <<EOF [Unit] Description=lxcfs
[Service] ExecStart=/usr/bin/lxcfs -f /var/lib/lxcfs Restart=on-failure
[Install] WantedBy=multi-user.target EOF
|
运行命令
1
| systemctl daemon-reload && systemctl enable lxcfs && systemctl start lxcfs && systemctl status lxcfs
|
4.3. 挂载容器内/proc
下的文件目录
1 2 3 4 5 6 7 8
| docker run -it --rm -m 256m --cpus 2 \ -v /var/lib/lxcfs/proc/cpuinfo:/proc/cpuinfo:rw \ -v /var/lib/lxcfs/proc/diskstats:/proc/diskstats:rw \ -v /var/lib/lxcfs/proc/meminfo:/proc/meminfo:rw \ -v /var/lib/lxcfs/proc/stat:/proc/stat:rw \ -v /var/lib/lxcfs/proc/swaps:/proc/swaps:rw \ -v /var/lib/lxcfs/proc/uptime:/proc/uptime:rw \ nginx:latest /bin/sh
|
4.4. 验证容器内CPU和内存
1 2 3 4 5 6 7
| grep -c processor /proc/cpuinfo cat /proc/cpuinfo
free -g cat /proc/meminfo
|
5. 使用k8s集群部署
使用k8s集群部署与systemd部署方式同理,需要解决2个问题:
- 在每个node节点上部署lxcfs常驻进程,lxcfs需要通过镜像来运行,可以通过daemonset来部署。
- 实现将lxcfs维护的目录自动挂载到pod内的
/proc
目录。
具体可参考:https://github.com/denverdino/lxcfs-admission-webhook
5.1. lxcfs-image
Dockerfile
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| FROM centos:7 as build RUN yum -y update RUN yum -y install fuse-devel pam-devel wget install gcc automake autoconf libtool make ENV LXCFS_VERSION 3.1.2 RUN wget https://linuxcontainers.org/downloads/lxcfs/lxcfs-$LXCFS_VERSION.tar.gz && \ mkdir /lxcfs && tar xzvf lxcfs-$LXCFS_VERSION.tar.gz -C /lxcfs --strip-components=1 && \ cd /lxcfs && ./configure && make
FROM centos:7 STOPSIGNAL SIGINT COPY --from=build /lxcfs/lxcfs /usr/local/bin/lxcfs COPY --from=build /lxcfs/.libs/liblxcfs.so /usr/local/lib/lxcfs/liblxcfs.so COPY --from=build /lxcfs/lxcfs /lxcfs/lxcfs COPY --from=build /lxcfs/.libs/liblxcfs.so /lxcfs/liblxcfs.so COPY --from=build /usr/lib64/libfuse.so.2.9.2 /usr/lib64/libfuse.so.2.9.2 COPY --from=build /usr/lib64/libulockmgr.so.1.0.1 /usr/lib64/libulockmgr.so.1.0.1 RUN ln -s /usr/lib64/libfuse.so.2.9.2 /usr/lib64/libfuse.so.2 && \ ln -s /usr/lib64/libulockmgr.so.1.0.1 /usr/lib64/libulockmgr.so.1 COPY start.sh / CMD ["/start.sh"]
|
star.sh
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| #!/bin/bash
nsenter -m/proc/1/ns/mnt fusermount -u /var/lib/lxcfs 2> /dev/null || true nsenter -m/proc/1/ns/mnt [ -L /etc/mtab ] || \ sed -i "/^lxcfs \/var\/lib\/lxcfs fuse.lxcfs/d" /etc/mtab
mkdir -p /usr/local/lib/lxcfs /var/lib/lxcfs
cp -f /lxcfs/lxcfs /usr/local/bin/lxcfs cp -f /lxcfs/liblxcfs.so /usr/local/lib/lxcfs/liblxcfs.so
exec nsenter -m/proc/1/ns/mnt /usr/local/bin/lxcfs /var/lib/lxcfs/
|
5.2. daemonset
lxcfs-daemonset.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
| apiVersion: apps/v1 kind: DaemonSet metadata: name: lxcfs labels: app: lxcfs spec: selector: matchLabels: app: lxcfs template: metadata: labels: app: lxcfs spec: hostPID: true tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule containers: - name: lxcfs image: registry.cn-hangzhou.aliyuncs.com/denverdino/lxcfs:3.1.2 imagePullPolicy: Always securityContext: privileged: true volumeMounts: - name: cgroup mountPath: /sys/fs/cgroup - name: lxcfs mountPath: /var/lib/lxcfs mountPropagation: Bidirectional - name: usr-local mountPath: /usr/local volumes: - name: cgroup hostPath: path: /sys/fs/cgroup - name: usr-local hostPath: path: /usr/local - name: lxcfs hostPath: path: /var/lib/lxcfs type: DirectoryOrCreate
|
lxcfs-admission-webhook
实现了一个动态的准入webhook,更准确的讲是实现了一个修改性质的webhook,即监听pod的创建,然后对pod执行patch的操作,从而将lxcfs与容器内的目录映射关系植入到pod创建的yaml中从而实现自动挂载。
deployment
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
| apiVersion: apps/v1 kind: Deployment metadata: name: lxcfs-admission-webhook-deployment labels: app: lxcfs-admission-webhook spec: replicas: 1 selector: matchLabels: app: lxcfs-admission-webhook template: metadata: labels: app: lxcfs-admission-webhook spec: containers: - name: lxcfs-admission-webhook image: registry.cn-hangzhou.aliyuncs.com/denverdino/lxcfs-admission-webhook:v1 imagePullPolicy: IfNotPresent args: - -tlsCertFile=/etc/webhook/certs/cert.pem - -tlsKeyFile=/etc/webhook/certs/key.pem - -alsologtostderr - -v=4 - 2>&1 volumeMounts: - name: webhook-certs mountPath: /etc/webhook/certs readOnly: true volumes: - name: webhook-certs secret: secretName: lxcfs-admission-webhook-certs
|
具体部署参考:install.sh
1 2 3 4 5 6 7 8 9
| #!/bin/bash
./deployment/webhook-create-signed-cert.sh kubectl get secret lxcfs-admission-webhook-certs
kubectl create -f deployment/deployment.yaml kubectl create -f deployment/service.yaml cat ./deployment/mutatingwebhook.yaml | ./deployment/webhook-patch-ca-bundle.sh > ./deployment/mutatingwebhook-ca-bundle.yaml kubectl create -f deployment/mutatingwebhook-ca-bundle.yaml
|
执行命令
参考: