前言
这是一篇便做边写的水文,有一定参考价值。这不是一篇教程,请不要与我文章中所作所为同步。建议读完全篇后作取舍。报错部分可略过,挑取有价值的部分。
安装监控端
一开始怎么都安装不上,显示无 TencentOS
分支
Status code: 404 for http://mirrors.tencentyun.com/tlinux/3/TencentOS/x86_64/repodata/repomd.xml (IP: \*.\*.\*.\*)
Error: Failed to download metadata for repo 'TencentOS': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
自己加docker的镜像后发现报错
Errors during downloading metadata for repository 'docker-ce-stable':
- Status code: 404 for https://mirrors.cloud.tencent.com/docker-ce/linux/centos/3.1/x86_64/stable/repodata/repomd.xml (IP: \*.\*.\*.\*)
Error: Failed to download metadata for repo 'docker-ce-stable': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
然后发现了TencentOS的一个issue,救星! docker 安装问题
跟着做发现报错。这才发现yum源被我搞炸了。进入yum源配置文件文件夹
cd /etc/yum.repos.d
ls
之后发现果然有docker-ce-stable
相关的repo文件,rm删除
yum update
之后根据上面GitHub的issue走
yum -y install tencentos-release-docker-ce
yum -y install docker-ce
然后再执行
sudo ./nezha.sh
不再报无法连接到docker了
安装面板端,1
跟着走,到目前为止都没什么问题,很开心。
将当前用户加入docker组,切换用户试试
sudo gpasswd -a ${USER} docker
sudo su
su ${USER}
安装完docker-compose后启动,报错
> 启动面板
Traceback (most recent call last):
File "urllib3/connectionpool.py", line 677, in urlopen
File "urllib3/connectionpool.py", line 392, in _make_request
File "http/client.py", line 1277, in request
File "http/client.py", line 1323, in _send_request
File "http/client.py", line 1272, in endheaders
File "http/client.py", line 1032, in _send_output
File "http/client.py", line 972, in send
File "docker/transport/unixconn.py", line 43, in connect
FileNotFoundError: [Errno 2] No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "requests/adapters.py", line 449, in send
File "urllib3/connectionpool.py", line 727, in urlopen
File "urllib3/util/retry.py", line 410, in increment
File "urllib3/packages/six.py", line 734, in reraise
File "urllib3/connectionpool.py", line 677, in urlopen
File "urllib3/connectionpool.py", line 392, in _make_request
File "http/client.py", line 1277, in request
File "http/client.py", line 1323, in _send_request
File "http/client.py", line 1272, in endheaders
File "http/client.py", line 1032, in _send_output
File "http/client.py", line 972, in send
File "docker/transport/unixconn.py", line 43, in connect
urllib3.exceptions.ProtocolError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "docker/api/client.py", line 214, in _retrieve_server_version
File "docker/api/daemon.py", line 181, in version
File "docker/utils/decorators.py", line 46, in inner
File "docker/api/client.py", line 237, in _get
File "requests/sessions.py", line 543, in get
File "requests/sessions.py", line 530, in request
File "requests/sessions.py", line 643, in send
File "requests/adapters.py", line 498, in send
requests.exceptions.ConnectionError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "docker-compose", line 3, in <module>
File "compose/cli/main.py", line 81, in main
File "compose/cli/main.py", line 200, in perform_command
File "compose/cli/command.py", line 70, in project_from_options
File "compose/cli/command.py", line 153, in get_project
File "compose/cli/docker_client.py", line 43, in get_client
File "compose/cli/docker_client.py", line 170, in docker_client
File "docker/api/client.py", line 197, in __init__
File "docker/api/client.py", line 222, in _retrieve_server_version
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
[2064673] Failed to execute script docker-compose
启动失败,请稍后查看日志信息
重启。 再次启动发现报错。
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
忘记将docker添加到自启了
sudo systemctl enable docker.service
sudo systemctl enable containerd.service
然后启动docker
service docker start
终于,启动成功
添加反代
需要反代websocket,否则无法实时监控。
在宝塔新建网站,反代配置如下。
自己摸索的反代配置,很可能有更好的配置,但我这块不熟。
location /
{
proxy_pass http://127.0.0.1:8008;
proxy_set_header Host $host;
}
location /ws
{
proxy_pass http://127.0.0.1:8008;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
proxy_set_header Host $host;
}
location /terminal
{
proxy_pass http://127.0.0.1:8008;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
proxy_set_header Host $host;
}
安装受控端
安装第一台受控端的时候很快,其实就是监控端受控端在同一服务器上跑。
安装第二台服务器的探针时发现怎么弄都没有上线。
打开
nmap扫描端口后发现该端口竟然是关闭的
查找宝塔发现端口未使用!说明没有程序在监听。
查看日志发现启动成功了。
遇事不决就重启。 未能解决问题
遂怀疑是SE Linux的问题,但是Ubuntu没有啊。 不管了,安装之后再禁用试试是否可行吧。 启动,不行
那就手动运行吧
/opt/nezha/agent/nezha-agent -s server2的IP:5555 -p agent密钥 -d
不行。但是有清晰的报错了。
➜ admin /opt/nezha/agent/nezha-agent -s server2的IP:5555 -p agent密钥 -d
NEZHA@2021-11-22 19:39:29>> 检查更新: 0.11.6
NEZHA@2021-11-22 19:39:29>> 上报系统信息失败: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp server2的IP:5555: connect: connection refused"
NEZHA@2021-11-22 19:39:29>> Error to close connection ...
NEZHA@2021-11-22 19:39:39>> Try to reconnect ...
NEZHA@2021-11-22 19:39:39>> 上报系统信息失败: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp server2的IP:5555: connect: connection refused"
NEZHA@2021-11-22 19:39:39>> Error to close connection ...
百度发现有类似的问题是防火墙导致的,关闭即可。于是去能正确运行的服务器运行了
systemctl status firewalld.service
发现防火墙是在正常工作的,不解。出错的服务器是Ubuntu,运行 sudo ufw status verbose
发现5555端口是开放的。
第二天突发奇想
➜ admin docker
zsh: command not found: docker
docker没有安装。(其实受控端不需要安装Docker) 安装后执行,报错同样。
➜ ~ /opt/nezha/agent/nezha-agent -s server2的IP:5555 -p agent密钥 -d
NEZHA@2021-11-23 18:53:14>> 检查更新: 0.11.6
NEZHA@2021-11-23 18:53:14>> 上报系统信息失败: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp server2的IP:5555: connect: connection refused"
NEZHA@2021-11-23 18:53:14>> Error to close connection ...
最终想起来不对,域名应该是指向面板啊!
所以不应该是 /opt/nezha/agent/nezha-agent -s server2的IP:5555 -p agent密钥 -d
而是 /opt/nezha/agent/nezha-agent -s 监控端的IP:5555 -p agent密钥 -d
。
果然正常运行了!我真是个憨憨,在不应该出错的地方浪费的大量的时间。
总结
监控端的安装
- 提前安装好Docker(如果你的系统不是常规系统的话
- 跟着提示走简单快捷
受控端的安装
- 跟着提示走,很简单。
- 域名/IP应为监控端所在服务器的未套CDN的域名/IP