Prometheus實(shí)戰(zhàn)-從0構(gòu)建高可用監(jiān)控平臺(tái)(一)
當(dāng)今的互聯(lián)網(wǎng)應(yīng)用系統(tǒng)越來(lái)越復(fù)雜,其中涉及的組件和服務(wù)越來(lái)越多,需要進(jìn)行高效、可靠的監(jiān)控,以保證系統(tǒng)的穩(wěn)定性和性能。Prometheus是一款功能強(qiáng)大的開源監(jiān)控系統(tǒng),可以實(shí)時(shí)監(jiān)控多個(gè)維度的指標(biāo)數(shù)據(jù),并支持強(qiáng)大的查詢語(yǔ)言和告警機(jī)制,是目前廣泛使用的云原生應(yīng)用監(jiān)控系統(tǒng)之一。
本文檔合集《Prometheus實(shí)戰(zhàn):從0構(gòu)建高可用監(jiān)控平臺(tái)》將從零開始,手把手教您如何構(gòu)建一套高可用的Prometheus監(jiān)控平臺(tái),涵蓋了以下內(nèi)容:
1.?Prometheus集群搭建:實(shí)現(xiàn)高可用和可擴(kuò)展的監(jiān)控系統(tǒng)
2.?動(dòng)態(tài)監(jiān)控指標(biāo):自動(dòng)發(fā)現(xiàn)和注冊(cè)要監(jiān)控的目標(biāo)
3.?告警機(jī)制配置:靈活配置告警規(guī)則、分組、過(guò)濾、抑制,實(shí)時(shí)通知異常情況
4.?Grafana可視化展示:直觀了解系統(tǒng)運(yùn)行狀態(tài)和趨勢(shì)
本文檔合集的目標(biāo)讀者是具有一定Linux系統(tǒng)和網(wǎng)絡(luò)知識(shí)的系統(tǒng)管理員和DevOps工程師。通過(guò)本文檔合集的學(xué)習(xí),您將掌握Prometheus的核心概念和實(shí)踐技巧,能夠快速搭建一套高效、可靠的監(jiān)控平臺(tái),幫助您更好地管理和維護(hù)復(fù)雜的互聯(lián)網(wǎng)應(yīng)用系統(tǒng)。
本文是以下內(nèi)容是基于Thanos安裝一個(gè)高可用和可擴(kuò)展的Prometheus集群。
環(huán)境
后面幾篇文檔都在改環(huán)境下生成。
主機(jī)系統(tǒng)和軟件(服務(wù))
//?主機(jī)信息
10.2.0.6??Q-gz-common-prod-thanos-001?node-1
10.2.0.10?Q-gz-common-prod-thanos-002?node-2
10.2.0.41?Q-gz-common-prod-thanos-003?node-3
//?系統(tǒng)版本
CentOS?Linux?release?7.9.2009?(Core)
//?軟件和服務(wù)
Consul
ConsulManager
Docker
Prometheus
Alertmanager
PrometheusAlert
MySQL
Thanos
Haproxy
Keepalived
Lsyncd
Grafana
Cos
運(yùn)維腳本
部署規(guī)劃
//?各模塊規(guī)劃
node-1?部署的服務(wù):?Consul?,?Grafana,?Lsyncd,?Keepalived,?Haproxy,?Thanos,?ConsulManager,?Docker
node-2?部署的服務(wù):?Consul?,?Prometheus,?Alertmanager,PrometheusAlert,??Lsyncd,?Keepalived,?Haproxy,?Thanos
node-3?部署的服務(wù):?Consul?,?Prometheus,?Alertmanager,?PrometheusAlert,?MySQL,?Thanos
調(diào)用拓?fù)?/h1>部署Consul 集群
Consul是一種分布式服務(wù)發(fā)現(xiàn)和配置管理工具,可以用于管理各種服務(wù)的注冊(cè)、發(fā)現(xiàn)和配置。它提供了一種簡(jiǎn)單而強(qiáng)大的方法來(lái)發(fā)現(xiàn)和管理服務(wù),并幫助確保服務(wù)實(shí)例始終保持可用狀態(tài)。Consul還提供了健康檢查、故障轉(zhuǎn)移和分布式一致性等功能,可以幫助應(yīng)用程序自動(dòng)化地管理自己的服務(wù)實(shí)例。
node1, node2, node3
yum?install?-y?yum-utils
yum-config-manager?--add-repo?https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo
yum?install?consul?consul-template?-y
mkdir?-p?/data/consul/{config,data,log}?&&?chown?-R?consul.consul?/data/consul
log_rotate_max_files:指定要保留的舊日志文件存檔的最大數(shù)量。默認(rèn)為 0(不會(huì)刪除任何文件)。設(shè)置為 -1 以在創(chuàng)建新日志文件時(shí)丟棄舊日志文件。
log_rotate_duration:指定日志在需要輪換之前應(yīng)寫入的最大持續(xù)時(shí)間。必須是持續(xù)時(shí)間值,例如 30s。默認(rèn)為 24 小時(shí)。
log_file:將所有 Consul 代理日志消息寫入文件。此值用作日志文件名的前綴。當(dāng)前時(shí)間戳附加到文件名。如果值以路徑分隔符結(jié)尾,consul- 則將附加到該值。如果文件名缺少擴(kuò)展名,.log 則附加。例如,設(shè)置log-file為/var/log/將導(dǎo)致日志文件路徑為/var/log/consul-{timestamp}.log. log-file可以-log-rotate-bytes (opens new window)與-log-rotate-duration (opens new window)結(jié)合使用 , 以獲得細(xì)粒度的日志輪換體驗(yàn)。
retry_join:指定將要置入集群的IP列表,如果失敗,會(huì)自動(dòng)重試,知道直到成功加入。
node1
[root@Q-gz-common-prod-thanos-001?~]#?consul?keygen
xxxxx=cat?>?/usr/lib/systemd/system/consul.service?<<?EOF
#?Consul?systemd?service?unit?file
[Unit]
Description=Consul?Service?Discovery?Agent
Documentation=https://www.consul.io/
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=consul
Group=consul
ExecStart=/usr/bin/consul?agent?\
????-node=node-1?\
????-config-dir=/data/consul/config
ExecReload=/bin/kill?-HUP?$MAINPID
KillMode=process
KillSignal=SIGTERM
Restart=on-failure
LimitNOFILE=10240
LimitNPROC=10240
[Install]
WantedBy=multi-user.target
EOF
cat?>?/data/consul/config/config.json?<<EOF
{
?????"advertise_addr":?"10.2.0.6",
?????"bind_addr":?"10.2.0.6",
?????"bootstrap_expect":?3,
?????"client_addr":?"0.0.0.0",
?????"datacenter":?"guangzhou",
?????"data_dir":?"/data/consul/data",
?????"domain":?"consul",
?????"enable_script_checks":?true,
?????"dns_config":?{
?????????"enable_truncate":?true,
?????????"only_passing":?true
?????},
?????"log_json":?true,
?????"log_level":?"error",
?????"log_rotate_max_files":?10,
?????"log_rotate_duration":?"24h",
?????"log_file":?"/data/consul/log/",
?????"encrypt":?"xxxxx=",
?????"leave_on_terminate":?true,
?????"rejoin_after_leave":?true,
?????"retry_join":?[
?????????"node-1",
?????????"node-2",
?????????"node-3"
?????],
?????"server":?true,
?????"start_join":?[
?????????"node-1",
?????????"node-2",
?????????"node-3"
?????],
?????"acl":?{
?????????"enabled":??true,
?????????"default_policy":?"deny",
?????????"enable_token_persistence":?true
?????},
?????"ui_config":?{
??????"enabled":?true
?????}
}
EOF
systemctl?enable?consul
systemctl?start?consul
node-2
cat?>?/usr/lib/systemd/system/consul.service?<<?EOF
#?Consul?systemd?service?unit?file
[Unit]
Description=Consul?Service?Discovery?Agent
Documentation=https://www.consul.io/
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=consul
Group=consul
ExecStart=/usr/bin/consul?agent?\
????-node=node-2?\
????-config-dir=/data/consul/config
ExecReload=/bin/kill?-HUP?$MAINPID
KillMode=process
KillSignal=SIGTERM
Restart=on-failure
LimitNOFILE=10240
LimitNPROC=10240
[Install]
WantedBy=multi-user.target
EOF
cat?>?/data/consul/config/config.json?<<EOF
{
?????"advertise_addr":?"10.2.0.10",
?????"bind_addr":?"10.2.0.10",
?????"bootstrap_expect":?3,
?????"client_addr":?"0.0.0.0",
?????"datacenter":?"guangzhou",
?????"data_dir":?"/data/consul/data",
?????"domain":?"consul",
?????"enable_script_checks":?true,
?????"dns_config":?{
?????????"enable_truncate":?true,
?????????"only_passing":?true
?????},
?????"log_json":?true,
?????"log_level":?"error",
?????"log_rotate_max_files":?10,
?????"log_rotate_duration":?"24h",
?????"log_file":?"/data/consul/log/",
?????"encrypt":?"xxxxx=",
?????"leave_on_terminate":?true,
?????"rejoin_after_leave":?true,
?????"retry_join":?[
?????????"node-1",
?????????"node-2",
?????????"node-3"
?????],
?????"server":?true,
?????"start_join":?[
?????????"node-1",
?????????"node-2",
?????????"node-3"
?????],
?????"acl":?{
?????????"enabled":??true,
?????????"default_policy":?"deny",
?????????"enable_token_persistence":?true
?????},
?????"ui_config":?{
??????"enabled":?true
?????}
}
EOF
systemctl?enable?consul
systemctl?start?consul
查看 login token
后面Prometheus做動(dòng)態(tài)發(fā)現(xiàn)的時(shí)候需要。
[root@Q-gz-common-prod-thanos-001?~]#?consul?acl?bootstrap
AccessorID:???????6708400c-3a06-62af-6bd7-fae9b1fd9235
SecretID:?????????63961456-9521-xxxx-bfd8-8a386162bb81
Description:??????Bootstrap?Token?(Global?Management)
Local:????????????false
Create?Time:??????2022-08-14?17:00:52.323737575?+0800?CST
Policies:
???00000000-0000-0000-0000-000000000001?-?global-management
ConsulManager
ConsulManager是一個(gè)工具,它可以幫助管理Consul集群的配置和狀態(tài)。ConsulManager可以讓你輕松地管理Consul集群的各個(gè)方面,包括服務(wù)發(fā)現(xiàn)、健康檢查、KV存儲(chǔ)、DNS、ACL等等。
node1 節(jié)點(diǎn)
mkdir?/usr/local/consulmanager/
cd?/usr/local/consulmanager/
wget?https://starsl.cn/static/img/docker-compose.yml編輯:docker-compose.yml,修改3個(gè)環(huán)境變量:
consul_token:consul的登錄token(如何獲取?)
consul_url:consul的URL(http開頭,/v1要保留)
admin_passwd:登錄ConsulManager?Web的admin密碼
[root@Q-gz-common-prod-thanos-001?consulmanager]#?cat?docker-compose.yml
version:?"3.2"
services:
??flask-consul:
????image:?registry.cn-shenzhen.aliyuncs.com/starsl/flask-consul:latest
????container_name:?flask-consul
????hostname:?flask-consul
????restart:?always
????volumes:
??????-?/usr/share/zoneinfo/PRC:/etc/localtime
????environment:
??????consul_token:?63961456-9521-xxxx-bfd8-8a386162bb81
??????consul_url:?http://10.2.0.6:8500/v1
??????admin_passwd:?xxxx
??nginx-consul:
????image:?registry.cn-shenzhen.aliyuncs.com/starsl/nginx-consul:latest
????container_name:?nginx-consul
????hostname:?nginx-consul
????restart:?always
????ports:
??????-?"1026:1026"
????volumes:
??????-?/usr/share/zoneinfo/PRC:/etc/localtime啟動(dòng):docker-compose?pull?&&?docker-compose?up?-d
訪問:http://{IP}:1026,使用配置的ConsulManager?admin密碼登錄
部署Thanos
Thanos :可以幫我們簡(jiǎn)化分布式 Prometheus的部署與管理,并提供了一些的高級(jí)特性︰全局視圖,長(zhǎng)期存儲(chǔ),高可用,該架構(gòu)使用grpc保持各個(gè)組件的通訊,sidecar組件負(fù)責(zé)連接Prometheus,將其數(shù)據(jù)提供給Thanos Query查詢,并且/或者將其上傳到對(duì)象存儲(chǔ),以供長(zhǎng)期存儲(chǔ)。
每個(gè)集群都對(duì)應(yīng)了一個(gè)唯一的租戶 ID,可以通過(guò)租戶標(biāo)簽區(qū)分不同集群的指標(biāo)數(shù)據(jù);如果某個(gè)租戶創(chuàng)建了新的集群,只需在新集群中部署 Prometheus 并配置遠(yuǎn)程寫入;
由于 Prometheus 遠(yuǎn)程讀取改進(jìn),強(qiáng)烈建議使用 Prometheus v2.13+。
//?在三個(gè)節(jié)點(diǎn)都執(zhí)行如下的操作
cd?/usr/local/src/
wget?https://github.com/thanos-io/thanos/releases/download/v0.27.0/thanos-0.27.0.linux-amd64.tar.gz
tar?xf?thanos-0.27.0.linux-amd64.tar.gz?-C?/usr/local/
ln?-sv?/usr/local/thanos-0.27.0.linux-amd64?/usr/local/thanos
echo?"PATH=\$PATH:/usr/local/thanos"?>?/etc/profile.d/thanos.sh?
source????/etc/profile?
groupadd?-r?thanos
useradd?-r?-g?thanos?-s?/sbin/nologin?-c?"thanos?Daemons"?thanos
mkdir?-p?/usr/local/thanos/{store,query,receive,compact}
mkdir?-p?/data/thanos/{store,query,receive,compact}
chown?-R?thanos.?/data/thanos
chown?-R?thanos.?/usr/local/thanos-0.27.0.linux-amd64
各組件介紹和管理
cos桶配置
Thanos 可以使用對(duì)象存儲(chǔ)服務(wù)(如 COS 桶)來(lái)存儲(chǔ) Prometheus 數(shù)據(jù)和 Thanos 組件生成的元數(shù)據(jù),以實(shí)現(xiàn)高可用和長(zhǎng)期存儲(chǔ)。 使用 COS 桶作為數(shù)據(jù)存儲(chǔ),可以實(shí)現(xiàn)以下功能:
長(zhǎng)期存儲(chǔ):將 Prometheus 數(shù)據(jù)存儲(chǔ)到 COS 桶中,可以實(shí)現(xiàn)對(duì)數(shù)據(jù)的長(zhǎng)期存儲(chǔ),以便于未來(lái)的查詢和分析。
可擴(kuò)展性:COS 桶可以存儲(chǔ)大量的數(shù)據(jù),并且可以根據(jù)需要?jiǎng)討B(tài)擴(kuò)展存儲(chǔ)容量,以應(yīng)對(duì)不斷增長(zhǎng)的數(shù)據(jù)需求。
數(shù)據(jù)備份:將 Prometheus 數(shù)據(jù)備份到 COS 桶中,可以保證數(shù)據(jù)的可靠性和安全性,防止因數(shù)據(jù)丟失或故障導(dǎo)致的業(yè)務(wù)損失。
高可用性:使用多個(gè) COS 桶作為 Thanos 存儲(chǔ)后端,可以實(shí)現(xiàn)數(shù)據(jù)的多副本備份和冗余存儲(chǔ),以提高數(shù)據(jù)的可用性和容錯(cuò)能力。
cat?>?/usr/local/thanos/cos_bucket.yaml?<<EOF
type:?COS
config:
??bucket:?"prod-thanos-1001060"
??region:?"ap-guangzhou"
??app_id:?""
??endpoint:?"https://prod-thanos-1001060.cos.ap-guangzhou.myqcloud.com"
??secret_key:?"xxxxxkey"
??secret_id:?"xxxxxscr"
??http_config:
????idle_conn_timeout:?1m30s
????response_header_timeout:?2m
????insecure_skip_verify:?false
????tls_handshake_timeout:?10s
????expect_continue_timeout:?1s
????max_idle_conns:?100
????max_idle_conns_per_host:?100
????max_conns_per_host:?0
????tls_config:
??????ca_file:?""
??????cert_file:?""
??????key_file:?""
??????server_name:?""
??????insecure_skip_verify:?false
????disable_compression:?false
prefix:?""
EOF
node-3
cat?>?/usr/lib/systemd/system/consul.service?<<?EOF
#?Consul?systemd?service?unit?file
[Unit]
Description=Consul?Service?Discovery?Agent
Documentation=https://www.consul.io/
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=consul
Group=consul
ExecStart=/usr/bin/consul?agent?\
????-node=node-3?\
????-config-dir=/data/consul/config
ExecReload=/bin/kill?-HUP?$MAINPID
KillMode=process
KillSignal=SIGTERM
Restart=on-failure
LimitNOFILE=10240
LimitNPROC=10240
[Install]
WantedBy=multi-user.target
EOF
cat?>?/data/consul/config/config.json?<<EOF
{
?????"advertise_addr":?"10.2.0.41",
?????"bind_addr":?"10.2.0.41",
?????"bootstrap_expect":?3,
?????"client_addr":?"0.0.0.0",
?????"datacenter":?"guangzhou",
?????"data_dir":?"/data/consul/data",
?????"domain":?"consul",
?????"enable_script_checks":?true,
?????"dns_config":?{
?????????"enable_truncate":?true,
?????????"only_passing":?true
?????},
?????"log_json":?true,
?????"log_level":?"error",
?????"log_rotate_max_files":?10,
?????"log_rotate_duration":?"24h",
?????"log_file":?"/data/consul/log/",
?????"encrypt":?"xxxxx=",
?????"leave_on_terminate":?true,
?????"rejoin_after_leave":?true,
?????"retry_join":?[
?????????"node-1",
?????????"node-2",
?????????"node-3"
?????],
?????"server":?true,
?????"start_join":?[
?????????"node-1",
?????????"node-2",
?????????"node-3"
?????],
?????"acl":?{
?????????"enabled":??true,
?????????"default_policy":?"deny",
?????????"enable_token_persistence":?true
?????},
?????"ui_config":?{
??????"enabled":?true
?????}
}
EOF
systemctl?enable?consul
systemctl?start?consul
Store配置
Store:是一個(gè) Thanos 的代理,通過(guò) gRPC 連接到 Prometheus 服務(wù),并提供了長(zhǎng)期存儲(chǔ)和全局查詢的能力。
node1 和 node2
cat?>?/etc/systemd/system/thanos-store.service?<<EOF
[Unit]
Description=thanos-store
Documentation=https://thanos.io/
After=network.target
[Service]
Type=simple
User=thanos
ExecStart=/usr/local/thanos/thanos?store?\
??????????--data-dir=/data/thanos/store?\
??????????--objstore.config-file=/usr/local/thanos/cos_bucket.yaml?\
??????????--http-address?????????0.0.0.0:10906?\
??????????--grpc-address?????????0.0.0.0:10905
ExecReload=/bin/kill?-HUP
TimeoutStopSec=20s
Restart=on-failure
LimitNOFILE=10240
LimitNPROC=10240
LimitCORE=infinity
[Install]
WantedBy=multi-user.target
EOFsystemctl?daemon-reload
systemctl?start?thanos-store
systemctl?enable?thanos-store
Receive配置
Receive:接收來(lái)自 Prometheus 服務(wù)器的數(shù)據(jù),并將其寫入長(zhǎng)期存儲(chǔ)中。
node1 , node2 , node3
mkdir?/usr/local/thanos/receive/fb
cat?>?/usr/local/thanos/receive/fb/hashring.json?<<EOF
[
??{
????"hashring":?"fanbook",
????"endpoints":?[
??????"10.2.0.6:10907",
??????"10.2.0.10:10907",
??????"10.2.0.41:10907"
????],
????"tenants":?["fb"]
??}
]
EOFcat?>?/etc/systemd/system/thanos-receive.service?<<EOF
[Unit]
Description=thanos-receive
Documentation=https://thanos.io/
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/thanos/thanos?receive?\
??????????--tsdb.path=/data/thanos/receiver?\
??????????--tsdb.retention=7d?\
??????????--grpc-address?0.0.0.0:10907?\
??????????--remote-write.address?0.0.0.0:10908?\
??????????--http-address?0.0.0.0:10909?\
??????????--receive.local-endpoint?10.2.0.6:10907?\
??????????--receive.replication-factor?3?\
??????????--receive.hashrings-file=/usr/local/thanos/receive/fb/hashring.json?\
??????????--objstore.config-file=/usr/local/thanos/cos_bucket.yaml?\
??????????--label=replica="fb"??
ExecReload=/bin/kill?-HUP
TimeoutStopSec=20s
Restart=always
KillMode=process
KillSignal=SIGTERM
LimitNOFILE=655350
LimitNPROC=655350
LimitCORE=infinity
[Install]
WantedBy=multi-user.target
EOF
systemctl?daemon-reload
systemctl?start?thanos-receive
systemctl?enable?thanos-receive
Query配置
Query:提供了一個(gè)聚合多個(gè) Prometheus 數(shù)據(jù)源的高可用查詢界面。用戶可以在單個(gè)查詢中跨多個(gè)集群或?qū)嵗龍?zhí)行查詢,可以使用標(biāo)準(zhǔn)的 Prometheus PromQL 或 Thanos 擴(kuò)展的 PromQL。
node1
為了方便Query節(jié)點(diǎn)動(dòng)態(tài)添加,我們這邊使用了consul的service, 并用模版文件生成動(dòng)態(tài)發(fā)現(xiàn)文件。具體如下。
[root@Q-gz-common-prod-thanos-001?thanos]#?pwd
/usr/local/thanos
[root@Q-gz-common-prod-thanos-001?thanos]#?ls
compact??consul_token.yaml??cos_bucket.yaml??query??receive??sd.sh??store??thanos??thanos-sd-file.tpl??thanos-sd-file.yaml
[root@Q-gz-common-prod-thanos-001?thanos]#?cat?thanos-sd-file.tpl
-?targets:
??{{?range?service?"thanos-query"?-}}
??-?{{?.Address?}}:{{?.Port?}}
??{{?end?-}}
然后用 consul-template從consul的service讀取并生成文件。
consul-template -consul-token-file=consul_token.yaml -consul-addr 127.0.0.1:8500 -template "/usr/local/thanos/thanos-sd-file.tpl:/usr/local/thanos/thanos-sd-file.yaml:echo ok" -once
cat?>?/etc/systemd/system/thanos-query.service?<<EOF
[Unit]
Description=thanos-query
Documentation=https://thanos.io/
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/thanos/thanos?query?\
??????????--http-address?0.0.0.0:10903?\
??????????--grpc-address?0.0.0.0:10904?\
??????????--store.sd-files=/usr/local/thanos/thanos-sd-file.yaml?\
??????????--query.replica-label=replica?\
??????????--log.level=info?\
??????????--query.timeout=15m?
ExecReload=/bin/kill?-HUP?
TimeoutStopSec=20s
Restart=always
LimitNOFILE=10240
LimitNPROC=10240
LimitCORE=infinity
[Install]
WantedBy=multi-user.target
EOF
systemctl?daemon-reload
systemctl?enable?thanos-query
systemctl?start?thanos-query
Compact配置
Compact:負(fù)責(zé)在長(zhǎng)期存儲(chǔ)中執(zhí)行壓縮和降采樣操作,以減少存儲(chǔ)空間和查詢時(shí)間。
注意:壓縮器必須作為單例運(yùn)行,并且不能在手動(dòng)修改存儲(chǔ)桶中的數(shù)據(jù)時(shí)運(yùn)行 --wait 讓 Compact 一直運(yùn)行,輪詢新數(shù)據(jù)來(lái)做壓縮和降采樣。 --retention.resolution-raw 指定原始數(shù)據(jù)存放時(shí)長(zhǎng),--retention.resolution-5m 指定降采樣到數(shù)據(jù)點(diǎn) 5 分鐘間隔的數(shù)據(jù)存放時(shí)長(zhǎng),--retention.resolution-1h 指定降采樣到數(shù)據(jù)點(diǎn) 1 小時(shí)間隔的數(shù)據(jù)存放時(shí)長(zhǎng),它們的數(shù)據(jù)精細(xì)程度遞減,占用的存儲(chǔ)空間也是遞減,通常建議它們的存放時(shí)間遞增配置 (一般只有比較新的數(shù)據(jù)才會(huì)放大看,久遠(yuǎn)的數(shù)據(jù)通常只會(huì)使用大時(shí)間范圍查詢來(lái)看個(gè)大致,所以建議將精細(xì)程度低的數(shù)據(jù)存放更長(zhǎng)時(shí)間)
node3
cat?>?/etc/systemd/system/thanos-compact.service?<<EOF
[Unit]
Description=thanos-compact
Documentation=https://thanos.io/
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/thanos/thanos?compact?\
??????????--data-dir=/data/thanos/compact?\
??????????--objstore.config-file=/usr/local/thanos/cos_bucket.yaml?\
??????????--http-address?0.0.0.0:19193?\
??????????--debug.accept-malformed-index?\
??????????--retention.resolution-raw=90d?\
??????????--retention.resolution-5m=180d?\
??????????--retention.resolution-1h=360d?\
??????????--log.level=info?\
??????????--wait
ExecReload=/bin/kill?-HUP?
TimeoutStopSec=20s
Restart=always
LimitNOFILE=10240
LimitNPROC=10240
LimitCORE=infinity
[Install]
WantedBy=multi-user.target
EOF
systemctl?daemon-reload
systemctl?start?thanos-compact
systemctl?enable?thanos-compact