使用 Loki 微服務(wù)模式部署生產(chǎn)集群
前面我們提到了 Loki 部署的單體模式和讀寫分離兩種模式,當(dāng)你的每天日志規(guī)模超過了 TB 的量級(jí),那么可能我們就需要使用到微服務(wù)模式來部署 Loki 了。
微服務(wù)部署模式將 Loki 的組件實(shí)例化為不同的進(jìn)程,每個(gè)進(jìn)程都被調(diào)用并指定其目標(biāo),每個(gè)組件都會(huì)產(chǎn)生一個(gè)用于內(nèi)部請(qǐng)求的 gRPC 服務(wù)器和一個(gè)用于外部 API 請(qǐng)求的 HTTP 服務(wù)。
ingester distributor query-frontend query-scheduler querier index-gateway ruler compactor

將組件作為單獨(dú)的微服務(wù)運(yùn)行允許通過增加微服務(wù)的數(shù)量來進(jìn)行擴(kuò)展,定制的集群對(duì)各個(gè)組件具有更好的可觀察性。微服務(wù)模式部署是最高效的 Loki 安裝,但是,它們的設(shè)置和維護(hù)也是最復(fù)雜的。
對(duì)于超大的 Loki 集群或需要對(duì)擴(kuò)展和集群操作進(jìn)行更多控制的集群,建議使用微服務(wù)模式。
微服務(wù)模式最適合在 Kubernetes 集群中部署,提供了 Jsonnet 和 Helm Chart 兩種安裝方式。
Helm Chart
同樣這里我們還是使用 Helm Chart 的方式來安裝微服務(wù)模式的 Loki,在安裝之前記得將前面章節(jié)安裝的 Loki 相關(guān)服務(wù)刪除。
首先獲取微服務(wù)模式的 Chart 包:
$ helm repo add grafana https://grafana.github.io/helm-charts
$ helm pull grafana/loki-distributed --untar --version 0.48.4
$ cd loki-simple-scalable
該 Chart 包支持下表中顯示的組件,Ingester、distributor、querier 和 query-frontend 組件是始終安裝的,其他組件是可選的。
| 組件 | 可選 | 默認(rèn)開啟? |
|---|---|---|
| gateway | ? | ? |
| ingester | ? | n/a |
| distributor | ? | n/a |
| querier | ? | n/a |
| query-frontend | ? | n/a |
| table-manager | ? | ? |
| compactor | ? | ? |
| ruler | ? | ? |
| index-gateway | ? | ? |
| memcached-chunks | ? | ? |
| memcached-frontend | ? | ? |
| memcached-index-queries | ? | ? |
| memcached-index-writes | ? | ? |
該 Chart 包在微服務(wù)模式下配置 Loki,已經(jīng)過測(cè)試,可以與 boltdb-shipper 和 memberlist 一起使用,而其他存儲(chǔ)和發(fā)現(xiàn)選項(xiàng)也可以使用,但是,該圖表不支持設(shè)置 Consul 或 Etcd 以進(jìn)行發(fā)現(xiàn),它們需要進(jìn)行單獨(dú)配置,相反,可以使用不需要單獨(dú)的鍵/值存儲(chǔ)的 memberlist。默認(rèn)情況下該 Chart 包會(huì)為成員列表創(chuàng)建了一個(gè) Headless Service,ingester、distributor、querier 和 ruler 是其中的一部分。
安裝minio
比如我們這里使用 memberlist、boltdb-shipper 和 minio 來作存儲(chǔ),由于這個(gè) Chart 包沒有包含 minio,所以需要我們先單獨(dú)安裝 minio:
$ helm repo add minio https://helm.min.io/
$ helm pull minio/minio --untar --version 8.0.10
$ cd minio
創(chuàng)建一個(gè)如下所示的 values 文件:
# ci/loki-values.yaml
accessKey: "myaccessKey"
secretKey: "mysecretKey"
persistence:
enabled: true
storageClass: "local-path"
accessMode: ReadWriteOnce
size: 5Gi
service:
type: NodePort
port: 9000
nodePort: 32000
resources:
requests:
memory: 1Gi
直接使用上面配置的 values 文件安裝 minio:
$ helm upgrade --install minio -n logging -f ci/loki-values.yaml .
Release "minio" does not exist. Installing it now.
NAME: minio
LAST DEPLOYED: Sun Jun 19 16:56:28 2022
NAMESPACE: logging
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Minio can be accessed via port 9000 on the following DNS name from within your cluster:
minio.logging.svc.cluster.local
To access Minio from localhost, run the below commands:
1. export POD_NAME=$(kubectl get pods --namespace logging -l "release=minio" -o jsonpath="{.items[0].metadata.name}")
2. kubectl port-forward $POD_NAME 9000 --namespace logging
Read more about port forwarding here: http://kubernetes.io/docs/user-guide/kubectl/kubectl_port-forward/
You can now access Minio server on http://localhost:9000. Follow the below steps to connect to Minio server with mc client:
1. Download the Minio mc client - https://docs.minio.io/docs/minio-client-quickstart-guide
2. Get the ACCESS_KEY=$(kubectl get secret minio -o jsonpath="{.data.accesskey}" | base64 --decode) and the SECRET_KEY=$(kubectl get secret minio -o jsonpath="{.data.secretkey}" | base64 --decode)
3. mc alias set minio-local http://localhost:9000 "$ACCESS_KEY" "$SECRET_KEY" --api s3v4
4. mc ls minio-local
Alternately, you can use your browser or the Minio SDK to access the server - https://docs.minio.io/categories/17
安裝完成后查看對(duì)應(yīng)的 Pod 狀態(tài):
$ kubectl get pods -n logging
NAME READY STATUS RESTARTS AGE
minio-548656f786-gctk9 1/1 Running 0 2m45s
$ kubectl get svc -n logging
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
minio NodePort 10.111.58.196 <none> 9000:32000/TCP 3h16m
可以通過指定的 32000 端口來訪問 minio:

然后記得創(chuàng)建一個(gè)名為 loki-data 的 bucket。
安裝Loki
現(xiàn)在將我們的對(duì)象存儲(chǔ)準(zhǔn)備好后,接下來我們來安裝微服務(wù)模式的 Loki,首先創(chuàng)建一個(gè)如下所示的 values 文件:
# ci/minio-values.yaml
loki:
structuredConfig:
ingester:
max_transfer_retries: 0
chunk_idle_period: 1h
chunk_target_size: 1536000
max_chunk_age: 1h
storage_config:
aws:
endpoint: minio.logging.svc.cluster.local:9000
insecure: true
bucketnames: loki-data
access_key_id: myaccessKey
secret_access_key: mysecretKey
s3forcepathstyle: true
boltdb_shipper:
shared_store: s3
schema_config:
configs:
- from: 2022-06-21
store: boltdb-shipper
object_store: s3
schema: v12
index:
prefix: loki_index_
period: 24h
distributor:
replicas: 2
ingester:
replicas: 2
persistence:
enabled: true
size: 1Gi
storageClass: local-path
querier:
replicas: 2
persistence:
enabled: true
size: 1Gi
storageClass: local-path
queryFrontend:
replicas: 2
gateway:
nginxConfig:
httpSnippet: |-
client_max_body_size 100M;
serverSnippet: |-
client_max_body_size 100M;
上述配置會(huì)選擇性地覆蓋 loki.config 模板文件中的默認(rèn)值,使用 loki.structuredConfig 可以在外部設(shè)置大多數(shù)配置參數(shù)。loki.config、loki.schemaConfig 和 loki.storageConfig 也可以與 loki.structuredConfig 結(jié)合使用。loki.structuredConfig 中的值優(yōu)先級(jí)更高。
這里我們通過 loki.structuredConfig.storage_config.aws 指定了用于保存數(shù)據(jù)的 minio 配置,為了高可用,核心的幾個(gè)組件我們配置了2個(gè)副本,ingester 和 querier 配置了持久化存儲(chǔ)。
現(xiàn)在使用上面的 values 文件進(jìn)行一鍵安裝:
$ helm upgrade --install loki -n logging -f ci/minio-values.yaml .
Release "loki" does not exist. Installing it now.
NAME: loki
LAST DEPLOYED: Tue Jun 21 16:20:10 2022
NAMESPACE: logging
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
***********************************************************************
Welcome to Grafana Loki
Chart version: 0.48.4
Loki version: 2.5.0
***********************************************************************
Installed components:
* gateway
* ingester
* distributor
* querier
* query-frontend
上面會(huì)分別安裝幾個(gè)組件:gateway、ingester、distributor、querier、query-frontend,對(duì)應(yīng)的 Pod 狀態(tài)如下所示:
$ kubectl get pods -n logging
NAME READY STATUS RESTARTS AGE
loki-loki-distributed-distributor-5dfdd5bd78-nxdq8 1/1 Running 0 2m40s
loki-loki-distributed-distributor-5dfdd5bd78-rh4gz 1/1 Running 0 116s
loki-loki-distributed-gateway-6f4cfd898c-hpszv 1/1 Running 0 21m
loki-loki-distributed-ingester-0 1/1 Running 0 96s
loki-loki-distributed-ingester-1 1/1 Running 0 2m38s
loki-loki-distributed-querier-0 1/1 Running 0 2m2s
loki-loki-distributed-querier-1 1/1 Running 0 2m33s
loki-loki-distributed-query-frontend-6d9845cb5b-p4vns 1/1 Running 0 4s
loki-loki-distributed-query-frontend-6d9845cb5b-sq5hr 1/1 Running 0 2m40s
minio-548656f786-gctk9 1/1 Running 1 (123m ago) 47h
$ kubectl get svc -n logging
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
loki-loki-distributed-distributor ClusterIP 10.102.156.127 <none> 3100/TCP,9095/TCP 22m
loki-loki-distributed-gateway ClusterIP 10.111.73.138 <none> 80/TCP 22m
loki-loki-distributed-ingester ClusterIP 10.98.238.236 <none> 3100/TCP,9095/TCP 22m
loki-loki-distributed-ingester-headless ClusterIP None <none> 3100/TCP,9095/TCP 22m
loki-loki-distributed-memberlist ClusterIP None <none> 7946/TCP 22m
loki-loki-distributed-querier ClusterIP 10.101.117.137 <none> 3100/TCP,9095/TCP 22m
loki-loki-distributed-querier-headless ClusterIP None <none> 3100/TCP,9095/TCP 22m
loki-loki-distributed-query-frontend ClusterIP None <none> 3100/TCP,9095/TCP,9096/TCP 22m
minio NodePort 10.111.58.196 <none> 9000:32000/TCP 47h
Loki 對(duì)應(yīng)的配置文件如下所示:
$ kubectl get cm -n logging loki-loki-distributed -o yaml
apiVersion: v1
data:
config.yaml: |
auth_enabled: false
chunk_store_config:
max_look_back_period: 0s
compactor:
shared_store: filesystem
distributor:
ring:
kvstore:
store: memberlist
frontend:
compress_responses: true
log_queries_longer_than: 5s
tail_proxy_url: http://loki-loki-distributed-querier:3100
frontend_worker:
frontend_address: loki-loki-distributed-query-frontend:9095
ingester:
chunk_block_size: 262144
chunk_encoding: snappy
chunk_idle_period: 1h
chunk_retain_period: 1m
chunk_target_size: 1536000
lifecycler:
ring:
kvstore:
store: memberlist
replication_factor: 1
max_chunk_age: 1h
max_transfer_retries: 0
wal:
dir: /var/loki/wal
limits_config:
enforce_metric_name: false
max_cache_freshness_per_query: 10m
reject_old_samples: true
reject_old_samples_max_age: 168h
split_queries_by_interval: 15m
memberlist:
join_members:
- loki-loki-distributed-memberlist
query_range:
align_queries_with_step: true
cache_results: true
max_retries: 5
results_cache:
cache:
enable_fifocache: true
fifocache:
max_size_items: 1024
validity: 24h
ruler:
alertmanager_url: https://alertmanager.xx
external_url: https://alertmanager.xx
ring:
kvstore:
store: memberlist
rule_path: /tmp/loki/scratch
storage:
local:
directory: /etc/loki/rules
type: local
schema_config:
configs:
- from: "2022-06-21"
index:
period: 24h
prefix: loki_index_
object_store: s3
schema: v12
store: boltdb-shipper
server:
http_listen_port: 3100
storage_config:
aws:
access_key_id: myaccessKey
bucketnames: loki-data
endpoint: minio.logging.svc.cluster.local:9000
insecure: true
s3forcepathstyle: true
secret_access_key: mysecretKey
boltdb_shipper:
active_index_directory: /var/loki/index
cache_location: /var/loki/cache
cache_ttl: 168h
shared_store: s3
filesystem:
directory: /var/loki/chunks
table_manager:
retention_deletes_enabled: false
retention_period: 0s
kind: ConfigMap
# ......
同樣其中有一個(gè) gateway 組件會(huì)來幫助我們將請(qǐng)求路由到正確的組件中去,該組件同樣就是一個(gè) nginx 服務(wù),對(duì)應(yīng)的配置如下所示:
$ kubectl -n logging exec -it loki-loki-distributed-gateway-6f4cfd898c-hpszv -- cat /etc/nginx/nginx.conf
worker_processes 5; ## Default: 1
error_log /dev/stderr;
pid /tmp/nginx.pid;
worker_rlimit_nofile 8192;
events {
worker_connections 4096; ## Default: 1024
}
http {
client_body_temp_path /tmp/client_temp;
proxy_temp_path /tmp/proxy_temp_path;
fastcgi_temp_path /tmp/fastcgi_temp;
uwsgi_temp_path /tmp/uwsgi_temp;
scgi_temp_path /tmp/scgi_temp;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] $status '
'"$request" $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /dev/stderr main;
sendfile on;
tcp_nopush on;
resolver kube-dns.kube-system.svc.cluster.local;
client_max_body_size 100M;
server {
listen 8080;
location = / {
return 200 'OK';
auth_basic off;
}
location = /api/prom/push {
proxy_pass http://loki-loki-distributed-distributor.logging.svc.cluster.local:3100$request_uri;
}
location = /api/prom/tail {
proxy_pass http://loki-loki-distributed-querier.logging.svc.cluster.local:3100$request_uri;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
# Ruler
location ~ /prometheus/api/v1/alerts.* {
proxy_pass http://loki-loki-distributed-ruler.logging.svc.cluster.local:3100$request_uri;
}
location ~ /prometheus/api/v1/rules.* {
proxy_pass http://loki-loki-distributed-ruler.logging.svc.cluster.local:3100$request_uri;
}
location ~ /api/prom/rules.* {
proxy_pass http://loki-loki-distributed-ruler.logging.svc.cluster.local:3100$request_uri;
}
location ~ /api/prom/alerts.* {
proxy_pass http://loki-loki-distributed-ruler.logging.svc.cluster.local:3100$request_uri;
}
location ~ /api/prom/.* {
proxy_pass http://loki-loki-distributed-query-frontend.logging.svc.cluster.local:3100$request_uri;
}
location = /loki/api/v1/push {
proxy_pass http://loki-loki-distributed-distributor.logging.svc.cluster.local:3100$request_uri;
}
location = /loki/api/v1/tail {
proxy_pass http://loki-loki-distributed-querier.logging.svc.cluster.local:3100$request_uri;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
location ~ /loki/api/.* {
proxy_pass http://loki-loki-distributed-query-frontend.logging.svc.cluster.local:3100$request_uri;
}
client_max_body_size 100M;
}
}
從上面配置可以看出對(duì)應(yīng)的 Push 端點(diǎn) /api/prom/push 與 /loki/api/v1/push 會(huì)轉(zhuǎn)發(fā)給 http://loki-loki-distributed-distributor.logging.svc.cluster.local:3100$request_uri;,也就是對(duì)應(yīng)的 distributor 服務(wù):
$ kubectl get pods -n logging -l app.kubernetes.io/component=distributor,app.kubernetes.io/instance=loki,app.kubernetes.io/name=loki-distributed
NAME READY STATUS RESTARTS AGE
loki-loki-distributed-distributor-5dfdd5bd78-nxdq8 1/1 Running 0 8m20s
loki-loki-distributed-distributor-5dfdd5bd78-rh4gz 1/1 Running 0 7m36s
所以如果我們要寫入日志數(shù)據(jù),自然現(xiàn)在是寫入到 gateway 的 Push 端點(diǎn)上去。為了驗(yàn)證應(yīng)用是否正常,接下來我們?cè)侔惭b Promtail 和 Grafana 來進(jìn)行數(shù)據(jù)的讀寫。
安裝Promtail
獲取 promtail 的 Chart 包并解壓:
$ helm pull grafana/promtail --untar
$ cd promtail
創(chuàng)建一個(gè)如下所示的 values 文件:
# ci/minio-values.yaml
rbac:
pspEnabled: false
config:
clients:
- url: http://loki-loki-distributed-gateway/loki/api/v1/push
注意我們需要將 Promtail 中配置的 Loki 地址為 http://loki-loki-distributed-gateway/loki/api/v1/push,這樣就是 Promtail 將日志數(shù)據(jù)首先發(fā)送到 gateway 上面去,然后 gateway 根據(jù)我們的 Endpoints 去轉(zhuǎn)發(fā)給 write 節(jié)點(diǎn),使用上面的 values 文件來安裝 Promtail:
$ helm upgrade --install promtail -n logging -f ci/minio-values.yaml .
Release "promtail" does not exist. Installing it now.
NAME: promtail
LAST DEPLOYED: Tue Jun 21 16:31:34 2022
NAMESPACE: logging
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
***********************************************************************
Welcome to Grafana Promtail
Chart version: 5.1.0
Promtail version: 2.5.0
***********************************************************************
Verify the application is working by running these commands:
* kubectl --namespace logging port-forward daemonset/promtail 3101
* curl http://127.0.0.1:3101/metrics
正常安裝完成后會(huì)在每個(gè)節(jié)點(diǎn)上運(yùn)行一個(gè) promtail:
$ kubectl get pods -n logging -l app.kubernetes.io/name=promtail
NAME READY STATUS RESTARTS AGE
promtail-gbjzs 1/1 Running 0 38s
promtail-gjn5p 1/1 Running 0 38s
promtail-z6vhd 1/1 Running 0 38s
正常 promtail 就已經(jīng)在開始采集所在節(jié)點(diǎn)上的所有容器日志了,然后將日志數(shù)據(jù) Push 給 gateway,gateway 轉(zhuǎn)發(fā)給 write 節(jié)點(diǎn),我們可以查看 gateway 的日志:
$ kubectl logs -f loki-loki-distributed-gateway-6f4cfd898c-hpszv -n logging
10.244.2.26 - - [21/Jun/2022:08:41:24 +0000] 204 "POST /loki/api/v1/push HTTP/1.1" 0 "-" "promtail/2.5.0" "-"
10.244.2.1 - - [21/Jun/2022:08:41:24 +0000] 200 "GET / HTTP/1.1" 2 "-" "kube-probe/1.22" "-"
10.244.2.26 - - [21/Jun/2022:08:41:25 +0000] 204 "POST /loki/api/v1/push HTTP/1.1" 0 "-" "promtail/2.5.0" "-"
10.244.1.28 - - [21/Jun/2022:08:41:26 +0000] 204 "POST /loki/api/v1/push HTTP/1.1" 0 "-" "promtail/2.5.0" "-"
......
可以看到 gateway 現(xiàn)在在一直接接收著 /loki/api/v1/push 的請(qǐng)求,也就是 promtail 發(fā)送過來的,正常來說現(xiàn)在日志數(shù)據(jù)已經(jīng)分發(fā)給 write 節(jié)點(diǎn)了,write 節(jié)點(diǎn)將數(shù)據(jù)存儲(chǔ)在了 minio 中,可以去查看下 minio 中已經(jīng)有日志數(shù)據(jù)了,前面安裝的時(shí)候?yàn)?minio 服務(wù)指定了一個(gè) 32000 的 NodePort 端口:

到這里可以看到數(shù)據(jù)已經(jīng)可以正常寫入了。
安裝Grafana
下面我們來驗(yàn)證下讀取路徑,安裝 Grafana 對(duì)接 Loki:
$ helm pull grafana/grafana --untar
$ cd grafana
創(chuàng)建如下所示的 values 配置文件:
# ci/minio-values.yaml
service:
type: NodePort
nodePort: 32001
rbac:
pspEnabled: false
persistence:
enabled: true
storageClassName: local-path
accessModes:
- ReadWriteOnce
size: 1Gi
直接使用上面的 values 文件安裝 Grafana:
$ helm upgrade --install grafana -n logging -f ci/minio-values.yaml .
Release "grafana" does not exist. Installing it now.
NAME: grafana
LAST DEPLOYED: Tue Jun 21 16:47:54 2022
NAMESPACE: logging
STATUS: deployed
REVISION: 1
NOTES:
1. Get your 'admin' user password by running:
kubectl get secret --namespace logging grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
2. The Grafana server can be accessed via port 80 on the following DNS name from within your cluster:
grafana.logging.svc.cluster.local
Get the Grafana URL to visit by running these commands in the same shell:
export NODE_PORT=$(kubectl get --namespace logging -o jsonpath="{.spec.ports[0].nodePort}" services grafana)
export NODE_IP=$(kubectl get nodes --namespace logging -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT
3. Login with the password from step 1 and the username: admin
可以通過上面提示中的命令獲取登錄密碼:
$ kubectl get secret --namespace logging grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
然后使用上面的密碼和 admin 用戶名登錄 Grafana:

登錄后進(jìn)入 Grafana 添加一個(gè)數(shù)據(jù)源,這里需要注意要填寫 gateway 的地址 http://loki-loki-distributed-gateway:

保存數(shù)據(jù)源后,可以進(jìn)入 Explore 頁面過濾日志,比如我們這里來實(shí)時(shí)查看 gateway 這個(gè)應(yīng)用的日志,如下圖所示:

如果你能看到最新的日志數(shù)據(jù)那說明我們部署成功了微服務(wù)模式的 Loki,這種模式靈活性非常高,可以根據(jù)需要對(duì)不同的組件做擴(kuò)縮容,但是運(yùn)維成本也會(huì)增加很多。
此外我們還可以來做查詢和寫入的緩存,我們這里使用的 Helm Chart 是支持 memcached 的,我們也可以自行換成 redis。
