如何访问 KubeSphere Prometheus 控制台
KubeSphere 监控引擎由 Prometheus 提供支持。出于调试目的,您可能希望通过 NodePort 访问内置的 Prometheus 服务,请运行以下命令将服务类型更改为?NodePort :
kubectl edit svc -n kubesphere-monitoring-system prometheus-k8s
ports:
- name: web
nodePort: 30066 ############
port: 9090
protocol: TCP
targetPort: web
selector:
app: prometheus
prometheus: k8s
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800
type: NodePort #############
- job_name: kubesphere-monitoring-system/kube-scheduler/0
honor_timestamps: true
scrape_interval: 1m
scrape_timeout: 10s
metrics_path: /metrics
scheme: https
authorization:
type: Bearer
credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
tls_config:
insecure_skip_verify: true
follow_redirects: true
relabel_configs:
- source_labels: [__meta_kubernetes_service_label_k8s_app]
separator: ;
regex: kube-scheduler
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_endpoint_port_name]
separator: ;
regex: https-metrics
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Node;(.*)
target_label: node
replacement: ${1}
action: replace
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Pod;(.*)
target_label: pod
replacement: ${1}
action: replace
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: service
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: job
replacement: ${1}
action: replace
- source_labels: [__meta_kubernetes_service_label_k8s_app]
separator: ;
regex: (.+)
target_label: job
replacement: ${1}
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: https-metrics
action: replace
metric_relabel_configs:
- source_labels: [__name__]
separator: ;
regex: scheduler_(e2e_scheduling_latency_microseconds|scheduling_algorithm_predicate_evaluation|scheduling_algorithm_priority_evaluation|scheduling_algorithm_preemption_evaluation|scheduling_algorithm_latency_microseconds|binding_latency_microseconds|scheduling_latency_seconds)
replacement: $1
action: drop
kubernetes_sd_configs:
- role: endpoints
follow_redirects: true
namespaces:
names:
- kube-system
[root@ks-master ~]# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-scheduler-svc ClusterIP None <none> 10259/TCP 43h
[root@master manifests]# vim kube-scheduler.yaml
[root@master manifests]# pwd
/etc/kubernetes/manifests
[root@master manifests]# ls
kube-apiserver.yaml kube-controller-manager.yaml kube-scheduler.yaml
[root@master ~]# kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-scheduler-master 1/1 Running 7 16d 192.168.100.5 master <none> <none>
[root@master ~]# kubectl get pod kube-scheduler-master -n kube-system --show-labels
NAME READY STATUS RESTARTS AGE LABELS
kube-scheduler-master 1/1 Running 7 16d component=kube-scheduler,tier=control-plane
?What did you expect to see?
I expected the recommended services find the endpoints for the kube-scheduler and kube-controller-manager. From those docs here's the kube-scheduler discovery service:
spec:
clusterIP: None
clusterIPs:
- None
ports:
- name: https-metrics
port: 10259
protocol: TCP
targetPort: 10259
selector:
component: kube-scheduler
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-scheduler-prometheus-discovery
labels:
k8s-app: kube-scheduler
spec:
selector:
component: kube-scheduler
type: ClusterIP
ports:
- name: http-metrics
port: 10259
targetPort: 10259
protocol: TCP
[root@master ~]# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-scheduler-prometheus-discovery ClusterIP 10.233.31.176 <none> 10259/TCP 4s
[root@master ~]# kubectl get ep -n kube-system
NAME ENDPOINTS AGE
kube-scheduler-prometheus-discovery 192.168.100.5:10259 31s
?
After deploy prometheus, it shows x509: certificate is valid for apiserver, not kubernetes.default.svc · Issue #2088 · prometheus/prometheus · GitHubhttps://github.com/prometheus/prometheus/issues/2088?
[root@master ~]# cat kube-scheduler-svc.yaml
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-scheduler-prometheus-discovery
labels:
k8s-app: kube-scheduler
spec:
selector:
component: kube-scheduler
type: ClusterIP
clusterIP: None
ports:
- name: https-metrics
port: 10259
targetPort: 10259
protocol: TCP
- job_name: 'kubernetes-scheduler'
scrape_interval: 1m
scrape_timeout: 10s
metrics_path: /metrics
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name]
action: keep
regex: kube-system;kube-scheduler-prometheus-discovery
scheduler? 暴露10251端口
版本是1.19+,如果k8s版本是1.19这个版本,可以看到这些都是拒绝的,因为这些端口都没有暴露出来,所以在监控这些组件的时候,会发现监控不了。如何将这些端口暴露出来呢?
[root@master ~]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Unhealthy Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused
scheduler Unhealthy Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused
etcd-0 Healthy {"health":"true"}
[root@master ~]# cd /etc/kubernetes/manifests/
[root@master manifests]# ls
kube-apiserver.yaml kube-controller-manager.yaml kube-scheduler.yaml
修改如下:
- --port=0 #去掉这行
- --bind-address=192.168.100.5 #127.0.0.1修改为本机的ip地址
[root@master ~]# netstat -tpln | grep 10251
tcp6 0 0 :::10251 :::* LISTEN 28249/kube-schedule
[root@master ~]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Unhealthy Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
?
controller-manager? 暴露10252端口
[root@master manifests]# vim kube-controller-manager.yaml
修改如下:
- --port=0 #去掉这行
- --bind-address=192.168.100.5 #127.0.0.1修改为本机的ip地址
让上面全部失效
systemctl restart kubelet
?
?
kube-proxy
可以看到kube-proxy还是绑定本机127.0.0.1 lo网卡上,怎么绑定到本机网卡呢
[root@master ~]# netstat -tpln | grep 10249
tcp 0 0 127.0.0.1:10249 0.0.0.0:* LISTEN 3122/kube-proxy
You have new mail in /var/spool/mail/root
[root@node1 ~]# netstat -tpln | grep 10249
tcp 0 0 127.0.0.1:10249 0.0.0.0:* LISTEN 3296/kube-proxy
[root@node2 ~]# netstat -tpln | grep 10249
tcp 0 0 127.0.0.1:10249 0.0.0.0:* LISTEN 3897/kube-proxy
这里不能修改为具体的ip了,因为kube-proxy在每个节点都有,所以不能变为具体的IP
[root@master prometheus]# kubectl edit configmap kube-proxy -n kube-system
metricsBindAddress: "0.0.0.0:10249"
修改之后保存,将这些pod删除掉
[root@master manifests]# kubectl delete pod kube-proxy-5zxsr -n kube-system
pod "kube-proxy-5zxsr" deleted
[root@master manifests]# kubectl delete pod kube-proxy-bwm6f -n kube-system
pod "kube-proxy-bwm6f" deleted
[root@master manifests]# kubectl delete pod kube-proxy-x7mb4 -n kube-system
pod "kube-proxy-x7mb4" deleted
kubectl get pods -n kube-system | grep kube-proxy |awk '{print $1}' | xargs kubectl delete pods -n kube-system
[root@master ~]# netstat -tpln | grep 10249
tcp6 0 0 :::10249 :::* LISTEN 740/kube-proxy
[root@node1 ~]# netstat -tpln | grep 10249
tcp6 0 0 :::10249 :::* LISTEN 7910/kube-proxy
[root@node2 ~]# netstat -tpln | grep 10249
tcp6 0 0 :::10249 :::* LISTEN 17547/kube-proxy
所以在安装好k8s之后,想要监控scheduler ,controller-manager,kube-proxy这些,最好按照上面步骤修改。修改之后重启kubelet和删除kube-proxy。(不要在业务已经跑起来的时候去做这些操作!)
|