跳到主要内容

PromQL 常用技巧

根据 Pod 名称获取所属控制器名称(Deployment/StatefulSet/DaemonSet)

核心思路

在 Kubernetes 中,Pod 和控制器(Deployment、StatefulSet、DaemonSet)的关系不同:

  1. Deployment:Pod → ReplicaSet → Deployment(需要两步关联)
  2. StatefulSet:Pod → StatefulSet(直接关联)
  3. DaemonSet:Pod → DaemonSet(直接关联)

基础指标说明

kube_pod_owner 指标

  • 包含每个 Pod 的所有者信息
  • owner_name:所有者的名称(可能是 ReplicaSet、StatefulSet 或 DaemonSet 名称)
  • owner_kind:所有者的类型("ReplicaSet"、"StatefulSet"、"DaemonSet" 等)

kube_replicaset_owner 指标

  • 包含每个 ReplicaSet 的所有者信息
  • owner_name:Deployment 名称
  • replicaset:ReplicaSet 名称

获取 StatefulSet 名称

StatefulSet 的 Pod 直接属于 StatefulSet,不需要中间层:

# 步骤 1:获取 Pod 的 owner 信息
kube_pod_owner{owner_kind="StatefulSet", pod="your-pod-name", namespace="your-namespace"}

# 步骤 2:提取 StatefulSet 名称并添加控制器类型标签
label_replace(
label_replace(
kube_pod_owner{owner_kind="StatefulSet", pod="your-pod-name", namespace="your-namespace"},
"workload", # 新的 label 名称
"$1", # 替换值(使用捕获组)
"owner_name", # 源 label 名称
"(.*)" # 正则表达式(匹配所有内容)
),
"workload_type", # 控制器类型 label
"statefulset", # 固定值:statefulset
"", # 空字符串(匹配所有)
".*" # 匹配所有内容
)

结果

  • workload label 就是 StatefulSet 名称
  • workload_type label 值为 statefulset

获取 DaemonSet 名称

DaemonSet 的 Pod 也直接属于 DaemonSet:

label_replace(
label_replace(
kube_pod_owner{owner_kind="DaemonSet", pod="your-pod-name", namespace="your-namespace"},
"workload",
"$1",
"owner_name",
"(.*)"
),
"workload_type", # 控制器类型 label
"daemonset", # 固定值:daemonset
"",
".*"
)

结果

  • workload label 就是 DaemonSet 名称
  • workload_type label 值为 daemonset

获取 Deployment 名称

Deployment 的 Pod 需要通过 ReplicaSet:

步骤 1:获取 Pod 的 ReplicaSet 名称

kube_pod_owner{owner_kind="ReplicaSet", pod="your-pod-name", namespace="your-namespace"}

此时 owner_name 是 ReplicaSet 名称(例如:my-deployment-abc123)。

步骤 2:将 ReplicaSet 名称重命名,方便后续匹配

label_replace(
kube_pod_owner{owner_kind="ReplicaSet", pod="your-pod-name", namespace="your-namespace"},
"replicaset", # 新的 label 名称
"$1", # 替换值
"owner_name", # 源 label(ReplicaSet 名称)
"(.*)" # 匹配所有内容
)

步骤 3:通过 ReplicaSet 名称匹配 Deployment

# 先获取 ReplicaSet 的 Deployment 信息
kube_replicaset_owner{replicaset="my-deployment-abc123", namespace="your-namespace"}

此时 owner_name 是 Deployment 名称(例如:my-deployment)。

步骤 4:避免 Many-to-Many 错误

同一个 Deployment 可能有多个 ReplicaSet(滚动更新时),需要使用 topk 确保每个 ReplicaSet 只匹配一次:

topk by(replicaset, namespace) (
1,
max by (replicaset, namespace, owner_name) (
kube_replicaset_owner{namespace="your-namespace"}
)
)

说明

  • topk by(replicaset, namespace) (1, ...) 确保每个 ReplicaSet 只返回一条记录
  • max by(...) 确保每个 ReplicaSet 只匹配一个 Deployment
  • 这样避免了 many-to-many 匹配错误

步骤 5:完整的关联过程

label_replace(
label_replace(
label_replace(
kube_pod_owner{owner_kind="ReplicaSet", pod="your-pod-name", namespace="your-namespace"},
"replicaset", "$1", "owner_name", "(.*)"
) * on(replicaset, namespace) group_left(owner_name) topk by(replicaset, namespace) (
1, max by (replicaset, namespace, owner_name) (
kube_replicaset_owner{namespace="your-namespace"}
)
),
"workload", "$1", "owner_name", "(.*)"
),
"workload_type", # 控制器类型 label
"deployment", # 固定值:deployment
"",
".*"
)

说明

  • 第一步:将 ReplicaSet 名称重命名为 replicaset
  • 第二步:使用 * on(replicaset, namespace) 匹配 kube_replicaset_owner
  • 第三步:使用 topk 确保每个 ReplicaSet 只匹配一次
  • 第四步:通过 group_left(owner_name) 获取 Deployment 名称
  • 第五步:将 Deployment 名称重命名为 workload
  • 第六步:添加 workload_type 标签,值为 deployment

结果

  • workload label 就是 Deployment 名称
  • workload_type label 值为 deployment

合并所有类型-完整查询

如果需要同时处理 Deployment、StatefulSet 和 DaemonSet:

(
# Deployment:通过 ReplicaSet 关联
label_replace(
label_replace(
label_replace(
kube_pod_owner{owner_kind="ReplicaSet", namespace="your-namespace"},
"replicaset", "$1", "owner_name", "(.*)"
) * on(replicaset, namespace) group_left(owner_name) topk by(replicaset, namespace) (
1, max by (replicaset, namespace, owner_name) (
kube_replicaset_owner{namespace="your-namespace"}
)
),
"workload", "$1", "owner_name", "(.*)"
),
"workload_type", "deployment", "", ".*"
)
or
# StatefulSet:直接关联
label_replace(
label_replace(
kube_pod_owner{owner_kind="StatefulSet", namespace="your-namespace"},
"workload", "$1", "owner_name", "(.*)"
),
"workload_type", "statefulset", "", ".*"
)
or
# DaemonSet:直接关联
label_replace(
label_replace(
kube_pod_owner{owner_kind="DaemonSet", namespace="your-namespace"},
"workload", "$1", "owner_name", "(.*)"
),
"workload_type", "daemonset", "", ".*"
)
)

说明

  • 使用 or 操作符合并三种类型
  • 最终结果中,workload label 就是控制器名称(Deployment、StatefulSet 或 DaemonSet)
  • workload_type label 标识控制器类型:deploymentstatefulsetdaemonset

完整示例:按控制器汇总 Pod CPU 申请核数

结合上面的方法,完整的查询语句:

sum(
sum(kube_pod_container_resource_requests{resource="cpu", container!="POD", container!="", cluster=~"$cluster", namespace=~"$namespace"}) by (namespace, cluster, pod)
* on(namespace, cluster, pod) group_left(workload, workload_type)
(
label_replace(
label_replace(
label_replace(
kube_pod_owner{owner_kind="ReplicaSet", cluster=~"$cluster", namespace=~"$namespace"},
"replicaset", "$1", "owner_name", "(.*)"
) * on(replicaset, namespace) group_left(owner_name) topk by(replicaset, namespace) (
1, max by (replicaset, namespace, owner_name) (
kube_replicaset_owner{cluster=~"$cluster", namespace=~"$namespace"}
)
),
"workload", "$1", "owner_name", "(.*)"
),
"workload_type", "deployment", "", ".*"
)
or
label_replace(
label_replace(
kube_pod_owner{owner_kind="StatefulSet", cluster=~"$cluster", namespace=~"$namespace"},
"workload", "$1", "owner_name", "(.*)"
),
"workload_type", "statefulset", "", ".*"
)
or
label_replace(
label_replace(
kube_pod_owner{owner_kind="DaemonSet", cluster=~"$cluster", namespace=~"$namespace"},
"workload", "$1", "owner_name", "(.*)"
),
"workload_type", "daemonset", "", ".*"
)
)
) by (namespace, cluster, workload, workload_type)

执行流程

  1. 按 pod 汇总 CPU request
  2. 关联到控制器(Deployment/StatefulSet/DaemonSet),获取 workloadworkload_type label
  3. workload(控制器名称)和 workload_type(控制器类型)汇总

要点总结

  1. StatefulSet 和 DaemonSet:Pod 直接属于控制器,使用 kube_pod_ownerowner_name 即可
  2. Deployment:Pod → ReplicaSet → Deployment,需要两步关联
  3. 避免 Many-to-Many 错误:使用 topk by(replicaset, namespace) (1, max by(...)) 确保每个 ReplicaSet 只匹配一次
  4. 统一 Label 名称:使用 label_replace 将所有控制器名称统一为 workload,方便后续查询
  5. 控制器类型标识:通过 workload_type label 标识控制器类型(deploymentstatefulsetdaemonset),便于按类型进行筛选和聚合

参考标准 Prometheus Recording Rules

这是 kube-prometheus 标准 recording rules 的做法,参考了以下 recording rule:

- expr: |
max by (cluster, namespace, workload, pod) (
label_replace(
label_replace(
kube_pod_owner{job="kube-state-metrics", owner_kind="ReplicaSet"},
"replicaset", "$1", "owner_name", "(.*)"
) * on(replicaset, namespace) group_left(owner_name) topk by(replicaset, namespace) (
1, max by (replicaset, namespace, owner_name) (
kube_replicaset_owner{job="kube-state-metrics"}
)
),
"workload", "$1", "owner_name", "(.*)"
)
)
labels:
workload_type: deployment
record: namespace_workload_pod:kube_pod_owner:relabel