DevOpsDays Taipei 2026 DevOpsDays Taipei 2026

每分鐘超過 5 億筆可觀測性服務

Taboola ingests 500M+ unique metrics per minute across seven data centers, with a hybrid fleet of physical servers and Kubernetes. 

This session explains how we keep ingestion stable, query latency predictable, and costs under control.

We’ll walk through the real architecture decisions and tradeoffs:

  • Physical layer (Puppet managed Prometheus): per DC scraping, and long retention.
  • Thanos integration: sidecars for per DC exposure, and rulers for local and cross DC rules.
  • Kubernetes layer (Helm): per DC query tier, cross DC query in IL.
  • High card strategy: label hygiene, shard boundaries, short term vs long term query paths, and when to offload to compactor/store gateway.
  • Operational lessons: where the architecture breaks first, what we changed, and how we keep it predictable at scale.

The goal is to share concrete practices and a reusable mental model for anyone trying to scale Prometheus + Thanos across multiple DCs.


聽眾收穫:

  • How to split metrics from “regular” metrics without losing visibility
  • How to combine per DC isolation with cross DC global views using Thanos
  • Practical sharding and retention strategies for large scale Prometheus deployments
  • A reference architecture for hybrid (physical + K8s) observability stacks
熊崇緯 (Chungwei)

講者

熊崇緯 (Chungwei)

Taboola
R&D Team Lead, Infrastructure
LEVEL

進階

LANGUAGE

中文

TAGS
Observability (可觀測性)
適合聽眾
DevOps老司機 (DevOps Veteran)IT人員 / 偏開發 (IT / DEV)IT人員 / 全都做 (IT / I have to do everything)