TL;DR

  • 本质上是一个rack app,访问其 /metrics 接口,会使用 sidekiq/api 读取 redis 中的统计信息,然后用 erb 渲染成 prometheus 格式
  • 指标均采自sidekiq 存于 redis 中的状态快照
  • 自带 sidekiq-scheduler 和 sidekiq-cron 的监控
  • 不太支持扩展。比较方便的做法是增加一层 rack middleware,拼接上这个 gem 返回的 response body
官网

Strech/sidekiq-prometheus-exporter: Basic metrics of Sidekiq with pluggable contribs

环境搭建

在 [[sidekiq环境]] 的环境基础上,加上sidekiq-prometheus-exporter

文件Gemfile

source 'https://mirrors.tuna.tsinghua.edu.cn/rubygems/'

gem 'sidekiq', '~> 6.5.7'
gem 'sidekiq-prometheus-exporter', '~> 0.1.17'

启动sidekiq客户端

bundle exec irb -r ./por.rb
irb(main):001:0> PlainOldRuby.perform_async "like a dog", 3
=> "a070cbe003929cd8aa0921a1"

启动sidekiq服务端

bundle exec sidekiq -r ./por.rb

启动sidekiq-prometheus-exporter。bundle exec rackup -p9292 -o0.0.0.0

require './por.rb'

require "rack"
require "sidekiq/prometheus/exporter"

run Sidekiq::Prometheus::Exporter.to_app

访问 http://localhost:9292/metrics

源码

提供/metrics接口的服务器

# sidekiq-prometheus-exporter-0.1.17/lib/sidekiq/prometheus/exporter.rb
module Sidekiq
  module Prometheus
    # Expose Prometheus metrics via Rack application or Sidekiq::Web application
    module Exporter
      # ...
      MOUNT_PATH = '/metrics'.freeze
      EXPORTERS = Exporters.new

      class << self
        def to_app
          Rack::Builder.app do
            map(MOUNT_PATH) do
              run Sidekiq::Prometheus::Exporter
            end
          end
        end

        def call(env)
          # ...
          [200, HEADERS, [EXPORTERS.to_s]]
        end
      end
    end
  end
end

访问/metrics接口会返回各种已加载的Sidekiq::Prometheus::Exporter::XXXto_s结果

# sidekiq-prometheus-exporter-0.1.17/lib/sidekiq/prometheus/exporter/exporters.rb
module Sidekiq
  module Prometheus
    module Exporter
      class Exporters
        AVAILABLE_EXPORTERS = {
          standard: Sidekiq::Prometheus::Exporter::Standard,
          cron: Sidekiq::Prometheus::Exporter::Cron,
          scheduler: Sidekiq::Prometheus::Exporter::Scheduler
        }.freeze

        attr_reader :enabled

        def initialize
          @enabled = AVAILABLE_EXPORTERS.values.select(&:available?)
        end

        def to_s
          @enabled.map { |exporter| exporter.new.to_s }.join("\n".freeze)
        end
      end
    end
  end
end

Sidekiq::Prometheus::Exporter::Standard为例,就是直接通过 [[sidekiq的api]] 读取队列、集合、统计信息,然后通过模板渲染成prometheus接受的格式

# sidekiq-prometheus-exporter-0.1.17/lib/sidekiq/prometheus/exporter/standard.rb
require 'erb'
require 'sidekiq/api'

module Sidekiq
  module Prometheus
    module Exporter
      class Standard
        TEMPLATE = ERB.new(File.read(File.expand_path('templates/standard.erb', __dir__)))

        QueueStats = Struct.new(:name, :size, :latency)
        QueueWorkersStats = Struct.new(:total_workers, :busy_workers, :processes)
        WorkersStats = Struct.new(:total_workers, :by_queue)

        def self.available?
          true
        end

        def initialize
          @overview_stats = Sidekiq::Stats.new
          @queues_stats = queues_stats
          @workers_stats = workers_stats
          @max_processing_times = max_processing_times
        end

        def to_s
          TEMPLATE.result(binding).chomp!
        end

        private

        def queues_stats
          Sidekiq::Queue.all.map do |queue|
            QueueStats.new(queue.name, queue.size, queue.latency)
          end
        end

        def workers_stats
          workers_stats = WorkersStats.new(0, {})

          Sidekiq::ProcessSet.new.each_with_object(workers_stats) do |process, stats|
            stats.total_workers += process['concurrency'].to_i

            process['queues'].each do |queue|
              stats.by_queue[queue] ||= QueueWorkersStats.new(0, 0, 0)
              stats.by_queue[queue].processes += 1
              stats.by_queue[queue].busy_workers += process['busy'].to_i
              stats.by_queue[queue].total_workers += process['concurrency'].to_i
            end
          end
        end

        def max_processing_times
          now = Time.now.to_i

          Sidekiq::Workers.new
            .map { |_, _, execution| execution }
            .group_by { |execution| execution['queue'] }
            .each_with_object({}) do |(queue, executions), memo|
              oldest_execution = executions.min_by { |execution| execution['run_at'] }
              memo[queue] = now - oldest_execution['run_at']
            end
        end
      end
    end
  end
end

而模板 sidekiq-prometheus-exporter-0.1.17/lib/sidekiq/prometheus/exporter/templates/standard.erb 内容如下

# HELP sidekiq_processed_jobs_total The total number of processed jobs.
# TYPE sidekiq_processed_jobs_total counter
sidekiq_processed_jobs_total <%= format('%d', @overview_stats.processed) %>

# HELP sidekiq_failed_jobs_total The total number of failed jobs.
# TYPE sidekiq_failed_jobs_total counter
sidekiq_failed_jobs_total <%= format('%d', @overview_stats.failed) %>

# HELP sidekiq_workers The number of workers across all the processes.
# TYPE sidekiq_workers gauge
sidekiq_workers <%= format('%d', @workers_stats.total_workers) %>

# HELP sidekiq_processes The number of processes.
# TYPE sidekiq_processes gauge
sidekiq_processes <%= format('%d', @overview_stats.processes_size) %>

# HELP sidekiq_busy_workers The number of workers performing the job.
# TYPE sidekiq_busy_workers gauge
sidekiq_busy_workers <%= format('%d', @overview_stats.workers_size) %>

# HELP sidekiq_enqueued_jobs The number of enqueued jobs.
# TYPE sidekiq_enqueued_jobs gauge
sidekiq_enqueued_jobs <%= format('%d', @overview_stats.enqueued) %>

# HELP sidekiq_scheduled_jobs The number of jobs scheduled for a future execution.
# TYPE sidekiq_scheduled_jobs gauge
sidekiq_scheduled_jobs <%= format('%d', @overview_stats.scheduled_size) %>

# HELP sidekiq_retry_jobs The number of jobs scheduled for the next try.
# TYPE sidekiq_retry_jobs gauge
sidekiq_retry_jobs <%= format('%d', @overview_stats.retry_size) %>

# HELP sidekiq_dead_jobs The number of jobs being dead.
# TYPE sidekiq_dead_jobs gauge
sidekiq_dead_jobs <%= format('%d', @overview_stats.dead_size) %>

# HELP sidekiq_queue_latency_seconds The number of seconds between oldest job being pushed to the queue and current time.
# TYPE sidekiq_queue_latency_seconds gauge
<% @queues_stats.each do |queue| %>sidekiq_queue_latency_seconds{name="<%= queue.name %>"} <%= format('%.3f', queue.latency) %>
<% end %>
# HELP sidekiq_queue_enqueued_jobs The number of enqueued jobs in the queue.
# TYPE sidekiq_queue_enqueued_jobs gauge
<% @queues_stats.each do |queue| %>sidekiq_queue_enqueued_jobs{name="<%= queue.name %>"} <%= format('%d', queue.size) %>
<% end %>
# HELP sidekiq_queue_max_processing_time_seconds The number of seconds between oldest job of the queue being executed and current time.
# TYPE sidekiq_queue_max_processing_time_seconds gauge
<% @max_processing_times.each do |queue, max_processing_time| %>sidekiq_queue_max_processing_time_seconds{name="<%= queue %>"} <%= format('%i', max_processing_time) %>
<% end %>
# HELP sidekiq_queue_workers The number of workers serving the queue.
# TYPE sidekiq_queue_workers gauge
<% @workers_stats.by_queue.each do |queue, stats| %>sidekiq_queue_workers{name="<%= queue %>"} <%= format('%i', stats.total_workers) %>
<% end %>
# HELP sidekiq_queue_processes The number of processes serving the queue.
# TYPE sidekiq_queue_processes gauge
<% @workers_stats.by_queue.each do |queue, stats| %>sidekiq_queue_processes{name="<%= queue %>"} <%= format('%i', stats.processes) %>
<% end %>
# HELP sidekiq_queue_busy_workers The number of workers performing the job for the queue.
# TYPE sidekiq_queue_busy_workers gauge
<% @workers_stats.by_queue.each do |queue, stats| %>sidekiq_queue_busy_workers{name="<%= queue %>"} <%= format('%i', stats.busy_workers) %>
<% end %>

增减监控指标

不能增加。只可以通过以下方式移除 sidekiq-scheduler 和 sidekiq-cron 的监控

Sidekiq::Prometheus::Exporter.configure do |config|
  config.exporters = # ..
end