浅析elastic-apm
环境搭建
然后在项目里加入apm agent,并设置好apm server的地址即可
结构分析
span:通过 a.订阅rails事件、b.改写底层gem、c.用户with_span 来收集,代表一些特别关注的可能消耗性能的过程。内含transaction id。
transaction:在middleware中收集,代表整个请求响应的过程。可能带有trace id
trace id:调用其他服务时,在http header中加入trace id,以便分布式链路跟踪


elastic-apm ruby agent的源码分析

增加额外跟踪信息

聚合查询
transaction和span存储在不同的索引里,不便于查询“哪些url的redis访问频繁”,“哪些url的IO时间占比低但总RT却很长”
require 'elasticsearch'
class Txns
attr_reader :c
def initialize(c)
@c = c
end
def load
res = c.search(
index: 'apm-*-transaction-*',
body: {
size: 200,
query: {match_all: {}},
_source: {includes: [
'transaction.duration.us',
'transaction.id',
'url.path'
]}
}
)
res['hits']['hits'].map{ |r| Txn.new(c, r['_source']) }
end
end
class Txn
attr_reader :conn, :src, :id, :duration, :path
def initialize(conn, src)
@conn = conn
@src = src
end
def id
@id ||= src.dig('transaction', 'id')
end
def duration
@duration ||= src.dig('transaction', 'duration', 'us')
end
def path
@path ||= src.dig('url', 'path')
end
def spans
res = conn.search(
index: 'apm-*-span-*',
body: {
size: 0,
query: {terms: {'transaction.id' => [id]}},
aggs: {
by_subtype: {
terms: {field: 'span.subtype'},
aggs: {
duration_sum: {
stats: {field: 'span.duration.us'}
}
}
}
}
}
)
buckets = res.dig('aggregations', 'by_subtype', 'buckets')
buckets.map do |b|
b['duration_sum'].merge!(
'percentage' => (b['duration_sum']['sum'] / duration),
'subtype' => b['key'],
)
end
end
def to_doc
{txn_id: id, path: path, duration: duration, spans: spans}
end
def index
conn.index index: 'txn_stats_001', id: id, body: to_doc
end
end
client = Elasticsearch::Client.new log: true
Txns.new(client).load.each &:index


聚合统计图
