chewy的witchcraft的运作分析
查看witchcraft!方法的定义,发现它只是给当前索引类设置了类变量_witchcraft为true,并确认parse、unparse这些gem有没有引入,此外没干什么
# chewy-5.1.0/lib/chewy/type/witchcraft.rb
module Chewy
class Type
module Witchcraft
extend ActiveSupport::Concern
included do
class_attribute :_witchcraft, instance_reader: false, instance_writer: false
end
module ClassMethods
def witchcraft!
self._witchcraft = true
check_requirements!
end
料想它应该是在查询数据时才组建函数,于是在整个chewy的代码中搜索witchcraft字样,发现如下使用之处,那应该就没错了
# chewy-5.1.0/lib/chewy/type/import.rb
def compose(object, crutches = nil, fields: [])
crutches ||= Chewy::Type::Crutch::Crutches.new self, [object]
if witchcraft? && root.children.present?
cauldron(fields: fields).brew(object, crutches)
else
root.compose(object, crutches, fields: fields)
end
end
于是跟踪一下在witchcraft模式下的import过程。先建一个索引类:
class BookIndex < Chewy::Index
define_type Book.includes(:author, :tags, :comments) do
witchcraft!
field :title, type: 'text'
field :author, value: ->{ author.name }
field :tags, type: 'keyword', value: ->{ tags.map(&:name) }
field :comments, type: 'nested' do
field :comment, value: -> (book, comment) { comment.comment }
field :rating
end
end
end
(此索引类必须写在文件里,因为parse需要根据代码语法树的位置来抽取代码,所以需要文件、行数)
然后执行一次import
[4] pry(main)> binding.trace_tree(htmp: 'witchcraft_import', transcode: true, no_methods: /perform_request/){ BookIndex::Book.import [1] }
(46.6ms) SELECT COUNT(*) FROM "books" WHERE "books"."id" IN (1)
Book Load (59.8ms) SELECT "books".* FROM "books" WHERE "books"."id" IN (1)
Dude Load (65.4ms) SELECT "dudes".* FROM "dudes" WHERE "dudes"."id" IN (1)
SQL (159.2ms) SELECT "taggings".*, "taggings"."id" AS t0_r0, "taggings"."tag_id" AS t0_r1, "taggings"."taggable_id" AS t0_r2, "taggings"."taggable_type" AS t0_r3, "taggings"."tagger_id" AS t0_r4, "taggings"."tagger_type" AS t0_r5, "taggings"."context" AS t0_r6, "taggings"."created_at" AS t0_r7, "tags"."id" AS t1_r0, "tags"."name" AS t1_r1, "tags"."taggings_count" AS t1_r2 FROM "taggings" LEFT OUTER JOIN "tags" ON "tags"."id" = "taggings"."tag_id" WHERE "taggings"."context" = ? AND "taggings"."taggable_type" = 'Book' AND "taggings"."taggable_id" IN (1) [["context", "tags"]]
Comment Load (69.1ms) SELECT "comments".* FROM "comments" WHERE "comments"."book_id" IN (1)
BookIndex::Book Import (29470.0ms) {:index=>1}
=> true
得调用栈如下,即使屏蔽了faraday的perform_request,依然非常庞大
于是尝试找出关键位置。从compose方法可见,witchcraft会用到Cauldron#brew,于是检查其代码
# chewy-5.1.0/lib/chewy/type/witchcraft.rb
class Cauldron
attr_reader :locals
# @param type [Chewy::Type] type for composition
# @param fields [Array] restricts the fields for composition
def initialize(type, fields: [])
@type = type
@locals = []
@fields = fields
end
def brew(object, crutches = nil)
alicorn.call(locals, object, crutches).as_json
end
private
def alicorn
@alicorn ||= class_eval <<-RUBY, __FILE__, __LINE__ + 1
-> (locals, object0, crutches) do
#{composed_values(@type.root, 0)}
end
RUBY
end
def composed_values(field, nesting)
source = <<-RUBY
non_proc_values#{nesting} = #{non_proc_values(field, nesting)}
proc_values#{nesting} = #{proc_values(field, nesting)}
non_proc_values#{nesting}.merge!(proc_values#{nesting})
RUBY
source.gsub("\n,", ',')
end
可见composed_values方法会返回non_pro_values字样,于是在调用栈中查找该返回值,得:

即是,alicorn方法会生成以下block,并缓存于@alicorn中:
-> (locals, object0, crutches) do
non_proc_values0 = (if object0.is_a?(Hash)
{
'title'.freeze => object0.has_key?(:title) ? object0[:title] : object0['title'], 'comments'.freeze =>
# ...
}
else
{
'title'.freeze => object0.title, 'comments'.freeze =>
# ...
}
end)
proc_values0 = {
'author'.freeze => (object0.author.name), 'tags'.freeze => (object0.tags.map(&:name))
}
non_proc_values0.merge!(proc_values0)
end
当要调用时,object0会传入Book对象,至于crutches参数,因为定义索引时没有为comments数据使用crutches模式,所以生成的代码里也没有crutches字眼的替换
再看看非witchcraft的源码。根据源码,witchcraft模式下json数据的生成是依靠cauldron(fields: fields).brew(object, crutches),而非witchcraft则是root.compose(object, crutches, fields: fields)。于是查找compose,得:
# chewy-5.1.0/lib/chewy/fields/root.rb
def compose(object, crutches = nil, fields: [])
result = evaluate([object, crutches])
if children.present?
child_fields = if fields.present?
child_hash.slice(*fields).values
else
children
end
child_fields.each_with_object({}) do |field, memo|
memo.merge!(field.compose(result, crutches) || {})
end.as_json
elsif fields.present?
result.as_json(only: fields, root: false)
else
result.as_json(root: false)
end
end
root的compose如上,root下的field以及field下的field,递归如下
# chewy-5.1.0/lib/chewy/fields/base.rb
def compose(*objects)
result = evaluate(objects)
if children.present? && !multi_field?
result = if result.respond_to?(:to_ary)
result.to_ary.map { |item| compose_children(item, *objects) }
else
compose_children(result, *objects)
end
end
{name => result}
end
private
def evaluate(objects)
object = objects.first
if value.is_a?(Proc)
if value.arity.zero?
object.instance_exec(&value)
elsif value.arity < 0
value.call(*object)
else
value.call(*objects.first(value.arity))
end
else
message = value.is_a?(Symbol) || value.is_a?(String) ? value.to_sym : name
if object.is_a?(Hash)
if object.key?(message)
object[message]
else
object[message.to_s]
end
else
object.send(message)
end
end
end
def compose_children(value, *parent_objects)
return unless value
children.each_with_object({}) do |field, result|
result.merge!(field.compose(value, *parent_objects) || {})
end
end
可见每读取一个字段的值,都要判断索引字段是否通过Proc定义,其中还有一些instance_exec、send之类的操作,性能应该不及witchcraft里代码拼接的写死
再检查brew和compose运行时的调用栈底下有多少函数调用(brew应该调用两次再作检查,因为第一次需要生产代码):
compose有2225个调用

brew有2115

好吧,相差不大,而实际从时间上看,在个人机器上也是几十毫秒的差别,不过有提升的话还是尽量用一下witchcraft吧……