直观上来看,includes就是执行两条sql,第二条以第一条的ids作为参数,以使关联记录在第二条sql中一次过查出,避免1+N问题

跟踪看看:

irb(main):034:0> binding.trace_tree(html: true, tmp: ['rails', 'includes.html']){Article.includes(:comments).limit(2).each{}}
  Article Load (2752.8ms)  SELECT  "articles".* FROM "articles" LIMIT ?  [["LIMIT", 2]]
  Comment Load (7870.3ms)  SELECT "comments".* FROM "comments" WHERE "comments"."article_id" IN (2, 3)
=> [#<article id:="" 2,="" title:="" "brandy="" of="" the="" damned",="" text:="" "to="" say="" nothing="" of="" the="" dog",="" created_at:="" "2017-03-28="" 06:23:20",="" updated_at:="" "2017-03-28="" 06:23:20",="" user_id:="" 6="">, #<article id:="" 3,="" title:="" "a="" farewell="" to="" arms",="" text:="" "vile="" bodies",="" created_at:="" "2017-03-28="" 06:23:20",="" updated_at:="" "2017-03-28="" 06:23:20",="" user_id:="" 2="">]</article></article>


调用栈如下

includes.html

include就是将要include的association先暂存于includes_values

def includes(*args)
  check_if_method_has_arguments!(:includes, args)
  spawn.includes!(*args)
end

def includes!(*args) # :nodoc:
  args.reject!(&:blank?)
  args.flatten!

  self.includes_values |= args
  self
end


当执行each实际查询时,执行路径如下,可见eager_loading?为false



从源码来看,eager_loading?是用于决定join不join的

def exec_queries(&block)
  @records = eager_loading? ? find_with_associations.freeze : @klass.find_by_sql(arel, bound_attributes, &block).freeze

  preload = preload_values
  preload +=  includes_values unless eager_loading?
  preloader = build_preloader
  preload.each do |associations|
    preloader.preload @records, associations
  end

  @records.each(&:readonly!) if readonly_value

  @loaded = true
  @records
end


那么eager_loading?具体是什么呢,步进看看,发现由四组values(一堆表名)是否有值来决定

From: /home/z/.rbenv/versions/2.4.0/lib/ruby/gems/2.4.0/gems/activerecord-5.0.2/lib/active_record/relation.rb @ line 640 ActiveRecord::Relation#eager_loading?:

    637: def eager_loading?
    638:   @should_eager_load ||=
    639:     eager_load_values.any? ||
 => 640:     includes_values.any? && (joined_includes_values.any? || references_eager_loaded_tables?)
    641: end

[1] pry(#)> eager_load_values
=> []
[2] pry(#)> includes_values
=> [:comments]
[3] pry(#)> joined_includes_values
=> []
[4] pry(#)> references_eager_loaded_tables?
=> false


搜索整个activerecord,得知eager_load_values是这样设置的(暂时未明为什么不直接写left_outer_join)

# Forces eager loading by performing a LEFT OUTER JOIN on +args+:
#
#   User.eager_load(:posts)
#   # SELECT "users"."id" AS t0_r0, "users"."name" AS t0_r1, ...
#   # FROM "users" LEFT OUTER JOIN "posts" ON "posts"."user_id" =
#   # "users"."id"
def eager_load(*args)
  check_if_method_has_arguments!(:eager_load, args)
  spawn.eager_load!(*args)
end

def eager_load!(*args) # :nodoc:
  self.eager_load_values += args
  self
end


而joined_includes_values是这样的,就是若发现includes中含有joins的表,也进行eager load(即导致exec_queries走find_with_associations )

# Joins that are also marked for preloading. In which case we should just eager load them.
# Note that this is a naive implementation because we could have strings and symbols which
# represent the same association, but that aren't matched by this. Also, we could have
# nested hashes which partially match, e.g. { a: :b } & { a: [:b, :c] }
def joined_includes_values
  includes_values & joins_values
end


references_eager_loaded_tables?骤看涉及到arel比较复杂,只看最后一行的话,只关注references_values和joined_tables

From: /home/z/.rbenv/versions/2.4.0/lib/ruby/gems/2.4.0/gems/activerecord-5.0.2/lib/active_record/relation.rb @ line 736 ActiveRecord::Relation#references_eager_loaded_tables?:

    722: def references_eager_loaded_tables?
    723:   joined_tables = arel.join_sources.map do |join|
    724:     if join.is_a?(Arel::Nodes::StringJoin)
    725:       tables_in_string(join.left)
    726:     else
    727:       [join.left.table_name, join.left.table_alias]
    728:     end
    729:   end
    730:
    731:   joined_tables += [table.name, table.table_alias]
    732:
    733:   # always convert table names to downcase as in Oracle quoted table names are in uppercase
    734:   joined_tables = joined_tables.flatten.compact.map(&:downcase).uniq
    735:
 => 736:   (references_values - joined_tables).any?
    737: end

[9] pry(#)> joined_tables
=> ["articles"]
[10] pry(#)> references_values
=> []


references_values的设置如下

# Use to indicate that the given +table_names+ are referenced by an SQL string,
# and should therefore be JOINed in any query rather than loaded separately.
# This method only works in conjunction with #includes.
# See #includes for more details.
#
#   User.includes(:posts).where("posts.name = 'foo'")
#   # Doesn't JOIN the posts table, resulting in an error.
#
#   User.includes(:posts).where("posts.name = 'foo'").references(:posts)
#   # Query now knows the string references posts, so adds a JOIN
def references(*table_names)
  check_if_method_has_arguments!(:references, table_names)
  spawn.references!(*table_names)
end

def references!(*table_names) # :nodoc:
  table_names.flatten!
  table_names.map!(&:to_s)

  self.references_values |= table_names
  self
end


结合rails guide来看,references是用来指定sql片段中的表名的,效果等同includes...where加hash参数,总之,会造成join,也需要eager load返回true来导致exec_queries走find_with_associations(rails guide建议includes...where写成joins会更直观)

而单单includes的话,想要的效果就是执行两条sql,第二条以第一条的ids作为参数,并把返回的记录归类到第一条的model上,但保持第一条sql的记录数而不inner join

粗略地看,在调用栈中没看到有调ids方法,不过有owner_keys

From: /home/z/.rbenv/versions/2.4.0/lib/ruby/gems/2.4.0/gems/activerecord-5.0.2/lib/active_record/associations/preloader/association.rb @ line 75 ActiveRecord::Associations::Preloader::Association#associated_records_by_owner:

    64: def associated_records_by_owner(preloader)
    65:   records = load_records do |record|
    66:     owner = owners_by_key[convert_key(record[association_key_name])]
    67:     association = owner.association(reflection.name)
    68:     association.set_inverse_instance(record)
    69:   end
    70:
    71:   owners.each_with_object({}) do |owner, result|
    72:     result[owner] = records[convert_key(owner[owner_key_name])] || []
    73:   end
    74:   binding.pry
 => 75: end

[2] pry(#)> owners
=> [#<article:0x007fb7c31efe60 id:="" 81,="" title:="" "the="" daffodil="" sky",="" text:="" "beyond="" the="" mexique="" bay",="" created_at:="" mon,="" 27="" mar="" 2017="" 15:57:57="" utc="" +00:00,="" updated_at:="" mon,="" 27="" mar="" 2017="" 15:57:57="" utc="" +00:00,="" user_id:="" 38="">,
 #<article:0x007fb7c31dfdf8 id:="" 82,="" title:="" "this="" side="" of="" paradise",="" text:="" "edna="" o'brien",="" created_at:="" mon,="" 27="" mar="" 2017="" 15:57:57="" utc="" +00:00,="" updated_at:="" mon,="" 27="" mar="" 2017="" 15:57:57="" utc="" +00:00,="" user_id:="" 39="">]
[3] pry(#)> owner_keys
=> [81, 82]</article:0x007fb7c31dfdf8></article:0x007fb7c31efe60>


preloader执行第二条sql的过程大概如下