Skip to content

Commit dac5992

Browse files
Add reference explorer
Allowing to perform analysis of references to an object.
1 parent fa70dab commit dac5992

File tree

4 files changed

+259
-0
lines changed

4 files changed

+259
-0
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
## HEAD
22

3+
- New command `heapy ref-explore` (https://github.com/zombocom/heapy/pull/33)
4+
35
## 0.2.0
46

57
- Heapy::Alive is removed (https://github.com/schneems/heapy/pull/27)

README.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,74 @@ $ heapy read tmp/2015-10-01T10:18:59-05:00-heap.dump all
133133

134134
You can also use T-Lo's online JS based [Heap Analyzer](http://tenderlove.github.io/heap-analyzer/) for visualizations. Another tool is [HARB](https://github.com/csfrancis/harb)
135135

136+
### Following references to an object
137+
138+
Using the methods above you might find out that certain kinds of objects are retained for many generations, but you might still not know what keeps them retained.
139+
140+
For this purpose you can use `heapy ref-explore`, which will follow the references to an object until it finds a GC root node. This should give you an
141+
indication, _why_ an object is still retained:
142+
143+
```
144+
$ heapy ref-explore spec/fixtures/dumps/00-heap.dump 0x7fb47763feb0
145+
146+
## Reference chain
147+
<OBJECT ActiveRecord::Attribute::FromDatabase 0x7FB47763FEB0> (allocated at activerecord-4.2.3/lib/active_record/attribute.rb:5)
148+
<HASH 0x7FB474CA1A90> (allocated at lib/active_record/attribute_set/builder.rb:30)
149+
<OBJECT ActiveRecord::LazyAttributeHash 0x7FB474CA1B30> (allocated at lib/active_record/attribute_set/builder.rb:16)
150+
<OBJECT ActiveRecord::AttributeSet 0x7FB474CA1A68> (allocated at lib/active_record/attribute_set/builder.rb:17)
151+
<OBJECT Repo 0x7FB474CA1A40> (allocated at activerecord-4.2.3/lib/active_record/core.rb:114)
152+
<ARRAY 996 items 0x7FB474D790A8> (allocated at activerecord-4.2.3/lib/active_record/querying.rb:50)
153+
<OBJECT Repo::ActiveRecord_Relation 0x7FB476A8BE98> (allocated at lib/active_record/relation/spawn_methods.rb:10)
154+
<OBJECT PagesController 0x7FB476AB25C0> (allocated at actionpack-4.2.3/lib/action_controller/metal.rb:237)
155+
<HASH 0x7FB4772EAE68> (allocated at rack-1.6.4/lib/rack/mock.rb:92)
156+
<OBJECT ActionDispatch::Request 0x7FB476AB2480> (allocated at actionpack-4.2.3/lib/action_controller/metal.rb:237)
157+
<OBJECT ActionDispatch::Response 0x7FB476AB2458> (allocated at lib/action_controller/metal/rack_delegation.rb:28)
158+
<ROOT machine_context 0x2> (allocated at )
159+
160+
## All references to 0x7fb47763feb0
161+
* <HASH 0x7FB474CA1A90> (allocated at lib/active_record/attribute_set/builder.rb:30)
162+
```
163+
164+
#### Obtaining object addresses for inspection
165+
166+
Heapy does not _yet_ include a way to obtain suitable addresses for further inspection. You might work around this using `grep`. Assuming you are
167+
looking for a string in generation 35 of your dump, you can filter like this:
168+
169+
```
170+
grep "generation\":35" spec/fixtures/dumps/00-heap.dump | grep STRING
171+
```
172+
173+
You can then try any of the addresses returned in the result.
174+
175+
#### Interactive mode
176+
177+
Loading a larger heap dump for reference exploration might take some time and you might want to try more than one object address to see if they all share the same path to a root node. When called without an address, `ref-explore` will enter interactive mode, where you can enter an address, see the result and then enter the next address until you quit (Ctrl+C):
178+
179+
```
180+
heapy ref-explore spec/fixtures/dumps/00-heap.dump
181+
Enter address > 0xdeadbeef
182+
183+
Could not find a reference chain leading to a root node. Searching for a non-specific chain now.
184+
185+
## Reference chain
186+
187+
## All references to 0xdeadbeef
188+
189+
Enter address > 0x7fb47763df70
190+
191+
## Reference chain
192+
<STRING 0x7FB47763DF70> (allocated at lib/active_record/type/string.rb:35)
193+
<OBJECT ActiveRecord::Attribute::FromDatabase 0x7FB47763DF98> (allocated at activerecord-4.2.3/lib/active_record/attribute.rb:5)
194+
--- shortened for documentation purposes ---
195+
<OBJECT ActionDispatch::Response 0x7FB476AB2458> (allocated at lib/action_controller/metal/rack_delegation.rb:28)
196+
<ROOT machine_context 0x2> (allocated at )
197+
198+
## All references to 0x7fb47763df70
199+
* <OBJECT ActiveRecord::Attribute::FromDatabase 0x7FB47763DF98> (allocated at activerecord-4.2.3/lib/active_record/attribute.rb:5)
200+
201+
Enter address >
202+
```
203+
136204
## Development
137205

138206
After checking out the repo, run `$ bundle install` to install dependencies. Then, run `rake spec` to run the tests.

lib/heapy.rb

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,33 @@ def diff(before, after, retained = nil)
6767
Diff.new(before: before, after: after, retained: retained, output_diff: options[:output_diff] || nil).call
6868
end
6969

70+
long_desc <<-DESC
71+
Follows references to given object addresses and prints them as a reference stack. This can for example be useful
72+
if you are wondering why a given object has not been garbage collected.
73+
74+
Run with a list of addresses to get results for reference stacks to all the given addresses
75+
76+
$ heapy ref-explore my.dump 0xabcdef 0xdeadbeef\x5
77+
78+
Run without specifying addresses to get an interactive prompt that asks you to enter one address at a time
79+
80+
$ heapy ref-explore my.dump\x5
81+
82+
DESC
83+
desc "ref-explore <file> [<address>...]", "Follows references to a given object"
84+
def ref_explore(file, *addresses)
85+
explorer = ReferenceExplorer.new(file)
86+
if addresses.any?
87+
explorer.drill_down_list(addresses)
88+
else
89+
begin
90+
explorer.drill_down_interactive
91+
rescue Interrupt
92+
nil
93+
end
94+
end
95+
end
96+
7097
map %w[--version -v] => :version
7198
desc "version", "Show heapy version"
7299
def version
@@ -103,3 +130,4 @@ def wat
103130

104131
require 'heapy/analyzer'
105132
require 'heapy/diff'
133+
require 'heapy/reference_explorer'

lib/heapy/reference_explorer.rb

Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
require 'json'
2+
require 'readline'
3+
require 'set'
4+
5+
module Heapy
6+
7+
# Follows references to given object addresses and prints
8+
# them as a reference stack.
9+
# Since multiple reference stacks are possible, it will preferably
10+
# try to print a stack that leads to a root node, since reference chains
11+
# leading to a root node will make an object non-collectible by GC.
12+
#
13+
# In case no chain to a root node can be found one possible stack is printed
14+
# as a fallback.
15+
class ReferenceExplorer
16+
def initialize(filename)
17+
@objects = {}
18+
@reverse_references = {}
19+
@virtual_root_address = 0
20+
File.open(filename) do |f|
21+
f.each.with_index do |line, i|
22+
o = JSON.parse(line)
23+
addr = add_object(o)
24+
add_reverse_references(o, addr)
25+
add_class_references(o, addr)
26+
end
27+
end
28+
end
29+
30+
def drill_down_list(addresses)
31+
addresses.each { |addr| drill_down(addr) }
32+
end
33+
34+
def drill_down_interactive
35+
while buf = Readline.readline("Enter address > ", true)
36+
drill_down(buf)
37+
end
38+
end
39+
40+
def drill_down(addr_string)
41+
addr = addr_string.to_i(16)
42+
puts
43+
44+
chain = find_root_chain(addr)
45+
unless chain
46+
puts 'Could not find a reference chain leading to a root node. Searching for a non-specific chain now.'
47+
puts
48+
chain = find_any_chain(addr)
49+
end
50+
51+
puts '## Reference chain'
52+
chain.each do |ref|
53+
puts format_object(ref)
54+
end
55+
56+
puts
57+
puts "## All references to #{addr_string}"
58+
refs = @reverse_references[addr] || []
59+
refs.each do |ref|
60+
puts " * #{format_object(ref)}"
61+
end
62+
63+
puts
64+
end
65+
66+
def inspect
67+
"<ReferenceExplorer #{@objects.size} objects; #{@reverse_references.size} back-refs>"
68+
end
69+
70+
private
71+
72+
def add_object(o)
73+
addr = o['address']&.to_i(16)
74+
if !addr && o['type'] == 'ROOT'
75+
addr = @virtual_root_address
76+
o['name'] ||= o['root']
77+
@virtual_root_address += 1
78+
end
79+
80+
return unless addr
81+
82+
simple_object = o.slice('type', 'file', 'name', 'class', 'length', 'imemo_type')
83+
simple_object['class'] = simple_object['class'].to_i(16) if simple_object.key?('class')
84+
simple_object['file'] = o['file'] + ":#{o['line']}" if o.key?('file') && o.key?('line')
85+
86+
@objects[addr] = simple_object
87+
88+
addr
89+
end
90+
91+
def add_reverse_references(o, addr)
92+
return unless o.key?('references')
93+
o.fetch('references').map { |r| r.to_i(16) }.each do |ref|
94+
(@reverse_references[ref] ||= []) << addr
95+
end
96+
end
97+
98+
# An instance of a class keeps that class marked by the GC.
99+
# This is not directly indicated as a reference in a heap dump,
100+
# so we manually introduce the back-reference.
101+
def add_class_references(o, addr)
102+
return unless o.key?('class')
103+
return if o['type'] == 'IMEMO'
104+
105+
class_addr = o.fetch('class').to_i(16)
106+
(@reverse_references[class_addr] ||= []) << addr
107+
end
108+
109+
def find_root_chain(addr, known_addresses = Set.new)
110+
known_addresses << addr
111+
112+
return [addr] if addr < @virtual_root_address # assumption: only root objects have smallest possible addresses
113+
114+
references = @reverse_references[addr] || []
115+
116+
references.reject { |a| known_addresses.include?(a) }.each do |ref|
117+
path = find_root_chain(ref, known_addresses)
118+
return [addr] + path if path
119+
end
120+
121+
nil
122+
end
123+
124+
def find_any_chain(addr, known_addresses = Set.new)
125+
known_addresses << addr
126+
127+
references = @reverse_references[addr] || []
128+
129+
next_ref = references.reject { |a| known_addresses.include?(a) }.first
130+
if next_ref
131+
[addr] + find_any_chain(next_ref, known_addresses)
132+
else
133+
[]
134+
end
135+
end
136+
137+
def format_path(path)
138+
return '' unless path
139+
140+
path.split('/').reverse.take(4).reverse.join('/')
141+
end
142+
143+
def format_object(addr)
144+
obj = @objects[addr]
145+
return "<Unknown 0x#{addr.to_s(16)}>" unless obj
146+
147+
desc = if obj['name']
148+
obj['name']
149+
elsif obj['type'] == 'OBJECT'
150+
@objects.dig(obj['class'], 'name')
151+
elsif obj['type'] == 'ARRAY'
152+
"#{obj['length']} items"
153+
elsif obj['type'] == 'IMEMO'
154+
obj['imemo_type']
155+
end
156+
desc = desc ? " #{desc}" : ''
157+
addr = addr ? " 0x#{addr.to_s(16).upcase}" : ''
158+
"<#{obj['type']}#{desc}#{addr}> (allocated at #{format_path obj['file']})"
159+
end
160+
end
161+
end

0 commit comments

Comments
 (0)