diff options
Diffstat (limited to 'doc/heapprofile.html')
-rw-r--r-- | doc/heapprofile.html | 355 |
1 files changed, 355 insertions, 0 deletions
diff --git a/doc/heapprofile.html b/doc/heapprofile.html new file mode 100644 index 0000000..b155eef --- /dev/null +++ b/doc/heapprofile.html @@ -0,0 +1,355 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<HTML> + +<HEAD> + <link rel="stylesheet" href="designstyle.css"> + <title>Google Heap Profiler</title> +</HEAD> + +<BODY> + +<p align=right> + <i>Last modified + <script type=text/javascript> + var lm = new Date(document.lastModified); + document.write(lm.toDateString()); + </script></i> +</p> + +<p>This is the heap profiler we use at Google, to explore how C++ +programs manage memory. This facility can be useful for</p> +<ul> + <li> Figuring out what is in the program heap at any given time + <li> Locating memory leaks + <li> Finding places that do a lot of allocation +</ul> + +<p>The profiling system instruments all allocations and frees. It +keeps track of various pieces of information per allocation site. An +allocation site is defined as the active stack trace at the call to +<code>malloc</code>, <code>calloc</code>, <code>realloc</code>, or, +<code>new</code>.</p> + +<p>There are three parts to using it: linking the library into an +application, running the code, and analyzing the output.</p> + + +<h1>Linking in the Library</h1> + +<p>To install the heap profiler into your executable, add +<code>-ltcmalloc</code> to the link-time step for your executable. +Also, while we don't necessarily recommend this form of usage, it's +possible to add in the profiler at run-time using +<code>LD_PRELOAD</code>: +<pre>% env LD_PRELOAD="/usr/lib/libtcmalloc.so" <binary></pre> + +<p>This does <i>not</i> turn on heap profiling; it just inserts the +code. For that reason, it's practical to just always link +<code>-ltcmalloc</code> into a binary while developing; that's what we +do at Google. (However, since any user can turn on the profiler by +setting an environment variable, it's not necessarily recommended to +install profiler-linked binaries into a production, running +system.) Note that if you wish to use the heap profiler, you must +also use the tcmalloc memory-allocation library. There is no way +currently to use the heap profiler separate from tcmalloc.</p> + + +<h1>Running the Code</h1> + +<p>There are several alternatives to actually turn on heap profiling +for a given run of an executable:</p> + +<ol> + <li> <p>Define the environment variable HEAPPROFILE to the filename + to dump the profile to. For instance, to profile + <code>/usr/local/bin/my_binary_compiled_with_tcmalloc</code>:</p> + <pre>% env HEAPPROFILE=/tmp/mybin.hprof /usr/local/bin/my_binary_compiled_with_tcmalloc</pre> + <li> <p>In your code, bracket the code you want profiled in calls to + <code>HeapProfilerStart()</code> and <code>HeapProfilerStop()</code>. + <code>HeapProfilerStart()</code> will take + the profile-filename-prefix as an argument. You can then use + <code>HeapProfileDump()</code> or <code>HeapProfile()</code> to + examine the profile.</p> +</ol> + + +<p>For security reasons, heap profiling will not write to a file -- +and is thus not usable -- for setuid programs.</p> + +<H2>Modifying Runtime Behavior</H2> + +<p>You can more finely control the behavior of the heap profiler via +commandline flags.</p> + +<table frame=box rules=sides cellpadding=5 width=100%> + +<tr valign=top> + <td><code>HEAP_PROFILE_ALLOCATION_INTERVAL</code></td> + <td>default: 1073741824 (1 Gb)</td> + <td> + Dump heap profiling information once every specified number of + bytes has been allocated by the program. + </td> +</tr> + +<tr valign=top> + <td><code>HEAP_PROFILE_INUSE_INTERVAL</code></td> + <td>default: 104857600 (100 Mb)</td> + <td> + Dump heap profiling information whenever the high-water memory + usage mark increases by the specified number of bytes. + </td> +</tr> + +<tr valign=top> + <td><code>HEAP_PROFILE_MMAP</code></td> + <td>default: false</td> + <td> + Profile <code>mmap</code> calls in addition to + <code>malloc</code>, <code>calloc</code>, <code>realloc</code>, + and <code>new</code>. + </td> +</tr> + +<tr valign=top> + <td><code>HEAP_PROFILE_MMAP_LOG</code></td> + <td>default: false</td> + <td> + Log <code>mmap</code>/<code>munmap</code>calls. + </td> +</tr> + +</table> + +<H2>Checking for Leaks</H2> + +<p>You can use the heap profiler to manually check for leaks, for +instance by reading the profiler output and looking for large +allocations. However, for that task, it's easier to use the <A +HREF="heap_checker.html">automatic heap-checking facility</A> built +into tcmalloc.</p> + + +<h1><a name="pprof">Analyzing the Output</a></h1> + +<p>If heap-profiling is turned on in a program, the program will +periodically write profiles to the filesystem. The sequence of +profiles will be named:</p> +<pre> + <prefix>.0000.heap + <prefix>.0001.heap + <prefix>.0002.heap + ... +</pre> +<p>where <code><prefix></code> is the filename-prefix supplied +when running the code (e.g. via the <code>HEAPPROFILE</code> +environment variable). Note that if the supplied prefix +does not start with a <code>/</code>, the profile files will be +written to the program's working directory.</p> + +<p>The profile output can be viewed by passing it to the +<code>pprof</code> tool -- the same tool that's used to analyze <A +HREF="cpuprofile.html">CPU profiles</A>. + +<p>Here are some examples. These examples assume the binary is named +<code>gfs_master</code>, and a sequence of heap profile files can be +found in files named:</p> +<pre> + /tmp/profile.0001.heap + /tmp/profile.0002.heap + ... + /tmp/profile.0100.heap +</pre> + +<h3>Why is a process so big</h3> + +<pre> + % pprof --gv gfs_master /tmp/profile.0100.heap +</pre> + +<p>This command will pop-up a <code>gv</code> window that displays +the profile information as a directed graph. Here is a portion +of the resulting output:</p> + +<p><center> +<img src="heap-example1.png"> +</center></p> + +A few explanations: +<ul> +<li> <code>GFS_MasterChunk::AddServer</code> accounts for 255.6 MB + of the live memory, which is 25% of the total live memory. +<li> <code>GFS_MasterChunkTable::UpdateState</code> is directly + accountable for 176.2 MB of the live memory (i.e., it directly + allocated 176.2 MB that has not been freed yet). Furthermore, + it and its callees are responsible for 729.9 MB. The + labels on the outgoing edges give a good indication of the + amount allocated by each callee. +</ul> + +<h3>Comparing Profiles</h3> + +<p>You often want to skip allocations during the initialization phase +of a program so you can find gradual memory leaks. One simple way to +do this is to compare two profiles -- both collected after the program +has been running for a while. Specify the name of the first profile +using the <code>--base</code> option. For example:</p> +<pre> + % pprof --base=/tmp/profile.0004.heap gfs_master /tmp/profile.0100.heap +</pre> + +<p>The memory-usage in <code>/tmp/profile.0004.heap</code> will be +subtracted from the memory-usage in +<code>/tmp/profile.0100.heap</code> and the result will be +displayed.</p> + +<h3>Text display</h3> + +<pre> +% pprof --text gfs_master /tmp/profile.0100.heap + 255.6 24.7% 24.7% 255.6 24.7% GFS_MasterChunk::AddServer + 184.6 17.8% 42.5% 298.8 28.8% GFS_MasterChunkTable::Create + 176.2 17.0% 59.5% 729.9 70.5% GFS_MasterChunkTable::UpdateState + 169.8 16.4% 75.9% 169.8 16.4% PendingClone::PendingClone + 76.3 7.4% 83.3% 76.3 7.4% __default_alloc_template::_S_chunk_alloc + 49.5 4.8% 88.0% 49.5 4.8% hashtable::resize + ... +</pre> + +<p> +<ul> + <li> The first column contains the direct memory use in MB. + <li> The fourth column contains memory use by the procedure + and all of its callees. + <li> The second and fifth columns are just percentage + representations of the numbers in the first and fifth columns. + <li> The third column is a cumulative sum of the second column + (i.e., the <code>k</code>th entry in the third column is the + sum of the first <code>k</code> entries in the second column.) +</ul> + +<h3>Ignoring or focusing on specific regions</h3> + +<p>The following command will give a graphical display of a subset of +the call-graph. Only paths in the call-graph that match the regular +expression <code>DataBuffer</code> are included:</p> +<pre> +% pprof --gv --focus=DataBuffer gfs_master /tmp/profile.0100.heap +</pre> + +<p>Similarly, the following command will omit all paths subset of the +call-graph. All paths in the call-graph that match the regular +expression <code>DataBuffer</code> are discarded:</p> +<pre> +% pprof --gv --ignore=DataBuffer gfs_master /tmp/profile.0100.heap +</pre> + +<h3>Total allocations + object-level information</h3> + +<p>All of the previous examples have displayed the amount of in-use +space. I.e., the number of bytes that have been allocated but not +freed. You can also get other types of information by supplying a +flag to <code>pprof</code>:</p> + +<center> +<table frame=box rules=sides cellpadding=5 width=100%> + +<tr valign=top> + <td><code>--inuse_space</code></td> + <td> + Display the number of in-use megabytes (i.e. space that has + been allocated but not freed). This is the default. + </td> +</tr> + +<tr valign=top> + <td><code>--inuse_objects</code></td> + <td> + Display the number of in-use objects (i.e. number of + objects that have been allocated but not freed). + </td> +</tr> + +<tr valign=top> + <td><code>--alloc_space</code></td> + <td> + Display the number of allocated megabytes. This includes + the space that has since been de-allocated. Use this + if you want to find the main allocation sites in the + program. + </td> +</tr> + +<tr valign=top> + <td><code>--alloc_objects</code></td> + <td> + Display the number of allocated objects. This includes + the objects that have since been de-allocated. Use this + if you want to find the main allocation sites in the + program. + </td> + +</table> +</center> + + +<h3>Interactive mode</a></h3> + +<p>By default -- if you don't specify any flags to the contrary -- +pprof runs in interactive mode. At the <code>(pprof)</code> prompt, +you can run many of the commands described above. You can type +<code>help</code> for a list of what commands are available in +interactive mode.</p> + + +<h1>Caveats</h1> + +<ul> + <li> Heap profiling requires the use of libtcmalloc. This + requirement may be removed in a future version of the heap + profiler, and the heap profiler separated out into its own + library. + + <li> If the program linked in a library that was not compiled + with enough symbolic information, all samples associated + with the library may be charged to the last symbol found + in the program before the libary. This will artificially + inflate the count for that symbol. + + <li> If you run the program on one machine, and profile it on + another, and the shared libraries are different on the two + machines, the profiling output may be confusing: samples that + fall within the shared libaries may be assigned to arbitrary + procedures. + + <li> Several libraries, such as some STL implementations, do their + own memory management. This may cause strange profiling + results. We have code in libtcmalloc to cause STL to use + tcmalloc for memory management (which in our tests is better + than STL's internal management), though it only works for some + STL implementations. + + <li> If your program forks, the children will also be profiled + (since they inherit the same HEAPPROFILE setting). Each + process is profiled separately; to distinguish the child + profiles from the parent profile and from each other, all + children will have their process-id attached to the HEAPPROFILE + name. + + <li> Due to a hack we make to work around a possible gcc bug, your + profiles may end up named strangely if the first character of + your HEAPPROFILE variable has ascii value greater than 127. + This should be exceedingly rare, but if you need to use such a + name, just set prepend <code>./</code> to your filename: + <code>HEAPPROFILE=./Ägypten</code>. +</ul> + +<hr> +<address>Sanjay Ghemawat<br> +<!-- Created: Tue Dec 19 10:43:14 PST 2000 --> +<!-- hhmts start --> +Last modified: Sat Feb 24 14:33:15 PST 2007 (csilvers) +<!-- hhmts end --> +</address> +</body> +</html> |