diff options
author | Ivan Maidanski <ivmai@mail.ru> | 2017-06-20 10:31:44 +0300 |
---|---|---|
committer | Ivan Maidanski <ivmai@mail.ru> | 2017-06-20 10:31:44 +0300 |
commit | 0e5ce680295709d971a32fa9d13008227332c1a0 (patch) | |
tree | 1871b026c5399d0c81e4a2b30088abed5695d019 /doc | |
parent | 0ac938d0823533cfca981c54a6316c3f482baea8 (diff) | |
download | bdwgc-0e5ce680295709d971a32fa9d13008227332c1a0.tar.gz |
Convert overview.html, tree.html to Markdown format
* README.md: Replace overview.html with overview.md.
* doc/doc.am (dist_doc_DATA): Likewise.
* doc/doc.am (dist_doc_DATA): Replace tree.html with tree.md.
* doc/gcdescr.md (Mark phase): Likewise.
* doc/overview.html: Change file suffix to .md; convert text format
from HTML to Markdown.
* doc/tree.html: Likewise.
Diffstat (limited to 'doc')
-rw-r--r-- | doc/doc.am | 4 | ||||
-rw-r--r-- | doc/gcdescr.md | 2 | ||||
-rw-r--r-- | doc/overview.html | 431 | ||||
-rw-r--r-- | doc/overview.md | 388 | ||||
-rw-r--r-- | doc/tree.html | 195 | ||||
-rw-r--r-- | doc/tree.md | 175 |
6 files changed, 566 insertions, 629 deletions
@@ -42,8 +42,8 @@ dist_doc_DATA = \ doc/gcdescr.md \ doc/gcinterface.md \ doc/leak.md \ - doc/overview.html \ + doc/overview.md \ doc/porting.md \ doc/scale.md \ doc/simple_example.md \ - doc/tree.html + doc/tree.md diff --git a/doc/gcdescr.md b/doc/gcdescr.md index 4dbae2fc..bddbab30 100644 --- a/doc/gcdescr.md +++ b/doc/gcdescr.md @@ -251,7 +251,7 @@ block. This is done in the following steps: actually consist of multiple such pages. HBLKSIZE is usually the page size divided by a small power of two.) * The page address part of the candidate pointer is looked up in - a [table](tree.html). Each table entry contains either 0, indicating that + a [table](tree.md). Each table entry contains either 0, indicating that the page is not part of the garbage collected heap, a small integer _n_, indicating that the page is part of large object, starting at least _n_ pages back, or a pointer to a descriptor for the page. In the first case, diff --git a/doc/overview.html b/doc/overview.html deleted file mode 100644 index 3c31c862..00000000 --- a/doc/overview.html +++ /dev/null @@ -1,431 +0,0 @@ -<!DOCTYPE HTML> -<html><head><title>A garbage collector for C and C++</title></head> -<body> -<table bgcolor="#f0f0ff" cellpadding="10%"> - <tbody><tr> - <td><a href="gcinterface.md">Interface Overview</a></td> - <td><a href="http://www.hboehm.info/gc/04tutorial.pdf">Tutorial Slides</a></td> - <td><a href="http://www.hboehm.info/gc/faq.html">FAQ</a></td> - <td><a href="simple_example.md">Example</a></td> - <td><a href="https://github.com/ivmai/bdwgc/wiki/Download">Download</a></td> - <td><a href="http://www.hboehm.info/gc/license.txt">License</a></td> - </tr> -</tbody></table> -<h1>A garbage collector for C and C++</h1> -<ul> -<li><a href="#platforms">Platforms</a> -</li><li><a href="#multiprocessors">Scalable multiprocessor versions</a> -</li><li><a href="#details">Some collector details</a> -</li><li><a href="#further">Further reading</a> -</li><li><a href="#users">Current users</a> -</li><li><a href="#collector">Local Links for this collector</a> -</li><li><a href="#background">Local Background Links</a> -</li><li><a href="#contacts">Contacts and Mailing List</a> -</li></ul> -[ This is an updated version of the page formerly at -<tt>www.hpl.hp.com/personal/Hans_Boehm/gc/</tt>, -before that at -<tt>http://reality.sgi.com/boehm/gc.html</tt> -and before that at -<tt>ftp://ftp.parc.xerox.com/pub/gc/gc.html</tt>. ] -<p> -The <a href="http://www.hboehm.info">Boehm</a>-<a href="http://www.cs.cornell.edu/annual_report/00-01/bios.htm#demers">Demers</a>-<a href="http://www.ubiq.com/hypertext/weiser/weiser.html">Weiser</a> -conservative Garbage Collector (<b>BDWGC</b>) can -be used as a garbage collecting -replacement for C <tt>malloc</tt> or C++ <tt>new</tt>. -It allows you to allocate memory basically as you normally would, -without explicitly deallocating memory that is no longer useful. -The collector automatically recycles memory when it determines -that it can no longer be otherwise accessed. -A simple example of such a use is given -<a href="simple_example.md">here</a>. -</p><p> -The collector is also used by a number of programming language -implementations that either use C as intermediate code, want -to facilitate easier interoperation with C libraries, or -just prefer the simple collector interface. -For a more detailed description of the interface, see -<a href="gcinterface.md">here</a>. -</p><p> -Alternatively, the garbage collector may be used as -a <a href="leak.md">leak detector</a> -for C or C++ programs, though that is not its primary goal. -</p><p> -Typically several versions are offered for -<a href="https://github.com/ivmai/bdwgc/wiki/Download">downloading</a>: -preview, stable, legacy. -Usually you should use the one marked as the <i>latest stable</i> release. -Preview versions may contain additional features, platform support, -but are likely to be less well tested. -The list of changes for each version is specified on the -<a href="https://github.com/ivmai/bdwgc/releases">releases</a> page. -</p><p> -The arguments for and against conservative garbage collection -in C and C++ are briefly -discussed in -<a href="http://www.hboehm.info/gc/issues.html">issues.html</a>. -The beginnings of a frequently-asked-questions list are -<a href="http://www.hboehm.info/gc/faq.html">here</a>. -</p><p> -The garbage collector code is copyrighted by -<a href="http://www.hboehm.info">Hans-J. Boehm</a>, -Alan J. Demers, -<a href="http://www.xerox.com/">Xerox Corporation</a>, -<a href="http://www.sgi.com/">Silicon Graphics</a>, -and -<a href="http://www.hp.com/">Hewlett-Packard Company</a>. -It may be used and copied without payment of a fee under minimal restrictions. -See the README file in the distribution or the -<a href="http://www.hboehm.info/gc/license.txt">license</a> for more details. -<b>IT IS PROVIDED AS IS, -WITH ABSOLUTELY NO WARRANTY EXPRESSED OR IMPLIED. ANY USE IS AT YOUR OWN RISK</b>. -</p><p> -Empirically, this collector works with most unmodified C programs, -simply by replacing -<tt>malloc</tt> with <tt>GC_malloc</tt> calls, -replacing <tt>realloc</tt> with <tt>GC_realloc</tt> calls, and removing -free calls. Exceptions are discussed -in <a href="http://www.hboehm.info/gc/issues.html">issues.html</a>. -</p><h2><a name="platforms">Platforms</a></h2> -The collector is not completely portable, but the distribution -includes ports to most standard PC and UNIX/Linux platforms. -The collector should work on Linux, *BSD, recent Windows versions, -MacOS X, HP/UX, Solaris, -Tru64, Irix and a few other operating systems. -Some ports are more polished than others. -<p> -Irix pthreads, Linux threads, Win32 threads, Solaris threads -(pthreads only), -HP/UX 11 pthreads, Tru64 pthreads, and MacOS X threads are supported -in recent versions. -</p><h3>Separately distributed ports</h3> -For MacOS 9/Classic use, Patrick Beard's latest port is available from -<tt>http://homepage.mac.com/pcbeard/gc/</tt>. -(Unfortunately, that's now quite dated. -I'm not in a position to test under MacOS. Although I try to -incorporate changes, it is impossible for -me to update the project file.) -<p> -Precompiled versions of the collector for NetBSD are available -<a href="ftp://ftp.netbsd.org/pub/pkgsrc/current/pkgsrc/devel/boehm-gc/README.html">here</a>. -</p><p> -<a href="http://www.debian.org/">Debian Linux</a> includes prepackaged -versions of the collector. -</p><h2><a name="multiprocessors">Scalable multiprocessor versions</a></h2> -Kenjiro Taura, Toshio Endo, and Akinori Yonezawa have made available -a <a href="http://ieeexplore.ieee.org/abstract/document/1592629/">parallel collector</a> -based on this one. Their collector takes advantage of multiple processors -during a collection. Starting with collector version 6.0alpha1 -we also do this, though with more modest processor scalability goals. -Our approach is discussed briefly in -<a href="scale.md">this document</a>. -<h2><a name="details">Some Collector Details</a></h2> -The collector uses a <a href="http://www.hboehm.info/gc/complexity.html">mark-sweep</a> algorithm. -It provides incremental and generational -collection under operating systems which provide the right kind of -virtual memory support. (Currently this includes SunOS[45], IRIX, -OSF/1, Linux, and Windows, with varying restrictions.) -It allows <a href="finalization.md"><i>finalization</i></a> code -to be invoked when an object is collected. -It can take advantage of type information to locate pointers if such -information is provided, but it is usually used without such information. -See the README and -<tt>gc.h</tt> files in the distribution for more details. -<p> -For an overview of the implementation, see <a href="gcdescr.md">here</a>. -</p><p> -The garbage collector distribution includes a C string -(<a type="text/plain" href="../include/cord.h"><i>cord</i></a>) package that provides -for fast concatenation and substring operations on long strings. -A simple curses- and win32-based editor that represents the entire file -as a cord is included as a -sample application. -</p><p> -Performance of the nonincremental collector is typically competitive -with malloc/free implementations. Both space and time overhead are -likely to be only slightly higher -for programs written for malloc/free -(see Detlefs, Dosser and Zorn's -<a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.30.3073&rep=rep1&type=ps">Memory Allocation Costs in Large C and C++ Programs</a>.) -For programs allocating primarily very small objects, the collector -may be faster; for programs allocating primarily large objects it will -be slower. If the collector is used in a multi-threaded environment -and configured for thread-local allocation, it may in some cases -significantly outperform malloc/free allocation in time. -</p><p> -We also expect that in many cases any additional overhead -will be more than compensated for by decreased copying etc. -if programs are written -and tuned for garbage collection. -</p><h1><a name="further">Further Reading:</a></h1> -<b>The beginnings of a frequently asked questions list for this -collector are <a href="http://www.hboehm.info/gc/faq.html">here</a></b>. -<p> -<b>The following provide information on garbage collection in general</b>: -</p><p> -Paul Wilson's <a href="ftp://ftp.cs.utexas.edu/pub/garbage">garbage collection ftp archive</a> and <a href="ftp://ftp.cs.utexas.edu/pub/garbage/gcsurvey.ps">GC survey</a>. -</p><p> -The Ravenbrook <a href="http://www.memorymanagement.org/"> -Memory Management Reference</a>. -</p><p> -David Chase's -<a href="http://www.iecc.com/gclist/GC-faq.html">GC FAQ</a>. -</p><p> -Richard Jones' -<a href="https://www.cs.kent.ac.uk/people/staff/rej/gc.html"> -Garbage Collection Page</a> and -<a href="http://www.cs.kent.ac.uk/people/staff/rej/gcbook/gcbook.html"> -his book</a>. -</p><p> -<b>The following papers describe the collector algorithms we use -and the underlying design decisions at -a higher level.</b> -</p><p> -(Some of the lower level details can be found -<a href="gcdescr.md">here</a>.) -</p><p> -The first one is not available -electronically due to copyright considerations. Most of the others are -subject to ACM copyright. -</p><p> -Boehm, H., "Dynamic Memory Allocation and Garbage Collection", <i>Computers in Physics -9</i>, 3, May/June 1995, pp. 297-303. This is directed at an otherwise sophisticated -audience unfamiliar with memory allocation issues. The algorithmic details differ -from those in the implementation. There is a related letter to the editor and a minor -correction in the next issue. -</p><p> -Boehm, H., and <a href="http://www.ubiq.com/hypertext/weiser/weiser.html">M. Weiser</a>, -<a href="http://www.hboehm.info/spe_gc_paper/">"Garbage Collection in an Uncooperative Environment"</a>, -<i>Software Practice & Experience</i>, September 1988, pp. 807-820. -</p><p> -Boehm, H., A. Demers, and S. Shenker, <a href="http://www.hboehm.info/gc/papers/pldi91.ps.Z">"Mostly Parallel Garbage Collection"</a>, -Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design and Implementation, -<i>SIGPLAN Notices 26</i>, 6 (June 1991), pp. 157-164. -</p><p> -Boehm, H., <a href="http://www.hboehm.info/gc/papers/pldi93.ps.Z">"Space Efficient Conservative Garbage Collection"</a>, -Proceedings of the ACM SIGPLAN '93 Conference on Programming Language Design -and Implementation, <i>SIGPLAN Notices 28</i>, 6 (June 1993), pp. 197-206. -</p><p> -Boehm, H., "Reducing Garbage Collector Cache Misses", -<i> Proceedings of the 2000 International Symposium on Memory Management </i>. -<a href="http://portal.acm.org/citation.cfm?doid=362422.362438"> -Official version.</a> -<a href="http://www.hpl.hp.com/techreports/2000/HPL-2000-99.html"> -Technical report version.</a> Describes the prefetch strategy -incorporated into the collector for some platforms. Explains why -the sweep phase of a "mark-sweep" collector should not really be -a distinct phase. -</p><p> -M. Serrano, H. Boehm, -"Understanding Memory Allocation of Scheme Programs", -<i>Proceedings of the Fifth ACM SIGPLAN International Conference on -Functional Programming</i>, 2000, Montreal, Canada, pp. 245-256. -<a href="http://dl.acm.org/citation.cfm?id=351264"> -Official version.</a> -<a href="http://www.hpl.hp.com/techreports/2000/HPL-2000-62.html"> -Earlier Technical Report version.</a> Includes some discussion of the -collector debugging facilities for identifying causes of memory retention. -</p><p> -Boehm, H., -"Fast Multiprocessor Memory Allocation and Garbage Collection", -<a href="http://www.hpl.hp.com/techreports/2000/HPL-2000-165.html"> -HP Labs Technical Report HPL 2000-165</a>. Discusses the parallel -collection algorithms, and presents some performance results. -</p><p> -Boehm, H., "Bounding Space Usage of Conservative Garbage Collectors", -<i>Proceedings of the 2002 ACM SIGPLAN-SIGACT Symposium on Principles of -Programming Languages</i>, Jan. 2002, pp. 93-100. -<a href="http://portal.acm.org/citation.cfm?doid=503272.503282"> -Official version.</a> -<a href="http://www.hpl.hp.com/techreports/2001/HPL-2001-251.html"> -Technical report version.</a> -Includes a discussion of a collector facility to much more reliably test for -the potential of unbounded heap growth. -</p><p> -<b>The following papers discuss language and compiler restrictions necessary to guaranteed -safety of conservative garbage collection.</b> -</p><p> -We thank John Levine and JCLT for allowing -us to make the second paper available electronically, and providing PostScript for the final -version. -</p><p> -Boehm, H., <a href="http://www.hboehm.info/gc/papers/pldi96.ps.gz">"Simple Garbage-Collector-Safety"</a>, -Proceedings of the ACM SIGPLAN '96 Conference on Programming Language Design -and Implementation. -</p><p> -Boehm, H., and D. Chase, <a href="http://www.hboehm.info/gc/papers/boecha.ps.gz">"A Proposal for Garbage-Collector-Safe C Compilation"</a>, -<i>Journal of C Language Translation 4</i>, 2 (December 1992), pp. 126-141. -</p><p> -<b>Other related information: </b> -</p><p> -The Detlefs, Dosser and Zorn's <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.30.3073&rep=rep1&type=ps">Memory Allocation Costs in Large C and C++ Programs</a>. - This is a performance comparison of the Boehm-Demers-Weiser collector to malloc/free, -using programs written for malloc/free. -</p><p> -Joel Bartlett's <a href="ftp://gatekeeper.dec.com/pub/compaq/WRL/research-reports/WRL-TN-12.ps">mostly copying conservative garbage collector for C++</a>. -</p><p> -John Ellis and David Detlef's -<a href="http://dl.acm.org/citation.cfm?id=1267983">Safe Efficient Garbage Collection for C++</a> -proposal. -</p><p> -Henry Baker's <a href="http://home.pipeline.com/%7Ehbaker1/">paper collection</a>. -</p><p> -Slides for Hans Boehm's <a href="http://www.hboehm.info/gc/myths.ps">Allocation and GC Myths</a> talk. -</p><h1><a name="users">Current users:</a></h1> -Known current users of some variant of this collector include: -<p> -The runtime system for -<a href="https://gcc.gnu.org/onlinedocs/gcc-4.8.5/gcj/">GCJ</a>, -the static GNU java compiler. -</p><p> -<a href="http://w3m.sourceforge.net/">W3m</a>, a text-based web browser. -</p><p> -Some versions of the Xerox DocuPrint printer software. -</p><p> -The <a href="http://www.mozilla.org/">Mozilla</a> project, as leak -detector. -</p><p> -The <a href="http://www.mono-project.com">Mono</a> project, -an open source implementation of the .NET development framework. -</p><p> -The <a href="http://www.gnu.org/projects/dotgnu/">DotGNU Portable.NET -project</a>, another open source .NET implementation. -</p><p> -The <a href="http://irssi.org/">Irssi IRC client</a>. -</p><p> -<a href="http://titanium.cs.berkeley.edu/">The Berkeley Titanium project</a>. -</p><p> -<a href="http://www.nag.co.uk/nagware/NP/NP_desc.asp">The NAGWare f90 Fortran 90 compiler</a>. -</p><p> -Elwood Corporation's Eclipse Common Lisp system, C library, and translator. -</p><p> -The <a href="http://www-sop.inria.fr/mimosa/fp/Bigloo/">Bigloo Scheme</a> -and <a href="https://github.com/samoht/camloo">Camloo ML compilers</a> -written by Manuel Serrano and others. -</p><p> -Brent Benson's <a href="http://www.cs.indiana.edu/scheme-repository/libscheme-vhll/final.html">libscheme</a>. -</p><p> -The MzScheme scheme implementation. -</p><p> -The <a href="http://www.cs.washington.edu/research/projects/cecil/www/cecil-home.html">University of Washington Cecil Implementation</a>. -</p><p> -<a href="http://www1.icsi.berkeley.edu/~sather/">The Berkeley Sather implementation</a>. -</p><p> -<a href="http://www.cs.berkeley.edu/~harmonia/harmonia/index.html">The Berkeley Harmonia Project</a>. -</p><p> -The <a href="http://www.cs.arizona.edu/projects/sumatra/toba/">Toba</a> Java Virtual -Machine to C translator. -</p><p> -The <a href="http://www.gwydiondylan.org/">Gwydion Dylan compiler</a>. -</p><p> -The <a href="http://gcc.gnu.org/onlinedocs/gcc/Objective-C.html"> -GNU Objective C runtime</a>. -</p><p> -<a href="http://www.math.uiuc.edu/Macaulay2">Macaulay 2</a>, a system to support -research in algebraic geometry and commutative algebra. -</p><p> -The <a href="http://www.vestasys.org/">Vesta</a> configuration management -system. -</p><p> -<a href="http://www.visual-prolog.com/">Visual Prolog 6</a>. -</p><p> -<a href="http://asymptote.sf.net/">Asymptote LaTeX-compatible -vector graphics language.</a> -</p><h1><a name="collector">More information on the BDWGC primary site</a></h1> -<a href="simple_example.md">A simple illustration of how to build and -use the collector</a>. -<p> -<a href="gcinterface.md">Description of alternate interfaces to the -garbage collector.</a> -</p><p> -<a href="http://www.hboehm.info/gc/04tutorial.pdf">Slides from an ISMM 2004 tutorial about the GC</a>. -</p><p> -<a href="http://www.hboehm.info/gc/faq.html">A FAQ (frequently asked questions) list</a>. -</p><p> -<a href="leak.md">How to use the garbage collector as a leak detector.</a> -</p><p> -<a href="debugging.md">Some hints on debugging garbage collected -applications.</a> -</p><p> -<a href="gcdescr.md">An overview of the implementation of the -garbage collector.</a> -</p><p> -<a href="tree.html">The data structure used for fast pointer lookups.</a> -</p><p> -<a href="scale.md">Scalability of the collector to multiprocessors.</a> -</p><p> -<a href="http://www.hboehm.info/gc/gc_source/">Directory</a> containing -the distribution files of all garbage collector releases. -It duplicates -<a href="https://github.com/ivmai/bdwgc/wiki/Download">Download</a> page on -GitHub. -</p><h1><a name="background">More background information</a></h1> -<a href="http://www.hboehm.info/gc/bounds.html">An attempt to establish a bound on space usage of -conservative garbage collectors</a>. -<p> -<a href="http://www.hboehm.info/gc/complexity.html">Mark-sweep versus copying garbage collectors -and their complexity</a>. -</p><p> -<a href="http://www.hboehm.info/gc/conservative.html">Pros and cons of conservative garbage collectors, -in comparison to other collectors</a>. -</p><p> -<a href="http://www.hboehm.info/gc/issues.html">Issues related to garbage collection vs. -manual memory management in C/C++</a>. -</p><p> -<a href="http://www.hboehm.info/gc/example.html">An example of a case in which garbage collection -results in a much faster implementation as a result of reduced synchronization</a>. -</p><p> -<a href="http://www.hboehm.info/gc/nonmoving/">Slide set discussing performance of nonmoving -garbage collectors</a>. -</p><p> -<a href="http://www.hboehm.info/popl03/web/"> -Slide set discussing <i>Destructors, Finalizers, and Synchronization</i> -(POPL 2003)</a>. -</p><p> -<a href="http://portal.acm.org/citation.cfm?doid=604131.604153"> -Paper corresponding to above slide set</a> -(<a href="http://www.hpl.hp.com/techreports/2002/HPL-2002-335.html"> -Technical Report version</a>). -</p><p> -<a href="http://www.hboehm.info/gc/gc_bench/">A Java/Scheme/C/C++ garbage collection benchmark</a>. -</p><p> -<a href="http://www.hboehm.info/gc/myths.ps">Slides for talk on memory allocation myths</a>. -</p><p> -<a href="http://www.hboehm.info/gc/gctalk.ps">Slides for OOPSLA 98 garbage collection talk</a>. -</p><p> -<a href="http://www.hboehm.info/gc/papers/">Related papers</a>. -</p><h1><a name="contacts">Contacts and new release announcements</a></h1> -GitHub and Stack Overflow are the major two places for communication. -<p> -Technical questions (how to, how does it work, etc.) should be posted to -<a href="https://stackoverflow.com/questions/tagged/boehm-gc">Stack Overflow</a> -with "boehm-gc" tag. -</p><p> -To contribute, please rebase your code to the latest -<a href="https://github.com/ivmai/bdwgc/tree/master/">master</a> and submit -a <a href="https://github.com/ivmai/bdwgc/pulls">pull request</a> to GitHub. -</p><p> -To report a bug, or propose (request) a new feature, create -a <a href="https://github.com/ivmai/bdwgc/issues">GitHub issue</a>. -Please make sure it has not been reported yet by someone else. -</p><p> -To receive notifications on every release, please subscribe to -<a href="https://github.com/ivmai/bdwgc/releases.atom">Releases RSS feed</a>. -Notifications on all issues and pull requests are available by -<a href="https://github.com/ivmai/bdwgc/watchers">watching</a> the project. -</p><p> -Mailing lists (bdwgc-announce@lists.opendylan.org, bdwgc@lists.opendylan.org, -and the former gc-announce@linux.hpl.hp.com and gc@linux.hpl.hp.com) are not -used at this moment. Their content is available in -<a href="https://github.com/ivmai/bdwgc/files/1037650/bdwgc-announce-mailing-list-archive-2014_02.tar.gz">bdwgc-announce</a> -and -<a href="https://github.com/ivmai/bdwgc/files/1038163/bdwgc-mailing-list-archive-2017_04.tar.gz">bdwgc</a> -archive files, respectively. -The gc list archive may also be read at -<a href="http://bdwgc.opendylan.narkive.com">Narkive</a>. -</p><p> -Some prior discussion of the collector has taken place on the gcc -java mailing list, whose archives appear -<a href="http://gcc.gnu.org/ml/java/">here</a>, and also on -<a href="http://lists.tunes.org/mailman/listinfo/gclist">gclist@iecc.com</a>. -</p></body></html> diff --git a/doc/overview.md b/doc/overview.md new file mode 100644 index 00000000..fb3aa2a4 --- /dev/null +++ b/doc/overview.md @@ -0,0 +1,388 @@ +[Interface Overview](gcinterface.md) | [Tutorial Slides](http://www.hboehm.info/gc/04tutorial.pdf) | [FAQ](http://www.hboehm.info/gc/faq.html) | [Example](simple_example.md) | [Download](https://github.com/ivmai/bdwgc/wiki/Download) | [License](http://www.hboehm.info/gc/license.txt) +---|---|---|---|---|--- + +# A garbage collector for C and C++ + + * Platforms + * Scalable multiprocessor versions + * Some collector details + * Further reading + * Current users + * Local Links for this collector + * Local Background Links + * Contacts and Mailing List + +[ This is an updated version of the page formerly at +`www.hpl.hp.com/personal/Hans_Boehm/gc/`, before that at +`http://reality.sgi.com/boehm/gc.html` and before that at +`ftp://ftp.parc.xerox.com/pub/gc/gc.html`. ] + +The +[Boehm](http://www.hboehm.info)-[Demers](http://www.cs.cornell.edu/annual_report/00-01/bios.htm#demers)-[Weiser](http://www.ubiq.com/hypertext/weiser/weiser.html) +conservative Garbage Collector (**BDWGC**) can be used as a garbage collecting +replacement for C `malloc` or C++ `new`. It allows you to allocate memory +basically as you normally would, without explicitly deallocating memory that +is no longer useful. The collector automatically recycles memory when +it determines that it can no longer be otherwise accessed. A simple example +of such a use is given [here](simple_example.md). + +The collector is also used by a number of programming language implementations +that either use C as intermediate code, want to facilitate easier +interoperation with C libraries, or just prefer the simple collector +interface. For a more detailed description of the interface, see +[here](gcinterface.md). + +Alternatively, the garbage collector may be used as a [leak detector](leak.md) +for C or C++ programs, though that is not its primary goal. + +Typically several versions are offered for +[downloading](https://github.com/ivmai/bdwgc/wiki/Download): preview, stable, +legacy. Usually you should use the one marked as the _latest stable_ release. +Preview versions may contain additional features, platform support, but are +likely to be less well tested. The list of changes for each version +is specified on the [releases](https://github.com/ivmai/bdwgc/releases) page. + +The arguments for and against conservative garbage collection in C and C++ are +briefly discussed [here](http://www.hboehm.info/gc/issues.html). The +beginnings of a frequently-asked-questions list are +[here](http://www.hboehm.info/gc/faq.html). + +The garbage collector code is copyrighted by +[Hans-J. Boehm](http://www.hboehm.info), Alan J. Demers, +[Xerox Corporation](http://www.xerox.com/), +[Silicon Graphics](http://www.sgi.com/), and +[Hewlett-Packard Company](http://www.hp.com/). It may be used and copied +without payment of a fee under minimal restrictions. See the README.md file +in the distribution or the [license](http://www.hboehm.info/gc/license.txt) +for more details. **IT IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY +EXPRESSED OR IMPLIED. ANY USE IS AT YOUR OWN RISK**. + +Empirically, this collector works with most unmodified C programs, simply +by replacing `malloc` with `GC_malloc` calls, replacing `realloc` with +`GC_realloc` calls, and removing free calls. Exceptions are discussed +[here](http://www.hboehm.info/gc/issues.html). + +## Platforms + +The collector is not completely portable, but the distribution includes ports +to most standard PC and UNIX/Linux platforms. The collector should work +on Linux, *BSD, recent Windows versions, MacOS X, HP/UX, Solaris, Tru64, Irix +and a few other operating systems. Some ports are more polished than others. + +Irix pthreads, Linux threads, Win32 threads, Solaris threads (pthreads only), +HP/UX 11 pthreads, Tru64 pthreads, and MacOS X threads are supported in recent +versions. + +### Separately distributed ports + +For MacOS 9/Classic use, Patrick Beard's latest port is available from +`http://homepage.mac.com/pcbeard/gc/`. (Unfortunately, that's now quite dated. +I'm not in a position to test under MacOS. Although I try to incorporate +changes, it is impossible for me to update the project file.) + +Precompiled versions of the collector for NetBSD are available +[here](ftp://ftp.netbsd.org/pub/pkgsrc/current/pkgsrc/devel/boehm-gc/README.html). + + +[Debian Linux](http://www.debian.org/) includes prepackaged versions of the +collector. + +## Scalable multiprocessor versions + +Kenjiro Taura, Toshio Endo, and Akinori Yonezawa have made available +a [parallel collector](http://ieeexplore.ieee.org/abstract/document/1592629/) +based on this one. Their collector takes advantage of multiple processors +during a collection. Starting with GC v6.0alpha1 we also do this, though with +more modest processor scalability goals. Our approach is discussed briefly +in [this document](scale.md). + +## Some Collector Details + +The collector uses a [mark-sweep](http://www.hboehm.info/gc/complexity.html) +algorithm. It provides incremental and generational collection under operating +systems which provide the right kind of virtual memory support. (Currently +this includes SunOS[45], IRIX, OSF/1, Linux, and Windows, with varying +restrictions.) It allows [_finalization_](finalization.md) code to be invoked +when an object is collected. It can take advantage of type information +to locate pointers if such information is provided, but it is usually used +without such information. See the README and `gc.h` files in the distribution +for more details. + +For an overview of the implementation, see [here](gcdescr.md). + +The garbage collector distribution includes a C string (`cord.h`) package that +provides for fast concatenation and substring operations on long strings. +A simple curses- and win32-based editor that represents the entire file as +a cord is included as a sample application. + +Performance of the non-incremental collector is typically competitive with +`malloc`/`free` implementations. Both space and time overhead are likely to be +only slightly higher for programs written for `malloc`/`free` (see Detlefs, +Dosser and Zorn's +[Memory Allocation Costs in Large C and C++ Programs](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.30.3073&rep=rep1&type=ps)). +For programs allocating primarily very small objects, the collector may be +faster; for programs allocating primarily large objects it will be slower. +If the collector is used in a multi-threaded environment and configured for +thread-local allocation, it may in some cases significantly outperform +`malloc`/`free` allocation in time. + +We also expect that in many cases any additional overhead will be more than +compensated for by decreased copying etc. if programs are written and tuned +for garbage collection. + +# Further Reading: + +**The beginnings of a frequently asked questions list for this collector are +[here](http://www.hboehm.info/gc/faq.html)**. + +**The following provide information on garbage collection in general**: Paul +Wilson's [garbage collection ftp archive](ftp://ftp.cs.utexas.edu/pub/garbage) +and [GC survey](ftp://ftp.cs.utexas.edu/pub/garbage/gcsurvey.ps). + +The Ravenbrook +[Memory Management Reference](http://www.memorymanagement.org/). + +David Chase's [GC FAQ](http://www.iecc.com/gclist/GC-faq.html). + +Richard Jones' +[Garbage Collection Page](https://www.cs.kent.ac.uk/people/staff/rej/gc.html) +and his [book](http://www.cs.kent.ac.uk/people/staff/rej/gcbook/gcbook.html). + +**The following papers describe the collector algorithms we use and the +underlying design decisions at a higher level.** + +(Some of the lower level details can be found [here](gcdescr.md).) + +The first one is not available electronically due to copyright considerations. +Most of the others are subject to ACM copyright. + +Boehm, H., Dynamic Memory Allocation and Garbage Collection, +_Computers in Physics 9_, 3, May/June 1995, pp. 297-303. This is directed +at an otherwise sophisticated audience unfamiliar with memory allocation +issues. The algorithmic details differ from those in the implementation. There +is a related letter to the editor and a minor correction in the next issue. + +Boehm, H., and [M. Weiser](http://www.ubiq.com/hypertext/weiser/weiser.html), +[Garbage Collection in an Uncooperative Environment](http://www.hboehm.info/spe_gc_paper/), +_Software Practice & Experience_, September 1988, pp. 807-820. + +Boehm, H., A. Demers, and S. Shenker, +[Mostly Parallel Garbage Collection](http://www.hboehm.info/gc/papers/pldi91.ps.Z), +Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design +and Implementation, _SIGPLAN Notices 26_, 6 (June 1991), pp. 157-164. + +Boehm, H., +[Space Efficient Conservative Garbage Collection](http://www.hboehm.info/gc/papers/pldi93.ps.Z), +Proceedings of the ACM SIGPLAN '93 Conference on Programming Language Design +and Implementation, _SIGPLAN Notices 28_, 6 (June 1993), pp. 197-206. + +Boehm, H., Reducing Garbage Collector Cache Misses, +_Proceedings of the 2000 International Symposium on Memory Management_. +[Official version](http://portal.acm.org/citation.cfm?doid=362422.362438). +[Technical report](http://www.hpl.hp.com/techreports/2000/HPL-2000-99.html) +version. Describes the prefetch strategy incorporated into the collector for +some platforms. Explains why the sweep phase of a _mark-sweep_ collector +should not really be a distinct phase. + +M. Serrano, H. Boehm, Understanding Memory Allocation of Scheme Programs, +_Proceedings of the Fifth ACM SIGPLAN International Conference on Functional +Programming_, 2000, Montreal, Canada, pp. 245-256. +[Official version](http://dl.acm.org/citation.cfm?id=351264). Earlier +[Technical Report](http://www.hpl.hp.com/techreports/2000/HPL-2000-62.html) +version. Includes some discussion of the collector debugging facilities for +identifying causes of memory retention. + +Boehm, H., Fast Multiprocessor Memory Allocation and Garbage Collection, +[HP Labs Technical Report HPL 2000-165](http://www.hpl.hp.com/techreports/2000/HPL-2000-165.html). +Discusses the parallel collection algorithms, and presents some performance +results. + +Boehm, H., Bounding Space Usage of Conservative Garbage Collectors, +_Proceedings of the 2002 ACM SIGPLAN-SIGACT Symposium on Principles +of Programming Languages_, Jan. 2002, pp. 93-100. +[Official version](http://portal.acm.org/citation.cfm?doid=503272.503282). +[Technical report](http://www.hpl.hp.com/techreports/2001/HPL-2001-251.html) +version. Includes a discussion of a collector facility to much more reliably +test for the potential of unbounded heap growth. + +**The following papers discuss language and compiler restrictions necessary +to guaranteed safety of conservative garbage collection.** + +We thank John Levine and JCLT for allowing us to make the second paper +available electronically, and providing PostScript for the final version. + +Boehm, H., +[Simple Garbage-Collector-Safety](http://www.hboehm.info/gc/papers/pldi96.ps.gz), +Proceedings of the ACM SIGPLAN '96 Conference on Programming Language Design +and Implementation. + +Boehm, H., and D. Chase, +[A Proposal for Garbage-Collector-Safe C Compilation](http://www.hboehm.info/gc/papers/boecha.ps.gz), +_Journal of C Language Translation 4_, 2 (December 1992), pp. 126-141. + +**Other related information:** + +The Detlefs, Dosser and Zorn's +[Memory Allocation Costs in Large C and C++ Programs](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.30.3073&rep=rep1&type=ps). +This is a performance comparison of the Boehm-Demers-Weiser collector +to `malloc`/`free`, using programs written for `malloc`/`free`. + +Joel Bartlett's +[mostly copying conservative garbage collector for C++](ftp://gatekeeper.dec.com/pub/compaq/WRL/research-reports/WRL-TN-12.ps). + +John Ellis and David Detlef's +[Safe Efficient Garbage Collection for C++](http://dl.acm.org/citation.cfm?id=1267983) +proposal. + +Henry Baker's [paper collection](http://home.pipeline.com/%7Ehbaker1/). + +Slides for Hans Boehm's +[Allocation and GC Myths](http://www.hboehm.info/gc/myths.ps) talk. + +# Current users: + +Known current users of some variant of this collector include: + +The runtime system for [GCJ](https://gcc.gnu.org/onlinedocs/gcc-4.8.5/gcj/), +the static GNU java compiler. + +[W3m](http://w3m.sourceforge.net/), a text-based web browser. + +Some versions of the Xerox DocuPrint printer software. + +The [Mozilla](http://www.mozilla.org/) project, as leak detector. + +The [Mono](http://www.mono-project.com) project, an open source implementation +of the .NET development framework. + +The [DotGNU Portable.NET project](http://www.gnu.org/projects/dotgnu/), +another open source .NET implementation. + +The [Irssi IRC client](http://irssi.org/). + +[The Berkeley Titanium project](http://titanium.cs.berkeley.edu/). + +[The NAGWare f90 Fortran 90 compiler](http://www.nag.co.uk/nagware/NP/NP_desc.asp). + +Elwood Corporation's Eclipse Common Lisp system, C library, and translator. + +The [Bigloo Scheme](http://www-sop.inria.fr/mimosa/fp/Bigloo/) and +[Camloo ML compilers](https://github.com/samoht/camloo) written by Manuel +Serrano and others. + +Brent Benson's +[libscheme](http://www.cs.indiana.edu/scheme-repository/libscheme-vhll/final.html). + +The MzScheme scheme implementation. + +The +[University of Washington Cecil Implementation](http://www.cs.washington.edu/research/projects/cecil/www/cecil-home.html). + +[The Berkeley Sather implementation](http://www1.icsi.berkeley.edu/~sather/). + +[The Berkeley Harmonia Project](http://www.cs.berkeley.edu/~harmonia/harmonia/index.html). + +The +[Toba](http://www.cs.arizona.edu/projects/sumatra/toba/) Java Virtual Machine +to C translator. + +The [Gwydion Dylan compiler](http://www.gwydiondylan.org/). + +The +[GNU Objective C runtime](http://gcc.gnu.org/onlinedocs/gcc/Objective-C.html). + +[Macaulay 2](http://www.math.uiuc.edu/Macaulay2), a system to support research +in algebraic geometry and commutative algebra. + +The [Vesta](http://www.vestasys.org/) configuration management system. + +[Visual Prolog 6](http://www.visual-prolog.com/). + +[Asymptote LaTeX-compatible vector graphics language](http://asymptote.sf.net/). + +# More information on the BDWGC primary site + +[A simple illustration of how to build and use the collector](simple_example.md). + +[Description of alternate interfaces to the garbage collector](gcinterface.md). + +[Slides from an ISMM 2004 tutorial about the GC](http://www.hboehm.info/gc/04tutorial.pdf). + +[A FAQ (frequently asked questions) list](http://www.hboehm.info/gc/faq.html). + +[How to use the garbage collector as a leak detector](leak.md). + +[Some hints on debugging garbage collected applications](debugging.md). + +[An overview of the implementation of the garbage collector](gcdescr.md). + +[The data structure used for fast pointer lookups](tree.md). + +[Scalability of the collector to multiprocessors](scale.md). + +[Directory](http://www.hboehm.info/gc/gc_source/) containing the distribution +files of all garbage collector releases. It duplicates +[Download](https://github.com/ivmai/bdwgc/wiki/Download) page on GitHub. + +# More background information + +[An attempt to establish a bound on space usage of conservative garbage collectors](http://www.hboehm.info/gc/bounds.html). + +[Mark-sweep versus copying garbage collectors and their complexity](http://www.hboehm.info/gc/complexity.html). + +[Pros and cons of conservative garbage collectors, in comparison to other collectors](http://www.hboehm.info/gc/conservative.html). + +[Issues related to garbage collection vs. manual memory management in C/C++](http://www.hboehm.info/gc/issues.html). + +[An example of a case in which garbage collection results in a much faster implementation as a result of reduced synchronization](http://www.hboehm.info/gc/example.html). + +[Slide set discussing performance of nonmoving garbage collectors](http://www.hboehm.info/gc/nonmoving/). + +[Slide set discussing _Destructors, Finalizers, and Synchronization_, POPL 2003](http://www.hboehm.info/popl03/web/). + +[Paper corresponding to above slide set](http://portal.acm.org/citation.cfm?doid=604131.604153) +([Technical Report](http://www.hpl.hp.com/techreports/2002/HPL-2002-335.html) +version). + +[A Java/Scheme/C/C++ garbage collection benchmark](http://www.hboehm.info/gc/gc_bench/). + +[Slides for talk on memory allocation myths](http://www.hboehm.info/gc/myths.ps). + +[Slides for OOPSLA 98 garbage collection talk](http://www.hboehm.info/gc/gctalk.ps). + +[Related papers](http://www.hboehm.info/gc/papers/). + +# Contacts and new release announcements + +GitHub and Stack Overflow are the major two places for communication. + +Technical questions (how to, how does it work, etc.) should be posted +to [Stack Overflow](https://stackoverflow.com/questions/tagged/boehm-gc) with +_boehm-gc_ tag. + +To contribute, please rebase your code to the latest +[master](https://github.com/ivmai/bdwgc/tree/master/) and submit +a [pull request](https://github.com/ivmai/bdwgc/pulls) to GitHub. + +To report a bug, or propose (request) a new feature, create +a [GitHub issue](https://github.com/ivmai/bdwgc/issues). Please make sure +it has not been reported yet by someone else. + +To receive notifications on every release, please subscribe to +[Releases RSS feed](https://github.com/ivmai/bdwgc/releases.atom). +Notifications on all issues and pull requests are available +by [watching](https://github.com/ivmai/bdwgc/watchers) the project. + +Mailing lists (bdwgc-announce@lists.opendylan.org, bdwgc@lists.opendylan.org, +and the former gc-announce@linux.hpl.hp.com and gc@linux.hpl.hp.com) are not +used at this moment. Their content is available +in +[bdwgc-announce](https://github.com/ivmai/bdwgc/files/1037650/bdwgc-announce-mailing-list-archive-2014_02.tar.gz) +and +[bdwgc](https://github.com/ivmai/bdwgc/files/1038163/bdwgc-mailing-list-archive-2017_04.tar.gz) +archive files, respectively. The gc list archive may also be read +at [Narkive](http://bdwgc.opendylan.narkive.com). + +Some prior discussion of the collector has taken place on the gcc java mailing +list, whose archives appear [here](http://gcc.gnu.org/ml/java/), and also +on [gclist@iecc.com](http://lists.tunes.org/mailman/listinfo/gclist). diff --git a/doc/tree.html b/doc/tree.html deleted file mode 100644 index 2bfd5834..00000000 --- a/doc/tree.html +++ /dev/null @@ -1,195 +0,0 @@ -<HTML> -<HEAD> - <TITLE> Two-Level Tree Structure for Fast Pointer Lookup</TITLE> - <AUTHOR> Hans-J. Boehm, Silicon Graphics (now at HP)</author> -</HEAD> -<BODY> -<H1>Two-Level Tree Structure for Fast Pointer Lookup</h1> -<P> -The BDWGC conservative garbage collector uses a 2-level tree -data structure to aid in fast pointer identification. -This data structure is described in a bit more detail here, since -<OL> -<LI> Variations of the data structure are more generally useful. -<LI> It appears to be hard to understand by reading the code. -<LI> Some other collectors appear to use inferior data structures to -solve the same problem. -<LI> It is central to fast collector operation. -</ol> -A candidate pointer is divided into three sections, the <I>high</i>, -<I>middle</i>, and <I>low</i> bits. The exact division between these -three groups of bits is dependent on the detailed collector configuration. -<P> -The high and middle bits are used to look up an entry in the table described -here. The resulting table entry consists of either a block descriptor -(<TT>struct hblkhdr *</tt> or <TT>hdr *</tt>) -identifying the layout of objects in the block, or an indication that this -address range corresponds to the middle of a large block, together with a -hint for locating the actual block descriptor. Such a hint consist -of a displacement that can be subtracted from the middle bits of the candidate -pointer without leaving the object. -<P> -In either case, the block descriptor (<TT>struct hblkhdr</tt>) -refers to a table of object starting addresses (the <TT>hb_map</tt> field). -The starting address table is indexed by the low bits if the candidate pointer. -The resulting entry contains a displacement to the beginning of the object, -or an indication that this cannot be a valid object pointer. -(If all interior pointer are recognized, pointers into large objects -are handled specially, as appropriate.) - -<H2>The Tree</h2> -<P> -The rest of this discussion focuses on the two level data structure -used to map the high and middle bits to the block descriptor. -<P> -The high bits are used as an index into the <TT>GC_top_index</tt> (really -<TT>GC_arrays._top_index</tt>) array. Each entry points to a -<TT>bottom_index</tt> data structure. This structure in turn consists -mostly of an array <TT>index</tt> indexed by the middle bits of -the candidate pointer. The <TT>index</tt> array contains the actual -<TT>hdr</tt> pointers. -<P> -Thus a pointer lookup consists primarily of a handful of memory references, -and can be quite fast: -<OL> -<LI> The appropriate <TT>bottom_index</tt> pointer is looked up in -<TT>GC_top_index</tt>, based on the high bits of the candidate pointer. -<LI> The appropriate <TT>hdr</tt> pointer is looked up in the -<TT>bottom_index</tt> structure, based on the middle bits. -<LI> The block layout map pointer is retrieved from the <TT>hdr</tt> -structure. (This memory reference is necessary since we try to share -block layout maps.) -<LI> The displacement to the beginning of the object is retrieved from the -above map. -</ol> -<P> -In order to conserve space, not all <TT>GC_top_index</tt> entries in fact -point to distinct <TT>bottom_index</tt> structures. If no address with -the corresponding high bits is part of the heap, then the entry points -to <TT>GC_all_nils</tt>, a single <TT>bottom_index</tt> structure consisting -only of NULL <TT>hdr</tt> pointers. -<P> -<TT>Bottom_index</tt> structures contain slightly more information than -just <TT>hdr</tt> pointers. The <TT>asc_link</tt> field is used to link -all <TT>bottom_index</tt> structures in ascending order for fast traversal. -This list is pointed to be <TT>GC_all_bottom_indices</tt>. -It is maintained with the aid of <TT>key</tt> field that contains the -high bits corresponding to the <TT>bottom_index</tt>. - -<H2>64 bit addresses</h2> -<P> -In the case of 64 bit addresses, this picture is complicated slightly -by the fact that one of the index structures would have to be huge to -cover the entire address space with a two level tree. We deal with this -by turning <TT>GC_top_index</tt> into a chained hash table, instead of -a simple array. This adds a <TT>hash_link</tt> field to the -<TT>bottom_index</tt> structure. -<P> -The "hash function" consists of dropping the high bits. This is cheap to -compute, and guarantees that there will be no collisions if the heap -is contiguous and not excessively large. - -<H2>A picture</h2> -<P> -The following is an ASCII diagram of the data structure. -This was contributed by Dave Barrett several years ago. -<PRE> - - Data Structure used by GC_base in gc3.7: - 21-Apr-94 - - - - - 63 LOG_TOP_SZ[11] LOG_BOTTOM_SZ[10] LOG_HBLKSIZE[13] - +------------------+----------------+------------------+------------------+ - p:| | TL_HASH(hi) | | HBLKDISPL(p) | - +------------------+----------------+------------------+------------------+ - \-----------------------HBLKPTR(p)-------------------/ - \------------hi-------------------/ - \______ ________/ \________ _______/ \________ _______/ - V V V - | | | - GC_top_index[] | | | - --- +--------------+ | | | - ^ | | | | | - | | | | | | - TOP +--------------+<--+ | | - _SZ +-<| [] | * | | -(items)| +--------------+ if 0 < bi< HBLKSIZE | | - | | | | then large object | | - | | | | starts at the bi'th | | - v | | | HBLK before p. | i | - --- | +--------------+ | (word- | - v | aligned) | - bi= |GET_BI(p){->hash_link}->key==hi | | - v | | - | (bottom_index) \ scratch_alloc'd | | - | ( struct bi ) / by get_index() | | - --- +->+--------------+ | | - ^ | | | | - ^ | | | | - BOTTOM | | ha=GET_HDR_ADDR(p) | | -_SZ(items)+--------------+<----------------------+ +-------+ - | +--<| index[] | | - | | +--------------+ GC_obj_map: v - | | | | from / +-+-+-----+-+-+-+-+ --- - v | | | GC_add < 0| | | | | | | | ^ - --- | +--------------+ _map_entry \ +-+-+-----+-+-+-+-+ | - | | asc_link | +-+-+-----+-+-+-+-+ MAXOBJSZ - | +--------------+ +-->| | | j | | | | | +1 - | | key | | +-+-+-----+-+-+-+-+ | - | +--------------+ | +-+-+-----+-+-+-+-+ | - | | hash_link | | | | | | | | | | v - | +--------------+ | +-+-+-----+-+-+-+-+ --- - | | |<--MAX_OFFSET--->| - | | (bytes) -HDR(p)| GC_find_header(p) | |<--MAP_ENTRIES-->| - | \ from | =HBLKSIZE/WORDSZ - | (hdr) (struct hblkhdr) / alloc_hdr() | (1024 on Alpha) - +-->+----------------------+ | (8/16 bits each) -GET_HDR(p)| word hb_sz (words) | | - +----------------------+ | - | struct hblk *hb_next | | - +----------------------+ | - |mark_proc hb_mark_proc| | - +----------------------+ | - | char * hb_map |>-------------+ - +----------------------+ - | ushort hb_obj_kind | - +----------------------+ - | hb_last_reclaimed | - --- +----------------------+ - ^ | | - MARK_BITS| hb_marks[] | *if hdr is free, hb_sz -_SZ(words)| | is the size of a heap chunk (struct hblk) - v | | of at least MININCR*HBLKSIZE bytes (below), - --- +----------------------+ otherwise, size of each object in chunk. - -Dynamic data structures above are interleaved throughout the heap in blocks of -size MININCR * HBLKSIZE bytes as done by gc_scratch_alloc which cannot be -freed; free lists are used (e.g. alloc_hdr). HBLK's below are collected. - - (struct hblk) HDR_BYTES - --- +----------------------+ < HBLKSIZE --- (bytes) - ^ +-----hb_body----------+ (and WORDSZ) ^ --- --- - | | | aligned | ^ ^ - | | | | hb_sz | - | | | | (words) | - | | Object 0 | | | | - | | | i |(word- v | - | + - - - - - - - - - - -+ --- (bytes)|aligned) --- | - | | | ^ | ^ | - | | | j (words) | | | - n * | Object 1 | v v hb_sz BODY_SZ - HBLKSIZE | |--------------- | (words) - (bytes) | | v MAX_OFFSET - | + - - - - - - - - - - -+ --- (bytes) - | | | !ALL_INTERIOR_POINTERS ^ | - | | | sets j only for hb_sz | - | | Object N | valid object offsets. | | - v | | All objects WORDSZ v v - --- +----------------------+ aligned. --- --- - -</PRE> -</body> diff --git a/doc/tree.md b/doc/tree.md new file mode 100644 index 00000000..cdf1aa7b --- /dev/null +++ b/doc/tree.md @@ -0,0 +1,175 @@ +# Two-Level Tree Structure for Fast Pointer Lookup + +The Boehm-Demers-Weiser conservative Garbage Collector uses a 2-level tree +data structure to aid in fast pointer identification. This data structure +is described in a bit more detail here, since + + 1. Variations of the data structure are more generally useful. + 2. It appears to be hard to understand by reading the code. + 3. Some other collectors appear to use inferior data structures to solve the + same problem. + 4. It is central to fast collector operation. A candidate pointer + is divided into three sections, the _high_, _middle_, and _low_ bits. The + exact division between these three groups of bits is dependent on the + detailed collector configuration. + +The high and middle bits are used to look up an entry in the table described +here. The resulting table entry consists of either a block descriptor +(`struct hblkhdr *` or `hdr *`) identifying the layout of objects in the +block, or an indication that this address range corresponds to the middle of +a large block, together with a hint for locating the actual block descriptor. +Such a hint consist of a displacement that can be subtracted from the middle +bits of the candidate pointer without leaving the object. + +In either case, the block descriptor (`struct hblkhdr`) refers to a table +of object starting addresses (the `hb_map` field). The starting address table +is indexed by the low bits if the candidate pointer. The resulting entry +contains a displacement to the beginning of the object, or an indication that +this cannot be a valid object pointer. (If all interior pointer are +recognized, pointers into large objects are handled specially, +as appropriate.) + +## The Tree + +The rest of this discussion focuses on the two level data structure used +to map the high and middle bits to the block descriptor. + +The high bits are used as an index into the `GC_top_index` (really +`GC_arrays._top_index`) array. Each entry points to a `bottom_index` data +structure. This structure in turn consists mostly of an array `index` indexed +by the middle bits of the candidate pointer. The `index` array contains the +actual `hdr` pointers. + +Thus a pointer lookup consists primarily of a handful of memory references, +and can be quite fast: + + 1. The appropriate `bottom_index` pointer is looked up in `GC_top_index`, + based on the high bits of the candidate pointer. + 2. The appropriate `hdr` pointer is looked up in the `bottom_index` + structure, based on the middle bits. + 3. The block layout map pointer is retrieved from the `hdr` structure. (This + memory reference is necessary since we try to share block layout maps.) + 4. The displacement to the beginning of the object is retrieved from the + above map. + +In order to conserve space, not all `GC_top_index` entries in fact point +to distinct `bottom_index` structures. If no address with the corresponding +high bits is part of the heap, then the entry points to `GC_all_nils`, +a single `bottom_index` structure consisting only of `NULL` `hdr` pointers. + +`Bottom_index` structures contain slightly more information than just `hdr` +pointers. The `asc_link` field is used to link all `bottom_index` structures +in ascending order for fast traversal. This list is pointed to be +`GC_all_bottom_indices`. It is maintained with the aid of `key` field that +contains the high bits corresponding to the `bottom_index`. + +## 64-bit addresses + +In the case of 64-bit addresses, this picture is complicated slightly by the +fact that one of the index structures would have to be huge to cover the +entire address space with a two level tree. We deal with this by turning +`GC_top_index` into a chained hash table, instead of a simple array. This adds +a `hash_link` field to the `bottom_index` structure. + +The _hash function_ consists of dropping the high bits. This is cheap +to compute, and guarantees that there will be no collisions if the heap +is contiguous and not excessively large. + +## A picture + +The following is an _ASCII_ diagram of the data structure used by GC_base (as +of GC v3.7, Apr 21, 1994). This was contributed by Dave Barrett. + + + 63 LOG_TOP_SZ[11] LOG_BOTTOM_SZ[10] LOG_HBLKSIZE[13] + +------------------+----------------+------------------+------------------+ + p:| | TL_HASH(hi) | | HBLKDISPL(p) | + +------------------+----------------+------------------+------------------+ + \-----------------------HBLKPTR(p)-------------------/ + \------------hi-------------------/ + \______ ________/ \________ _______/ \________ _______/ + V V V + | | | + GC_top_index[] | | | + --- +--------------+ | | | + ^ | | | | | + | | | | | | + TOP +--------------+<--+ | | + _SZ +-<| [] | * | | + (items)| +--------------+ if 0 < bi< HBLKSIZE | | + | | | | then large object | | + | | | | starts at the bi'th | | + v | | | HBLK before p. | i | + --- | +--------------+ | (word- | + v | aligned) | + bi= |GET_BI(p){->hash_link}->key==hi | | + v | | + | (bottom_index) \ scratch_alloc'd | | + | ( struct bi ) / by get_index() | | + --- +->+--------------+ | | + ^ | | | | + ^ | | | | + BOTTOM | | ha=GET_HDR_ADDR(p) | | + _SZ(items)+--------------+<----------------------+ +-------+ + | +--<| index[] | | + | | +--------------+ GC_obj_map: v + | | | | from / +-+-+-----+-+-+-+-+ --- + v | | | GC_add < 0| | | | | | | | ^ + --- | +--------------+ _map_entry \ +-+-+-----+-+-+-+-+ | + | | asc_link | +-+-+-----+-+-+-+-+ MAXOBJSZ + | +--------------+ +-->| | | j | | | | | +1 + | | key | | +-+-+-----+-+-+-+-+ | + | +--------------+ | +-+-+-----+-+-+-+-+ | + | | hash_link | | | | | | | | | | v + | +--------------+ | +-+-+-----+-+-+-+-+ --- + | | |<--MAX_OFFSET--->| + | | (bytes) + HDR(p)| GC_find_header(p) | |<--MAP_ENTRIES-->| + | \ from | =HBLKSIZE/WORDSZ + | (hdr) (struct hblkhdr) / alloc_hdr() | (1024 on Alpha) + +-->+----------------------+ | (8/16 bits each) + GET_HDR(p)| word hb_sz (words) | | + +----------------------+ | + | struct hblk *hb_next | | + +----------------------+ | + |mark_proc hb_mark_proc| | + +----------------------+ | + | char * hb_map |>-------------+ + +----------------------+ + | ushort hb_obj_kind | + +----------------------+ + | hb_last_reclaimed | + --- +----------------------+ + ^ | | + MARK_BITS| hb_marks[] | * if hdr is free, hb_sz is the size of + _SZ(words)| | a heap chunk (struct hblk) of at least + v | | MININCR*HBLKSIZE bytes (below), + --- +----------------------+ otherwise, size of each object in chunk. + + +Dynamic data structures above are interleaved throughout the heap in blocks +of size `MININCR * HBLKSIZE` bytes as done by `gc_scratch_alloc` which cannot +be freed; free lists are used (e.g. `alloc_hdr`). `HBLK`'s below are +collected. + + + (struct hblk) HDR_BYTES + --- +----------------------+ < HBLKSIZE --- (bytes) + ^ +-----hb_body----------+ (and WORDSZ) ^ --- --- + | | | aligned | ^ ^ + | | | | hb_sz | + | | | | (words) | + | | Object 0 | | | | + | | | i |(word- v | + | + - - - - - - - - - - -+ --- (bytes)|aligned) --- | + | | | ^ | ^ | + | | | j (words) | | | + n * | Object 1 | v v hb_sz BODY_SZ + HBLKSIZE | |--------------- | (words) + (bytes) | | v MAX_OFFSET + | + - - - - - - - - - - -+ --- (bytes) + | | | !ALL_INTERIOR_POINTERS ^ | + | | | sets j only for hb_sz | + | | Object N | valid object offsets. | | + v | | All objects WORDSZ v v + --- +----------------------+ aligned. --- --- |