by others. Notably, some of the run-time systems developed at Xerox PARC
in the early 1980s conservatively scanned thread stacks to locate possible
pointers (cf. Paul Rovner, "On Adding Garbage Collection and Runtime Types
-to a Strongly-Typed Statically Checked, Concurrent Language" Xerox PARC
+to a Strongly-Typed Statically Checked, Concurrent Language" Xerox PARC
CSL 84-7). Doug McIlroy wrote a simpler fully conservative collector that
was part of version 8 UNIX (tm), but appears to not have received
widespread use.
visible to the collector.
Note that the garbage collector does not need to be informed of shared
-read-only data. However if the shared library mechanism can introduce
-discontiguous data areas that may contain pointers, then the collector does
+read-only data. However, if the shared library mechanism can introduce
+discontiguous data areas that may contain pointers then the collector does
need to be informed.
Signal processing for most signals may be deferred during collection,
## Installation and Portability
-As distributed, the collector operates silently
+The collector operates silently in the default configuration.
In the event of problems, this can usually be changed by defining the
`GC_PRINT_STATS` or `GC_PRINT_VERBOSE_STATS` environment variables. This
will result in a few lines of descriptive output for each collection.
Try `./configure --help` to see the configuration options. It is currently
not possible to exercise all combinations of build options this way.
-It is suggested that if you need to replace a piece of the collector
-(e.g. GC_mark_rts.c) you simply list your version ahead of gc.a on the
-ld command line, rather than replacing the one in gc.a. (This will
-generate numerous warnings under some versions of AIX, but it still
-works.)
-
All include files that need to be used by clients will be put in the
include subdirectory. (Normally this is just gc.h. `make cords` adds
"cord.h" and "ec.h".)
The collector currently is designed to run essentially unmodified on
machines that use a flat 32-bit or 64-bit address space.
That includes the vast majority of Workstations and X86 (X >= 3) PCs.
-(The list here was deleted because it was getting too long and constantly
-out of date.)
In a few cases (Amiga, OS/2, Win32, MacOS) a separate makefile
or equivalent is supplied. Many of these have separate README.system
a given size. Returns a pointer to the new object, which may, or may not,
be the same as the pointer to the old object. The new object is taken to
be atomic if and only if the old one was. If the new object is composite
- and larger than the original object then the newly added bytes are cleared
- (we hope). This is very likely to allocate a new object, unless
- `MERGE_SIZES` is defined in gc_priv.h. Even then, it is likely to recycle
- the old object only if the object is grown in small additive increments
- (which, we claim, is generally bad coding practice).
+ and larger than the original object then the newly added bytes are cleared.
+ This is very likely to allocate a new object.
4. `GC_free(object)` - Explicitly deallocate an object returned by
`GC_malloc` or `GC_malloc_atomic`, or friends. Not necessary, but can be
All externally visible names in the garbage collector start with `GC_`.
To avoid name conflicts, client code should avoid this prefix, except when
-accessing garbage collector routines or variables.
+accessing garbage collector routines.
There are provisions for allocation with explicit type information.
This is rarely necessary. Details can be found in gc_typed.h.
## The C++ Interface to the Allocator
-The Ellis-Hull C++ interface to the collector is included in
-the collector distribution. If you intend to use this, type
-`make c++` after the initial build of the collector is complete.
-See gc_cpp.h for the definition of the interface. This interface
-tries to approximate the Ellis-Detlefs C++ garbage collection
-proposal without compiler changes.
+The Ellis-Hull C++ interface to the collector is included in the collector
+distribution. If you intend to use this, type
+`./configure --enable-cplusplus; make` (or `make -f Makefile.direct c++`)
+after the initial build of the collector is complete. See gc_cpp.h for the
+definition of the interface. This interface tries to approximate the
+Ellis-Detlefs C++ garbage collection proposal without compiler changes.
Very often it will also be necessary to use gc_allocator.h and the
allocator declared there to construct STL data structures. Otherwise
The collector may be used to track down leaks in C programs that are
intended to run with malloc/free (e.g. code with extreme real-time or
portability constraints). To do so define `FIND_LEAK` in Makefile.
-This will cause the collector to invoke the `report_leak`
-routine defined near the top of reclaim.c whenever an inaccessible
-object is found that has not been explicitly freed. Such objects will
-also be automatically reclaimed.
-
-If all objects are allocated with `GC_DEBUG_MALLOC` (see next section), then
-the default version of report_leak will report at least the source file and
-line number at which the leaked object was allocated. This may sometimes be
-sufficient. (On a few machines, it will also report a cryptic stack trace.
-If this is not symbolic, it can sometimes be called into a symbolic stack
-trace by invoking program "foo" with "tools/callprocs.sh foo". It is a short
-shell script that invokes adb to expand program counter values to symbolic
-addresses. It was largely supplied by Scott Schwartz.)
+This will cause the collector to print a human-readable object description
+whenever an inaccessible object is found that has not been explicitly freed.
+Such objects will also be automatically reclaimed.
+
+If all objects are allocated with `GC_DEBUG_MALLOC` (see the next section)
+then, by default, the human-readable object description will at least contain
+the source file and the line number at which the leaked object was allocated.
+This may sometimes be sufficient. (On a few machines, it will also report
+a cryptic stack trace. If this is not symbolic, it can sometimes be called
+into a symbolic stack trace by invoking program "foo" with
+`tools/callprocs.sh foo`. It is a short shell script that invokes adb to
+expand program counter values to symbolic addresses. It was largely supplied
+by Scott Schwartz.)
Note that the debugging facilities described in the next section can
-sometimes be slightly LESS effective in leak finding mode, since in
-leak finding mode, `GC_debug_free` actually results in reuse of the object.
-(Otherwise the object is simply marked invalid.) Also note that the test
-program is not designed to run meaningfully in `FIND_LEAK` mode.
-Use "make gc.a" to build the collector.
+sometimes be slightly LESS effective in leak finding mode, since in the latter
+`GC_debug_free` actually results in reuse of the object. (Otherwise the
+object is simply marked invalid.) Also, note that most GC tests are not
+designed to run meaningfully in `FIND_LEAK` mode.
## Debugging Facilities
deallocation of an object without debugging information. Out of
memory errors will be reported to stderr, in addition to returning `NULL`.
-`GC_debug_malloc` checking during garbage collection is enabled
-with the first call to `GC_debug_malloc`. This will result in some
+`GC_debug_malloc` checking during garbage collection is enabled
+with the first call to this function. This will result in some
slowdown during collections. If frequent heap checks are desired,
this can be achieved by explicitly invoking `GC_gcollect`, e.g. from
the debugger.
having been overwritten. This should happen with probability at most
one in 2**32. This probability is zero if `GC_debug_malloc` is never called.
-`GC_debug_malloc`, `GC_malloc_atomic`, and `GC_debug_realloc` take two
+`GC_debug_malloc`, `GC_debug_malloc_atomic`, and `GC_debug_realloc` take two
additional trailing arguments, a string and an integer. These are not
interpreted by the allocator. They are stored in the object (the string is
not copied). If an error involving the object is detected, they are printed.
-The macros `GC_MALLOC`, `GC_MALLOC_ATOMIC`, `GC_REALLOC`, `GC_FREE`, and
-`GC_REGISTER_FINALIZER` are also provided. These require the same arguments
-as the corresponding (nondebugging) routines. If gc.h is included
+The macros `GC_MALLOC`, `GC_MALLOC_ATOMIC`, `GC_REALLOC`, `GC_FREE`,
+`GC_REGISTER_FINALIZER` and friends are also provided. These require the same
+arguments as the corresponding (nondebugging) routines. If gc.h is included
with `GC_DEBUG` defined, they call the debugging versions of these
functions, passing the current file name and line number as the two
extra arguments, where appropriate. If gc.h is included without `GC_DEBUG`
-defined, then all these macros will instead be defined to their nondebugging
+defined then all these macros will instead be defined to their nondebugging
equivalents. (`GC_REGISTER_FINALIZER` is necessary, since pointers to
objects with debugging information are really pointers to a displacement
of 16 bytes from the object beginning, and some translation is necessary
1. Information provided by the VM system. This may be provided in one of
several forms. Under Solaris 2.X (and potentially under other similar
systems) information on dirty pages can be read from the /proc file system.
- Under other systems (currently SunOS4.X) it is possible to write-protect
+ Under other systems (e.g. SunOS4.X) it is possible to write-protect
the heap, and catch the resulting faults. On these systems we require that
system calls writing to the heap (other than read) be handled specially by
client code. See `os_dep.c` for details.
checksums.c
dbg_mlc.c
finalize.c
+ fnlz_mlc.c
headers.c
mach_dep.c
- MacOS.c -- contains MacOS code
+ extra/MacOS.c -- contains MacOS code
malloc.c
mallocx.c
mark.c
ptr_chck.c
reclaim.c
typd_mlc.c
- gc++.cc -- this is 'gc_cpp.cc' with less 'inline' and
- -- throw std::bad_alloc when out of memory
- -- gc_cpp.cc works just fine too
+ gc_cpp.cc
== 2. Test that the library works with 'test.c' ==
Unless --prefix is set (or --exec-prefix or one of the more obscure options),
-make install will install libgc.a and libgc.so in /usr/local/bin, which
-would typically require the "make install" to be run as root.
+"make install" will install libgc.a and libgc.so in /usr/local/lib and
+/usr/local/bin, respectively, which would typically require the "make install"
+to be run as root.
It is not recommended to turn off parallel marking for multiprocessors unless
a poor support of the feature on the platform.
== Important Usage Notes ==
-GC_init() MUST be called before calling any other GC functions. This
+GC_INIT() MUST be called before calling any other GC functions. This
is necessary to properly register segments in dynamic libraries. This
call is required even if you code does not use dynamic libraries as the
dyld code handles registering all data segments.
When your use of the garbage collector is confined to dylibs and you
-cannot call GC_init() before your libraries' static initializers have
+cannot call GC_INIT() before your libraries' static initializers have
run and perhaps called GC_malloc(), create an initialization routine
-for each library to call GC_init():
+for each library to call GC_INIT(), e.g.:
#include "gc.h"
-extern "C" void my_library_init() { GC_init(); }
+extern "C" void my_library_init() { GC_INIT(); }
Compile this code into a my_library_init.o, and link it into your
dylib. When you link the dylib, pass the -init argument with
my_library_init() to be called before any static initializers, and
will initialize the garbage collector properly.
-Note: It doesn't hurt to call GC_init() more than once, so it's best,
+Note: It doesn't hurt to call GC_INIT() more than once, so it's best,
if you have an application or set of libraries that all use the
garbage collector, to create an initialization routine for each of
-them that calls GC_init(). Better safe than sorry.
+them that calls GC_INIT(). Better safe than sorry.
The incremental collector is still a bit flaky on darwin. It seems to
work reliably with workarounds for a few possible bugs in place however
work in combination with it.
The stack finding code can be confused by putenv calls before collector
-initialization. Call GC_malloc or GC_init before any putenv calls.
+initialization. Call GC_malloc() or GC_INIT() before any putenv() calls.
See README.alpha for Linux on DEC AXP info.
-This file applies mostly to Linux/Intel IA32. Ports to Linux on an M68K,
-IA64, SPARC, MIPS, Alpha and PowerPC are integrated too. They should behave
+This file applies mostly to Linux/Intel IA-32. Ports to Linux on an M68K,
+IA-64, SPARC, MIPS, Alpha and PowerPC are integrated too. They should behave
similarly, except that the PowerPC port lacks incremental GC support, and
it is unknown to what extent the Linux threads code is functional.
See below for M68K specific notes.
in the Makefile.
3a) Every file that makes thread calls should define GC_THREADS, and then
- include gc.h. Gc.h redefines some of the pthread primitives as macros
- which also provide the collector with information it requires.
+ include gc.h. The latter redefines some of the pthread primitives as
+ macros which also provide the collector with information it requires.
3b) A new alternative to (3a) is to build the collector and compile GC clients
with -DGC_USE_LD_WRAP, and to link the final program with
dynamic libraries are used, but the collector is in a static
library. Tested by gc_config_macros.h.
-GC_REQUIRE_WCSDUP Force GC to export GC_wcsdup() (the Unicode version
- of GC_strdup); could be useful in the leak-finding mode.
-
These define arguments influence the collector configuration:
+GC_REQUIRE_WCSDUP Force GC to export GC_wcsdup() (the Unicode version
+ of GC_strdup); could be useful in the leak-finding mode. Clients should
+ define it before including gc.h if the function is needed.
+
FIND_LEAK Causes GC_find_leak to be initially set. This causes the
collector to assume that all inaccessible objects should have been
explicitly deallocated, and reports exceptions. Finalization and the test
PCR Set if the collector is being built as part of the Xerox Portable
Common Runtime.
-IMPORTANT: Any of the _THREADS options must normally also be defined in
- the client before including gc.h. This redefines thread primitives to
- invoke the GC_ versions instead. Alternatively, linker-based symbol
- interception can be used on a few platforms.
-
GC_THREADS Should set the appropriate one of the below macros,
except GC_WIN32_PTHREADS, which must be set explicitly. Tested by gc.h.
+ IMPORTANT: GC_THREADS macro (or the relevant platform-specific deprecated
+ one) must normally also be defined by the client before including gc.h.
+ This redefines thread primitives to invoke the GC_ wrappers instead.
+ Alternatively, linker-based symbol interception can be used on a few
+ platforms.
GC_SOLARIS_THREADS Enables support for Solaris pthreads.
Must also define _REENTRANT. Deprecated, use GC_THREADS instead.
See README.DGUX386. (Probably has not been tested recently.) Deprecated,
use GC_THREADS instead.
-GC_WIN32_THREADS Enables support for Win32 threads. That makes sense
- for Makefile (and Makefile.direct) only under Cygwin or MinGW. Deprecated,
+GC_WIN32_THREADS Enables support for Win32 threads. Deprecated,
use GC_THREADS instead.
GC_WIN32_PTHREADS Enables support for pthreads-win32 (or other
GC_DISABLE_INCREMENTAL Turn off the incremental collection support.
-NO_INCREMENTAL Causes the gctest program to not invoke the incremental
- collector. This has no impact on the generated library, only on the test
- program. (This is often useful for debugging failures unrelated to
+NO_INCREMENTAL Causes the GC test programs to not invoke the incremental mode
+ of the collector. This has no impact on the generated library, only on the
+ test programs. (This is often useful for debugging failures unrelated to
incremental GC.)
LARGE_CONFIG Tunes the collector for unusually large heaps.
execute permission is required.
GC_NO_OPERATOR_NEW_ARRAY Declares that the C++ compiler does not
- support the new syntax "operator new[]" for allocating and deleting arrays.
+ support the new syntax "operator new[]" for allocating and deleting arrays.
See gc_cpp.h for details. No effect on the C part of the collector.
This is defined implicitly in a few environments. Must also be defined
by clients that use gc_cpp.h.
for debugging/profiling purposes. The gc_backptr.h interface is
implemented only if this is defined.
-GC_ASSERTIONS Enable some internal GC assertion checking. Currently
- this facility is only used in a few places. It is intended primarily
- for debugging of the garbage collector itself, but could also...
+GC_ASSERTIONS Enable some internal GC assertion checking. It is intended
+ primarily for debugging of the garbage collector itself, but could also
+ help to identify cases of incorrect GC usage by a client.
DBG_HDRS_ALL Make sure that all objects have debug headers. Increases
the reliability (from 99.9999% to 100% mod. bugs) of some of the debugging
SAVE_CALL_COUNT=<n> Set the number of call frames saved with objects
allocated through the debugging interface. Affects the amount of
information generated in leak reports. Only matters on platforms
- on which we can quickly generate call stacks, currently Linux/(X86 & SPARC)
- and Solaris/SPARC and platforms that provide execinfo.h.
- Default is zero. On X86, client
- code should NOT be compiled with -fomit-frame-pointer.
+ on which we can quickly generate call stacks, currently Linux/X86,
+ Linux/SPARC, Solaris/SPARC, and platforms that provide execinfo.h.
+ Default is zero. On X86, client code should NOT be compiled with
+ -fomit-frame-pointer.
SAVE_CALL_NARGS=<n> Set the number of functions arguments to be saved
with each call frame. Default is zero. Ignored if we don't know how to
that include a pointer to a type descriptor in each allocated object).
USE_I686_PREFETCH Causes the collector to issue Pentium III style
- prefetch instructions. No effect except on X86 Linux platforms.
- Assumes a very recent gcc-compatible compiler and assembler.
- (Gas prefetcht0 support was added around May 1999.)
+ prefetch instructions. No effect except on Linux/X86 platforms.
Empirically the code appears to still run correctly on Pentium II
processors, though with no performance benefit. May not run on other
- X86 processors? In some cases this improves performance by
- 15% or so.
+ X86 processors probably. In some cases this improves performance by 15%
+ or so.
USE_3DNOW_PREFETCH Causes the collector to issue AMD 3DNow style
prefetch instructions. Same restrictions as USE_I686_PREFETCH.
THREAD_LOCAL_ALLOC Defines GC_malloc(), GC_malloc_atomic() and
GC_gcj_malloc() to use a per-thread set of free-lists. These then allocate
in a way that usually does not involve acquisition of a global lock.
- Recommended for multiprocessors. Requires explicit GC_INIT() call, unless
- REDIRECT_MALLOC is defined and GC_malloc is used first.
+ Recommended for multiprocessors.
USE_COMPILER_TLS Causes thread local allocation to use
the compiler-supported "__thread" thread-local variables. This is the
GC_PREFER_MPROTECT_VDB Choose MPROTECT_VDB manually in case of multiple
virtual dirty bit strategies are implemented (at present useful on Win32 and
Solaris to force MPROTECT_VDB strategy instead of the default GWW_VDB or
- PROC_VDB ones).
+ PROC_VDB ones, respectively).
GC_IGNORE_GCJ_INFO Disable GCJ-style type information (useful for
debugging on WinCE).
to expect that this is not safe if the client program also calls the system
malloc, or especially realloc. The sbrk man page strongly suggests this is
not safe: "Many library routines use malloc() internally, so use brk()
-and sbrk() only when you know that malloc() definitely will not be used by
+and sbrk() only when you know that malloc() definitely will not be used by
any library routine." This doesn't make a lot of sense to me, since there
seems to be no documentation as to which routines can transitively call malloc.
Nonetheless, under Solaris2, the collector now allocates
It is also essential that gc.h be included in files that call pthread_create,
pthread_join, pthread_detach, or dlopen. gc.h macro defines these to also do
-GC bookkeeping, etc. gc.h must be included with one or both of these macros
-defined, otherwise these replacements are not visible. A collector built in
-this way way only be used by programs that are linked with the threads library.
+GC bookkeeping, etc. gc.h must be included with GC_THREADS macro defined
+first, otherwise these replacements are not visible. A collector built in
+this way may only be used by programs that are linked with the threads library.
Unless USE_PROC_FOR_LIBRARIES is defined, dlopen disables collection
temporarily. In some unlikely cases, this can result in unpleasant heap
first thread. (This avoids a deadlock arising from calling GC_thr_init
with the allocation lock held.)
-It appears that there is a problem in using gc_cpp.h in conjunction with
-Solaris threads and Sun's C++ runtime. Apparently the overloaded new operator
-is invoked by some iostream initialization code before threads are correctly
-initialized. As a result, call to thr_self() in garbage collector
-initialization SEGV faults. Currently the only known workaround is to not
+There could be an issue when using gc_cpp.h in conjunction with Solaris
+threads and Sun's C++ runtime. Apparently the overloaded new operator
+may be invoked by some iostream initialization code before threads are
+correctly initialized. This may cause a SIGSEGV during initialization
+of the garbage collector. Currently the only known workaround is to not
invoke the garbage collector from a user defined global operator new, or to
have it invoke the garbage-collector's allocators only after main has started.
(Note that the latter requires a moderately expensive test in operator
-The collector has at various times been compiled under Windows 95 & later, NT,
-and XP, with the original Microsoft SDK, with Visual C++ 2.0, 4.0, and 6, with
-the GNU win32 tools, with Borland C++ Builder, with Watcom C, and
-with the Digital Mars compiler. It is likely that some of these have been
-broken in the meantime. Patches are appreciated.
+The collector has at various times been compiled under Windows 95 and later,
+NT, and XP, with the original Microsoft SDK, with Visual C++ 2.0, 4.0, and 6,
+with the GNU win32 tools, with Borland C++ Builder, with Watcom C, with EMX,
+and with the Digital Mars compiler (DMC).
For historical reasons,
the collector test program "gctest" is linked as a GUI application,
cursor may appear as long as it's running. If it is started from the
command line, it will usually run in the background. Wait a few
minutes (a few seconds on a modern machine) before you check the output.
-You should see either a failure indication or a "Collector appears to
-work" message.
+You should see either a failure indication or a "Collector appears to work"
+message.
A toy editor (cord/de.exe) based on cords (heavyweight
strings represented as trees) has been ported and is included.
Microsoft Tools
---------------
+
For Microsoft development tools, type
"nmake -f NT_MAKEFILE cpu=i386 make_as_lib=1 nothreads=1 nodebug=1"
to build the release variant of the collector as a static library without
GNU Tools
---------
+
The collector should be buildable under Cygwin with the
"./configure; make check" machinery.
Borland Tools
-------------
+
[Rarely tested.]
For Borland tools, use BCC_MAKEFILE. Note that
Borland's compiler defaults to 1 byte alignment in structures (-a1),
including "gc.h" (for example, with -DGC_DLL compiler option). It's
important, otherwise resulting programs will not run.
-
Special note for OpenWatcom users: the C (unlike the C++) compiler (of the
latest stable release, not sure for older ones) doesn't force pointer global
variables (i.e. not struct fields, not sure for locals) to be aligned unless
a feature (see an old report of same kind -
http://bugzilla.openwatcom.org/show_bug.cgi?id=664), so You are warned.
-
Incremental Collection
----------------------
+
There is some support for incremental collection. By default, the
collector chooses between explicit page protection, and GetWriteWatch-based
write tracking automatically, depending on the platform.
For the normal, non-dll-based thread tracking to work properly,
threads should be created with GC_CreateThread or GC_beginthreadex,
-and exit normally or call GC_endthreadex or GC_ExitThread. (For Cygwin, the
+and exit normally, or call GC_endthreadex or GC_ExitThread. (For Cygwin, the
standard pthread_create/exit calls could be used instead.) As in the pthread
case, including gc.h will redefine CreateThread, _beginthreadex,
_endthreadex, and ExitThread to call the GC_ versions instead.
The garbage collector generates warning messages of the form:
- Needed to allocate blacklisted block at 0x...
-
-
-or
-
-
Repeated allocation of very large block ...
+ May lead to memory leak and poor performance
when it needs to allocate a block at a location that it knows to be referenced
recently used logically open files. Any other needed files would be closed
after saving their state. They would then be reopened on demand.
Finalization would logically close the file, closing the real descriptor
- only if it happened to be cached.) Note that most modern systems (e.g. Irix)
- allow hundreds or thousands of open files, and this is typically not
- an issue.
+ only if it happened to be cached.) Note that most modern systems allow
+ thousands of open files, and this is typically not an issue.
* Finalization code may be run anyplace an allocation or other call to the
collector takes place. In multi-threaded programs, finalizers have to obey
the normal locking conventions to ensure safety. Code run directly from
finalizers should not acquire locks that may be held during allocation.
- This restriction can be easily circumvented by registering a finalizer which
- enqueues the real action for execution in a separate thread.
+ This restriction can be easily circumvented by calling
+ `GC_set_finalize_on_demand(1)` at program start and creating a separate
+ thread dedicated to periodic invocation of `GC_invoke_finalizers()`.
-In single-threaded code, it is also often easiest to have finalizers queue
-actions, which are then explicitly run during an explicit call by the user's
-program.
+In single-threaded code, it is also often easiest to have finalizers queued
+and, then to have them explicitly executed by `GC_invoke_finalizers()`.
## Topologically ordered finalization
-.TH BDWGC 3 "15 Aug 2018"
+.TH BDWGC 3 "26 Mar 2019"
.SH NAME
GC_malloc, GC_malloc_atomic, GC_free, GC_realloc, GC_enable_incremental, GC_register_finalizer, GC_malloc_ignore_off_page, GC_malloc_atomic_ignore_off_page, GC_set_warn_proc \- Garbage collecting malloc replacement
.SH SYNOPSIS
.br
void * GC_malloc(size_t size);
.br
+void * GC_malloc_atomic(size_t size);
+.br
void GC_free(void *ptr);
.br
void * GC_realloc(void *ptr, size_t size);
.br
+void GC_enable_incremental();
+.br
+void * GC_malloc_ignore_off_page(size_t size);
+.br
+void * GC_malloc_atomic_ignore_off_page(size_t size);
+.br
+void GC_set_warn_proc(void (*proc)(char *, GC_word));
+.br
.sp
cc ... -lgc
.LP
.LP
It is also possible to use the collector to find storage leaks in programs destined to be run with standard malloc/free. The collector can be compiled for thread-safe operation. Unlike standard malloc, it is safe to call malloc after a previous malloc call was interrupted by a signal, provided the original malloc call is not resumed.
.LP
-The collector may, on rare occasion produce warning messages. On UNIX machines these appear on stderr. Warning messages can be filtered, redirected, or ignored with
+The collector may, on rare occasion, produce warning messages. On UNIX machines these appear on stderr. Warning messages can be filtered, redirected, or ignored with
.I
GC_set_warn_proc
This is recommended for production code. See gc.h for details.
Fully portable code should call
.I
GC_INIT
-from the main program before making any other GC calls.
+from the primordial thread of the main program before making any other GC calls.
On most platforms this does nothing and the collector is initialized on first use.
On a few platforms explicit initialization is necessary. And it can never hurt.
.LP
allows the garbage collector to easily ignore the collectors own data
structures when it searches for root pointers. Other allocator and collector
internal data structures are allocated dynamically with `GC_scratch_alloc`.
-`GC_scratch_alloc` does not allow for deallocation, and is therefore used only
-for permanent data structures.
+The latter does not allow for deallocation, and is therefore used only for
+permanent data structures.
-The allocator allocates objects of different _kinds_. Different kinds are
+The allocator returns objects of different _kinds_. Different _kinds_ are
handled somewhat differently by certain parts of the garbage collector.
Certain kinds are scanned for pointers, others are not. Some may have
per-object type descriptors that determine pointer locations. Or a specific
kind may correspond to one specific object layout. Two built-in kinds are
-uncollectible.
-In spite of that, it is very likely that most C clients of the collector
-currently use at most two kinds: `NORMAL` and `PTRFREE` objects. The
+uncollectible. In spite of that, it is very likely that most C clients of the
+collector currently use at most two kinds: `NORMAL` and `PTRFREE` objects. The
[GCJ](https://gcc.gnu.org/onlinedocs/gcc-4.8.5/gcj/) runtime also makes heavy
use of a kind (allocated with `GC_gcj_malloc`) that stores type information
at a known offset in method tables.
a length. (For other possibilities, see `gc_mark.h`.)
At the beginning of the mark phase, all root segments (as described above) are
-pushed on the stack by `GC_push_roots`. (Registers and eagerly processed stack
+pushed on the stack by `GC_push_roots`. (Registers and eagerly scanned stack
sections are processed by pushing the referenced objects instead of the stack
section itself.) If `ALL_INTERIOR_POINTERS` is not defined, then stack roots
require special treatment. In this case, the normal marking code ignores
attempt results in additional marked objects.
Each mark stack entry is processed by examining all candidate pointers in the
-range described by the entry. If the region has no associated type
-information, then this typically requires that each 4-byte aligned quantity
-(8-byte aligned with 64-bit pointers) be considered a candidate pointer.
+range described by the entry. If the region has no associated type information
+then this typically requires that each 4-byte aligned quantity (8-byte aligned
+if 64-bit pointers) be considered a candidate pointer.
We determine whether a candidate pointer is actually the address of a heap
block. This is done in the following steps:
* The candidate pointer is divided into two pieces; the most significant
bits identify a `HBLKSIZE`-sized page in the address space, and the least
significant bits specify an offset within that page. (A hardware page may
- actually consist of multiple such pages. HBLKSIZE is usually the page size
- divided by a small power of two.)
+ actually consist of multiple such pages. Normally, HBLKSIZE is usually the
+ page size divided by a small power of two. Alternatively, if the collector
+ is built with `-DLARGE_CONFIG`, such a page may consist of multiple hardware
+ pages.)
* The page address part of the candidate pointer is looked up in
a [table](tree.md). Each table entry contains either 0, indicating that
the page is not part of the garbage collected heap, a small integer _n_,
operation in computing the object start address.
* The mark bit for the target object is checked and set. If the object was
previously unmarked, the object is pushed on the mark stack. The descriptor
- is read from the page descriptor. (This is computed from information
- `GC_obj_kinds` when the page is first allocated.)
+ is read from the page descriptor. (This is computed from information stored
+ in `GC_obj_kinds` when the page is first allocated.)
At the end of the mark phase, mark bits for left-over free lists are cleared,
in case a free list was accidentally marked due to a stray pointer.
In incremental mode, the heap is always expanded when we encounter
insufficient space for an allocation. Garbage collection is triggered whenever
-we notice that more than `GC_heap_size`/2 * `GC_free_space_divisor` bytes
+we notice that more than `GC_heap_size / 2 * GC_free_space_divisor` bytes
of allocation have taken place. After `GC_full_freq` minor collections a major
collection is started.
## Thread-local allocation
-If thread-local allocation is enabled, the collector keeps separate arrays
-of free lists for each thread. Thread-local allocation is currently only
-supported on a few platforms.
+If thread-local allocation is enabled (which is true in the default
+configuration for most supported platforms), the collector keeps separate
+arrays of free lists for each thread.
The free list arrays associated with each thread are only used to satisfy
requests for objects that are both very small, and belong to one of a small
-number of well-known kinds. These currently include _normal_ and pointer-free
-objects. Depending on the configuration, _gcj_ objects may also be included.
+number of well-known kinds. These include _normal_, pointer-free, _gcj_ and
+_disclaim_ objects.
Thread-local free list entries contain either a pointer to the first element
of a free list, or they contain a counter of the number of allocation
granules, corresponding to objects of this size, allocated so far. Initially
they contain the value one, i.e. a small counter value.
-Thread-local allocation allocates directly through the global allocator,
-if the object is of a size or kind not covered by the local free lists.
+Thread-local allocation goes directly through the global allocator if the
+object is of a size or kind not covered by the local free lists.
If there is an appropriate local free list, the allocator checks whether
it contains a sufficiently small counter value. If so, the counter is simply
-incremented by the counter value, and the global allocator is used. In this
-way, the initial few allocations of a given size bypass the local allocator.
+incremented by a value, and the global allocator is used. In this way,
+the initial few allocations of a given size bypass the local allocator.
A thread that only allocates a handful of objects of a given size will not
build up its own free list for that size. This avoids wasting space for
unpopular objects sizes or kinds.
The following describes the standard C interface to the garbage collector.
It is not a complete definition of the interface. It describes only the most
commonly used functionality, approximately in decreasing order of frequency
-of use. The full interface is described in `gc.h` file.
+of use. This somewhat duplicates the information in `gc.man` file. The full
+interface is described in `gc.h` file.
Clients should include `gc.h` (i.e., not `gc_config_macros.h`,
`gc_pthread_redirects.h`, `gc_version.h`). In the case of multi-threaded code,
Thread users should also be aware that on many platforms objects reachable
only from thread-local variables may be prematurely reclaimed. Thus objects
pointed to by thread-local variables should also be pointed to by a globally
-visible data structure. (This is viewed as a bug, but as one that
-is exceedingly hard to fix without some `libc` hooks.)
+visible data area, e.g. thread's stack. (This behavior is viewed as a bug, but
+as one that is exceedingly hard to fix without some `libc` hooks.)
-**void * `GC_MALLOC`(size_t _nbytes_)** - Allocates and clears _nbytes_
-of storage. Requires (amortized) time proportional to _nbytes_. The resulting
+**void * `GC_MALLOC`(size_t _bytes_)** - Allocates and clears _bytes_
+of storage. Requires (amortized) time proportional to _bytes_. The resulting
object will be automatically deallocated when unreferenced. References from
objects allocated with the system malloc are usually not considered by the
collector. (See `GC_MALLOC_UNCOLLECTABLE`, however. Building the collector
is defined before `gc.h` is included, a debugging version that checks
occasionally for overwrite errors, and the like.
-**void * `GC_MALLOC_ATOMIC`(size_t _nbytes_)** - Allocates _nbytes_
-of storage. Requires (amortized) time proportional to _nbytes_. The resulting
+**void * `GC_MALLOC_ATOMIC`(size_t _bytes_)** - Allocates _bytes_
+of storage. Requires (amortized) time proportional to _bytes_. The resulting
object will be automatically deallocated when unreferenced. The client
promises that the resulting object will never contain any pointers. The memory
is not cleared. This is the preferred way to allocate strings, floating point
arrays, bitmaps, etc. More precise information about pointer locations can be
communicated to the collector using the interface in `gc_typed.h`.
-**void * `GC_MALLOC_UNCOLLECTABLE`(size_t _nbytes_)** - Identical
+**void * `GC_MALLOC_UNCOLLECTABLE`(size_t _bytes_)** - Identical
to `GC_MALLOC`, except that the resulting object is not automatically
deallocated. Unlike the system-provided `malloc`, the collector does scan the
object for pointers to garbage-collectible memory, even if the block itself
does not appear to be reachable. (Objects allocated in this way are
effectively treated as roots by the collector.)
-**void * `GC_REALLOC`(void * _old_, size_t _new_size_)** - Allocates a new
-object of the indicated size and copy (a prefix of) the old object into the
+**void * `GC_REALLOC`(void * _old_object_, size_t _new_bytes_)** - Allocates
+a new object of the indicated size and copy the old object's content into the
new object. The old object is reused in place if convenient. If the original
object was allocated with `GC_MALLOC_ATOMIC`, the new object is subject to the
same constraints. If it was allocated as an uncollectible object, then the new
object is uncollectible, and the old object (if different) is deallocated.
-**void `GC_FREE`(void * _dead_)** - Explicitly deallocates an object.
+**void `GC_FREE`(void * _object_)** - Explicitly deallocates an _object_.
Typically not useful for small collectible objects.
-**void * `GC_MALLOC_IGNORE_OFF_PAGE`(size_t _nbytes_)** and
-**void * `GC_MALLOC_ATOMIC_IGNORE_OFF_PAGE`(size_t _nbytes_)** - Analogous
+**void * `GC_MALLOC_IGNORE_OFF_PAGE`(size_t _bytes_)** and
+**void * `GC_MALLOC_ATOMIC_IGNORE_OFF_PAGE`(size_t _bytes_)** - Analogous
to `GC_MALLOC` and `GC_MALLOC_ATOMIC`, respectively, except that the client
guarantees that as long as the resulting object is of use, a pointer
is maintained to someplace inside the first 512 bytes of the object. This
This is the preferred way to allocate objects that are likely to be
more than 100 KB in size. It greatly reduces the risk that such objects will
be accidentally retained when they are no longer needed. Thus space usage may
-be significantly reduced.
+be significantly reduced. Another way is `GC_set_all_interior_pointers(0)`
+called at program start (this, however, is generally not suitable for C++ code
+because of multiple inheretance).
**void `GC_INIT()`** - On some platforms, it is necessary to invoke this _from
the main executable_, _not from a dynamic library_, before the initial
to perform a small amount of work every few invocations of `GC_MALLOC` or the
like, instead of performing an entire collection at once. This is likely
to increase total running time. It will improve response on a platform that
-either has suitable support in the garbage collector (Linux and most Unix
-versions, Win32 if the collector was suitably built). On many platforms this
-interacts poorly with system calls that write to the garbage collected heap.
+has suitable support in the garbage collector (Linux and most Unix versions,
+Win32 if the collector was suitably built). On many platforms this interacts
+poorly with system calls that write to the garbage collected heap.
**void `GC_set_warn_proc`(GC_warn_proc)** - Replaces the default procedure
used by the collector to print warnings. The collector may otherwise
them becomes inaccessible. It is not an acceptable method to perform actions
that must be performed in a timely fashion. See `gc.h` for details of the
interface. See also [here](finalization.md) for a more detailed discussion
-of the design.
-
-Note that an object may become inaccessible before client code is done
-operating on objects referenced by its fields. Suitable synchronization
-is usually required. See
-[here](http://portal.acm.org/citation.cfm?doid=604131.604153)
-or [here](http://www.hpl.hp.com/techreports/2002/HPL-2002-335.html) for
-details.
+of the design. Note that an object may become inaccessible before client code
+is done operating on objects referenced by its fields. Suitable
+synchronization is usually required. See
+[here](http://portal.acm.org/citation.cfm?doid=604131.604153) or
+[here](http://www.hpl.hp.com/techreports/2002/HPL-2002-335.html) for details.
If you are concerned with multiprocessor performance and scalability, you
should consider enabling and using thread local allocation.
-If your platform supports it, you should build the collector with parallel
-marking support (`-DPARALLEL_MARK`); configure has it on by default.
+If your platform supports it, you should also build the collector with
+parallel marking support (`-DPARALLEL_MARK`); configure has it on by default.
If the collector is used in an environment in which pointer location
information for heap objects is easily available, this can be passed on to the
The garbage collector provides leak detection support. This includes the
following features:
- 1. Leak detection mode can be initiated at run-time by setting
- `GC_find_leak` instead of building the collector with `FIND_LEAK` defined.
- This variable should be set to a nonzero value at program startup.
+ 1. Leak detection mode can be initiated at run-time by `GC_set_find_leak(1)`
+ call at program startup instead of building the collector with `FIND_LEAK`
+ macro defined.
2. Leaked objects should be reported and then correctly garbage collected.
-To use the collector as a leak detector, follow the following steps:
+To use the collector as a leak detector, do the following steps:
- 1. Build the collector with `-DFIND_LEAK`. Otherwise use default build
- options.
+ 1. Activate the leak detection mode as described above.
2. Change the program so that all allocation and deallocation goes through
the garbage collector.
- 3. Arrange to call `GC_gcollect` at appropriate points to check for leaks.
- (For sufficiently long running programs, this will happen implicitly, but
- probably not with sufficient frequency.)
+ 3. Arrange to call `GC_gcollect` (or `CHECK_LEAKS()`) at appropriate points
+ to check for leaks. (This happens implicitly but probably not with
+ a sufficient frequency for long running programs.)
The second step can usually be accomplished with the
`-DREDIRECT_MALLOC=GC_malloc` option when the collector is built, or by
-defining `malloc`, `calloc`, `realloc` and `free` to call the corresponding
+defining `malloc`, `calloc`, `realloc`, `free` (as well as `strdup`,
+`strndup`, `wcsdup`, `memalign`, `posix_memalign`) to call the corresponding
garbage collector functions. But this, by itself, will not yield very
-informative diagnostics, since the collector does not keep track of
+informative diagnostics, since the collector does not keep track of the
information about how objects were allocated. The error reports will include
only object addresses.
reports should be generated with linuxthreads, at least.
On a few platforms (currently Solaris/SPARC, Irix, and, with
--DSAVE_CALL_CHAIN, Linux/X86), `GC_MALLOC` also causes some more information
+`-DSAVE_CALL_CHAIN`, Linux/X86), `GC_MALLOC` also causes some more information
about its call stack to be saved in the object. Such information is reproduced
in the error reports in very non-symbolic form, but it can be very useful with
the aid of a debugger.
Assume the collector has been built with `-DFIND_LEAK` or
`GC_set_find_leak(1)` exists as the first statement in `main`.
-The program to be tested for leaks can then look like "leak_test.c" file
-in the "tests" subdirectory of the distribution.
+The program to be tested for leaks could look like `tests/leak_test.c` file
+of the distribution.
On an Intel X86 Linux system this produces on the stderr stream:
- Leaked composite object at 0x806dff0 (leak_test.c:8, sz=4)
+ Found 1 leaked objects:
+ 0x806dff0 (tests/leak_test.c:19, sz=4, NORMAL)
(On most unmentioned operating systems, the output is similar to this. If the
On Irix it reports:
- Leaked composite object at 0x10040fe0 (leak_test.c:8, sz=4)
+ Found 1 leaked objects:
+ 0x10040fe0 (tests/leak_test.c:19, sz=4, NORMAL)
Caller at allocation:
##PC##= 0x10004910
and on Solaris the error report is:
- Leaked composite object at 0xef621fc8 (leak_test.c:8, sz=4)
+ Found 1 leaked objects:
+ 0xef621fc8 (tests/leak_test.c:19, sz=4, NORMAL)
Call chain at allocation:
args: 4 (0x4), 200656 (0x30FD0)
##PC##= 0x14ADC
In the latter two cases some additional information is given about how malloc
was called when the leaked object was allocated. For Solaris, the first line
specifies the arguments to `GC_debug_malloc` (the actual allocation routine),
-The second the program counter inside main, the third the arguments to `main`,
-and finally the program counter inside the caller to main (i.e. in the
-C startup code).
-
-In the Irix case, only the address inside the caller to main is given.
+The second one specifies the program counter inside `main`, the third one
+specifies the arguments to `main`, and, finally, the program counter inside
+the caller to `main` (i.e. in the C startup code). In the Irix case, only the
+address inside the caller to `main` is given.
In many cases, a debugger is needed to interpret the additional information.
-On systems supporting the "adb" debugger, the `tools/callprocs.sh` script can
+On systems supporting the `adb` debugger, the `tools/callprocs.sh` script can
be used to replace program counter values with symbolic names. The collector
tries to generate symbolic names for call stacks if it knows how to do so on
the platform. This is true on Linux/X86, but not on most other platforms.
* Platforms
* Some collector details
* Further reading
- * Local Links for this collector
- * Local Background Links
- * Contacts and Mailing List
+ * Information provided on the BDWGC site
+ * More background information
+ * Contacts and new release announcements
[ This is an updated version of the page formerly at
`www.hpl.hp.com/personal/Hans_Boehm/gc/`, before that at
Preview versions may contain additional features, platform support, but are
likely to be less well tested. The list of changes for each version
is specified on the [releases](https://github.com/ivmai/bdwgc/releases) page.
+The development version (snapshot) is available in the master branch of
+[bdwgc git](https://github.com/ivmai/bdwgc) repository on GitHub.
The arguments for and against conservative garbage collection in C and C++ are
briefly discussed [here](http://www.hboehm.info/gc/issues.html). The
The garbage collector code is copyrighted by
[Hans-J. Boehm](http://www.hboehm.info), Alan J. Demers,
[Xerox Corporation](http://www.xerox.com/),
-[Silicon Graphics](http://www.sgi.com/), and
-[Hewlett-Packard Company](http://www.hp.com/). It may be used and copied
-without payment of a fee under minimal restrictions. See the README.md file
-in the distribution or the [license](http://www.hboehm.info/gc/license.txt)
-for more details. **IT IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY
-EXPRESSED OR IMPLIED. ANY USE IS AT YOUR OWN RISK**.
+[Silicon Graphics](http://www.sgi.com/),
+[Hewlett-Packard Company](http://www.hp.com/),
+[Ivan Maidanski](https://github.com/ivmai), and partially by some others.
+It may be used and copied without payment of a fee under minimal restrictions.
+See the README.md file in the distribution or the
+[license](http://www.hboehm.info/gc/license.txt) for more details.
+**IT IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED OR IMPLIED.
+ANY USE IS AT YOUR OWN RISK.**
Empirically, this collector works with most unmodified C programs, simply
-by replacing `malloc` with `GC_malloc` calls, replacing `realloc` with
-`GC_realloc` calls, and removing free calls. Exceptions are discussed
+by replacing `malloc` and `calloc` with `GC_malloc` calls, replacing `realloc`
+with `GC_realloc` calls, and removing `free` calls. Exceptions are discussed
[here](http://www.hboehm.info/gc/issues.html).
## Platforms
The collector is not completely portable, but the distribution includes ports
to most standard PC and UNIX/Linux platforms. The collector should work
-on Linux, *BSD, recent Windows versions, MacOS X, HP/UX, Solaris, Tru64, Irix
-and a few other operating systems. Some ports are more polished than others.
+on Linux, Android, BSD variants, OS/2, Windows (Win32 and Win64), MacOS X,
+iOS, HP/UX, Solaris, Tru64, Irix, Symbian and other operating systems. Some
+platforms are more polished (better supported) than others.
-Irix pthreads, Linux threads, Win32 threads, Solaris threads (pthreads only),
-HP/UX 11 pthreads, Tru64 pthreads, and MacOS X threads are supported in recent
-versions.
+Irix pthreads, Linux threads, Windows threads, Solaris threads (pthreads
+only), HP/UX 11 pthreads, Tru64 pthreads, and MacOS X threads are supported.
## Some Collector Details
algorithm. It provides incremental and generational collection under operating
systems which provide the right kind of virtual memory support. (Currently
this includes SunOS[45], IRIX, OSF/1, Linux, and Windows, with varying
-restrictions.) It allows [_finalization_](finalization.md) code to be invoked
+restrictions.) It allows [finalization](finalization.md) code to be invoked
when an object is collected. It can take advantage of type information
to locate pointers if such information is provided, but it is usually used
without such information. See the README and `gc.h` files in the distribution
`malloc`/`free` allocation in time.
We also expect that in many cases any additional overhead will be more than
-compensated for by decreased copying etc. if programs are written and tuned
+compensated for by e.g. decreased copying if programs are written and tuned
for garbage collection.
## Further reading
**The beginnings of a frequently asked questions list for this collector are
-[here](http://www.hboehm.info/gc/faq.html)**.
+[here](http://www.hboehm.info/gc/faq.html).**
-**The following provide information on garbage collection in general**: Paul
-Wilson's [garbage collection ftp archive](ftp://ftp.cs.utexas.edu/pub/garbage)
+**The following provide information on garbage collection in general:**
+
+Paul Wilson's
+[garbage collection ftp archive](ftp://ftp.cs.utexas.edu/pub/garbage)
and [GC survey](ftp://ftp.cs.utexas.edu/pub/garbage/gcsurvey.ps).
The Ravenbrook
and his [book](http://www.cs.kent.ac.uk/people/staff/rej/gcbook/gcbook.html).
**The following papers describe the collector algorithms we use and the
-underlying design decisions at a higher level.**
+underlying design decisions at a higher level:**
(Some of the lower level details can be found [here](gcdescr.md).)
test for the potential of unbounded heap growth.
**The following papers discuss language and compiler restrictions necessary
-to guaranteed safety of conservative garbage collection.**
+to guaranteed safety of conservative garbage collection:**
We thank John Levine and JCLT for allowing us to make the second paper
available electronically, and providing PostScript for the final version.
All of the following assumes that the collector is being ported to
a byte-addressable 32- or 64-bit machine. Currently all successful ports
-to 64-bit machines involve LP64 targets. The code base includes some
-provisions for P64 targets (notably Win64), but that has not been tested. You
+to 64-bit machines involve LP64 and LLP64 targets (notably Win64). You
are hereby discouraged from attempting a port to non-byte-addressable,
or 8-bit, or 16-bit machines.
is found. This often works on Posix-like platforms. It makes it harder
to debug client programs, since startup involves generating and catching
a segmentation fault, which tends to confuse users.
- * `DATAEND` - Set to the end of the main data segment. Defaults to `end`,
+ * `DATAEND` - Set to the end of the main data segment. Defaults to `_end`,
where that is declared as an array. This works in some cases, since the
linker introduces a suitable symbol.
* `DATASTART2`, `DATAEND2` - Some platforms have two discontiguous main data
plausible page boundary, and use that as the stack base.
* `DYNAMIC_LOADING` - Should be defined if `dyn_load.c` has been updated for
this platform and tracing of dynamic library roots is supported.
- * `MPROTECT_VDB`, `PROC_VDB` - May be defined if the corresponding
- _virtual dirty bit_ implementation in `os_dep.c` is usable on this platform.
- This allows incremental/generational garbage collection. `MPROTECT_VDB`
- identifies modified pages by write protecting the heap and catching faults.
- `PROC_VDB` uses the /proc primitives to read dirty bits.
+ * `GWW_VDB`, `MPROTECT_VDB`, `PROC_VDB` - May be defined if the
+ corresponding _virtual dirty bit_ implementation in `os_dep.c` is usable on
+ this platform. This allows incremental/generational garbage collection.
+ (`GWW_VDB` uses the Win32 `GetWriteWatch` function to read dirty bits,
+ `MPROTECT_VDB` identifies modified pages by write protecting the heap and
+ catching faults. `PROC_VDB` uses the /proc primitives to read dirty bits.)
* `PREFETCH`, `GC_PREFETCH_FOR_WRITE` - The collector uses `PREFETCH(x)`
to preload the cache with the data at _x_ address. This defaults to a no-op.
* `CLEAR_DOUBLE` - If `CLEAR_DOUBLE` is defined, then `CLEAR_DOUBLE(x)`
workarounds are common. Non-preemptive threads packages will probably
require further work. Similarly thread-local allocation and parallel marking
requires further work in `pthread_support.c`, and may require better
- `atomic_ops` support.
+ `atomic_ops` support for the designed platform.
## Dynamic library support
# Garbage collector scalability
-In its default configuration, the Boehm-Demers-Weiser garbage collector is not
-thread-safe. It can be made thread-safe for a number of environments
-by building the collector with `-DGC_THREADS` compilation flag. This has
-primarily two effects:
+If Makefile.direct is used, in its default configuration the
+Boehm-Demers-Weiser garbage collector is not thread-safe. Generally, it can be
+made thread-safe by building the collector with `-DGC_THREADS` compilation
+flag. This has primarily the following effects:
1. It causes the garbage collector to stop all other threads when it needs
- to see a consistent memory state.
+ to see a consistent memory state. It intercepts thread creation and
+ termination events to maintain a list of client threads to be stopped when
+ needed.
2. It causes the collector to acquire a lock around essentially all
allocation and garbage collection activity. Since a single lock is used for
all allocation-related activity, only one thread can be allocating
On most platforms, the allocator/collector lock is implemented as a spin lock
with exponential back-off. Longer wait times are implemented by yielding
and/or sleeping. If a collection is in progress, the pure spinning stage
-is skipped. This has the advantage that uncontested and thus most uniprocessor
-lock acquisitions are very cheap. It has the disadvantage that the application
-may sleep for small periods of time even when there is work to be done. And
+is skipped. This has the uncontested advantage that most uniprocessor lock
+acquisitions are very cheap. It has the disadvantage that the application may
+sleep for small periods of time even when there is work to be done. And
threads may be unnecessarily woken up for short periods. Nonetheless, this
scheme empirically outperforms native queue-based mutual exclusion
implementations in most cases, sometimes drastically so.
* Building the collector with `-DPARALLEL_MARK` allows the collector to run
the mark phase in parallel in multiple threads, and thus on multiple
- processors. The mark phase typically consumes the large majority of the
- collection time. Thus this largely parallelizes the garbage collector
- itself, though not the allocation process. Currently the marking
+ processors (or processor cores). The mark phase typically consumes the large
+ majority of the collection time. Thus, this largely parallelizes the garbage
+ collector itself, though not the allocation process. Currently the marking
is performed by the thread that triggered the collection, together with
- _N_ - 1 dedicated threads, where _N_ is the number of processors detected
- by the collector. The dedicated threads are created once at initialization
- time. A second effect of this flag is to switch to a more concurrent
- implementation of `GC_malloc_many`, so that free lists can be built, and
- memory can be cleared, by more than one thread concurrently.
+ _N_ - 1 dedicated threads, where _N_ is the number of processors (cores)
+ detected by the collector. The dedicated marker threads are created once at
+ initialization time. Another effect of this flag is to switch to a more
+ concurrent implementation of `GC_malloc_many`, so that free lists can be
+ built and memory can be cleared by more than one thread concurrently.
* Building the collector with `-DTHREAD_LOCAL_ALLOC` adds support for
- thread-local allocation. This causes `GC_malloc`, `GC_malloc_atomic`, and
- `GC_gcj_malloc` to be redefined to perform thread-local allocation.
+ thread-local allocation. This causes `GC_malloc` (actually `GC_malloc_kind`)
+ and `GC_gcj_malloc` to be redefined to perform thread-local allocation.
Memory returned from thread-local allocators is completely interchangeable
with that returned by the standard allocators. It may be used by other
spin-then-sleep lock to be replaced by a spin-then-queue based implementation.
This _reduces performance_ for the standard allocation functions, though
it usually improves performance when thread-local allocation is used heavily,
-and thus the number of short-duration lock acquisitions is greatly reduced.
+and, thus, the number of short-duration lock acquisitions is greatly reduced.
## The Parallel Marking Algorithm
the other, but not both.
The number of marker threads is set on startup to the number of available
-processors (or to the value of the `GC_NPROCS` environment variable). If only
-a single processor is detected, parallel marking is disabled.
+processor cores (or to the value of either `GC_MARKERS` or `GC_NPROCS`
+environment variable, if provided). If only a single processor is detected,
+parallel marking is disabled.
Note that setting `GC_NPROCS` to 1 also causes some lock acquisitions inside
the collector to immediately yield the processor instead of busy waiting
time increased to 10.3 seconds, or 23.5 elapsed seconds with two clients. (The
times for the `malloc`/`free` version with glibc `malloc` are 10.51 (standard
library, pthreads not linked), 20.90 (one thread, pthreads linked), and 24.55
-seconds respectively. The benchmark favors a garbage collector, since most
+seconds, respectively. The benchmark favors a garbage collector, since most
objects are small.)
The following table gives execution times for the collector built with
kind of hardware even with such a small number of processors, since the memory
system is a major constraint for the garbage collector, the processors usually
share a single memory bus, and thus the aggregate memory bandwidth does not
-increase in proportion to the number of processors.
+increase in proportion to the number of processors (cores).
These results are likely to be very sensitive to both hardware and OS issues.
Preliminary experiments with an older Pentium Pro machine running an older