xref: /freebsd/contrib/jemalloc/ChangeLog (revision c43cad87172039ccf38172129c79755ea79e6102)
1a4bd5210SJason EvansFollowing are change highlights associated with official releases.  Important
2d0e79aa3SJason Evansbug fixes are all mentioned, but some internal enhancements are omitted here for
3d0e79aa3SJason Evansbrevity.  Much more detail can be found in the git revision history:
4a4bd5210SJason Evans
5706d9bd1SJason Evans    https://github.com/jemalloc/jemalloc
6706d9bd1SJason Evans
7*c43cad87SWarner Losh* 5.3.0 (May 6, 2022)
8*c43cad87SWarner Losh
9*c43cad87SWarner Losh  This release contains many speed and space optimizations, from micro
10*c43cad87SWarner Losh  optimizations on common paths to rework of internal data structures and
11*c43cad87SWarner Losh  locking schemes, and many more too detailed to list below.  Multiple percent
12*c43cad87SWarner Losh  of system level metric improvements were measured in tested production
13*c43cad87SWarner Losh  workloads.  The release has gone through large-scale production testing.
14*c43cad87SWarner Losh
15*c43cad87SWarner Losh  New features:
16*c43cad87SWarner Losh  - Add the thread.idle mallctl which hints that the calling thread will be
17*c43cad87SWarner Losh    idle for a nontrivial period of time.  (@davidtgoldblatt)
18*c43cad87SWarner Losh  - Allow small size classes to be the maximum size class to cache in the
19*c43cad87SWarner Losh    thread-specific cache, through the opt.[lg_]tcache_max option.  (@interwq,
20*c43cad87SWarner Losh    @jordalgo)
21*c43cad87SWarner Losh  - Make the behavior of realloc(ptr, 0) configurable with opt.zero_realloc.
22*c43cad87SWarner Losh    (@davidtgoldblatt)
23*c43cad87SWarner Losh  - Add 'make uninstall' support.  (@sangshuduo, @Lapenkov)
24*c43cad87SWarner Losh  - Support C++17 over-aligned allocation.  (@marksantaniello)
25*c43cad87SWarner Losh  - Add the thread.peak mallctl for approximate per-thread peak memory tracking.
26*c43cad87SWarner Losh    (@davidtgoldblatt)
27*c43cad87SWarner Losh  - Add interval-based stats output opt.stats_interval.  (@interwq)
28*c43cad87SWarner Losh  - Add prof.prefix to override filename prefixes for dumps.  (@zhxchen17)
29*c43cad87SWarner Losh  - Add high resolution timestamp support for profiling.  (@tyroguru)
30*c43cad87SWarner Losh  - Add the --collapsed flag to jeprof for flamegraph generation.
31*c43cad87SWarner Losh    (@igorwwwwwwwwwwwwwwwwwwww)
32*c43cad87SWarner Losh  - Add the --debug-syms-by-id option to jeprof for debug symbols discovery.
33*c43cad87SWarner Losh    (@DeannaGelbart)
34*c43cad87SWarner Losh  - Add the opt.prof_leak_error option to exit with error code when leak is
35*c43cad87SWarner Losh    detected using opt.prof_final.  (@yunxuo)
36*c43cad87SWarner Losh  - Add opt.cache_oblivious as an runtime alternative to config.cache_oblivious.
37*c43cad87SWarner Losh    (@interwq)
38*c43cad87SWarner Losh  - Add mallctl interfaces:
39*c43cad87SWarner Losh    + opt.zero_realloc  (@davidtgoldblatt)
40*c43cad87SWarner Losh    + opt.cache_oblivious  (@interwq)
41*c43cad87SWarner Losh    + opt.prof_leak_error  (@yunxuo)
42*c43cad87SWarner Losh    + opt.stats_interval  (@interwq)
43*c43cad87SWarner Losh    + opt.stats_interval_opts  (@interwq)
44*c43cad87SWarner Losh    + opt.tcache_max  (@interwq)
45*c43cad87SWarner Losh    + opt.trust_madvise  (@azat)
46*c43cad87SWarner Losh    + prof.prefix  (@zhxchen17)
47*c43cad87SWarner Losh    + stats.zero_reallocs  (@davidtgoldblatt)
48*c43cad87SWarner Losh    + thread.idle  (@davidtgoldblatt)
49*c43cad87SWarner Losh    + thread.peak.{read,reset}  (@davidtgoldblatt)
50*c43cad87SWarner Losh
51*c43cad87SWarner Losh  Bug fixes:
52*c43cad87SWarner Losh  - Fix the synchronization around explicit tcache creation which could cause
53*c43cad87SWarner Losh    invalid tcache identifiers.  This regression was first released in 5.0.0.
54*c43cad87SWarner Losh    (@yoshinorim, @davidtgoldblatt)
55*c43cad87SWarner Losh  - Fix a profiling biasing issue which could cause incorrect heap usage and
56*c43cad87SWarner Losh    object counts.  This issue existed in all previous releases with the heap
57*c43cad87SWarner Losh    profiling feature.  (@davidtgoldblatt)
58*c43cad87SWarner Losh  - Fix the order of stats counter updating on large realloc which could cause
59*c43cad87SWarner Losh    failed assertions.  This regression was first released in 5.0.0.  (@azat)
60*c43cad87SWarner Losh  - Fix the locking on the arena destroy mallctl, which could cause concurrent
61*c43cad87SWarner Losh    arena creations to fail.  This functionality was first introduced in 5.0.0.
62*c43cad87SWarner Losh    (@interwq)
63*c43cad87SWarner Losh
64*c43cad87SWarner Losh  Portability improvements:
65*c43cad87SWarner Losh  - Remove nothrow from system function declarations on macOS and FreeBSD.
66*c43cad87SWarner Losh    (@davidtgoldblatt, @fredemmott, @leres)
67*c43cad87SWarner Losh  - Improve overcommit and page alignment settings on NetBSD.  (@zoulasc)
68*c43cad87SWarner Losh  - Improve CPU affinity support on BSD platforms.  (@devnexen)
69*c43cad87SWarner Losh  - Improve utrace detection and support.  (@devnexen)
70*c43cad87SWarner Losh  - Improve QEMU support with MADV_DONTNEED zeroed pages detection.  (@azat)
71*c43cad87SWarner Losh  - Add memcntl support on Solaris / illumos.  (@devnexen)
72*c43cad87SWarner Losh  - Improve CPU_SPINWAIT on ARM.  (@AWSjswinney)
73*c43cad87SWarner Losh  - Improve TSD cleanup on FreeBSD.  (@Lapenkov)
74*c43cad87SWarner Losh  - Disable percpu_arena if the CPU count cannot be reliably detected.  (@azat)
75*c43cad87SWarner Losh  - Add malloc_size(3) override support.  (@devnexen)
76*c43cad87SWarner Losh  - Add mmap VM_MAKE_TAG support.  (@devnexen)
77*c43cad87SWarner Losh  - Add support for MADV_[NO]CORE.  (@devnexen)
78*c43cad87SWarner Losh  - Add support for DragonFlyBSD.  (@devnexen)
79*c43cad87SWarner Losh  - Fix the QUANTUM setting on MIPS64.  (@brooksdavis)
80*c43cad87SWarner Losh  - Add the QUANTUM setting for ARC.  (@vineetgarc)
81*c43cad87SWarner Losh  - Add the QUANTUM setting for LoongArch.  (@wangjl-uos)
82*c43cad87SWarner Losh  - Add QNX support.  (@jqian-aurora)
83*c43cad87SWarner Losh  - Avoid atexit(3) calls unless the relevant profiling features are enabled.
84*c43cad87SWarner Losh    (@BusyJay, @laiwei-rice, @interwq)
85*c43cad87SWarner Losh  - Fix unknown option detection when using Clang.  (@Lapenkov)
86*c43cad87SWarner Losh  - Fix symbol conflict with musl libc.  (@georgthegreat)
87*c43cad87SWarner Losh  - Add -Wimplicit-fallthrough checks.  (@nickdesaulniers)
88*c43cad87SWarner Losh  - Add __forceinline support on MSVC.  (@santagada)
89*c43cad87SWarner Losh  - Improve FreeBSD and Windows CI support.  (@Lapenkov)
90*c43cad87SWarner Losh  - Add CI support for PPC64LE architecture.  (@ezeeyahoo)
91*c43cad87SWarner Losh
92*c43cad87SWarner Losh  Incompatible changes:
93*c43cad87SWarner Losh  - Maximum size class allowed in tcache (opt.[lg_]tcache_max) now has an upper
94*c43cad87SWarner Losh    bound of 8MiB.  (@interwq)
95*c43cad87SWarner Losh
96*c43cad87SWarner Losh  Optimizations and refactors (@davidtgoldblatt, @Lapenkov, @interwq):
97*c43cad87SWarner Losh  - Optimize the common cases of the thread cache operations.
98*c43cad87SWarner Losh  - Optimize internal data structures, including RB tree and pairing heap.
99*c43cad87SWarner Losh  - Optimize the internal locking on extent management.
100*c43cad87SWarner Losh  - Extract and refactor the internal page allocator and interface modules.
101*c43cad87SWarner Losh
102*c43cad87SWarner Losh  Documentation:
103*c43cad87SWarner Losh  - Fix doc build with --with-install-suffix.  (@lawmurray, @interwq)
104*c43cad87SWarner Losh  - Add PROFILING_INTERNALS.md.  (@davidtgoldblatt)
105*c43cad87SWarner Losh  - Ensure the proper order of doc building and installation.  (@Mingli-Yu)
106*c43cad87SWarner Losh
107c5ad8142SEric van Gyzen* 5.2.1 (August 5, 2019)
108c5ad8142SEric van Gyzen
109c5ad8142SEric van Gyzen  This release is primarily about Windows.  A critical virtual memory leak is
110c5ad8142SEric van Gyzen  resolved on all Windows platforms.  The regression was present in all releases
111c5ad8142SEric van Gyzen  since 5.0.0.
112c5ad8142SEric van Gyzen
113c5ad8142SEric van Gyzen  Bug fixes:
114c5ad8142SEric van Gyzen  - Fix a severe virtual memory leak on Windows.  This regression was first
115c5ad8142SEric van Gyzen    released in 5.0.0.  (@Ignition, @j0t, @frederik-h, @davidtgoldblatt,
116c5ad8142SEric van Gyzen    @interwq)
117c5ad8142SEric van Gyzen  - Fix size 0 handling in posix_memalign().  This regression was first released
118c5ad8142SEric van Gyzen    in 5.2.0.  (@interwq)
119c5ad8142SEric van Gyzen  - Fix the prof_log unit test which may observe unexpected backtraces from
120c5ad8142SEric van Gyzen    compiler optimizations.  The test was first added in 5.2.0.  (@marxin,
121c5ad8142SEric van Gyzen    @gnzlbg, @interwq)
122c5ad8142SEric van Gyzen  - Fix the declaration of the extent_avail tree.  This regression was first
123c5ad8142SEric van Gyzen    released in 5.1.0.  (@zoulasc)
124c5ad8142SEric van Gyzen  - Fix an incorrect reference in jeprof.  This functionality was first released
125c5ad8142SEric van Gyzen    in 3.0.0.  (@prehistoric-penguin)
126c5ad8142SEric van Gyzen  - Fix an assertion on the deallocation fast-path.  This regression was first
127c5ad8142SEric van Gyzen    released in 5.2.0.  (@yinan1048576)
128c5ad8142SEric van Gyzen  - Fix the TLS_MODEL attribute in headers.  This regression was first released
129c5ad8142SEric van Gyzen    in 5.0.0.  (@zoulasc, @interwq)
130c5ad8142SEric van Gyzen
131c5ad8142SEric van Gyzen  Optimizations and refactors:
132c5ad8142SEric van Gyzen  - Implement opt.retain on Windows and enable by default on 64-bit.  (@interwq,
133c5ad8142SEric van Gyzen    @davidtgoldblatt)
134c5ad8142SEric van Gyzen  - Optimize away a branch on the operator delete[] path.  (@mgrice)
135c5ad8142SEric van Gyzen  - Add format annotation to the format generator function.  (@zoulasc)
136c5ad8142SEric van Gyzen  - Refactor and improve the size class header generation.  (@yinan1048576)
137c5ad8142SEric van Gyzen  - Remove best fit.  (@djwatson)
138c5ad8142SEric van Gyzen  - Avoid blocking on background thread locks for stats.  (@oranagra, @interwq)
139c5ad8142SEric van Gyzen
140c5ad8142SEric van Gyzen* 5.2.0 (April 2, 2019)
141c5ad8142SEric van Gyzen
142c5ad8142SEric van Gyzen  This release includes a few notable improvements, which are summarized below:
143c5ad8142SEric van Gyzen  1) improved fast-path performance from the optimizations by @djwatson; 2)
144c5ad8142SEric van Gyzen  reduced virtual memory fragmentation and metadata usage; and 3) bug fixes on
145c5ad8142SEric van Gyzen  setting the number of background threads.  In addition, peak / spike memory
146c5ad8142SEric van Gyzen  usage is improved with certain allocation patterns.  As usual, the release and
147c5ad8142SEric van Gyzen  prior dev versions have gone through large-scale production testing.
148c5ad8142SEric van Gyzen
149c5ad8142SEric van Gyzen  New features:
150c5ad8142SEric van Gyzen  - Implement oversize_threshold, which uses a dedicated arena for allocations
151c5ad8142SEric van Gyzen    crossing the specified threshold to reduce fragmentation.  (@interwq)
152c5ad8142SEric van Gyzen  - Add extents usage information to stats.  (@tyleretzel)
153c5ad8142SEric van Gyzen  - Log time information for sampled allocations.  (@tyleretzel)
154c5ad8142SEric van Gyzen  - Support 0 size in sdallocx.  (@djwatson)
155c5ad8142SEric van Gyzen  - Output rate for certain counters in malloc_stats.  (@zinoale)
156c5ad8142SEric van Gyzen  - Add configure option --enable-readlinkat, which allows the use of readlinkat
157c5ad8142SEric van Gyzen    over readlink.  (@davidtgoldblatt)
158c5ad8142SEric van Gyzen  - Add configure options --{enable,disable}-{static,shared} to allow not
159c5ad8142SEric van Gyzen    building unwanted libraries.  (@Ericson2314)
160c5ad8142SEric van Gyzen  - Add configure option --disable-libdl to enable fully static builds.
161c5ad8142SEric van Gyzen    (@interwq)
162c5ad8142SEric van Gyzen  - Add mallctl interfaces:
163c5ad8142SEric van Gyzen	+ opt.oversize_threshold (@interwq)
164c5ad8142SEric van Gyzen	+ stats.arenas.<i>.extent_avail (@tyleretzel)
165c5ad8142SEric van Gyzen	+ stats.arenas.<i>.extents.<j>.n{dirty,muzzy,retained} (@tyleretzel)
166c5ad8142SEric van Gyzen	+ stats.arenas.<i>.extents.<j>.{dirty,muzzy,retained}_bytes
167c5ad8142SEric van Gyzen	  (@tyleretzel)
168c5ad8142SEric van Gyzen
169c5ad8142SEric van Gyzen  Portability improvements:
170c5ad8142SEric van Gyzen  - Update MSVC builds.  (@maksqwe, @rustyx)
171c5ad8142SEric van Gyzen  - Workaround a compiler optimizer bug on s390x.  (@rkmisra)
172c5ad8142SEric van Gyzen  - Make use of pthread_set_name_np(3) on FreeBSD.  (@trasz)
173c5ad8142SEric van Gyzen  - Implement malloc_getcpu() to enable percpu_arena for windows.  (@santagada)
174c5ad8142SEric van Gyzen  - Link against -pthread instead of -lpthread.  (@paravoid)
175c5ad8142SEric van Gyzen  - Make background_thread not dependent on libdl.  (@interwq)
176c5ad8142SEric van Gyzen  - Add stringify to fix a linker directive issue on MSVC.  (@daverigby)
177c5ad8142SEric van Gyzen  - Detect and fall back when 8-bit atomics are unavailable.  (@interwq)
178c5ad8142SEric van Gyzen  - Fall back to the default pthread_create if dlsym(3) fails.  (@interwq)
179c5ad8142SEric van Gyzen
180c5ad8142SEric van Gyzen  Optimizations and refactors:
181c5ad8142SEric van Gyzen  - Refactor the TSD module.  (@davidtgoldblatt)
182c5ad8142SEric van Gyzen  - Avoid taking extents_muzzy mutex when muzzy is disabled.  (@interwq)
183c5ad8142SEric van Gyzen  - Avoid taking large_mtx for auto arenas on the tcache flush path.  (@interwq)
184c5ad8142SEric van Gyzen  - Optimize ixalloc by avoiding a size lookup.  (@interwq)
185c5ad8142SEric van Gyzen  - Implement opt.oversize_threshold which uses a dedicated arena for requests
186c5ad8142SEric van Gyzen    crossing the threshold, also eagerly purges the oversize extents.  Default
187c5ad8142SEric van Gyzen    the threshold to 8 MiB.  (@interwq)
188c5ad8142SEric van Gyzen  - Clean compilation with -Wextra.  (@gnzlbg, @jasone)
189c5ad8142SEric van Gyzen  - Refactor the size class module.  (@davidtgoldblatt)
190c5ad8142SEric van Gyzen  - Refactor the stats emitter.  (@tyleretzel)
191c5ad8142SEric van Gyzen  - Optimize pow2_ceil.  (@rkmisra)
192c5ad8142SEric van Gyzen  - Avoid runtime detection of lazy purging on FreeBSD.  (@trasz)
193c5ad8142SEric van Gyzen  - Optimize mmap(2) alignment handling on FreeBSD.  (@trasz)
194c5ad8142SEric van Gyzen  - Improve error handling for THP state initialization.  (@jsteemann)
195c5ad8142SEric van Gyzen  - Rework the malloc() fast path.  (@djwatson)
196c5ad8142SEric van Gyzen  - Rework the free() fast path.  (@djwatson)
197c5ad8142SEric van Gyzen  - Refactor and optimize the tcache fill / flush paths.  (@djwatson)
198c5ad8142SEric van Gyzen  - Optimize sync / lwsync on PowerPC.  (@chmeeedalf)
199c5ad8142SEric van Gyzen  - Bypass extent_dalloc() when retain is enabled.  (@interwq)
200c5ad8142SEric van Gyzen  - Optimize the locking on large deallocation.  (@interwq)
201c5ad8142SEric van Gyzen  - Reduce the number of pages committed from sanity checking in debug build.
202c5ad8142SEric van Gyzen    (@trasz, @interwq)
203c5ad8142SEric van Gyzen  - Deprecate OSSpinLock.  (@interwq)
204c5ad8142SEric van Gyzen  - Lower the default number of background threads to 4 (when the feature
205c5ad8142SEric van Gyzen    is enabled).  (@interwq)
206c5ad8142SEric van Gyzen  - Optimize the trylock spin wait.  (@djwatson)
207c5ad8142SEric van Gyzen  - Use arena index for arena-matching checks.  (@interwq)
208c5ad8142SEric van Gyzen  - Avoid forced decay on thread termination when using background threads.
209c5ad8142SEric van Gyzen    (@interwq)
210c5ad8142SEric van Gyzen  - Disable muzzy decay by default.  (@djwatson, @interwq)
211c5ad8142SEric van Gyzen  - Only initialize libgcc unwinder when profiling is enabled.  (@paravoid,
212c5ad8142SEric van Gyzen    @interwq)
213c5ad8142SEric van Gyzen
214c5ad8142SEric van Gyzen  Bug fixes (all only relevant to jemalloc 5.x):
215c5ad8142SEric van Gyzen  - Fix background thread index issues with max_background_threads.  (@djwatson,
216c5ad8142SEric van Gyzen    @interwq)
217c5ad8142SEric van Gyzen  - Fix stats output for opt.lg_extent_max_active_fit.  (@interwq)
218c5ad8142SEric van Gyzen  - Fix opt.prof_prefix initialization.  (@davidtgoldblatt)
219c5ad8142SEric van Gyzen  - Properly trigger decay on tcache destroy.  (@interwq, @amosbird)
220c5ad8142SEric van Gyzen  - Fix tcache.flush.  (@interwq)
221c5ad8142SEric van Gyzen  - Detect whether explicit extent zero out is necessary with huge pages or
222c5ad8142SEric van Gyzen    custom extent hooks, which may change the purge semantics.  (@interwq)
223c5ad8142SEric van Gyzen  - Fix a side effect caused by extent_max_active_fit combined with decay-based
224c5ad8142SEric van Gyzen    purging, where freed extents can accumulate and not be reused for an
225c5ad8142SEric van Gyzen    extended period of time.  (@interwq, @mpghf)
226c5ad8142SEric van Gyzen  - Fix a missing unlock on extent register error handling.  (@zoulasc)
227c5ad8142SEric van Gyzen
228c5ad8142SEric van Gyzen  Testing:
229c5ad8142SEric van Gyzen  - Simplify the Travis script output.  (@gnzlbg)
230c5ad8142SEric van Gyzen  - Update the test scripts for FreeBSD.  (@devnexen)
231c5ad8142SEric van Gyzen  - Add unit tests for the producer-consumer pattern.  (@interwq)
232c5ad8142SEric van Gyzen  - Add Cirrus-CI config for FreeBSD builds.  (@jasone)
233c5ad8142SEric van Gyzen  - Add size-matching sanity checks on tcache flush.  (@davidtgoldblatt,
234c5ad8142SEric van Gyzen    @interwq)
235c5ad8142SEric van Gyzen
236c5ad8142SEric van Gyzen  Incompatible changes:
237c5ad8142SEric van Gyzen  - Remove --with-lg-page-sizes.  (@davidtgoldblatt)
238c5ad8142SEric van Gyzen
239c5ad8142SEric van Gyzen  Documentation:
240c5ad8142SEric van Gyzen  - Attempt to build docs by default, however skip doc building when xsltproc
241c5ad8142SEric van Gyzen    is missing. (@interwq, @cmuellner)
242c5ad8142SEric van Gyzen
243c5ad8142SEric van Gyzen* 5.1.0 (May 4, 2018)
2440ef50b4eSJason Evans
2450ef50b4eSJason Evans  This release is primarily about fine-tuning, ranging from several new features
2460ef50b4eSJason Evans  to numerous notable performance and portability enhancements.  The release and
2470ef50b4eSJason Evans  prior dev versions have been running in multiple large scale applications for
2480ef50b4eSJason Evans  months, and the cumulative improvements are substantial in many cases.
2490ef50b4eSJason Evans
2500ef50b4eSJason Evans  Given the long and successful production runs, this release is likely a good
2510ef50b4eSJason Evans  candidate for applications to upgrade, from both jemalloc 5.0 and before.  For
2520ef50b4eSJason Evans  performance-critical applications, the newly added TUNING.md provides
2530ef50b4eSJason Evans  guidelines on jemalloc tuning.
2540ef50b4eSJason Evans
2550ef50b4eSJason Evans  New features:
2560ef50b4eSJason Evans  - Implement transparent huge page support for internal metadata.  (@interwq)
2570ef50b4eSJason Evans  - Add opt.thp to allow enabling / disabling transparent huge pages for all
2580ef50b4eSJason Evans    mappings.  (@interwq)
2590ef50b4eSJason Evans  - Add maximum background thread count option.  (@djwatson)
2600ef50b4eSJason Evans  - Allow prof_active to control opt.lg_prof_interval and prof.gdump.
2610ef50b4eSJason Evans    (@interwq)
2620ef50b4eSJason Evans  - Allow arena index lookup based on allocation addresses via mallctl.
2630ef50b4eSJason Evans    (@lionkov)
2640ef50b4eSJason Evans  - Allow disabling initial-exec TLS model.  (@davidtgoldblatt, @KenMacD)
2650ef50b4eSJason Evans  - Add opt.lg_extent_max_active_fit to set the max ratio between the size of
2660ef50b4eSJason Evans    the active extent selected (to split off from) and the size of the requested
2670ef50b4eSJason Evans    allocation.  (@interwq, @davidtgoldblatt)
2680ef50b4eSJason Evans  - Add retain_grow_limit to set the max size when growing virtual address
2690ef50b4eSJason Evans    space.  (@interwq)
2700ef50b4eSJason Evans  - Add mallctl interfaces:
2710ef50b4eSJason Evans    + arena.<i>.retain_grow_limit  (@interwq)
2720ef50b4eSJason Evans    + arenas.lookup  (@lionkov)
2730ef50b4eSJason Evans    + max_background_threads  (@djwatson)
2740ef50b4eSJason Evans    + opt.lg_extent_max_active_fit  (@interwq)
2750ef50b4eSJason Evans    + opt.max_background_threads  (@djwatson)
2760ef50b4eSJason Evans    + opt.metadata_thp  (@interwq)
2770ef50b4eSJason Evans    + opt.thp  (@interwq)
2780ef50b4eSJason Evans    + stats.metadata_thp  (@interwq)
2790ef50b4eSJason Evans
2800ef50b4eSJason Evans  Portability improvements:
2810ef50b4eSJason Evans  - Support GNU/kFreeBSD configuration.  (@paravoid)
2820ef50b4eSJason Evans  - Support m68k, nios2 and SH3 architectures.  (@paravoid)
2830ef50b4eSJason Evans  - Fall back to FD_CLOEXEC when O_CLOEXEC is unavailable.  (@zonyitoo)
2840ef50b4eSJason Evans  - Fix symbol listing for cross-compiling.  (@tamird)
2850ef50b4eSJason Evans  - Fix high bits computation on ARM.  (@davidtgoldblatt, @paravoid)
2860ef50b4eSJason Evans  - Disable the CPU_SPINWAIT macro for Power.  (@davidtgoldblatt, @marxin)
2870ef50b4eSJason Evans  - Fix MSVC 2015 & 2017 builds.  (@rustyx)
2880ef50b4eSJason Evans  - Improve RISC-V support.  (@EdSchouten)
2890ef50b4eSJason Evans  - Set name mangling script in strict mode.  (@nicolov)
2900ef50b4eSJason Evans  - Avoid MADV_HUGEPAGE on ARM.  (@marxin)
2910ef50b4eSJason Evans  - Modify configure to determine return value of strerror_r.
2920ef50b4eSJason Evans    (@davidtgoldblatt, @cferris1000)
2930ef50b4eSJason Evans  - Make sure CXXFLAGS is tested with CPP compiler.  (@nehaljwani)
2940ef50b4eSJason Evans  - Fix 32-bit build on MSVC.  (@rustyx)
2950ef50b4eSJason Evans  - Fix external symbol on MSVC.  (@maksqwe)
2960ef50b4eSJason Evans  - Avoid a printf format specifier warning.  (@jasone)
2970ef50b4eSJason Evans  - Add configure option --disable-initial-exec-tls which can allow jemalloc to
2980ef50b4eSJason Evans    be dynamically loaded after program startup.  (@davidtgoldblatt, @KenMacD)
2990ef50b4eSJason Evans  - AArch64: Add ILP32 support.  (@cmuellner)
3000ef50b4eSJason Evans  - Add --with-lg-vaddr configure option to support cross compiling.
3010ef50b4eSJason Evans    (@cmuellner, @davidtgoldblatt)
3020ef50b4eSJason Evans
3030ef50b4eSJason Evans  Optimizations and refactors:
3040ef50b4eSJason Evans  - Improve active extent fit with extent_max_active_fit.  This considerably
3050ef50b4eSJason Evans    reduces fragmentation over time and improves virtual memory and metadata
3060ef50b4eSJason Evans    usage.  (@davidtgoldblatt, @interwq)
3070ef50b4eSJason Evans  - Eagerly coalesce large extents to reduce fragmentation.  (@interwq)
3080ef50b4eSJason Evans  - sdallocx: only read size info when page aligned (i.e. possibly sampled),
3090ef50b4eSJason Evans    which speeds up the sized deallocation path significantly.  (@interwq)
3100ef50b4eSJason Evans  - Avoid attempting new mappings for in place expansion with retain, since
3110ef50b4eSJason Evans    it rarely succeeds in practice and causes high overhead.  (@interwq)
3120ef50b4eSJason Evans  - Refactor OOM handling in newImpl.  (@wqfish)
3130ef50b4eSJason Evans  - Add internal fine-grained logging functionality for debugging use.
3140ef50b4eSJason Evans    (@davidtgoldblatt)
3150ef50b4eSJason Evans  - Refactor arena / tcache interactions.  (@davidtgoldblatt)
3160ef50b4eSJason Evans  - Refactor extent management with dumpable flag.  (@davidtgoldblatt)
3170ef50b4eSJason Evans  - Add runtime detection of lazy purging.  (@interwq)
3180ef50b4eSJason Evans  - Use pairing heap instead of red-black tree for extents_avail.  (@djwatson)
3190ef50b4eSJason Evans  - Use sysctl on startup in FreeBSD.  (@trasz)
3200ef50b4eSJason Evans  - Use thread local prng state instead of atomic.  (@djwatson)
3210ef50b4eSJason Evans  - Make decay to always purge one more extent than before, because in
3220ef50b4eSJason Evans    practice large extents are usually the ones that cross the decay threshold.
3230ef50b4eSJason Evans    Purging the additional extent helps save memory as well as reduce VM
3240ef50b4eSJason Evans    fragmentation.  (@interwq)
3250ef50b4eSJason Evans  - Fast division by dynamic values.  (@davidtgoldblatt)
3260ef50b4eSJason Evans  - Improve the fit for aligned allocation.  (@interwq, @edwinsmith)
3270ef50b4eSJason Evans  - Refactor extent_t bitpacking.  (@rkmisra)
3280ef50b4eSJason Evans  - Optimize the generated assembly for ticker operations.  (@davidtgoldblatt)
3290ef50b4eSJason Evans  - Convert stats printing to use a structured text emitter.  (@davidtgoldblatt)
3300ef50b4eSJason Evans  - Remove preserve_lru feature for extents management.  (@djwatson)
3310ef50b4eSJason Evans  - Consolidate two memory loads into one on the fast deallocation path.
3320ef50b4eSJason Evans    (@davidtgoldblatt, @interwq)
3330ef50b4eSJason Evans
3340ef50b4eSJason Evans  Bug fixes (most of the issues are only relevant to jemalloc 5.0):
3350ef50b4eSJason Evans  - Fix deadlock with multithreaded fork in OS X.  (@davidtgoldblatt)
3360ef50b4eSJason Evans  - Validate returned file descriptor before use.  (@zonyitoo)
3370ef50b4eSJason Evans  - Fix a few background thread initialization and shutdown issues.  (@interwq)
3380ef50b4eSJason Evans  - Fix an extent coalesce + decay race by taking both coalescing extents off
3390ef50b4eSJason Evans    the LRU list.  (@interwq)
3400ef50b4eSJason Evans  - Fix potentially unbound increase during decay, caused by one thread keep
3410ef50b4eSJason Evans    stashing memory to purge while other threads generating new pages.  The
3420ef50b4eSJason Evans    number of pages to purge is checked to prevent this.  (@interwq)
3430ef50b4eSJason Evans  - Fix a FreeBSD bootstrap assertion.  (@strejda, @interwq)
3440ef50b4eSJason Evans  - Handle 32 bit mutex counters.  (@rkmisra)
3450ef50b4eSJason Evans  - Fix a indexing bug when creating background threads.  (@davidtgoldblatt,
3460ef50b4eSJason Evans    @binliu19)
3470ef50b4eSJason Evans  - Fix arguments passed to extent_init.  (@yuleniwo, @interwq)
3480ef50b4eSJason Evans  - Fix addresses used for ordering mutexes.  (@rkmisra)
3490ef50b4eSJason Evans  - Fix abort_conf processing during bootstrap.  (@interwq)
3500ef50b4eSJason Evans  - Fix include path order for out-of-tree builds.  (@cmuellner)
3510ef50b4eSJason Evans
3520ef50b4eSJason Evans  Incompatible changes:
3530ef50b4eSJason Evans  - Remove --disable-thp.  (@interwq)
3540ef50b4eSJason Evans  - Remove mallctl interfaces:
3550ef50b4eSJason Evans    + config.thp  (@interwq)
3560ef50b4eSJason Evans
3570ef50b4eSJason Evans  Documentation:
3580ef50b4eSJason Evans  - Add TUNING.md.  (@interwq, @davidtgoldblatt, @djwatson)
3590ef50b4eSJason Evans
3608b2f5aafSJason Evans* 5.0.1 (July 1, 2017)
3618b2f5aafSJason Evans
3628b2f5aafSJason Evans  This bugfix release fixes several issues, most of which are obscure enough
3638b2f5aafSJason Evans  that typical applications are not impacted.
3648b2f5aafSJason Evans
3658b2f5aafSJason Evans  Bug fixes:
3668b2f5aafSJason Evans  - Update decay->nunpurged before purging, in order to avoid potential update
3678b2f5aafSJason Evans    races and subsequent incorrect purging volume.  (@interwq)
3688b2f5aafSJason Evans  - Only abort on dlsym(3) error if the failure impacts an enabled feature (lazy
3698b2f5aafSJason Evans    locking and/or background threads).  This mitigates an initialization
3708b2f5aafSJason Evans    failure bug for which we still do not have a clear reproduction test case.
3718b2f5aafSJason Evans    (@interwq)
3728b2f5aafSJason Evans  - Modify tsd management so that it neither crashes nor leaks if a thread's
3738b2f5aafSJason Evans    only allocation activity is to call free() after TLS destructors have been
3748b2f5aafSJason Evans    executed.  This behavior was observed when operating with GNU libc, and is
3758b2f5aafSJason Evans    unlikely to be an issue with other libc implementations.  (@interwq)
3768b2f5aafSJason Evans  - Mask signals during background thread creation.  This prevents signals from
3778b2f5aafSJason Evans    being inadvertently delivered to background threads.  (@jasone,
3788b2f5aafSJason Evans    @davidtgoldblatt, @interwq)
3798b2f5aafSJason Evans  - Avoid inactivity checks within background threads, in order to prevent
3808b2f5aafSJason Evans    recursive mutex acquisition.  (@interwq)
3818b2f5aafSJason Evans  - Fix extent_grow_retained() to use the specified hooks when the
3828b2f5aafSJason Evans    arena.<i>.extent_hooks mallctl is used to override the default hooks.
3838b2f5aafSJason Evans    (@interwq)
3848b2f5aafSJason Evans  - Add missing reentrancy support for custom extent hooks which allocate.
3858b2f5aafSJason Evans    (@interwq)
3868b2f5aafSJason Evans  - Post-fork(2), re-initialize the list of tcaches associated with each arena
3878b2f5aafSJason Evans    to contain no tcaches except the forking thread's.  (@interwq)
3888b2f5aafSJason Evans  - Add missing post-fork(2) mutex reinitialization for extent_grow_mtx.  This
3898b2f5aafSJason Evans    fixes potential deadlocks after fork(2).  (@interwq)
3908b2f5aafSJason Evans  - Enforce minimum autoconf version (currently 2.68), since 2.63 is known to
3918b2f5aafSJason Evans    generate corrupt configure scripts.  (@jasone)
3928b2f5aafSJason Evans  - Ensure that the configured page size (--with-lg-page) is no larger than the
3938b2f5aafSJason Evans    configured huge page size (--with-lg-hugepage).  (@jasone)
3948b2f5aafSJason Evans
395b7eaed25SJason Evans* 5.0.0 (June 13, 2017)
396b7eaed25SJason Evans
397b7eaed25SJason Evans  Unlike all previous jemalloc releases, this release does not use naturally
398b7eaed25SJason Evans  aligned "chunks" for virtual memory management, and instead uses page-aligned
399b7eaed25SJason Evans  "extents".  This change has few externally visible effects, but the internal
400b7eaed25SJason Evans  impacts are... extensive.  Many other internal changes combine to make this
401b7eaed25SJason Evans  the most cohesively designed version of jemalloc so far, with ample
402b7eaed25SJason Evans  opportunity for further enhancements.
403b7eaed25SJason Evans
404b7eaed25SJason Evans  Continuous integration is now an integral aspect of development thanks to the
405b7eaed25SJason Evans  efforts of @davidtgoldblatt, and the dev branch tends to remain reasonably
406b7eaed25SJason Evans  stable on the tested platforms (Linux, FreeBSD, macOS, and Windows).  As a
407b7eaed25SJason Evans  side effect the official release frequency may decrease over time.
408b7eaed25SJason Evans
409b7eaed25SJason Evans  New features:
410b7eaed25SJason Evans  - Implement optional per-CPU arena support; threads choose which arena to use
411b7eaed25SJason Evans    based on current CPU rather than on fixed thread-->arena associations.
412b7eaed25SJason Evans    (@interwq)
413b7eaed25SJason Evans  - Implement two-phase decay of unused dirty pages.  Pages transition from
414b7eaed25SJason Evans    dirty-->muzzy-->clean, where the first phase transition relies on
415b7eaed25SJason Evans    madvise(... MADV_FREE) semantics, and the second phase transition discards
416b7eaed25SJason Evans    pages such that they are replaced with demand-zeroed pages on next access.
417b7eaed25SJason Evans    (@jasone)
418b7eaed25SJason Evans  - Increase decay time resolution from seconds to milliseconds.  (@jasone)
419b7eaed25SJason Evans  - Implement opt-in per CPU background threads, and use them for asynchronous
420b7eaed25SJason Evans    decay-driven unused dirty page purging.  (@interwq)
421b7eaed25SJason Evans  - Add mutex profiling, which collects a variety of statistics useful for
422b7eaed25SJason Evans    diagnosing overhead/contention issues.  (@interwq)
423b7eaed25SJason Evans  - Add C++ new/delete operator bindings.  (@djwatson)
424b7eaed25SJason Evans  - Support manually created arena destruction, such that all data and metadata
425b7eaed25SJason Evans    are discarded.  Add MALLCTL_ARENAS_DESTROYED for accessing merged stats
426b7eaed25SJason Evans    associated with destroyed arenas.  (@jasone)
427b7eaed25SJason Evans  - Add MALLCTL_ARENAS_ALL as a fixed index for use in accessing
428b7eaed25SJason Evans    merged/destroyed arena statistics via mallctl.  (@jasone)
429b7eaed25SJason Evans  - Add opt.abort_conf to optionally abort if invalid configuration options are
430b7eaed25SJason Evans    detected during initialization.  (@interwq)
431b7eaed25SJason Evans  - Add opt.stats_print_opts, so that e.g. JSON output can be selected for the
432b7eaed25SJason Evans    stats dumped during exit if opt.stats_print is true.  (@jasone)
433b7eaed25SJason Evans  - Add --with-version=VERSION for use when embedding jemalloc into another
434b7eaed25SJason Evans    project's git repository.  (@jasone)
435b7eaed25SJason Evans  - Add --disable-thp to support cross compiling.  (@jasone)
436b7eaed25SJason Evans  - Add --with-lg-hugepage to support cross compiling.  (@jasone)
437b7eaed25SJason Evans  - Add mallctl interfaces (various authors):
438b7eaed25SJason Evans    + background_thread
439b7eaed25SJason Evans    + opt.abort_conf
440b7eaed25SJason Evans    + opt.retain
441b7eaed25SJason Evans    + opt.percpu_arena
442b7eaed25SJason Evans    + opt.background_thread
443b7eaed25SJason Evans    + opt.{dirty,muzzy}_decay_ms
444b7eaed25SJason Evans    + opt.stats_print_opts
445b7eaed25SJason Evans    + arena.<i>.initialized
446b7eaed25SJason Evans    + arena.<i>.destroy
447b7eaed25SJason Evans    + arena.<i>.{dirty,muzzy}_decay_ms
448b7eaed25SJason Evans    + arena.<i>.extent_hooks
449b7eaed25SJason Evans    + arenas.{dirty,muzzy}_decay_ms
450b7eaed25SJason Evans    + arenas.bin.<i>.slab_size
451b7eaed25SJason Evans    + arenas.nlextents
452b7eaed25SJason Evans    + arenas.lextent.<i>.size
453b7eaed25SJason Evans    + arenas.create
454b7eaed25SJason Evans    + stats.background_thread.{num_threads,num_runs,run_interval}
455b7eaed25SJason Evans    + stats.mutexes.{ctl,background_thread,prof,reset}.
456b7eaed25SJason Evans      {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
457b7eaed25SJason Evans      num_owner_switch}
458b7eaed25SJason Evans    + stats.arenas.<i>.{dirty,muzzy}_decay_ms
459b7eaed25SJason Evans    + stats.arenas.<i>.uptime
460b7eaed25SJason Evans    + stats.arenas.<i>.{pmuzzy,base,internal,resident}
461b7eaed25SJason Evans    + stats.arenas.<i>.{dirty,muzzy}_{npurge,nmadvise,purged}
462b7eaed25SJason Evans    + stats.arenas.<i>.bins.<j>.{nslabs,reslabs,curslabs}
463b7eaed25SJason Evans    + stats.arenas.<i>.bins.<j>.mutex.
464b7eaed25SJason Evans      {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
465b7eaed25SJason Evans      num_owner_switch}
466b7eaed25SJason Evans    + stats.arenas.<i>.lextents.<j>.{nmalloc,ndalloc,nrequests,curlextents}
467b7eaed25SJason Evans    + stats.arenas.i.mutexes.{large,extent_avail,extents_dirty,extents_muzzy,
468b7eaed25SJason Evans      extents_retained,decay_dirty,decay_muzzy,base,tcache_list}.
469b7eaed25SJason Evans      {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
470b7eaed25SJason Evans      num_owner_switch}
471b7eaed25SJason Evans
472b7eaed25SJason Evans  Portability improvements:
473b7eaed25SJason Evans  - Improve reentrant allocation support, such that deadlock is less likely if
474b7eaed25SJason Evans    e.g. a system library call in turn allocates memory.  (@davidtgoldblatt,
475b7eaed25SJason Evans    @interwq)
476b7eaed25SJason Evans  - Support static linking of jemalloc with glibc.  (@djwatson)
477b7eaed25SJason Evans
478b7eaed25SJason Evans  Optimizations and refactors:
479b7eaed25SJason Evans  - Organize virtual memory as "extents" of virtual memory pages, rather than as
480b7eaed25SJason Evans    naturally aligned "chunks", and store all metadata in arbitrarily distant
481b7eaed25SJason Evans    locations.  This reduces virtual memory external fragmentation, and will
482b7eaed25SJason Evans    interact better with huge pages (not yet explicitly supported).  (@jasone)
483b7eaed25SJason Evans  - Fold large and huge size classes together; only small and large size classes
484b7eaed25SJason Evans    remain.  (@jasone)
485b7eaed25SJason Evans  - Unify the allocation paths, and merge most fast-path branching decisions.
486b7eaed25SJason Evans    (@davidtgoldblatt, @interwq)
487b7eaed25SJason Evans  - Embed per thread automatic tcache into thread-specific data, which reduces
488b7eaed25SJason Evans    conditional branches and dereferences.  Also reorganize tcache to increase
489b7eaed25SJason Evans    fast-path data locality.  (@interwq)
490b7eaed25SJason Evans  - Rewrite atomics to closely model the C11 API, convert various
491b7eaed25SJason Evans    synchronization from mutex-based to atomic, and use the explicit memory
492b7eaed25SJason Evans    ordering control to resolve various hypothetical races without increasing
493b7eaed25SJason Evans    synchronization overhead.  (@davidtgoldblatt)
494b7eaed25SJason Evans  - Extensively optimize rtree via various methods:
495b7eaed25SJason Evans    + Add multiple layers of rtree lookup caching, since rtree lookups are now
496b7eaed25SJason Evans      part of fast-path deallocation.  (@interwq)
497b7eaed25SJason Evans    + Determine rtree layout at compile time.  (@jasone)
498b7eaed25SJason Evans    + Make the tree shallower for common configurations.  (@jasone)
499b7eaed25SJason Evans    + Embed the root node in the top-level rtree data structure, thus avoiding
500b7eaed25SJason Evans      one level of indirection.  (@jasone)
501b7eaed25SJason Evans    + Further specialize leaf elements as compared to internal node elements,
502b7eaed25SJason Evans      and directly embed extent metadata needed for fast-path deallocation.
503b7eaed25SJason Evans      (@jasone)
504b7eaed25SJason Evans    + Ignore leading always-zero address bits (architecture-specific).
505b7eaed25SJason Evans      (@jasone)
506b7eaed25SJason Evans  - Reorganize headers (ongoing work) to make them hermetic, and disentangle
507b7eaed25SJason Evans    various module dependencies.  (@davidtgoldblatt)
508b7eaed25SJason Evans  - Convert various internal data structures such as size class metadata from
509b7eaed25SJason Evans    boot-time-initialized to compile-time-initialized.  Propagate resulting data
510b7eaed25SJason Evans    structure simplifications, such as making arena metadata fixed-size.
511b7eaed25SJason Evans    (@jasone)
512b7eaed25SJason Evans  - Simplify size class lookups when constrained to size classes that are
513b7eaed25SJason Evans    multiples of the page size.  This speeds lookups, but the primary benefit is
514b7eaed25SJason Evans    complexity reduction in code that was the source of numerous regressions.
515b7eaed25SJason Evans    (@jasone)
516b7eaed25SJason Evans  - Lock individual extents when possible for localized extent operations,
517b7eaed25SJason Evans    rather than relying on a top-level arena lock.  (@davidtgoldblatt, @jasone)
518b7eaed25SJason Evans  - Use first fit layout policy instead of best fit, in order to improve
519b7eaed25SJason Evans    packing.  (@jasone)
520b7eaed25SJason Evans  - If munmap(2) is not in use, use an exponential series to grow each arena's
521b7eaed25SJason Evans    virtual memory, so that the number of disjoint virtual memory mappings
522b7eaed25SJason Evans    remains low.  (@jasone)
523b7eaed25SJason Evans  - Implement per arena base allocators, so that arenas never share any virtual
524b7eaed25SJason Evans    memory pages.  (@jasone)
525b7eaed25SJason Evans  - Automatically generate private symbol name mangling macros.  (@jasone)
526b7eaed25SJason Evans
527b7eaed25SJason Evans  Incompatible changes:
528b7eaed25SJason Evans  - Replace chunk hooks with an expanded/normalized set of extent hooks.
529b7eaed25SJason Evans    (@jasone)
530b7eaed25SJason Evans  - Remove ratio-based purging.  (@jasone)
531b7eaed25SJason Evans  - Remove --disable-tcache.  (@jasone)
532b7eaed25SJason Evans  - Remove --disable-tls.  (@jasone)
533b7eaed25SJason Evans  - Remove --enable-ivsalloc.  (@jasone)
534b7eaed25SJason Evans  - Remove --with-lg-size-class-group.  (@jasone)
535b7eaed25SJason Evans  - Remove --with-lg-tiny-min.  (@jasone)
536b7eaed25SJason Evans  - Remove --disable-cc-silence.  (@jasone)
537b7eaed25SJason Evans  - Remove --enable-code-coverage.  (@jasone)
538b7eaed25SJason Evans  - Remove --disable-munmap (replaced by opt.retain).  (@jasone)
539b7eaed25SJason Evans  - Remove Valgrind support.  (@jasone)
540b7eaed25SJason Evans  - Remove quarantine support.  (@jasone)
541b7eaed25SJason Evans  - Remove redzone support.  (@jasone)
542b7eaed25SJason Evans  - Remove mallctl interfaces (various authors):
543b7eaed25SJason Evans    + config.munmap
544b7eaed25SJason Evans    + config.tcache
545b7eaed25SJason Evans    + config.tls
546b7eaed25SJason Evans    + config.valgrind
547b7eaed25SJason Evans    + opt.lg_chunk
548b7eaed25SJason Evans    + opt.purge
549b7eaed25SJason Evans    + opt.lg_dirty_mult
550b7eaed25SJason Evans    + opt.decay_time
551b7eaed25SJason Evans    + opt.quarantine
552b7eaed25SJason Evans    + opt.redzone
553b7eaed25SJason Evans    + opt.thp
554b7eaed25SJason Evans    + arena.<i>.lg_dirty_mult
555b7eaed25SJason Evans    + arena.<i>.decay_time
556b7eaed25SJason Evans    + arena.<i>.chunk_hooks
557b7eaed25SJason Evans    + arenas.initialized
558b7eaed25SJason Evans    + arenas.lg_dirty_mult
559b7eaed25SJason Evans    + arenas.decay_time
560b7eaed25SJason Evans    + arenas.bin.<i>.run_size
561b7eaed25SJason Evans    + arenas.nlruns
562b7eaed25SJason Evans    + arenas.lrun.<i>.size
563b7eaed25SJason Evans    + arenas.nhchunks
564b7eaed25SJason Evans    + arenas.hchunk.<i>.size
565b7eaed25SJason Evans    + arenas.extend
566b7eaed25SJason Evans    + stats.cactive
567b7eaed25SJason Evans    + stats.arenas.<i>.lg_dirty_mult
568b7eaed25SJason Evans    + stats.arenas.<i>.decay_time
569b7eaed25SJason Evans    + stats.arenas.<i>.metadata.{mapped,allocated}
570b7eaed25SJason Evans    + stats.arenas.<i>.{npurge,nmadvise,purged}
571b7eaed25SJason Evans    + stats.arenas.<i>.huge.{allocated,nmalloc,ndalloc,nrequests}
572b7eaed25SJason Evans    + stats.arenas.<i>.bins.<j>.{nruns,reruns,curruns}
573b7eaed25SJason Evans    + stats.arenas.<i>.lruns.<j>.{nmalloc,ndalloc,nrequests,curruns}
574b7eaed25SJason Evans    + stats.arenas.<i>.hchunks.<j>.{nmalloc,ndalloc,nrequests,curhchunks}
575b7eaed25SJason Evans
576b7eaed25SJason Evans  Bug fixes:
577b7eaed25SJason Evans  - Improve interval-based profile dump triggering to dump only one profile when
578b7eaed25SJason Evans    a single allocation's size exceeds the interval.  (@jasone)
579b7eaed25SJason Evans  - Use prefixed function names (as controlled by --with-jemalloc-prefix) when
580b7eaed25SJason Evans    pruning backtrace frames in jeprof.  (@jasone)
581b7eaed25SJason Evans
5828244f2aaSJason Evans* 4.5.0 (February 28, 2017)
5838244f2aaSJason Evans
5848244f2aaSJason Evans  This is the first release to benefit from much broader continuous integration
5858244f2aaSJason Evans  testing, thanks to @davidtgoldblatt.  Had we had this testing infrastructure
5868244f2aaSJason Evans  in place for prior releases, it would have caught all of the most serious
5878244f2aaSJason Evans  regressions fixed by this release.
5888244f2aaSJason Evans
5898244f2aaSJason Evans  New features:
590b7eaed25SJason Evans  - Add --disable-thp and the opt.thp mallctl to provide opt-out mechanisms for
5918244f2aaSJason Evans    transparent huge page integration.  (@jasone)
5928244f2aaSJason Evans  - Update zone allocator integration to work with macOS 10.12.  (@glandium)
5938244f2aaSJason Evans  - Restructure *CFLAGS configuration, so that CFLAGS behaves typically, and
5948244f2aaSJason Evans    EXTRA_CFLAGS provides a way to specify e.g. -Werror during building, but not
5958244f2aaSJason Evans    during configuration.  (@jasone, @ronawho)
5968244f2aaSJason Evans
5978244f2aaSJason Evans  Bug fixes:
5988244f2aaSJason Evans  - Fix DSS (sbrk(2)-based) allocation.  This regression was first released in
5998244f2aaSJason Evans    4.3.0.  (@jasone)
6008244f2aaSJason Evans  - Handle race in per size class utilization computation.  This functionality
6018244f2aaSJason Evans    was first released in 4.0.0.  (@interwq)
6028244f2aaSJason Evans  - Fix lock order reversal during gdump.  (@jasone)
603b7eaed25SJason Evans  - Fix/refactor tcache synchronization.  This regression was first released in
6048244f2aaSJason Evans    4.0.0.  (@jasone)
6058244f2aaSJason Evans  - Fix various JSON-formatted malloc_stats_print() bugs.  This functionality
6068244f2aaSJason Evans    was first released in 4.3.0.  (@jasone)
6078244f2aaSJason Evans  - Fix huge-aligned allocation.  This regression was first released in 4.4.0.
6088244f2aaSJason Evans    (@jasone)
6098244f2aaSJason Evans  - When transparent huge page integration is enabled, detect what state pages
6108244f2aaSJason Evans    start in according to the kernel's current operating mode, and only convert
6118244f2aaSJason Evans    arena chunks to non-huge during purging if that is not their initial state.
6128244f2aaSJason Evans    This functionality was first released in 4.4.0.  (@jasone)
6138244f2aaSJason Evans  - Fix lg_chunk clamping for the --enable-cache-oblivious --disable-fill case.
6148244f2aaSJason Evans    This regression was first released in 4.0.0.  (@jasone, @428desmo)
6158244f2aaSJason Evans  - Properly detect sparc64 when building for Linux.  (@glaubitz)
6168244f2aaSJason Evans
6177fa7f12fSJason Evans* 4.4.0 (December 3, 2016)
6187fa7f12fSJason Evans
6197fa7f12fSJason Evans  New features:
6207fa7f12fSJason Evans  - Add configure support for *-*-linux-android.  (@cferris1000, @jasone)
6217fa7f12fSJason Evans  - Add the --disable-syscall configure option, for use on systems that place
6227fa7f12fSJason Evans    security-motivated limitations on syscall(2).  (@jasone)
6237fa7f12fSJason Evans  - Add support for Debian GNU/kFreeBSD.  (@thesam)
6247fa7f12fSJason Evans
6257fa7f12fSJason Evans  Optimizations:
6267fa7f12fSJason Evans  - Add extent serial numbers and use them where appropriate as a sort key that
6277fa7f12fSJason Evans    is higher priority than address, so that the allocation policy prefers older
6287fa7f12fSJason Evans    extents.  This tends to improve locality (decrease fragmentation) when
6297fa7f12fSJason Evans    memory grows downward.  (@jasone)
6307fa7f12fSJason Evans  - Refactor madvise(2) configuration so that MADV_FREE is detected and utilized
6317fa7f12fSJason Evans    on Linux 4.5 and newer.  (@jasone)
6327fa7f12fSJason Evans  - Mark partially purged arena chunks as non-huge-page.  This improves
6337fa7f12fSJason Evans    interaction with Linux's transparent huge page functionality.  (@jasone)
6347fa7f12fSJason Evans
6357fa7f12fSJason Evans  Bug fixes:
6367fa7f12fSJason Evans  - Fix size class computations for edge conditions involving extremely large
6377fa7f12fSJason Evans    allocations.  This regression was first released in 4.0.0.  (@jasone,
6387fa7f12fSJason Evans    @ingvarha)
6397fa7f12fSJason Evans  - Remove overly restrictive assertions related to the cactive statistic.  This
6407fa7f12fSJason Evans    regression was first released in 4.1.0.  (@jasone)
6417fa7f12fSJason Evans  - Implement a more reliable detection scheme for os_unfair_lock on macOS.
6427fa7f12fSJason Evans    (@jszakmeister)
6437fa7f12fSJason Evans
644bde95144SJason Evans* 4.3.1 (November 7, 2016)
645bde95144SJason Evans
646bde95144SJason Evans  Bug fixes:
647bde95144SJason Evans  - Fix a severe virtual memory leak.  This regression was first released in
648bde95144SJason Evans    4.3.0.  (@interwq, @jasone)
649bde95144SJason Evans  - Refactor atomic and prng APIs to restore support for 32-bit platforms that
650bde95144SJason Evans    use pre-C11 toolchains, e.g. FreeBSD's mips.  (@jasone)
651bde95144SJason Evans
652bde95144SJason Evans* 4.3.0 (November 4, 2016)
653bde95144SJason Evans
654bde95144SJason Evans  This is the first release that passes the test suite for multiple Windows
655bde95144SJason Evans  configurations, thanks in large part to @glandium setting up continuous
656bde95144SJason Evans  integration via AppVeyor (and Travis CI for Linux and OS X).
657bde95144SJason Evans
658bde95144SJason Evans  New features:
659bde95144SJason Evans  - Add "J" (JSON) support to malloc_stats_print().  (@jasone)
660bde95144SJason Evans  - Add Cray compiler support.  (@ronawho)
661bde95144SJason Evans
662bde95144SJason Evans  Optimizations:
663bde95144SJason Evans  - Add/use adaptive spinning for bootstrapping and radix tree node
664bde95144SJason Evans    initialization.  (@jasone)
665bde95144SJason Evans
666bde95144SJason Evans  Bug fixes:
667bde95144SJason Evans  - Fix large allocation to search starting in the optimal size class heap,
668bde95144SJason Evans    which can substantially reduce virtual memory churn and fragmentation.  This
669bde95144SJason Evans    regression was first released in 4.0.0.  (@mjp41, @jasone)
670bde95144SJason Evans  - Fix stats.arenas.<i>.nthreads accounting.  (@interwq)
671bde95144SJason Evans  - Fix and simplify decay-based purging.  (@jasone)
672bde95144SJason Evans  - Make DSS (sbrk(2)-related) operations lockless, which resolves potential
673bde95144SJason Evans    deadlocks during thread exit.  (@jasone)
674bde95144SJason Evans  - Fix over-sized allocation of radix tree leaf nodes.  (@mjp41, @ogaun,
675bde95144SJason Evans    @jasone)
676bde95144SJason Evans  - Fix over-sized allocation of arena_t (plus associated stats) data
677bde95144SJason Evans    structures.  (@jasone, @interwq)
678bde95144SJason Evans  - Fix EXTRA_CFLAGS to not affect configuration.  (@jasone)
679bde95144SJason Evans  - Fix a Valgrind integration bug.  (@ronawho)
680bde95144SJason Evans  - Disallow 0x5a junk filling when running in Valgrind.  (@jasone)
681bde95144SJason Evans  - Fix a file descriptor leak on Linux.  This regression was first released in
682bde95144SJason Evans    4.2.0.  (@vsarunas, @jasone)
683bde95144SJason Evans  - Fix static linking of jemalloc with glibc.  (@djwatson)
684bde95144SJason Evans  - Use syscall(2) rather than {open,read,close}(2) during boot on Linux.  This
685bde95144SJason Evans    works around other libraries' system call wrappers performing reentrant
686bde95144SJason Evans    allocation.  (@kspinka, @Whissi, @jasone)
687bde95144SJason Evans  - Fix OS X default zone replacement to work with OS X 10.12.  (@glandium,
688bde95144SJason Evans    @jasone)
689bde95144SJason Evans  - Fix cached memory management to avoid needless commit/decommit operations
690bde95144SJason Evans    during purging, which resolves permanent virtual memory map fragmentation
691bde95144SJason Evans    issues on Windows.  (@mjp41, @jasone)
692bde95144SJason Evans  - Fix TSD fetches to avoid (recursive) allocation.  This is relevant to
693bde95144SJason Evans    non-TLS and Windows configurations.  (@jasone)
694bde95144SJason Evans  - Fix malloc_conf overriding to work on Windows.  (@jasone)
695bde95144SJason Evans  - Forcibly disable lazy-lock on Windows (was forcibly *enabled*).  (@jasone)
696bde95144SJason Evans
69762b2691eSJason Evans* 4.2.1 (June 8, 2016)
69862b2691eSJason Evans
69962b2691eSJason Evans  Bug fixes:
70062b2691eSJason Evans  - Fix bootstrapping issues for configurations that require allocation during
70162b2691eSJason Evans    tsd initialization (e.g. --disable-tls).  (@cferris1000, @jasone)
70262b2691eSJason Evans  - Fix gettimeofday() version of nstime_update().  (@ronawho)
70362b2691eSJason Evans  - Fix Valgrind regressions in calloc() and chunk_alloc_wrapper().  (@ronawho)
70462b2691eSJason Evans  - Fix potential VM map fragmentation regression.  (@jasone)
70562b2691eSJason Evans  - Fix opt_zero-triggered in-place huge reallocation zeroing.  (@jasone)
70662b2691eSJason Evans  - Fix heap profiling context leaks in reallocation edge cases.  (@jasone)
70762b2691eSJason Evans
7081f0a49e8SJason Evans* 4.2.0 (May 12, 2016)
7091f0a49e8SJason Evans
7101f0a49e8SJason Evans  New features:
7111f0a49e8SJason Evans  - Add the arena.<i>.reset mallctl, which makes it possible to discard all of
7121f0a49e8SJason Evans    an arena's allocations in a single operation.  (@jasone)
7131f0a49e8SJason Evans  - Add the stats.retained and stats.arenas.<i>.retained statistics.  (@jasone)
7141f0a49e8SJason Evans  - Add the --with-version configure option.  (@jasone)
7151f0a49e8SJason Evans  - Support --with-lg-page values larger than actual page size.  (@jasone)
7161f0a49e8SJason Evans
7171f0a49e8SJason Evans  Optimizations:
7181f0a49e8SJason Evans  - Use pairing heaps rather than red-black trees for various hot data
7191f0a49e8SJason Evans    structures.  (@djwatson, @jasone)
7201f0a49e8SJason Evans  - Streamline fast paths of rtree operations.  (@jasone)
7211f0a49e8SJason Evans  - Optimize the fast paths of calloc() and [m,d,sd]allocx().  (@jasone)
7221f0a49e8SJason Evans  - Decommit unused virtual memory if the OS does not overcommit.  (@jasone)
7231f0a49e8SJason Evans  - Specify MAP_NORESERVE on Linux if [heuristic] overcommit is active, in order
7241f0a49e8SJason Evans    to avoid unfortunate interactions during fork(2).  (@jasone)
7251f0a49e8SJason Evans
7261f0a49e8SJason Evans  Bug fixes:
7271f0a49e8SJason Evans  - Fix chunk accounting related to triggering gdump profiles.  (@jasone)
7281f0a49e8SJason Evans  - Link against librt for clock_gettime(2) if glibc < 2.17.  (@jasone)
7291f0a49e8SJason Evans  - Scale leak report summary according to sampling probability.  (@jasone)
7301f0a49e8SJason Evans
7311f0a49e8SJason Evans* 4.1.1 (May 3, 2016)
7321f0a49e8SJason Evans
7331f0a49e8SJason Evans  This bugfix release resolves a variety of mostly minor issues, though the
7341f0a49e8SJason Evans  bitmap fix is critical for 64-bit Windows.
7351f0a49e8SJason Evans
7361f0a49e8SJason Evans  Bug fixes:
7371f0a49e8SJason Evans  - Fix the linear scan version of bitmap_sfu() to shift by the proper amount
7381f0a49e8SJason Evans    even when sizeof(long) is not the same as sizeof(void *), as on 64-bit
7391f0a49e8SJason Evans    Windows.  (@jasone)
7401f0a49e8SJason Evans  - Fix hashing functions to avoid unaligned memory accesses (and resulting
7411f0a49e8SJason Evans    crashes).  This is relevant at least to some ARM-based platforms.
7421f0a49e8SJason Evans    (@rkmisra)
7431f0a49e8SJason Evans  - Fix fork()-related lock rank ordering reversals.  These reversals were
7441f0a49e8SJason Evans    unlikely to cause deadlocks in practice except when heap profiling was
7451f0a49e8SJason Evans    enabled and active.  (@jasone)
7461f0a49e8SJason Evans  - Fix various chunk leaks in OOM code paths.  (@jasone)
7471f0a49e8SJason Evans  - Fix malloc_stats_print() to print opt.narenas correctly.  (@jasone)
7481f0a49e8SJason Evans  - Fix MSVC-specific build/test issues.  (@rustyx, @yuslepukhin)
7491f0a49e8SJason Evans  - Fix a variety of test failures that were due to test fragility rather than
7501f0a49e8SJason Evans    core bugs.  (@jasone)
7511f0a49e8SJason Evans
752df0d881dSJason Evans* 4.1.0 (February 28, 2016)
753df0d881dSJason Evans
754df0d881dSJason Evans  This release is primarily about optimizations, but it also incorporates a lot
755df0d881dSJason Evans  of portability-motivated refactoring and enhancements.  Many people worked on
756df0d881dSJason Evans  this release, to an extent that even with the omission here of minor changes
757df0d881dSJason Evans  (see git revision history), and of the people who reported and diagnosed
758df0d881dSJason Evans  issues, so much of the work was contributed that starting with this release,
759df0d881dSJason Evans  changes are annotated with author credits to help reflect the collaborative
760df0d881dSJason Evans  effort involved.
761df0d881dSJason Evans
762df0d881dSJason Evans  New features:
763df0d881dSJason Evans  - Implement decay-based unused dirty page purging, a major optimization with
764df0d881dSJason Evans    mallctl API impact.  This is an alternative to the existing ratio-based
765df0d881dSJason Evans    unused dirty page purging, and is intended to eventually become the sole
766df0d881dSJason Evans    purging mechanism.  New mallctls:
767df0d881dSJason Evans    + opt.purge
768df0d881dSJason Evans    + opt.decay_time
769df0d881dSJason Evans    + arena.<i>.decay
770df0d881dSJason Evans    + arena.<i>.decay_time
771df0d881dSJason Evans    + arenas.decay_time
772df0d881dSJason Evans    + stats.arenas.<i>.decay_time
773df0d881dSJason Evans    (@jasone, @cevans87)
774df0d881dSJason Evans  - Add --with-malloc-conf, which makes it possible to embed a default
775df0d881dSJason Evans    options string during configuration.  This was motivated by the desire to
776df0d881dSJason Evans    specify --with-malloc-conf=purge:decay , since the default must remain
777df0d881dSJason Evans    purge:ratio until the 5.0.0 release.  (@jasone)
778df0d881dSJason Evans  - Add MS Visual Studio 2015 support.  (@rustyx, @yuslepukhin)
779df0d881dSJason Evans  - Make *allocx() size class overflow behavior defined.  The maximum
780df0d881dSJason Evans    size class is now less than PTRDIFF_MAX to protect applications against
781df0d881dSJason Evans    numerical overflow, and all allocation functions are guaranteed to indicate
782df0d881dSJason Evans    errors rather than potentially crashing if the request size exceeds the
783df0d881dSJason Evans    maximum size class.  (@jasone)
784df0d881dSJason Evans  - jeprof:
785df0d881dSJason Evans    + Add raw heap profile support.  (@jasone)
786df0d881dSJason Evans    + Add --retain and --exclude for backtrace symbol filtering.  (@jasone)
787df0d881dSJason Evans
788df0d881dSJason Evans  Optimizations:
789df0d881dSJason Evans  - Optimize the fast path to combine various bootstrapping and configuration
790df0d881dSJason Evans    checks and execute more streamlined code in the common case.  (@interwq)
791df0d881dSJason Evans  - Use linear scan for small bitmaps (used for small object tracking).  In
792df0d881dSJason Evans    addition to speeding up bitmap operations on 64-bit systems, this reduces
793df0d881dSJason Evans    allocator metadata overhead by approximately 0.2%.  (@djwatson)
794df0d881dSJason Evans  - Separate arena_avail trees, which substantially speeds up run tree
795df0d881dSJason Evans    operations.  (@djwatson)
796df0d881dSJason Evans  - Use memoization (boot-time-computed table) for run quantization.  Separate
797df0d881dSJason Evans    arena_avail trees reduced the importance of this optimization.  (@jasone)
798df0d881dSJason Evans  - Attempt mmap-based in-place huge reallocation.  This can dramatically speed
799df0d881dSJason Evans    up incremental huge reallocation.  (@jasone)
800df0d881dSJason Evans
801df0d881dSJason Evans  Incompatible changes:
802df0d881dSJason Evans  - Make opt.narenas unsigned rather than size_t.  (@jasone)
803df0d881dSJason Evans
804df0d881dSJason Evans  Bug fixes:
805df0d881dSJason Evans  - Fix stats.cactive accounting regression.  (@rustyx, @jasone)
806df0d881dSJason Evans  - Handle unaligned keys in hash().  This caused problems for some ARM systems.
8071f0a49e8SJason Evans    (@jasone, @cferris1000)
808df0d881dSJason Evans  - Refactor arenas array.  In addition to fixing a fork-related deadlock, this
809df0d881dSJason Evans    makes arena lookups faster and simpler.  (@jasone)
810df0d881dSJason Evans  - Move retained memory allocation out of the default chunk allocation
811df0d881dSJason Evans    function, to a location that gets executed even if the application installs
812df0d881dSJason Evans    a custom chunk allocation function.  This resolves a virtual memory leak.
813df0d881dSJason Evans    (@buchgr)
8141f0a49e8SJason Evans  - Fix a potential tsd cleanup leak.  (@cferris1000, @jasone)
815df0d881dSJason Evans  - Fix run quantization.  In practice this bug had no impact unless
816df0d881dSJason Evans    applications requested memory with alignment exceeding one page.
817df0d881dSJason Evans    (@jasone, @djwatson)
818df0d881dSJason Evans  - Fix LinuxThreads-specific bootstrapping deadlock.  (Cosmin Paraschiv)
819df0d881dSJason Evans  - jeprof:
820df0d881dSJason Evans    + Don't discard curl options if timeout is not defined.  (@djwatson)
821df0d881dSJason Evans    + Detect failed profile fetches.  (@djwatson)
822df0d881dSJason Evans  - Fix stats.arenas.<i>.{dss,lg_dirty_mult,decay_time,pactive,pdirty} for
823df0d881dSJason Evans    --disable-stats case.  (@jasone)
824df0d881dSJason Evans
825ba4f5cc0SJason Evans* 4.0.4 (October 24, 2015)
826ba4f5cc0SJason Evans
827ba4f5cc0SJason Evans  This bugfix release fixes another xallocx() regression.  No other regressions
828ba4f5cc0SJason Evans  have come to light in over a month, so this is likely a good starting point
829ba4f5cc0SJason Evans  for people who prefer to wait for "dot one" releases with all the major issues
830ba4f5cc0SJason Evans  shaken out.
831ba4f5cc0SJason Evans
832ba4f5cc0SJason Evans  Bug fixes:
833ba4f5cc0SJason Evans  - Fix xallocx(..., MALLOCX_ZERO to zero the last full trailing page of large
834ba4f5cc0SJason Evans    allocations that have been randomly assigned an offset of 0 when
835ba4f5cc0SJason Evans    --enable-cache-oblivious configure option is enabled.
836ba4f5cc0SJason Evans
837ba4f5cc0SJason Evans* 4.0.3 (September 24, 2015)
838ba4f5cc0SJason Evans
839ba4f5cc0SJason Evans  This bugfix release continues the trend of xallocx() and heap profiling fixes.
840ba4f5cc0SJason Evans
841ba4f5cc0SJason Evans  Bug fixes:
842ba4f5cc0SJason Evans  - Fix xallocx(..., MALLOCX_ZERO) to zero all trailing bytes of large
843ba4f5cc0SJason Evans    allocations when --enable-cache-oblivious configure option is enabled.
844ba4f5cc0SJason Evans  - Fix xallocx(..., MALLOCX_ZERO) to zero trailing bytes of huge allocations
845ba4f5cc0SJason Evans    when resizing from/to a size class that is not a multiple of the chunk size.
846ba4f5cc0SJason Evans  - Fix prof_tctx_dump_iter() to filter out nodes that were created after heap
847ba4f5cc0SJason Evans    profile dumping started.
848ba4f5cc0SJason Evans  - Work around a potentially bad thread-specific data initialization
849ba4f5cc0SJason Evans    interaction with NPTL (glibc's pthreads implementation).
850ba4f5cc0SJason Evans
851536b3538SJason Evans* 4.0.2 (September 21, 2015)
852536b3538SJason Evans
853536b3538SJason Evans  This bugfix release addresses a few bugs specific to heap profiling.
854536b3538SJason Evans
855536b3538SJason Evans  Bug fixes:
856536b3538SJason Evans  - Fix ixallocx_prof_sample() to never modify nor create sampled small
857536b3538SJason Evans    allocations.  xallocx() is in general incapable of moving small allocations,
858536b3538SJason Evans    so this fix removes buggy code without loss of generality.
859536b3538SJason Evans  - Fix irallocx_prof_sample() to always allocate large regions, even when
860536b3538SJason Evans    alignment is non-zero.
861536b3538SJason Evans  - Fix prof_alloc_rollback() to read tdata from thread-specific data rather
862536b3538SJason Evans    than dereferencing a potentially invalid tctx.
863536b3538SJason Evans
864536b3538SJason Evans* 4.0.1 (September 15, 2015)
865536b3538SJason Evans
866536b3538SJason Evans  This is a bugfix release that is somewhat high risk due to the amount of
867536b3538SJason Evans  refactoring required to address deep xallocx() problems.  As a side effect of
868536b3538SJason Evans  these fixes, xallocx() now tries harder to partially fulfill requests for
869536b3538SJason Evans  optional extra space.  Note that a couple of minor heap profiling
870536b3538SJason Evans  optimizations are included, but these are better thought of as performance
8710ef50b4eSJason Evans  fixes that were integral to discovering most of the other bugs.
872536b3538SJason Evans
873536b3538SJason Evans  Optimizations:
874536b3538SJason Evans  - Avoid a chunk metadata read in arena_prof_tctx_set(), since it is in the
875536b3538SJason Evans    fast path when heap profiling is enabled.  Additionally, split a special
876536b3538SJason Evans    case out into arena_prof_tctx_reset(), which also avoids chunk metadata
877536b3538SJason Evans    reads.
878536b3538SJason Evans  - Optimize irallocx_prof() to optimistically update the sampler state.  The
879536b3538SJason Evans    prior implementation appears to have been a holdover from when
880536b3538SJason Evans    rallocx()/xallocx() functionality was combined as rallocm().
881536b3538SJason Evans
882536b3538SJason Evans  Bug fixes:
883536b3538SJason Evans  - Fix TLS configuration such that it is enabled by default for platforms on
884536b3538SJason Evans    which it works correctly.
885536b3538SJason Evans  - Fix arenas_cache_cleanup() and arena_get_hard() to handle
886536b3538SJason Evans    allocation/deallocation within the application's thread-specific data
887536b3538SJason Evans    cleanup functions even after arenas_cache is torn down.
888536b3538SJason Evans  - Fix xallocx() bugs related to size+extra exceeding HUGE_MAXCLASS.
889536b3538SJason Evans  - Fix chunk purge hook calls for in-place huge shrinking reallocation to
890536b3538SJason Evans    specify the old chunk size rather than the new chunk size.  This bug caused
891536b3538SJason Evans    no correctness issues for the default chunk purge function, but was
892536b3538SJason Evans    visible to custom functions set via the "arena.<i>.chunk_hooks" mallctl.
893536b3538SJason Evans  - Fix heap profiling bugs:
894536b3538SJason Evans    + Fix heap profiling to distinguish among otherwise identical sample sites
895536b3538SJason Evans      with interposed resets (triggered via the "prof.reset" mallctl).  This bug
896536b3538SJason Evans      could cause data structure corruption that would most likely result in a
897536b3538SJason Evans      segfault.
898536b3538SJason Evans    + Fix irealloc_prof() to prof_alloc_rollback() on OOM.
899536b3538SJason Evans    + Make one call to prof_active_get_unlocked() per allocation event, and use
900536b3538SJason Evans      the result throughout the relevant functions that handle an allocation
901536b3538SJason Evans      event.  Also add a missing check in prof_realloc().  These fixes protect
902536b3538SJason Evans      allocation events against concurrent prof_active changes.
903536b3538SJason Evans    + Fix ixallocx_prof() to pass usize_max and zero to ixallocx_prof_sample()
904536b3538SJason Evans      in the correct order.
905536b3538SJason Evans    + Fix prof_realloc() to call prof_free_sampled_object() after calling
906536b3538SJason Evans      prof_malloc_sample_object().  Prior to this fix, if tctx and old_tctx were
907536b3538SJason Evans      the same, the tctx could have been prematurely destroyed.
908536b3538SJason Evans  - Fix portability bugs:
909536b3538SJason Evans    + Don't bitshift by negative amounts when encoding/decoding run sizes in
910536b3538SJason Evans      chunk header maps.  This affected systems with page sizes greater than 8
911536b3538SJason Evans      KiB.
912536b3538SJason Evans    + Rename index_t to szind_t to avoid an existing type on Solaris.
913536b3538SJason Evans    + Add JEMALLOC_CXX_THROW to the memalign() function prototype, in order to
914536b3538SJason Evans      match glibc and avoid compilation errors when including both
915536b3538SJason Evans      jemalloc/jemalloc.h and malloc.h in C++ code.
916536b3538SJason Evans    + Don't assume that /bin/sh is appropriate when running size_classes.sh
917536b3538SJason Evans      during configuration.
918536b3538SJason Evans    + Consider __sparcv9 a synonym for __sparc64__ when defining LG_QUANTUM.
919536b3538SJason Evans    + Link tests to librt if it contains clock_gettime(2).
920536b3538SJason Evans
921d0e79aa3SJason Evans* 4.0.0 (August 17, 2015)
922d0e79aa3SJason Evans
923d0e79aa3SJason Evans  This version contains many speed and space optimizations, both minor and
924d0e79aa3SJason Evans  major.  The major themes are generalization, unification, and simplification.
925d0e79aa3SJason Evans  Although many of these optimizations cause no visible behavior change, their
926d0e79aa3SJason Evans  cumulative effect is substantial.
927d0e79aa3SJason Evans
928d0e79aa3SJason Evans  New features:
929d0e79aa3SJason Evans  - Normalize size class spacing to be consistent across the complete size
930d0e79aa3SJason Evans    range.  By default there are four size classes per size doubling, but this
931d0e79aa3SJason Evans    is now configurable via the --with-lg-size-class-group option.  Also add the
932d0e79aa3SJason Evans    --with-lg-page, --with-lg-page-sizes, --with-lg-quantum, and
933d0e79aa3SJason Evans    --with-lg-tiny-min options, which can be used to tweak page and size class
934d0e79aa3SJason Evans    settings.  Impacts:
935d0e79aa3SJason Evans    + Worst case performance for incrementally growing/shrinking reallocation
936d0e79aa3SJason Evans      is improved because there are far fewer size classes, and therefore
937d0e79aa3SJason Evans      copying happens less often.
938d0e79aa3SJason Evans    + Internal fragmentation is limited to 20% for all but the smallest size
939d0e79aa3SJason Evans      classes (those less than four times the quantum).  (1B + 4 KiB)
940d0e79aa3SJason Evans      and (1B + 4 MiB) previously suffered nearly 50% internal fragmentation.
941d0e79aa3SJason Evans    + Chunk fragmentation tends to be lower because there are fewer distinct run
942d0e79aa3SJason Evans      sizes to pack.
943d0e79aa3SJason Evans  - Add support for explicit tcaches.  The "tcache.create", "tcache.flush", and
944d0e79aa3SJason Evans    "tcache.destroy" mallctls control tcache lifetime and flushing, and the
945d0e79aa3SJason Evans    MALLOCX_TCACHE(tc) and MALLOCX_TCACHE_NONE flags to the *allocx() API
946d0e79aa3SJason Evans    control which tcache is used for each operation.
947d0e79aa3SJason Evans  - Implement per thread heap profiling, as well as the ability to
948d0e79aa3SJason Evans    enable/disable heap profiling on a per thread basis.  Add the "prof.reset",
949d0e79aa3SJason Evans    "prof.lg_sample", "thread.prof.name", "thread.prof.active",
950d0e79aa3SJason Evans    "opt.prof_thread_active_init", "prof.thread_active_init", and
951d0e79aa3SJason Evans    "thread.prof.active" mallctls.
952d0e79aa3SJason Evans  - Add support for per arena application-specified chunk allocators, configured
953d0e79aa3SJason Evans    via the "arena.<i>.chunk_hooks" mallctl.
954d0e79aa3SJason Evans  - Refactor huge allocation to be managed by arenas, so that arenas now
955d0e79aa3SJason Evans    function as general purpose independent allocators.  This is important in
956d0e79aa3SJason Evans    the context of user-specified chunk allocators, aside from the scalability
957d0e79aa3SJason Evans    benefits.  Related new statistics:
958d0e79aa3SJason Evans    + The "stats.arenas.<i>.huge.allocated", "stats.arenas.<i>.huge.nmalloc",
959d0e79aa3SJason Evans      "stats.arenas.<i>.huge.ndalloc", and "stats.arenas.<i>.huge.nrequests"
960d0e79aa3SJason Evans      mallctls provide high level per arena huge allocation statistics.
961d0e79aa3SJason Evans    + The "arenas.nhchunks", "arenas.hchunk.<i>.size",
962d0e79aa3SJason Evans      "stats.arenas.<i>.hchunks.<j>.nmalloc",
963d0e79aa3SJason Evans      "stats.arenas.<i>.hchunks.<j>.ndalloc",
964d0e79aa3SJason Evans      "stats.arenas.<i>.hchunks.<j>.nrequests", and
965d0e79aa3SJason Evans      "stats.arenas.<i>.hchunks.<j>.curhchunks" mallctls provide per size class
966d0e79aa3SJason Evans      statistics.
967d0e79aa3SJason Evans  - Add the 'util' column to malloc_stats_print() output, which reports the
968d0e79aa3SJason Evans    proportion of available regions that are currently in use for each small
969d0e79aa3SJason Evans    size class.
970d0e79aa3SJason Evans  - Add "alloc" and "free" modes for for junk filling (see the "opt.junk"
971d0e79aa3SJason Evans    mallctl), so that it is possible to separately enable junk filling for
972d0e79aa3SJason Evans    allocation versus deallocation.
973d0e79aa3SJason Evans  - Add the jemalloc-config script, which provides information about how
974d0e79aa3SJason Evans    jemalloc was configured, and how to integrate it into application builds.
975d0e79aa3SJason Evans  - Add metadata statistics, which are accessible via the "stats.metadata",
976d0e79aa3SJason Evans    "stats.arenas.<i>.metadata.mapped", and
977d0e79aa3SJason Evans    "stats.arenas.<i>.metadata.allocated" mallctls.
978d0e79aa3SJason Evans  - Add the "stats.resident" mallctl, which reports the upper limit of
979d0e79aa3SJason Evans    physically resident memory mapped by the allocator.
980d0e79aa3SJason Evans  - Add per arena control over unused dirty page purging, via the
981d0e79aa3SJason Evans    "arenas.lg_dirty_mult", "arena.<i>.lg_dirty_mult", and
982d0e79aa3SJason Evans    "stats.arenas.<i>.lg_dirty_mult" mallctls.
983d0e79aa3SJason Evans  - Add the "prof.gdump" mallctl, which makes it possible to toggle the gdump
984d0e79aa3SJason Evans    feature on/off during program execution.
985d0e79aa3SJason Evans  - Add sdallocx(), which implements sized deallocation.  The primary
986d0e79aa3SJason Evans    optimization over dallocx() is the removal of a metadata read, which often
987d0e79aa3SJason Evans    suffers an L1 cache miss.
988d0e79aa3SJason Evans  - Add missing header includes in jemalloc/jemalloc.h, so that applications
989d0e79aa3SJason Evans    only have to #include <jemalloc/jemalloc.h>.
990d0e79aa3SJason Evans  - Add support for additional platforms:
991d0e79aa3SJason Evans    + Bitrig
992d0e79aa3SJason Evans    + Cygwin
993d0e79aa3SJason Evans    + DragonFlyBSD
994d0e79aa3SJason Evans    + iOS
995d0e79aa3SJason Evans    + OpenBSD
996d0e79aa3SJason Evans    + OpenRISC/or1k
997d0e79aa3SJason Evans
998d0e79aa3SJason Evans  Optimizations:
999d0e79aa3SJason Evans  - Maintain dirty runs in per arena LRUs rather than in per arena trees of
1000d0e79aa3SJason Evans    dirty-run-containing chunks.  In practice this change significantly reduces
1001d0e79aa3SJason Evans    dirty page purging volume.
1002d0e79aa3SJason Evans  - Integrate whole chunks into the unused dirty page purging machinery.  This
1003d0e79aa3SJason Evans    reduces the cost of repeated huge allocation/deallocation, because it
1004d0e79aa3SJason Evans    effectively introduces a cache of chunks.
1005d0e79aa3SJason Evans  - Split the arena chunk map into two separate arrays, in order to increase
1006d0e79aa3SJason Evans    cache locality for the frequently accessed bits.
1007d0e79aa3SJason Evans  - Move small run metadata out of runs, into arena chunk headers.  This reduces
1008d0e79aa3SJason Evans    run fragmentation, smaller runs reduce external fragmentation for small size
1009d0e79aa3SJason Evans    classes, and packed (less uniformly aligned) metadata layout improves CPU
1010d0e79aa3SJason Evans    cache set distribution.
1011d0e79aa3SJason Evans  - Randomly distribute large allocation base pointer alignment relative to page
1012d0e79aa3SJason Evans    boundaries in order to more uniformly utilize CPU cache sets.  This can be
1013d0e79aa3SJason Evans    disabled via the --disable-cache-oblivious configure option, and queried via
1014d0e79aa3SJason Evans    the "config.cache_oblivious" mallctl.
1015d0e79aa3SJason Evans  - Micro-optimize the fast paths for the public API functions.
1016d0e79aa3SJason Evans  - Refactor thread-specific data to reside in a single structure.  This assures
1017d0e79aa3SJason Evans    that only a single TLS read is necessary per call into the public API.
1018d0e79aa3SJason Evans  - Implement in-place huge allocation growing and shrinking.
1019d0e79aa3SJason Evans  - Refactor rtree (radix tree for chunk lookups) to be lock-free, and make
1020d0e79aa3SJason Evans    additional optimizations that reduce maximum lookup depth to one or two
1021d0e79aa3SJason Evans    levels.  This resolves what was a concurrency bottleneck for per arena huge
1022d0e79aa3SJason Evans    allocation, because a global data structure is critical for determining
1023d0e79aa3SJason Evans    which arenas own which huge allocations.
1024d0e79aa3SJason Evans
1025d0e79aa3SJason Evans  Incompatible changes:
1026d0e79aa3SJason Evans  - Replace --enable-cc-silence with --disable-cc-silence to suppress spurious
1027d0e79aa3SJason Evans    warnings by default.
1028d0e79aa3SJason Evans  - Assure that the constness of malloc_usable_size()'s return type matches that
1029d0e79aa3SJason Evans    of the system implementation.
1030d0e79aa3SJason Evans  - Change the heap profile dump format to support per thread heap profiling,
1031d0e79aa3SJason Evans    rename pprof to jeprof, and enhance it with the --thread=<n> option.  As a
1032d0e79aa3SJason Evans    result, the bundled jeprof must now be used rather than the upstream
1033d0e79aa3SJason Evans    (gperftools) pprof.
1034d0e79aa3SJason Evans  - Disable "opt.prof_final" by default, in order to avoid atexit(3), which can
1035d0e79aa3SJason Evans    internally deadlock on some platforms.
1036d0e79aa3SJason Evans  - Change the "arenas.nlruns" mallctl type from size_t to unsigned.
1037d0e79aa3SJason Evans  - Replace the "stats.arenas.<i>.bins.<j>.allocated" mallctl with
1038d0e79aa3SJason Evans    "stats.arenas.<i>.bins.<j>.curregs".
1039d0e79aa3SJason Evans  - Ignore MALLOC_CONF in set{uid,gid,cap} binaries.
1040d0e79aa3SJason Evans  - Ignore MALLOCX_ARENA(a) in dallocx(), in favor of using the
1041d0e79aa3SJason Evans    MALLOCX_TCACHE(tc) and MALLOCX_TCACHE_NONE flags to control tcache usage.
1042d0e79aa3SJason Evans
1043d0e79aa3SJason Evans  Removed features:
1044d0e79aa3SJason Evans  - Remove the *allocm() API, which is superseded by the *allocx() API.
1045d0e79aa3SJason Evans  - Remove the --enable-dss options, and make dss non-optional on all platforms
1046d0e79aa3SJason Evans    which support sbrk(2).
1047d0e79aa3SJason Evans  - Remove the "arenas.purge" mallctl, which was obsoleted by the
1048d0e79aa3SJason Evans    "arena.<i>.purge" mallctl in 3.1.0.
1049d0e79aa3SJason Evans  - Remove the unnecessary "opt.valgrind" mallctl; jemalloc automatically
1050d0e79aa3SJason Evans    detects whether it is running inside Valgrind.
1051d0e79aa3SJason Evans  - Remove the "stats.huge.allocated", "stats.huge.nmalloc", and
1052d0e79aa3SJason Evans    "stats.huge.ndalloc" mallctls.
1053d0e79aa3SJason Evans  - Remove the --enable-mremap option.
1054d0e79aa3SJason Evans  - Remove the "stats.chunks.current", "stats.chunks.total", and
1055d0e79aa3SJason Evans    "stats.chunks.high" mallctls.
1056d0e79aa3SJason Evans
1057d0e79aa3SJason Evans  Bug fixes:
1058d0e79aa3SJason Evans  - Fix the cactive statistic to decrease (rather than increase) when active
1059d0e79aa3SJason Evans    memory decreases.  This regression was first released in 3.5.0.
1060d0e79aa3SJason Evans  - Fix OOM handling in memalign() and valloc().  A variant of this bug existed
1061d0e79aa3SJason Evans    in all releases since 2.0.0, which introduced these functions.
1062d0e79aa3SJason Evans  - Fix an OOM-related regression in arena_tcache_fill_small(), which could
1063d0e79aa3SJason Evans    cause cache corruption on OOM.  This regression was present in all releases
1064d0e79aa3SJason Evans    from 2.2.0 through 3.6.0.
1065d0e79aa3SJason Evans  - Fix size class overflow handling for malloc(), posix_memalign(), memalign(),
1066d0e79aa3SJason Evans    calloc(), and realloc() when profiling is enabled.
1067d0e79aa3SJason Evans  - Fix the "arena.<i>.dss" mallctl to return an error if "primary" or
1068d0e79aa3SJason Evans    "secondary" precedence is specified, but sbrk(2) is not supported.
1069d0e79aa3SJason Evans  - Fix fallback lg_floor() implementations to handle extremely large inputs.
1070d0e79aa3SJason Evans  - Ensure the default purgeable zone is after the default zone on OS X.
1071d0e79aa3SJason Evans  - Fix latent bugs in atomic_*().
1072d0e79aa3SJason Evans  - Fix the "arena.<i>.dss" mallctl to handle read-only calls.
1073d0e79aa3SJason Evans  - Fix tls_model configuration to enable the initial-exec model when possible.
1074d0e79aa3SJason Evans  - Mark malloc_conf as a weak symbol so that the application can override it.
1075d0e79aa3SJason Evans  - Correctly detect glibc's adaptive pthread mutexes.
1076d0e79aa3SJason Evans  - Fix the --without-export configure option.
1077d0e79aa3SJason Evans
10782fff27f8SJason Evans* 3.6.0 (March 31, 2014)
10792fff27f8SJason Evans
10802fff27f8SJason Evans  This version contains a critical bug fix for a regression present in 3.5.0 and
10812fff27f8SJason Evans  3.5.1.
10822fff27f8SJason Evans
10832fff27f8SJason Evans  Bug fixes:
10842fff27f8SJason Evans  - Fix a regression in arena_chunk_alloc() that caused crashes during
10852fff27f8SJason Evans    small/large allocation if chunk allocation failed.  In the absence of this
10862fff27f8SJason Evans    bug, chunk allocation failure would result in allocation failure, e.g.  NULL
10872fff27f8SJason Evans    return from malloc().  This regression was introduced in 3.5.0.
10882fff27f8SJason Evans  - Fix backtracing for gcc intrinsics-based backtracing by specifying
10892fff27f8SJason Evans    -fno-omit-frame-pointer to gcc.  Note that the application (and all the
10902fff27f8SJason Evans    libraries it links to) must also be compiled with this option for
10912fff27f8SJason Evans    backtracing to be reliable.
10922fff27f8SJason Evans  - Use dss allocation precedence for huge allocations as well as small/large
10932fff27f8SJason Evans    allocations.
1094d0e79aa3SJason Evans  - Fix test assertion failure message formatting.  This bug did not manifest on
10952fff27f8SJason Evans    x86_64 systems because of implementation subtleties in va_list.
10962fff27f8SJason Evans  - Fix inconsequential test failures for hash and SFMT code.
10972fff27f8SJason Evans
10982fff27f8SJason Evans  New features:
10992fff27f8SJason Evans  - Support heap profiling on FreeBSD.  This feature depends on the proc
11002fff27f8SJason Evans    filesystem being mounted during heap profile dumping.
11012fff27f8SJason Evans
1102706d9bd1SJason Evans* 3.5.1 (February 25, 2014)
1103706d9bd1SJason Evans
1104706d9bd1SJason Evans  This version primarily addresses minor bugs in test code.
1105706d9bd1SJason Evans
1106706d9bd1SJason Evans  Bug fixes:
1107706d9bd1SJason Evans  - Configure Solaris/Illumos to use MADV_FREE.
1108706d9bd1SJason Evans  - Fix junk filling for mremap(2)-based huge reallocation.  This is only
1109706d9bd1SJason Evans    relevant if configuring with the --enable-mremap option specified.
1110706d9bd1SJason Evans  - Avoid compilation failure if 'restrict' C99 keyword is not supported by the
1111706d9bd1SJason Evans    compiler.
1112706d9bd1SJason Evans  - Add a configure test for SSE2 rather than assuming it is usable on i686
1113706d9bd1SJason Evans    systems.  This fixes test compilation errors, especially on 32-bit Linux
1114706d9bd1SJason Evans    systems.
1115706d9bd1SJason Evans  - Fix mallctl argument size mismatches (size_t vs. uint64_t) in the stats unit
1116706d9bd1SJason Evans    test.
1117706d9bd1SJason Evans  - Fix/remove flawed alignment-related overflow tests.
1118706d9bd1SJason Evans  - Prevent compiler optimizations that could change backtraces in the
1119706d9bd1SJason Evans    prof_accum unit test.
1120a4bd5210SJason Evans
1121f921d10fSJason Evans* 3.5.0 (January 22, 2014)
1122f921d10fSJason Evans
1123f921d10fSJason Evans  This version focuses on refactoring and automated testing, though it also
1124f921d10fSJason Evans  includes some non-trivial heap profiling optimizations not mentioned below.
1125f921d10fSJason Evans
1126f921d10fSJason Evans  New features:
1127f921d10fSJason Evans  - Add the *allocx() API, which is a successor to the experimental *allocm()
1128f921d10fSJason Evans    API.  The *allocx() functions are slightly simpler to use because they have
1129f921d10fSJason Evans    fewer parameters, they directly return the results of primary interest, and
1130f921d10fSJason Evans    mallocx()/rallocx() avoid the strict aliasing pitfall that
1131706d9bd1SJason Evans    allocm()/rallocm() share with posix_memalign().  Note that *allocm() is
1132f921d10fSJason Evans    slated for removal in the next non-bugfix release.
1133f921d10fSJason Evans  - Add support for LinuxThreads.
1134f921d10fSJason Evans
1135f921d10fSJason Evans  Bug fixes:
1136f921d10fSJason Evans  - Unless heap profiling is enabled, disable floating point code and don't link
1137f921d10fSJason Evans    with libm.  This, in combination with e.g. EXTRA_CFLAGS=-mno-sse on x64
1138f921d10fSJason Evans    systems, makes it possible to completely disable floating point register
1139f921d10fSJason Evans    use.  Some versions of glibc neglect to save/restore caller-saved floating
1140f921d10fSJason Evans    point registers during dynamic lazy symbol loading, and the symbol loading
1141f921d10fSJason Evans    code uses whatever malloc the application happens to have linked/loaded
1142f921d10fSJason Evans    with, the result being potential floating point register corruption.
1143f921d10fSJason Evans  - Report ENOMEM rather than EINVAL if an OOM occurs during heap profiling
1144f921d10fSJason Evans    backtrace creation in imemalign().  This bug impacted posix_memalign() and
1145f921d10fSJason Evans    aligned_alloc().
1146f921d10fSJason Evans  - Fix a file descriptor leak in a prof_dump_maps() error path.
1147f921d10fSJason Evans  - Fix prof_dump() to close the dump file descriptor for all relevant error
1148f921d10fSJason Evans    paths.
1149f921d10fSJason Evans  - Fix rallocm() to use the arena specified by the ALLOCM_ARENA(s) flag for
1150f921d10fSJason Evans    allocation, not just deallocation.
1151f921d10fSJason Evans  - Fix a data race for large allocation stats counters.
1152f921d10fSJason Evans  - Fix a potential infinite loop during thread exit.  This bug occurred on
1153f921d10fSJason Evans    Solaris, and could affect other platforms with similar pthreads TSD
1154f921d10fSJason Evans    implementations.
1155f921d10fSJason Evans  - Don't junk-fill reallocations unless usable size changes.  This fixes a
1156f921d10fSJason Evans    violation of the *allocx()/*allocm() semantics.
1157f921d10fSJason Evans  - Fix growing large reallocation to junk fill new space.
1158f921d10fSJason Evans  - Fix huge deallocation to junk fill when munmap is disabled.
1159f921d10fSJason Evans  - Change the default private namespace prefix from empty to je_, and change
1160f921d10fSJason Evans    --with-private-namespace-prefix so that it prepends an additional prefix
1161f921d10fSJason Evans    rather than replacing je_.  This reduces the likelihood of applications
1162f921d10fSJason Evans    which statically link jemalloc experiencing symbol name collisions.
1163f921d10fSJason Evans  - Add missing private namespace mangling (relevant when
1164f921d10fSJason Evans    --with-private-namespace is specified).
1165f921d10fSJason Evans  - Add and use JEMALLOC_INLINE_C so that static inline functions are marked as
1166f921d10fSJason Evans    static even for debug builds.
1167f921d10fSJason Evans  - Add a missing mutex unlock in a malloc_init_hard() error path.  In practice
1168f921d10fSJason Evans    this error path is never executed.
1169f921d10fSJason Evans  - Fix numerous bugs in malloc_strotumax() error handling/reporting.  These
1170f921d10fSJason Evans    bugs had no impact except for malformed inputs.
1171f921d10fSJason Evans  - Fix numerous bugs in malloc_snprintf().  These bugs were not exercised by
1172f921d10fSJason Evans    existing calls, so they had no impact.
1173f921d10fSJason Evans
11742b06b201SJason Evans* 3.4.1 (October 20, 2013)
11752b06b201SJason Evans
11762b06b201SJason Evans  Bug fixes:
11772b06b201SJason Evans  - Fix a race in the "arenas.extend" mallctl that could cause memory corruption
11782b06b201SJason Evans    of internal data structures and subsequent crashes.
11792b06b201SJason Evans  - Fix Valgrind integration flaws that caused Valgrind warnings about reads of
11802b06b201SJason Evans    uninitialized memory in:
11812b06b201SJason Evans    + arena chunk headers
11822b06b201SJason Evans    + internal zero-initialized data structures (relevant to tcache and prof
11832b06b201SJason Evans      code)
11842b06b201SJason Evans  - Preserve errno during the first allocation.  A readlink(2) call during
11852b06b201SJason Evans    initialization fails unless /etc/malloc.conf exists, so errno was typically
11862b06b201SJason Evans    set during the first allocation prior to this fix.
11872b06b201SJason Evans  - Fix compilation warnings reported by gcc 4.8.1.
11882b06b201SJason Evans
1189f8ca2db1SJason Evans* 3.4.0 (June 2, 2013)
1190f8ca2db1SJason Evans
1191f8ca2db1SJason Evans  This version is essentially a small bugfix release, but the addition of
1192f8ca2db1SJason Evans  aarch64 support requires that the minor version be incremented.
1193f8ca2db1SJason Evans
1194f8ca2db1SJason Evans  Bug fixes:
1195f8ca2db1SJason Evans  - Fix race-triggered deadlocks in chunk_record().  These deadlocks were
1196f8ca2db1SJason Evans    typically triggered by multiple threads concurrently deallocating huge
1197f8ca2db1SJason Evans    objects.
1198f8ca2db1SJason Evans
1199f8ca2db1SJason Evans  New features:
1200f8ca2db1SJason Evans  - Add support for the aarch64 architecture.
1201f8ca2db1SJason Evans
1202f8ca2db1SJason Evans* 3.3.1 (March 6, 2013)
1203f8ca2db1SJason Evans
1204f8ca2db1SJason Evans  This version fixes bugs that are typically encountered only when utilizing
1205f8ca2db1SJason Evans  custom run-time options.
1206f8ca2db1SJason Evans
1207f8ca2db1SJason Evans  Bug fixes:
1208f8ca2db1SJason Evans  - Fix a locking order bug that could cause deadlock during fork if heap
1209f8ca2db1SJason Evans    profiling were enabled.
1210f8ca2db1SJason Evans  - Fix a chunk recycling bug that could cause the allocator to lose track of
1211f8ca2db1SJason Evans    whether a chunk was zeroed.  On FreeBSD, NetBSD, and OS X, it could cause
1212f8ca2db1SJason Evans    corruption if allocating via sbrk(2) (unlikely unless running with the
1213f8ca2db1SJason Evans    "dss:primary" option specified).  This was completely harmless on Linux
1214f8ca2db1SJason Evans    unless using mlockall(2) (and unlikely even then, unless the
1215f8ca2db1SJason Evans    --disable-munmap configure option or the "dss:primary" option was
1216f8ca2db1SJason Evans    specified).  This regression was introduced in 3.1.0 by the
1217f8ca2db1SJason Evans    mlockall(2)/madvise(2) interaction fix.
1218f8ca2db1SJason Evans  - Fix TLS-related memory corruption that could occur during thread exit if the
1219f8ca2db1SJason Evans    thread never allocated memory.  Only the quarantine and prof facilities were
1220f8ca2db1SJason Evans    susceptible.
1221f8ca2db1SJason Evans  - Fix two quarantine bugs:
1222f8ca2db1SJason Evans    + Internal reallocation of the quarantined object array leaked the old
1223f8ca2db1SJason Evans      array.
1224f8ca2db1SJason Evans    + Reallocation failure for internal reallocation of the quarantined object
1225f8ca2db1SJason Evans      array (very unlikely) resulted in memory corruption.
1226f8ca2db1SJason Evans  - Fix Valgrind integration to annotate all internally allocated memory in a
1227f8ca2db1SJason Evans    way that keeps Valgrind happy about internal data structure access.
1228f8ca2db1SJason Evans  - Fix building for s390 systems.
1229f8ca2db1SJason Evans
123088ad2f8dSJason Evans* 3.3.0 (January 23, 2013)
123188ad2f8dSJason Evans
123288ad2f8dSJason Evans  This version includes a few minor performance improvements in addition to the
123388ad2f8dSJason Evans  listed new features and bug fixes.
123488ad2f8dSJason Evans
123588ad2f8dSJason Evans  New features:
123688ad2f8dSJason Evans  - Add clipping support to lg_chunk option processing.
123788ad2f8dSJason Evans  - Add the --enable-ivsalloc option.
123888ad2f8dSJason Evans  - Add the --without-export option.
123988ad2f8dSJason Evans  - Add the --disable-zone-allocator option.
124088ad2f8dSJason Evans
124188ad2f8dSJason Evans  Bug fixes:
124288ad2f8dSJason Evans  - Fix "arenas.extend" mallctl to output the number of arenas.
12432b06b201SJason Evans  - Fix chunk_recycle() to unconditionally inform Valgrind that returned memory
124488ad2f8dSJason Evans    is undefined.
124588ad2f8dSJason Evans  - Fix build break on FreeBSD related to alloca.h.
124688ad2f8dSJason Evans
124782872ac0SJason Evans* 3.2.0 (November 9, 2012)
124882872ac0SJason Evans
124982872ac0SJason Evans  In addition to a couple of bug fixes, this version modifies page run
125082872ac0SJason Evans  allocation and dirty page purging algorithms in order to better control
125182872ac0SJason Evans  page-level virtual memory fragmentation.
125282872ac0SJason Evans
125382872ac0SJason Evans  Incompatible changes:
125482872ac0SJason Evans  - Change the "opt.lg_dirty_mult" default from 5 to 3 (32:1 to 8:1).
125582872ac0SJason Evans
125682872ac0SJason Evans  Bug fixes:
125782872ac0SJason Evans  - Fix dss/mmap allocation precedence code to use recyclable mmap memory only
125882872ac0SJason Evans    after primary dss allocation fails.
125982872ac0SJason Evans  - Fix deadlock in the "arenas.purge" mallctl.  This regression was introduced
126082872ac0SJason Evans    in 3.1.0 by the addition of the "arena.<i>.purge" mallctl.
126182872ac0SJason Evans
126282872ac0SJason Evans* 3.1.0 (October 16, 2012)
126382872ac0SJason Evans
126482872ac0SJason Evans  New features:
126582872ac0SJason Evans  - Auto-detect whether running inside Valgrind, thus removing the need to
126682872ac0SJason Evans    manually specify MALLOC_CONF=valgrind:true.
126782872ac0SJason Evans  - Add the "arenas.extend" mallctl, which allows applications to create
126882872ac0SJason Evans    manually managed arenas.
126982872ac0SJason Evans  - Add the ALLOCM_ARENA() flag for {,r,d}allocm().
127082872ac0SJason Evans  - Add the "opt.dss", "arena.<i>.dss", and "stats.arenas.<i>.dss" mallctls,
127182872ac0SJason Evans    which provide control over dss/mmap precedence.
127282872ac0SJason Evans  - Add the "arena.<i>.purge" mallctl, which obsoletes "arenas.purge".
127382872ac0SJason Evans  - Define LG_QUANTUM for hppa.
127482872ac0SJason Evans
127582872ac0SJason Evans  Incompatible changes:
127682872ac0SJason Evans  - Disable tcache by default if running inside Valgrind, in order to avoid
127782872ac0SJason Evans    making unallocated objects appear reachable to Valgrind.
127882872ac0SJason Evans  - Drop const from malloc_usable_size() argument on Linux.
127982872ac0SJason Evans
128082872ac0SJason Evans  Bug fixes:
128182872ac0SJason Evans  - Fix heap profiling crash if sampled object is freed via realloc(p, 0).
128282872ac0SJason Evans  - Remove const from __*_hook variable declarations, so that glibc can modify
128382872ac0SJason Evans    them during process forking.
128482872ac0SJason Evans  - Fix mlockall(2)/madvise(2) interaction.
128582872ac0SJason Evans  - Fix fork(2)-related deadlocks.
128682872ac0SJason Evans  - Fix error return value for "thread.tcache.enabled" mallctl.
128782872ac0SJason Evans
128835dad073SJason Evans* 3.0.0 (May 11, 2012)
1289a4bd5210SJason Evans
1290a4bd5210SJason Evans  Although this version adds some major new features, the primary focus is on
1291a4bd5210SJason Evans  internal code cleanup that facilitates maintainability and portability, most
1292a4bd5210SJason Evans  of which is not reflected in the ChangeLog.  This is the first release to
1293a4bd5210SJason Evans  incorporate substantial contributions from numerous other developers, and the
1294a4bd5210SJason Evans  result is a more broadly useful allocator (see the git revision history for
1295a4bd5210SJason Evans  contribution details).  Note that the license has been unified, thanks to
1296a4bd5210SJason Evans  Facebook granting a license under the same terms as the other copyright
1297a4bd5210SJason Evans  holders (see COPYING).
1298a4bd5210SJason Evans
1299a4bd5210SJason Evans  New features:
1300a4bd5210SJason Evans  - Implement Valgrind support, redzones, and quarantine.
1301e722f8f8SJason Evans  - Add support for additional platforms:
1302a4bd5210SJason Evans    + FreeBSD
1303a4bd5210SJason Evans    + Mac OS X Lion
1304e722f8f8SJason Evans    + MinGW
130535dad073SJason Evans    + Windows (no support yet for replacing the system malloc)
1306a4bd5210SJason Evans  - Add support for additional architectures:
1307a4bd5210SJason Evans    + MIPS
1308a4bd5210SJason Evans    + SH4
1309a4bd5210SJason Evans    + Tilera
1310a4bd5210SJason Evans  - Add support for cross compiling.
1311a4bd5210SJason Evans  - Add nallocm(), which rounds a request size up to the nearest size class
1312a4bd5210SJason Evans    without actually allocating.
1313a4bd5210SJason Evans  - Implement aligned_alloc() (blame C11).
1314a4bd5210SJason Evans  - Add the "thread.tcache.enabled" mallctl.
13158ed34ab0SJason Evans  - Add the "opt.prof_final" mallctl.
13168ed34ab0SJason Evans  - Update pprof (from gperftools 2.0).
131735dad073SJason Evans  - Add the --with-mangling option.
131835dad073SJason Evans  - Add the --disable-experimental option.
131935dad073SJason Evans  - Add the --disable-munmap option, and make it the default on Linux.
132035dad073SJason Evans  - Add the --enable-mremap option, which disables use of mremap(2) by default.
1321a4bd5210SJason Evans
1322a4bd5210SJason Evans  Incompatible changes:
1323a4bd5210SJason Evans  - Enable stats by default.
1324a4bd5210SJason Evans  - Enable fill by default.
1325a4bd5210SJason Evans  - Disable lazy locking by default.
1326a4bd5210SJason Evans  - Rename the "tcache.flush" mallctl to "thread.tcache.flush".
1327a4bd5210SJason Evans  - Rename the "arenas.pagesize" mallctl to "arenas.page".
13288ed34ab0SJason Evans  - Change the "opt.lg_prof_sample" default from 0 to 19 (1 B to 512 KiB).
13298ed34ab0SJason Evans  - Change the "opt.prof_accum" default from true to false.
1330a4bd5210SJason Evans
1331a4bd5210SJason Evans  Removed features:
1332a4bd5210SJason Evans  - Remove the swap feature, including the "config.swap", "swap.avail",
1333a4bd5210SJason Evans    "swap.prezeroed", "swap.nfds", and "swap.fds" mallctls.
1334a4bd5210SJason Evans  - Remove highruns statistics, including the
1335a4bd5210SJason Evans    "stats.arenas.<i>.bins.<j>.highruns" and
1336a4bd5210SJason Evans    "stats.arenas.<i>.lruns.<j>.highruns" mallctls.
1337a4bd5210SJason Evans  - As part of small size class refactoring, remove the "opt.lg_[qc]space_max",
1338a4bd5210SJason Evans    "arenas.cacheline", "arenas.subpage", "arenas.[tqcs]space_{min,max}", and
1339a4bd5210SJason Evans    "arenas.[tqcs]bins" mallctls.
1340a4bd5210SJason Evans  - Remove the "arenas.chunksize" mallctl.
1341a4bd5210SJason Evans  - Remove the "opt.lg_prof_tcmax" option.
1342a4bd5210SJason Evans  - Remove the "opt.lg_prof_bt_max" option.
1343a4bd5210SJason Evans  - Remove the "opt.lg_tcache_gc_sweep" option.
1344a4bd5210SJason Evans  - Remove the --disable-tiny option, including the "config.tiny" mallctl.
1345a4bd5210SJason Evans  - Remove the --enable-dynamic-page-shift configure option.
1346a4bd5210SJason Evans  - Remove the --enable-sysv configure option.
1347a4bd5210SJason Evans
1348a4bd5210SJason Evans  Bug fixes:
1349a4bd5210SJason Evans  - Fix a statistics-related bug in the "thread.arena" mallctl that could cause
1350a4bd5210SJason Evans    invalid statistics and crashes.
1351e722f8f8SJason Evans  - Work around TLS deallocation via free() on Linux.  This bug could cause
1352a4bd5210SJason Evans    write-after-free memory corruption.
1353e722f8f8SJason Evans  - Fix a potential deadlock that could occur during interval- and
1354e722f8f8SJason Evans    growth-triggered heap profile dumps.
135535dad073SJason Evans  - Fix large calloc() zeroing bugs due to dropping chunk map unzeroed flags.
13564bcb1430SJason Evans  - Fix chunk_alloc_dss() to stop claiming memory is zeroed.  This bug could
13574bcb1430SJason Evans    cause memory corruption and crashes with --enable-dss specified.
1358e722f8f8SJason Evans  - Fix fork-related bugs that could cause deadlock in children between fork
1359e722f8f8SJason Evans    and exec.
1360a4bd5210SJason Evans  - Fix malloc_stats_print() to honor 'b' and 'l' in the opts parameter.
1361a4bd5210SJason Evans  - Fix realloc(p, 0) to act like free(p).
1362a4bd5210SJason Evans  - Do not enforce minimum alignment in memalign().
1363a4bd5210SJason Evans  - Check for NULL pointer in malloc_usable_size().
1364e722f8f8SJason Evans  - Fix an off-by-one heap profile statistics bug that could be observed in
1365e722f8f8SJason Evans    interval- and growth-triggered heap profiles.
1366e722f8f8SJason Evans  - Fix the "epoch" mallctl to update cached stats even if the passed in epoch
1367e722f8f8SJason Evans    is 0.
1368a4bd5210SJason Evans  - Fix bin->runcur management to fix a layout policy bug.  This bug did not
1369a4bd5210SJason Evans    affect correctness.
1370a4bd5210SJason Evans  - Fix a bug in choose_arena_hard() that potentially caused more arenas to be
1371a4bd5210SJason Evans    initialized than necessary.
1372a4bd5210SJason Evans  - Add missing "opt.lg_tcache_max" mallctl implementation.
1373a4bd5210SJason Evans  - Use glibc allocator hooks to make mixed allocator usage less likely.
1374a4bd5210SJason Evans  - Fix build issues for --disable-tcache.
13758ed34ab0SJason Evans  - Don't mangle pthread_create() when --with-private-namespace is specified.
1376a4bd5210SJason Evans
1377a4bd5210SJason Evans* 2.2.5 (November 14, 2011)
1378a4bd5210SJason Evans
1379a4bd5210SJason Evans  Bug fixes:
1380a4bd5210SJason Evans  - Fix huge_ralloc() race when using mremap(2).  This is a serious bug that
1381a4bd5210SJason Evans    could cause memory corruption and/or crashes.
1382a4bd5210SJason Evans  - Fix huge_ralloc() to maintain chunk statistics.
1383a4bd5210SJason Evans  - Fix malloc_stats_print(..., "a") output.
1384a4bd5210SJason Evans
1385a4bd5210SJason Evans* 2.2.4 (November 5, 2011)
1386a4bd5210SJason Evans
1387a4bd5210SJason Evans  Bug fixes:
1388a4bd5210SJason Evans  - Initialize arenas_tsd before using it.  This bug existed for 2.2.[0-3], as
1389a4bd5210SJason Evans    well as for --disable-tls builds in earlier releases.
1390a4bd5210SJason Evans  - Do not assume a 4 KiB page size in test/rallocm.c.
1391a4bd5210SJason Evans
1392a4bd5210SJason Evans* 2.2.3 (August 31, 2011)
1393a4bd5210SJason Evans
1394a4bd5210SJason Evans  This version fixes numerous bugs related to heap profiling.
1395a4bd5210SJason Evans
1396a4bd5210SJason Evans  Bug fixes:
1397a4bd5210SJason Evans  - Fix a prof-related race condition.  This bug could cause memory corruption,
1398a4bd5210SJason Evans    but only occurred in non-default configurations (prof_accum:false).
1399a4bd5210SJason Evans  - Fix off-by-one backtracing issues (make sure that prof_alloc_prep() is
1400a4bd5210SJason Evans    excluded from backtraces).
1401a4bd5210SJason Evans  - Fix a prof-related bug in realloc() (only triggered by OOM errors).
1402a4bd5210SJason Evans  - Fix prof-related bugs in allocm() and rallocm().
1403a4bd5210SJason Evans  - Fix prof_tdata_cleanup() for --disable-tls builds.
1404a4bd5210SJason Evans  - Fix a relative include path, to fix objdir builds.
1405a4bd5210SJason Evans
1406a4bd5210SJason Evans* 2.2.2 (July 30, 2011)
1407a4bd5210SJason Evans
1408a4bd5210SJason Evans  Bug fixes:
1409a4bd5210SJason Evans  - Fix a build error for --disable-tcache.
1410a4bd5210SJason Evans  - Fix assertions in arena_purge() (for real this time).
1411a4bd5210SJason Evans  - Add the --with-private-namespace option.  This is a workaround for symbol
1412a4bd5210SJason Evans    conflicts that can inadvertently arise when using static libraries.
1413a4bd5210SJason Evans
1414a4bd5210SJason Evans* 2.2.1 (March 30, 2011)
1415a4bd5210SJason Evans
1416a4bd5210SJason Evans  Bug fixes:
1417a4bd5210SJason Evans  - Implement atomic operations for x86/x64.  This fixes compilation failures
1418a4bd5210SJason Evans    for versions of gcc that are still in wide use.
1419a4bd5210SJason Evans  - Fix an assertion in arena_purge().
1420a4bd5210SJason Evans
1421a4bd5210SJason Evans* 2.2.0 (March 22, 2011)
1422a4bd5210SJason Evans
1423a4bd5210SJason Evans  This version incorporates several improvements to algorithms and data
1424a4bd5210SJason Evans  structures that tend to reduce fragmentation and increase speed.
1425a4bd5210SJason Evans
1426a4bd5210SJason Evans  New features:
1427a4bd5210SJason Evans  - Add the "stats.cactive" mallctl.
1428a4bd5210SJason Evans  - Update pprof (from google-perftools 1.7).
1429a4bd5210SJason Evans  - Improve backtracing-related configuration logic, and add the
1430a4bd5210SJason Evans    --disable-prof-libgcc option.
1431a4bd5210SJason Evans
1432a4bd5210SJason Evans  Bug fixes:
1433a4bd5210SJason Evans  - Change default symbol visibility from "internal", to "hidden", which
1434a4bd5210SJason Evans    decreases the overhead of library-internal function calls.
1435a4bd5210SJason Evans  - Fix symbol visibility so that it is also set on OS X.
1436a4bd5210SJason Evans  - Fix a build dependency regression caused by the introduction of the .pic.o
1437a4bd5210SJason Evans    suffix for PIC object files.
1438a4bd5210SJason Evans  - Add missing checks for mutex initialization failures.
1439a4bd5210SJason Evans  - Don't use libgcc-based backtracing except on x64, where it is known to work.
1440a4bd5210SJason Evans  - Fix deadlocks on OS X that were due to memory allocation in
1441a4bd5210SJason Evans    pthread_mutex_lock().
1442a4bd5210SJason Evans  - Heap profiling-specific fixes:
1443a4bd5210SJason Evans    + Fix memory corruption due to integer overflow in small region index
1444a4bd5210SJason Evans      computation, when using a small enough sample interval that profiling
1445a4bd5210SJason Evans      context pointers are stored in small run headers.
1446a4bd5210SJason Evans    + Fix a bootstrap ordering bug that only occurred with TLS disabled.
1447a4bd5210SJason Evans    + Fix a rallocm() rsize bug.
1448a4bd5210SJason Evans    + Fix error detection bugs for aligned memory allocation.
1449a4bd5210SJason Evans
1450a4bd5210SJason Evans* 2.1.3 (March 14, 2011)
1451a4bd5210SJason Evans
1452a4bd5210SJason Evans  Bug fixes:
1453a4bd5210SJason Evans  - Fix a cpp logic regression (due to the "thread.{de,}allocatedp" mallctl fix
1454a4bd5210SJason Evans    for OS X in 2.1.2).
1455a4bd5210SJason Evans  - Fix a "thread.arena" mallctl bug.
1456a4bd5210SJason Evans  - Fix a thread cache stats merging bug.
1457a4bd5210SJason Evans
1458a4bd5210SJason Evans* 2.1.2 (March 2, 2011)
1459a4bd5210SJason Evans
1460a4bd5210SJason Evans  Bug fixes:
1461a4bd5210SJason Evans  - Fix "thread.{de,}allocatedp" mallctl for OS X.
1462a4bd5210SJason Evans  - Add missing jemalloc.a to build system.
1463a4bd5210SJason Evans
1464a4bd5210SJason Evans* 2.1.1 (January 31, 2011)
1465a4bd5210SJason Evans
1466a4bd5210SJason Evans  Bug fixes:
1467a4bd5210SJason Evans  - Fix aligned huge reallocation (affected allocm()).
1468a4bd5210SJason Evans  - Fix the ALLOCM_LG_ALIGN macro definition.
1469a4bd5210SJason Evans  - Fix a heap dumping deadlock.
1470a4bd5210SJason Evans  - Fix a "thread.arena" mallctl bug.
1471a4bd5210SJason Evans
1472a4bd5210SJason Evans* 2.1.0 (December 3, 2010)
1473a4bd5210SJason Evans
1474a4bd5210SJason Evans  This version incorporates some optimizations that can't quite be considered
1475a4bd5210SJason Evans  bug fixes.
1476a4bd5210SJason Evans
1477a4bd5210SJason Evans  New features:
1478a4bd5210SJason Evans  - Use Linux's mremap(2) for huge object reallocation when possible.
1479a4bd5210SJason Evans  - Avoid locking in mallctl*() when possible.
1480a4bd5210SJason Evans  - Add the "thread.[de]allocatedp" mallctl's.
1481a4bd5210SJason Evans  - Convert the manual page source from roff to DocBook, and generate both roff
1482a4bd5210SJason Evans    and HTML manuals.
1483a4bd5210SJason Evans
1484a4bd5210SJason Evans  Bug fixes:
1485a4bd5210SJason Evans  - Fix a crash due to incorrect bootstrap ordering.  This only impacted
1486a4bd5210SJason Evans    --enable-debug --enable-dss configurations.
1487a4bd5210SJason Evans  - Fix a minor statistics bug for mallctl("swap.avail", ...).
1488a4bd5210SJason Evans
1489a4bd5210SJason Evans* 2.0.1 (October 29, 2010)
1490a4bd5210SJason Evans
1491a4bd5210SJason Evans  Bug fixes:
1492a4bd5210SJason Evans  - Fix a race condition in heap profiling that could cause undefined behavior
1493a4bd5210SJason Evans    if "opt.prof_accum" were disabled.
1494a4bd5210SJason Evans  - Add missing mutex unlocks for some OOM error paths in the heap profiling
1495a4bd5210SJason Evans    code.
1496a4bd5210SJason Evans  - Fix a compilation error for non-C99 builds.
1497a4bd5210SJason Evans
1498a4bd5210SJason Evans* 2.0.0 (October 24, 2010)
1499a4bd5210SJason Evans
1500a4bd5210SJason Evans  This version focuses on the experimental *allocm() API, and on improved
1501a4bd5210SJason Evans  run-time configuration/introspection.  Nonetheless, numerous performance
1502a4bd5210SJason Evans  improvements are also included.
1503a4bd5210SJason Evans
1504a4bd5210SJason Evans  New features:
1505a4bd5210SJason Evans  - Implement the experimental {,r,s,d}allocm() API, which provides a superset
1506a4bd5210SJason Evans    of the functionality available via malloc(), calloc(), posix_memalign(),
1507a4bd5210SJason Evans    realloc(), malloc_usable_size(), and free().  These functions can be used to
1508a4bd5210SJason Evans    allocate/reallocate aligned zeroed memory, ask for optional extra memory
1509a4bd5210SJason Evans    during reallocation, prevent object movement during reallocation, etc.
1510a4bd5210SJason Evans  - Replace JEMALLOC_OPTIONS/JEMALLOC_PROF_PREFIX with MALLOC_CONF, which is
1511a4bd5210SJason Evans    more human-readable, and more flexible.  For example:
1512a4bd5210SJason Evans      JEMALLOC_OPTIONS=AJP
1513a4bd5210SJason Evans    is now:
1514a4bd5210SJason Evans      MALLOC_CONF=abort:true,fill:true,stats_print:true
1515a4bd5210SJason Evans  - Port to Apple OS X.  Sponsored by Mozilla.
1516a4bd5210SJason Evans  - Make it possible for the application to control thread-->arena mappings via
1517a4bd5210SJason Evans    the "thread.arena" mallctl.
1518a4bd5210SJason Evans  - Add compile-time support for all TLS-related functionality via pthreads TSD.
1519a4bd5210SJason Evans    This is mainly of interest for OS X, which does not support TLS, but has a
1520a4bd5210SJason Evans    TSD implementation with similar performance.
1521a4bd5210SJason Evans  - Override memalign() and valloc() if they are provided by the system.
1522a4bd5210SJason Evans  - Add the "arenas.purge" mallctl, which can be used to synchronously purge all
1523a4bd5210SJason Evans    dirty unused pages.
1524a4bd5210SJason Evans  - Make cumulative heap profiling data optional, so that it is possible to
1525a4bd5210SJason Evans    limit the amount of memory consumed by heap profiling data structures.
1526a4bd5210SJason Evans  - Add per thread allocation counters that can be accessed via the
1527a4bd5210SJason Evans    "thread.allocated" and "thread.deallocated" mallctls.
1528a4bd5210SJason Evans
1529a4bd5210SJason Evans  Incompatible changes:
1530a4bd5210SJason Evans  - Remove JEMALLOC_OPTIONS and malloc_options (see MALLOC_CONF above).
1531a4bd5210SJason Evans  - Increase default backtrace depth from 4 to 128 for heap profiling.
1532a4bd5210SJason Evans  - Disable interval-based profile dumps by default.
1533a4bd5210SJason Evans
1534a4bd5210SJason Evans  Bug fixes:
1535a4bd5210SJason Evans  - Remove bad assertions in fork handler functions.  These assertions could
1536a4bd5210SJason Evans    cause aborts for some combinations of configure settings.
1537a4bd5210SJason Evans  - Fix strerror_r() usage to deal with non-standard semantics in GNU libc.
1538a4bd5210SJason Evans  - Fix leak context reporting.  This bug tended to cause the number of contexts
1539a4bd5210SJason Evans    to be underreported (though the reported number of objects and bytes were
1540a4bd5210SJason Evans    correct).
1541a4bd5210SJason Evans  - Fix a realloc() bug for large in-place growing reallocation.  This bug could
1542a4bd5210SJason Evans    cause memory corruption, but it was hard to trigger.
1543a4bd5210SJason Evans  - Fix an allocation bug for small allocations that could be triggered if
1544a4bd5210SJason Evans    multiple threads raced to create a new run of backing pages.
1545a4bd5210SJason Evans  - Enhance the heap profiler to trigger samples based on usable size, rather
1546a4bd5210SJason Evans    than request size.
1547a4bd5210SJason Evans  - Fix a heap profiling bug due to sometimes losing track of requested object
1548a4bd5210SJason Evans    size for sampled objects.
1549a4bd5210SJason Evans
1550a4bd5210SJason Evans* 1.0.3 (August 12, 2010)
1551a4bd5210SJason Evans
1552a4bd5210SJason Evans  Bug fixes:
1553a4bd5210SJason Evans  - Fix the libunwind-based implementation of stack backtracing (used for heap
1554a4bd5210SJason Evans    profiling).  This bug could cause zero-length backtraces to be reported.
1555a4bd5210SJason Evans  - Add a missing mutex unlock in library initialization code.  If multiple
1556a4bd5210SJason Evans    threads raced to initialize malloc, some of them could end up permanently
1557a4bd5210SJason Evans    blocked.
1558a4bd5210SJason Evans
1559a4bd5210SJason Evans* 1.0.2 (May 11, 2010)
1560a4bd5210SJason Evans
1561a4bd5210SJason Evans  Bug fixes:
1562a4bd5210SJason Evans  - Fix junk filling of large objects, which could cause memory corruption.
1563a4bd5210SJason Evans  - Add MAP_NORESERVE support for chunk mapping, because otherwise virtual
1564a4bd5210SJason Evans    memory limits could cause swap file configuration to fail.  Contributed by
1565a4bd5210SJason Evans    Jordan DeLong.
1566a4bd5210SJason Evans
1567a4bd5210SJason Evans* 1.0.1 (April 14, 2010)
1568a4bd5210SJason Evans
1569a4bd5210SJason Evans  Bug fixes:
1570a4bd5210SJason Evans  - Fix compilation when --enable-fill is specified.
1571a4bd5210SJason Evans  - Fix threads-related profiling bugs that affected accuracy and caused memory
1572a4bd5210SJason Evans    to be leaked during thread exit.
1573a4bd5210SJason Evans  - Fix dirty page purging race conditions that could cause crashes.
1574a4bd5210SJason Evans  - Fix crash in tcache flushing code during thread destruction.
1575a4bd5210SJason Evans
1576a4bd5210SJason Evans* 1.0.0 (April 11, 2010)
1577a4bd5210SJason Evans
1578a4bd5210SJason Evans  This release focuses on speed and run-time introspection.  Numerous
1579a4bd5210SJason Evans  algorithmic improvements make this release substantially faster than its
1580a4bd5210SJason Evans  predecessors.
1581a4bd5210SJason Evans
1582a4bd5210SJason Evans  New features:
1583a4bd5210SJason Evans  - Implement autoconf-based configuration system.
1584a4bd5210SJason Evans  - Add mallctl*(), for the purposes of introspection and run-time
1585a4bd5210SJason Evans    configuration.
1586a4bd5210SJason Evans  - Make it possible for the application to manually flush a thread's cache, via
1587a4bd5210SJason Evans    the "tcache.flush" mallctl.
1588a4bd5210SJason Evans  - Base maximum dirty page count on proportion of active memory.
1589d0e79aa3SJason Evans  - Compute various additional run-time statistics, including per size class
1590a4bd5210SJason Evans    statistics for large objects.
1591a4bd5210SJason Evans  - Expose malloc_stats_print(), which can be called repeatedly by the
1592a4bd5210SJason Evans    application.
1593a4bd5210SJason Evans  - Simplify the malloc_message() signature to only take one string argument,
1594a4bd5210SJason Evans    and incorporate an opaque data pointer argument for use by the application
1595a4bd5210SJason Evans    in combination with malloc_stats_print().
1596a4bd5210SJason Evans  - Add support for allocation backed by one or more swap files, and allow the
1597a4bd5210SJason Evans    application to disable over-commit if swap files are in use.
1598a4bd5210SJason Evans  - Implement allocation profiling and leak checking.
1599a4bd5210SJason Evans
1600a4bd5210SJason Evans  Removed features:
1601a4bd5210SJason Evans  - Remove the dynamic arena rebalancing code, since thread-specific caching
1602a4bd5210SJason Evans    reduces its utility.
1603a4bd5210SJason Evans
1604a4bd5210SJason Evans  Bug fixes:
1605a4bd5210SJason Evans  - Modify chunk allocation to work when address space layout randomization
1606a4bd5210SJason Evans    (ASLR) is in use.
1607a4bd5210SJason Evans  - Fix thread cleanup bugs related to TLS destruction.
1608a4bd5210SJason Evans  - Handle 0-size allocation requests in posix_memalign().
1609a4bd5210SJason Evans  - Fix a chunk leak.  The leaked chunks were never touched, so this impacted
1610a4bd5210SJason Evans    virtual memory usage, but not physical memory usage.
1611a4bd5210SJason Evans
1612a4bd5210SJason Evans* linux_2008082[78]a (August 27/28, 2008)
1613a4bd5210SJason Evans
1614a4bd5210SJason Evans  These snapshot releases are the simple result of incorporating Linux-specific
1615a4bd5210SJason Evans  support into the FreeBSD malloc sources.
1616a4bd5210SJason Evans
1617a4bd5210SJason Evans--------------------------------------------------------------------------------
1618a4bd5210SJason Evansvim:filetype=text:textwidth=80
1619