xref: /freebsd/contrib/jemalloc/ChangeLog (revision b7eaed250fccfdef218a62bc2d0af529ad75341c)
1a4bd5210SJason EvansFollowing are change highlights associated with official releases.  Important
2d0e79aa3SJason Evansbug fixes are all mentioned, but some internal enhancements are omitted here for
3d0e79aa3SJason Evansbrevity.  Much more detail can be found in the git revision history:
4a4bd5210SJason Evans
5706d9bd1SJason Evans    https://github.com/jemalloc/jemalloc
6706d9bd1SJason Evans
7*b7eaed25SJason Evans* 5.0.0 (June 13, 2017)
8*b7eaed25SJason Evans
9*b7eaed25SJason Evans  Unlike all previous jemalloc releases, this release does not use naturally
10*b7eaed25SJason Evans  aligned "chunks" for virtual memory management, and instead uses page-aligned
11*b7eaed25SJason Evans  "extents".  This change has few externally visible effects, but the internal
12*b7eaed25SJason Evans  impacts are... extensive.  Many other internal changes combine to make this
13*b7eaed25SJason Evans  the most cohesively designed version of jemalloc so far, with ample
14*b7eaed25SJason Evans  opportunity for further enhancements.
15*b7eaed25SJason Evans
16*b7eaed25SJason Evans  Continuous integration is now an integral aspect of development thanks to the
17*b7eaed25SJason Evans  efforts of @davidtgoldblatt, and the dev branch tends to remain reasonably
18*b7eaed25SJason Evans  stable on the tested platforms (Linux, FreeBSD, macOS, and Windows).  As a
19*b7eaed25SJason Evans  side effect the official release frequency may decrease over time.
20*b7eaed25SJason Evans
21*b7eaed25SJason Evans  New features:
22*b7eaed25SJason Evans  - Implement optional per-CPU arena support; threads choose which arena to use
23*b7eaed25SJason Evans    based on current CPU rather than on fixed thread-->arena associations.
24*b7eaed25SJason Evans    (@interwq)
25*b7eaed25SJason Evans  - Implement two-phase decay of unused dirty pages.  Pages transition from
26*b7eaed25SJason Evans    dirty-->muzzy-->clean, where the first phase transition relies on
27*b7eaed25SJason Evans    madvise(... MADV_FREE) semantics, and the second phase transition discards
28*b7eaed25SJason Evans    pages such that they are replaced with demand-zeroed pages on next access.
29*b7eaed25SJason Evans    (@jasone)
30*b7eaed25SJason Evans  - Increase decay time resolution from seconds to milliseconds.  (@jasone)
31*b7eaed25SJason Evans  - Implement opt-in per CPU background threads, and use them for asynchronous
32*b7eaed25SJason Evans    decay-driven unused dirty page purging.  (@interwq)
33*b7eaed25SJason Evans  - Add mutex profiling, which collects a variety of statistics useful for
34*b7eaed25SJason Evans    diagnosing overhead/contention issues.  (@interwq)
35*b7eaed25SJason Evans  - Add C++ new/delete operator bindings.  (@djwatson)
36*b7eaed25SJason Evans  - Support manually created arena destruction, such that all data and metadata
37*b7eaed25SJason Evans    are discarded.  Add MALLCTL_ARENAS_DESTROYED for accessing merged stats
38*b7eaed25SJason Evans    associated with destroyed arenas.  (@jasone)
39*b7eaed25SJason Evans  - Add MALLCTL_ARENAS_ALL as a fixed index for use in accessing
40*b7eaed25SJason Evans    merged/destroyed arena statistics via mallctl.  (@jasone)
41*b7eaed25SJason Evans  - Add opt.abort_conf to optionally abort if invalid configuration options are
42*b7eaed25SJason Evans    detected during initialization.  (@interwq)
43*b7eaed25SJason Evans  - Add opt.stats_print_opts, so that e.g. JSON output can be selected for the
44*b7eaed25SJason Evans    stats dumped during exit if opt.stats_print is true.  (@jasone)
45*b7eaed25SJason Evans  - Add --with-version=VERSION for use when embedding jemalloc into another
46*b7eaed25SJason Evans    project's git repository.  (@jasone)
47*b7eaed25SJason Evans  - Add --disable-thp to support cross compiling.  (@jasone)
48*b7eaed25SJason Evans  - Add --with-lg-hugepage to support cross compiling.  (@jasone)
49*b7eaed25SJason Evans  - Add mallctl interfaces (various authors):
50*b7eaed25SJason Evans    + background_thread
51*b7eaed25SJason Evans    + opt.abort_conf
52*b7eaed25SJason Evans    + opt.retain
53*b7eaed25SJason Evans    + opt.percpu_arena
54*b7eaed25SJason Evans    + opt.background_thread
55*b7eaed25SJason Evans    + opt.{dirty,muzzy}_decay_ms
56*b7eaed25SJason Evans    + opt.stats_print_opts
57*b7eaed25SJason Evans    + arena.<i>.initialized
58*b7eaed25SJason Evans    + arena.<i>.destroy
59*b7eaed25SJason Evans    + arena.<i>.{dirty,muzzy}_decay_ms
60*b7eaed25SJason Evans    + arena.<i>.extent_hooks
61*b7eaed25SJason Evans    + arenas.{dirty,muzzy}_decay_ms
62*b7eaed25SJason Evans    + arenas.bin.<i>.slab_size
63*b7eaed25SJason Evans    + arenas.nlextents
64*b7eaed25SJason Evans    + arenas.lextent.<i>.size
65*b7eaed25SJason Evans    + arenas.create
66*b7eaed25SJason Evans    + stats.background_thread.{num_threads,num_runs,run_interval}
67*b7eaed25SJason Evans    + stats.mutexes.{ctl,background_thread,prof,reset}.
68*b7eaed25SJason Evans      {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
69*b7eaed25SJason Evans      num_owner_switch}
70*b7eaed25SJason Evans    + stats.arenas.<i>.{dirty,muzzy}_decay_ms
71*b7eaed25SJason Evans    + stats.arenas.<i>.uptime
72*b7eaed25SJason Evans    + stats.arenas.<i>.{pmuzzy,base,internal,resident}
73*b7eaed25SJason Evans    + stats.arenas.<i>.{dirty,muzzy}_{npurge,nmadvise,purged}
74*b7eaed25SJason Evans    + stats.arenas.<i>.bins.<j>.{nslabs,reslabs,curslabs}
75*b7eaed25SJason Evans    + stats.arenas.<i>.bins.<j>.mutex.
76*b7eaed25SJason Evans      {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
77*b7eaed25SJason Evans      num_owner_switch}
78*b7eaed25SJason Evans    + stats.arenas.<i>.lextents.<j>.{nmalloc,ndalloc,nrequests,curlextents}
79*b7eaed25SJason Evans    + stats.arenas.i.mutexes.{large,extent_avail,extents_dirty,extents_muzzy,
80*b7eaed25SJason Evans      extents_retained,decay_dirty,decay_muzzy,base,tcache_list}.
81*b7eaed25SJason Evans      {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds,
82*b7eaed25SJason Evans      num_owner_switch}
83*b7eaed25SJason Evans
84*b7eaed25SJason Evans  Portability improvements:
85*b7eaed25SJason Evans  - Improve reentrant allocation support, such that deadlock is less likely if
86*b7eaed25SJason Evans    e.g. a system library call in turn allocates memory.  (@davidtgoldblatt,
87*b7eaed25SJason Evans    @interwq)
88*b7eaed25SJason Evans  - Support static linking of jemalloc with glibc.  (@djwatson)
89*b7eaed25SJason Evans
90*b7eaed25SJason Evans  Optimizations and refactors:
91*b7eaed25SJason Evans  - Organize virtual memory as "extents" of virtual memory pages, rather than as
92*b7eaed25SJason Evans    naturally aligned "chunks", and store all metadata in arbitrarily distant
93*b7eaed25SJason Evans    locations.  This reduces virtual memory external fragmentation, and will
94*b7eaed25SJason Evans    interact better with huge pages (not yet explicitly supported).  (@jasone)
95*b7eaed25SJason Evans  - Fold large and huge size classes together; only small and large size classes
96*b7eaed25SJason Evans    remain.  (@jasone)
97*b7eaed25SJason Evans  - Unify the allocation paths, and merge most fast-path branching decisions.
98*b7eaed25SJason Evans    (@davidtgoldblatt, @interwq)
99*b7eaed25SJason Evans  - Embed per thread automatic tcache into thread-specific data, which reduces
100*b7eaed25SJason Evans    conditional branches and dereferences.  Also reorganize tcache to increase
101*b7eaed25SJason Evans    fast-path data locality.  (@interwq)
102*b7eaed25SJason Evans  - Rewrite atomics to closely model the C11 API, convert various
103*b7eaed25SJason Evans    synchronization from mutex-based to atomic, and use the explicit memory
104*b7eaed25SJason Evans    ordering control to resolve various hypothetical races without increasing
105*b7eaed25SJason Evans    synchronization overhead.  (@davidtgoldblatt)
106*b7eaed25SJason Evans  - Extensively optimize rtree via various methods:
107*b7eaed25SJason Evans    + Add multiple layers of rtree lookup caching, since rtree lookups are now
108*b7eaed25SJason Evans      part of fast-path deallocation.  (@interwq)
109*b7eaed25SJason Evans    + Determine rtree layout at compile time.  (@jasone)
110*b7eaed25SJason Evans    + Make the tree shallower for common configurations.  (@jasone)
111*b7eaed25SJason Evans    + Embed the root node in the top-level rtree data structure, thus avoiding
112*b7eaed25SJason Evans      one level of indirection.  (@jasone)
113*b7eaed25SJason Evans    + Further specialize leaf elements as compared to internal node elements,
114*b7eaed25SJason Evans      and directly embed extent metadata needed for fast-path deallocation.
115*b7eaed25SJason Evans      (@jasone)
116*b7eaed25SJason Evans    + Ignore leading always-zero address bits (architecture-specific).
117*b7eaed25SJason Evans      (@jasone)
118*b7eaed25SJason Evans  - Reorganize headers (ongoing work) to make them hermetic, and disentangle
119*b7eaed25SJason Evans    various module dependencies.  (@davidtgoldblatt)
120*b7eaed25SJason Evans  - Convert various internal data structures such as size class metadata from
121*b7eaed25SJason Evans    boot-time-initialized to compile-time-initialized.  Propagate resulting data
122*b7eaed25SJason Evans    structure simplifications, such as making arena metadata fixed-size.
123*b7eaed25SJason Evans    (@jasone)
124*b7eaed25SJason Evans  - Simplify size class lookups when constrained to size classes that are
125*b7eaed25SJason Evans    multiples of the page size.  This speeds lookups, but the primary benefit is
126*b7eaed25SJason Evans    complexity reduction in code that was the source of numerous regressions.
127*b7eaed25SJason Evans    (@jasone)
128*b7eaed25SJason Evans  - Lock individual extents when possible for localized extent operations,
129*b7eaed25SJason Evans    rather than relying on a top-level arena lock.  (@davidtgoldblatt, @jasone)
130*b7eaed25SJason Evans  - Use first fit layout policy instead of best fit, in order to improve
131*b7eaed25SJason Evans    packing.  (@jasone)
132*b7eaed25SJason Evans  - If munmap(2) is not in use, use an exponential series to grow each arena's
133*b7eaed25SJason Evans    virtual memory, so that the number of disjoint virtual memory mappings
134*b7eaed25SJason Evans    remains low.  (@jasone)
135*b7eaed25SJason Evans  - Implement per arena base allocators, so that arenas never share any virtual
136*b7eaed25SJason Evans    memory pages.  (@jasone)
137*b7eaed25SJason Evans  - Automatically generate private symbol name mangling macros.  (@jasone)
138*b7eaed25SJason Evans
139*b7eaed25SJason Evans  Incompatible changes:
140*b7eaed25SJason Evans  - Replace chunk hooks with an expanded/normalized set of extent hooks.
141*b7eaed25SJason Evans    (@jasone)
142*b7eaed25SJason Evans  - Remove ratio-based purging.  (@jasone)
143*b7eaed25SJason Evans  - Remove --disable-tcache.  (@jasone)
144*b7eaed25SJason Evans  - Remove --disable-tls.  (@jasone)
145*b7eaed25SJason Evans  - Remove --enable-ivsalloc.  (@jasone)
146*b7eaed25SJason Evans  - Remove --with-lg-size-class-group.  (@jasone)
147*b7eaed25SJason Evans  - Remove --with-lg-tiny-min.  (@jasone)
148*b7eaed25SJason Evans  - Remove --disable-cc-silence.  (@jasone)
149*b7eaed25SJason Evans  - Remove --enable-code-coverage.  (@jasone)
150*b7eaed25SJason Evans  - Remove --disable-munmap (replaced by opt.retain).  (@jasone)
151*b7eaed25SJason Evans  - Remove Valgrind support.  (@jasone)
152*b7eaed25SJason Evans  - Remove quarantine support.  (@jasone)
153*b7eaed25SJason Evans  - Remove redzone support.  (@jasone)
154*b7eaed25SJason Evans  - Remove mallctl interfaces (various authors):
155*b7eaed25SJason Evans    + config.munmap
156*b7eaed25SJason Evans    + config.tcache
157*b7eaed25SJason Evans    + config.tls
158*b7eaed25SJason Evans    + config.valgrind
159*b7eaed25SJason Evans    + opt.lg_chunk
160*b7eaed25SJason Evans    + opt.purge
161*b7eaed25SJason Evans    + opt.lg_dirty_mult
162*b7eaed25SJason Evans    + opt.decay_time
163*b7eaed25SJason Evans    + opt.quarantine
164*b7eaed25SJason Evans    + opt.redzone
165*b7eaed25SJason Evans    + opt.thp
166*b7eaed25SJason Evans    + arena.<i>.lg_dirty_mult
167*b7eaed25SJason Evans    + arena.<i>.decay_time
168*b7eaed25SJason Evans    + arena.<i>.chunk_hooks
169*b7eaed25SJason Evans    + arenas.initialized
170*b7eaed25SJason Evans    + arenas.lg_dirty_mult
171*b7eaed25SJason Evans    + arenas.decay_time
172*b7eaed25SJason Evans    + arenas.bin.<i>.run_size
173*b7eaed25SJason Evans    + arenas.nlruns
174*b7eaed25SJason Evans    + arenas.lrun.<i>.size
175*b7eaed25SJason Evans    + arenas.nhchunks
176*b7eaed25SJason Evans    + arenas.hchunk.<i>.size
177*b7eaed25SJason Evans    + arenas.extend
178*b7eaed25SJason Evans    + stats.cactive
179*b7eaed25SJason Evans    + stats.arenas.<i>.lg_dirty_mult
180*b7eaed25SJason Evans    + stats.arenas.<i>.decay_time
181*b7eaed25SJason Evans    + stats.arenas.<i>.metadata.{mapped,allocated}
182*b7eaed25SJason Evans    + stats.arenas.<i>.{npurge,nmadvise,purged}
183*b7eaed25SJason Evans    + stats.arenas.<i>.huge.{allocated,nmalloc,ndalloc,nrequests}
184*b7eaed25SJason Evans    + stats.arenas.<i>.bins.<j>.{nruns,reruns,curruns}
185*b7eaed25SJason Evans    + stats.arenas.<i>.lruns.<j>.{nmalloc,ndalloc,nrequests,curruns}
186*b7eaed25SJason Evans    + stats.arenas.<i>.hchunks.<j>.{nmalloc,ndalloc,nrequests,curhchunks}
187*b7eaed25SJason Evans
188*b7eaed25SJason Evans  Bug fixes:
189*b7eaed25SJason Evans  - Improve interval-based profile dump triggering to dump only one profile when
190*b7eaed25SJason Evans    a single allocation's size exceeds the interval.  (@jasone)
191*b7eaed25SJason Evans  - Use prefixed function names (as controlled by --with-jemalloc-prefix) when
192*b7eaed25SJason Evans    pruning backtrace frames in jeprof.  (@jasone)
193*b7eaed25SJason Evans
1948244f2aaSJason Evans* 4.5.0 (February 28, 2017)
1958244f2aaSJason Evans
1968244f2aaSJason Evans  This is the first release to benefit from much broader continuous integration
1978244f2aaSJason Evans  testing, thanks to @davidtgoldblatt.  Had we had this testing infrastructure
1988244f2aaSJason Evans  in place for prior releases, it would have caught all of the most serious
1998244f2aaSJason Evans  regressions fixed by this release.
2008244f2aaSJason Evans
2018244f2aaSJason Evans  New features:
202*b7eaed25SJason Evans  - Add --disable-thp and the opt.thp mallctl to provide opt-out mechanisms for
2038244f2aaSJason Evans    transparent huge page integration.  (@jasone)
2048244f2aaSJason Evans  - Update zone allocator integration to work with macOS 10.12.  (@glandium)
2058244f2aaSJason Evans  - Restructure *CFLAGS configuration, so that CFLAGS behaves typically, and
2068244f2aaSJason Evans    EXTRA_CFLAGS provides a way to specify e.g. -Werror during building, but not
2078244f2aaSJason Evans    during configuration.  (@jasone, @ronawho)
2088244f2aaSJason Evans
2098244f2aaSJason Evans  Bug fixes:
2108244f2aaSJason Evans  - Fix DSS (sbrk(2)-based) allocation.  This regression was first released in
2118244f2aaSJason Evans    4.3.0.  (@jasone)
2128244f2aaSJason Evans  - Handle race in per size class utilization computation.  This functionality
2138244f2aaSJason Evans    was first released in 4.0.0.  (@interwq)
2148244f2aaSJason Evans  - Fix lock order reversal during gdump.  (@jasone)
215*b7eaed25SJason Evans  - Fix/refactor tcache synchronization.  This regression was first released in
2168244f2aaSJason Evans    4.0.0.  (@jasone)
2178244f2aaSJason Evans  - Fix various JSON-formatted malloc_stats_print() bugs.  This functionality
2188244f2aaSJason Evans    was first released in 4.3.0.  (@jasone)
2198244f2aaSJason Evans  - Fix huge-aligned allocation.  This regression was first released in 4.4.0.
2208244f2aaSJason Evans    (@jasone)
2218244f2aaSJason Evans  - When transparent huge page integration is enabled, detect what state pages
2228244f2aaSJason Evans    start in according to the kernel's current operating mode, and only convert
2238244f2aaSJason Evans    arena chunks to non-huge during purging if that is not their initial state.
2248244f2aaSJason Evans    This functionality was first released in 4.4.0.  (@jasone)
2258244f2aaSJason Evans  - Fix lg_chunk clamping for the --enable-cache-oblivious --disable-fill case.
2268244f2aaSJason Evans    This regression was first released in 4.0.0.  (@jasone, @428desmo)
2278244f2aaSJason Evans  - Properly detect sparc64 when building for Linux.  (@glaubitz)
2288244f2aaSJason Evans
2297fa7f12fSJason Evans* 4.4.0 (December 3, 2016)
2307fa7f12fSJason Evans
2317fa7f12fSJason Evans  New features:
2327fa7f12fSJason Evans  - Add configure support for *-*-linux-android.  (@cferris1000, @jasone)
2337fa7f12fSJason Evans  - Add the --disable-syscall configure option, for use on systems that place
2347fa7f12fSJason Evans    security-motivated limitations on syscall(2).  (@jasone)
2357fa7f12fSJason Evans  - Add support for Debian GNU/kFreeBSD.  (@thesam)
2367fa7f12fSJason Evans
2377fa7f12fSJason Evans  Optimizations:
2387fa7f12fSJason Evans  - Add extent serial numbers and use them where appropriate as a sort key that
2397fa7f12fSJason Evans    is higher priority than address, so that the allocation policy prefers older
2407fa7f12fSJason Evans    extents.  This tends to improve locality (decrease fragmentation) when
2417fa7f12fSJason Evans    memory grows downward.  (@jasone)
2427fa7f12fSJason Evans  - Refactor madvise(2) configuration so that MADV_FREE is detected and utilized
2437fa7f12fSJason Evans    on Linux 4.5 and newer.  (@jasone)
2447fa7f12fSJason Evans  - Mark partially purged arena chunks as non-huge-page.  This improves
2457fa7f12fSJason Evans    interaction with Linux's transparent huge page functionality.  (@jasone)
2467fa7f12fSJason Evans
2477fa7f12fSJason Evans  Bug fixes:
2487fa7f12fSJason Evans  - Fix size class computations for edge conditions involving extremely large
2497fa7f12fSJason Evans    allocations.  This regression was first released in 4.0.0.  (@jasone,
2507fa7f12fSJason Evans    @ingvarha)
2517fa7f12fSJason Evans  - Remove overly restrictive assertions related to the cactive statistic.  This
2527fa7f12fSJason Evans    regression was first released in 4.1.0.  (@jasone)
2537fa7f12fSJason Evans  - Implement a more reliable detection scheme for os_unfair_lock on macOS.
2547fa7f12fSJason Evans    (@jszakmeister)
2557fa7f12fSJason Evans
256bde95144SJason Evans* 4.3.1 (November 7, 2016)
257bde95144SJason Evans
258bde95144SJason Evans  Bug fixes:
259bde95144SJason Evans  - Fix a severe virtual memory leak.  This regression was first released in
260bde95144SJason Evans    4.3.0.  (@interwq, @jasone)
261bde95144SJason Evans  - Refactor atomic and prng APIs to restore support for 32-bit platforms that
262bde95144SJason Evans    use pre-C11 toolchains, e.g. FreeBSD's mips.  (@jasone)
263bde95144SJason Evans
264bde95144SJason Evans* 4.3.0 (November 4, 2016)
265bde95144SJason Evans
266bde95144SJason Evans  This is the first release that passes the test suite for multiple Windows
267bde95144SJason Evans  configurations, thanks in large part to @glandium setting up continuous
268bde95144SJason Evans  integration via AppVeyor (and Travis CI for Linux and OS X).
269bde95144SJason Evans
270bde95144SJason Evans  New features:
271bde95144SJason Evans  - Add "J" (JSON) support to malloc_stats_print().  (@jasone)
272bde95144SJason Evans  - Add Cray compiler support.  (@ronawho)
273bde95144SJason Evans
274bde95144SJason Evans  Optimizations:
275bde95144SJason Evans  - Add/use adaptive spinning for bootstrapping and radix tree node
276bde95144SJason Evans    initialization.  (@jasone)
277bde95144SJason Evans
278bde95144SJason Evans  Bug fixes:
279bde95144SJason Evans  - Fix large allocation to search starting in the optimal size class heap,
280bde95144SJason Evans    which can substantially reduce virtual memory churn and fragmentation.  This
281bde95144SJason Evans    regression was first released in 4.0.0.  (@mjp41, @jasone)
282bde95144SJason Evans  - Fix stats.arenas.<i>.nthreads accounting.  (@interwq)
283bde95144SJason Evans  - Fix and simplify decay-based purging.  (@jasone)
284bde95144SJason Evans  - Make DSS (sbrk(2)-related) operations lockless, which resolves potential
285bde95144SJason Evans    deadlocks during thread exit.  (@jasone)
286bde95144SJason Evans  - Fix over-sized allocation of radix tree leaf nodes.  (@mjp41, @ogaun,
287bde95144SJason Evans    @jasone)
288bde95144SJason Evans  - Fix over-sized allocation of arena_t (plus associated stats) data
289bde95144SJason Evans    structures.  (@jasone, @interwq)
290bde95144SJason Evans  - Fix EXTRA_CFLAGS to not affect configuration.  (@jasone)
291bde95144SJason Evans  - Fix a Valgrind integration bug.  (@ronawho)
292bde95144SJason Evans  - Disallow 0x5a junk filling when running in Valgrind.  (@jasone)
293bde95144SJason Evans  - Fix a file descriptor leak on Linux.  This regression was first released in
294bde95144SJason Evans    4.2.0.  (@vsarunas, @jasone)
295bde95144SJason Evans  - Fix static linking of jemalloc with glibc.  (@djwatson)
296bde95144SJason Evans  - Use syscall(2) rather than {open,read,close}(2) during boot on Linux.  This
297bde95144SJason Evans    works around other libraries' system call wrappers performing reentrant
298bde95144SJason Evans    allocation.  (@kspinka, @Whissi, @jasone)
299bde95144SJason Evans  - Fix OS X default zone replacement to work with OS X 10.12.  (@glandium,
300bde95144SJason Evans    @jasone)
301bde95144SJason Evans  - Fix cached memory management to avoid needless commit/decommit operations
302bde95144SJason Evans    during purging, which resolves permanent virtual memory map fragmentation
303bde95144SJason Evans    issues on Windows.  (@mjp41, @jasone)
304bde95144SJason Evans  - Fix TSD fetches to avoid (recursive) allocation.  This is relevant to
305bde95144SJason Evans    non-TLS and Windows configurations.  (@jasone)
306bde95144SJason Evans  - Fix malloc_conf overriding to work on Windows.  (@jasone)
307bde95144SJason Evans  - Forcibly disable lazy-lock on Windows (was forcibly *enabled*).  (@jasone)
308bde95144SJason Evans
30962b2691eSJason Evans* 4.2.1 (June 8, 2016)
31062b2691eSJason Evans
31162b2691eSJason Evans  Bug fixes:
31262b2691eSJason Evans  - Fix bootstrapping issues for configurations that require allocation during
31362b2691eSJason Evans    tsd initialization (e.g. --disable-tls).  (@cferris1000, @jasone)
31462b2691eSJason Evans  - Fix gettimeofday() version of nstime_update().  (@ronawho)
31562b2691eSJason Evans  - Fix Valgrind regressions in calloc() and chunk_alloc_wrapper().  (@ronawho)
31662b2691eSJason Evans  - Fix potential VM map fragmentation regression.  (@jasone)
31762b2691eSJason Evans  - Fix opt_zero-triggered in-place huge reallocation zeroing.  (@jasone)
31862b2691eSJason Evans  - Fix heap profiling context leaks in reallocation edge cases.  (@jasone)
31962b2691eSJason Evans
3201f0a49e8SJason Evans* 4.2.0 (May 12, 2016)
3211f0a49e8SJason Evans
3221f0a49e8SJason Evans  New features:
3231f0a49e8SJason Evans  - Add the arena.<i>.reset mallctl, which makes it possible to discard all of
3241f0a49e8SJason Evans    an arena's allocations in a single operation.  (@jasone)
3251f0a49e8SJason Evans  - Add the stats.retained and stats.arenas.<i>.retained statistics.  (@jasone)
3261f0a49e8SJason Evans  - Add the --with-version configure option.  (@jasone)
3271f0a49e8SJason Evans  - Support --with-lg-page values larger than actual page size.  (@jasone)
3281f0a49e8SJason Evans
3291f0a49e8SJason Evans  Optimizations:
3301f0a49e8SJason Evans  - Use pairing heaps rather than red-black trees for various hot data
3311f0a49e8SJason Evans    structures.  (@djwatson, @jasone)
3321f0a49e8SJason Evans  - Streamline fast paths of rtree operations.  (@jasone)
3331f0a49e8SJason Evans  - Optimize the fast paths of calloc() and [m,d,sd]allocx().  (@jasone)
3341f0a49e8SJason Evans  - Decommit unused virtual memory if the OS does not overcommit.  (@jasone)
3351f0a49e8SJason Evans  - Specify MAP_NORESERVE on Linux if [heuristic] overcommit is active, in order
3361f0a49e8SJason Evans    to avoid unfortunate interactions during fork(2).  (@jasone)
3371f0a49e8SJason Evans
3381f0a49e8SJason Evans  Bug fixes:
3391f0a49e8SJason Evans  - Fix chunk accounting related to triggering gdump profiles.  (@jasone)
3401f0a49e8SJason Evans  - Link against librt for clock_gettime(2) if glibc < 2.17.  (@jasone)
3411f0a49e8SJason Evans  - Scale leak report summary according to sampling probability.  (@jasone)
3421f0a49e8SJason Evans
3431f0a49e8SJason Evans* 4.1.1 (May 3, 2016)
3441f0a49e8SJason Evans
3451f0a49e8SJason Evans  This bugfix release resolves a variety of mostly minor issues, though the
3461f0a49e8SJason Evans  bitmap fix is critical for 64-bit Windows.
3471f0a49e8SJason Evans
3481f0a49e8SJason Evans  Bug fixes:
3491f0a49e8SJason Evans  - Fix the linear scan version of bitmap_sfu() to shift by the proper amount
3501f0a49e8SJason Evans    even when sizeof(long) is not the same as sizeof(void *), as on 64-bit
3511f0a49e8SJason Evans    Windows.  (@jasone)
3521f0a49e8SJason Evans  - Fix hashing functions to avoid unaligned memory accesses (and resulting
3531f0a49e8SJason Evans    crashes).  This is relevant at least to some ARM-based platforms.
3541f0a49e8SJason Evans    (@rkmisra)
3551f0a49e8SJason Evans  - Fix fork()-related lock rank ordering reversals.  These reversals were
3561f0a49e8SJason Evans    unlikely to cause deadlocks in practice except when heap profiling was
3571f0a49e8SJason Evans    enabled and active.  (@jasone)
3581f0a49e8SJason Evans  - Fix various chunk leaks in OOM code paths.  (@jasone)
3591f0a49e8SJason Evans  - Fix malloc_stats_print() to print opt.narenas correctly.  (@jasone)
3601f0a49e8SJason Evans  - Fix MSVC-specific build/test issues.  (@rustyx, @yuslepukhin)
3611f0a49e8SJason Evans  - Fix a variety of test failures that were due to test fragility rather than
3621f0a49e8SJason Evans    core bugs.  (@jasone)
3631f0a49e8SJason Evans
364df0d881dSJason Evans* 4.1.0 (February 28, 2016)
365df0d881dSJason Evans
366df0d881dSJason Evans  This release is primarily about optimizations, but it also incorporates a lot
367df0d881dSJason Evans  of portability-motivated refactoring and enhancements.  Many people worked on
368df0d881dSJason Evans  this release, to an extent that even with the omission here of minor changes
369df0d881dSJason Evans  (see git revision history), and of the people who reported and diagnosed
370df0d881dSJason Evans  issues, so much of the work was contributed that starting with this release,
371df0d881dSJason Evans  changes are annotated with author credits to help reflect the collaborative
372df0d881dSJason Evans  effort involved.
373df0d881dSJason Evans
374df0d881dSJason Evans  New features:
375df0d881dSJason Evans  - Implement decay-based unused dirty page purging, a major optimization with
376df0d881dSJason Evans    mallctl API impact.  This is an alternative to the existing ratio-based
377df0d881dSJason Evans    unused dirty page purging, and is intended to eventually become the sole
378df0d881dSJason Evans    purging mechanism.  New mallctls:
379df0d881dSJason Evans    + opt.purge
380df0d881dSJason Evans    + opt.decay_time
381df0d881dSJason Evans    + arena.<i>.decay
382df0d881dSJason Evans    + arena.<i>.decay_time
383df0d881dSJason Evans    + arenas.decay_time
384df0d881dSJason Evans    + stats.arenas.<i>.decay_time
385df0d881dSJason Evans    (@jasone, @cevans87)
386df0d881dSJason Evans  - Add --with-malloc-conf, which makes it possible to embed a default
387df0d881dSJason Evans    options string during configuration.  This was motivated by the desire to
388df0d881dSJason Evans    specify --with-malloc-conf=purge:decay , since the default must remain
389df0d881dSJason Evans    purge:ratio until the 5.0.0 release.  (@jasone)
390df0d881dSJason Evans  - Add MS Visual Studio 2015 support.  (@rustyx, @yuslepukhin)
391df0d881dSJason Evans  - Make *allocx() size class overflow behavior defined.  The maximum
392df0d881dSJason Evans    size class is now less than PTRDIFF_MAX to protect applications against
393df0d881dSJason Evans    numerical overflow, and all allocation functions are guaranteed to indicate
394df0d881dSJason Evans    errors rather than potentially crashing if the request size exceeds the
395df0d881dSJason Evans    maximum size class.  (@jasone)
396df0d881dSJason Evans  - jeprof:
397df0d881dSJason Evans    + Add raw heap profile support.  (@jasone)
398df0d881dSJason Evans    + Add --retain and --exclude for backtrace symbol filtering.  (@jasone)
399df0d881dSJason Evans
400df0d881dSJason Evans  Optimizations:
401df0d881dSJason Evans  - Optimize the fast path to combine various bootstrapping and configuration
402df0d881dSJason Evans    checks and execute more streamlined code in the common case.  (@interwq)
403df0d881dSJason Evans  - Use linear scan for small bitmaps (used for small object tracking).  In
404df0d881dSJason Evans    addition to speeding up bitmap operations on 64-bit systems, this reduces
405df0d881dSJason Evans    allocator metadata overhead by approximately 0.2%.  (@djwatson)
406df0d881dSJason Evans  - Separate arena_avail trees, which substantially speeds up run tree
407df0d881dSJason Evans    operations.  (@djwatson)
408df0d881dSJason Evans  - Use memoization (boot-time-computed table) for run quantization.  Separate
409df0d881dSJason Evans    arena_avail trees reduced the importance of this optimization.  (@jasone)
410df0d881dSJason Evans  - Attempt mmap-based in-place huge reallocation.  This can dramatically speed
411df0d881dSJason Evans    up incremental huge reallocation.  (@jasone)
412df0d881dSJason Evans
413df0d881dSJason Evans  Incompatible changes:
414df0d881dSJason Evans  - Make opt.narenas unsigned rather than size_t.  (@jasone)
415df0d881dSJason Evans
416df0d881dSJason Evans  Bug fixes:
417df0d881dSJason Evans  - Fix stats.cactive accounting regression.  (@rustyx, @jasone)
418df0d881dSJason Evans  - Handle unaligned keys in hash().  This caused problems for some ARM systems.
4191f0a49e8SJason Evans    (@jasone, @cferris1000)
420df0d881dSJason Evans  - Refactor arenas array.  In addition to fixing a fork-related deadlock, this
421df0d881dSJason Evans    makes arena lookups faster and simpler.  (@jasone)
422df0d881dSJason Evans  - Move retained memory allocation out of the default chunk allocation
423df0d881dSJason Evans    function, to a location that gets executed even if the application installs
424df0d881dSJason Evans    a custom chunk allocation function.  This resolves a virtual memory leak.
425df0d881dSJason Evans    (@buchgr)
4261f0a49e8SJason Evans  - Fix a potential tsd cleanup leak.  (@cferris1000, @jasone)
427df0d881dSJason Evans  - Fix run quantization.  In practice this bug had no impact unless
428df0d881dSJason Evans    applications requested memory with alignment exceeding one page.
429df0d881dSJason Evans    (@jasone, @djwatson)
430df0d881dSJason Evans  - Fix LinuxThreads-specific bootstrapping deadlock.  (Cosmin Paraschiv)
431df0d881dSJason Evans  - jeprof:
432df0d881dSJason Evans    + Don't discard curl options if timeout is not defined.  (@djwatson)
433df0d881dSJason Evans    + Detect failed profile fetches.  (@djwatson)
434df0d881dSJason Evans  - Fix stats.arenas.<i>.{dss,lg_dirty_mult,decay_time,pactive,pdirty} for
435df0d881dSJason Evans    --disable-stats case.  (@jasone)
436df0d881dSJason Evans
437ba4f5cc0SJason Evans* 4.0.4 (October 24, 2015)
438ba4f5cc0SJason Evans
439ba4f5cc0SJason Evans  This bugfix release fixes another xallocx() regression.  No other regressions
440ba4f5cc0SJason Evans  have come to light in over a month, so this is likely a good starting point
441ba4f5cc0SJason Evans  for people who prefer to wait for "dot one" releases with all the major issues
442ba4f5cc0SJason Evans  shaken out.
443ba4f5cc0SJason Evans
444ba4f5cc0SJason Evans  Bug fixes:
445ba4f5cc0SJason Evans  - Fix xallocx(..., MALLOCX_ZERO to zero the last full trailing page of large
446ba4f5cc0SJason Evans    allocations that have been randomly assigned an offset of 0 when
447ba4f5cc0SJason Evans    --enable-cache-oblivious configure option is enabled.
448ba4f5cc0SJason Evans
449ba4f5cc0SJason Evans* 4.0.3 (September 24, 2015)
450ba4f5cc0SJason Evans
451ba4f5cc0SJason Evans  This bugfix release continues the trend of xallocx() and heap profiling fixes.
452ba4f5cc0SJason Evans
453ba4f5cc0SJason Evans  Bug fixes:
454ba4f5cc0SJason Evans  - Fix xallocx(..., MALLOCX_ZERO) to zero all trailing bytes of large
455ba4f5cc0SJason Evans    allocations when --enable-cache-oblivious configure option is enabled.
456ba4f5cc0SJason Evans  - Fix xallocx(..., MALLOCX_ZERO) to zero trailing bytes of huge allocations
457ba4f5cc0SJason Evans    when resizing from/to a size class that is not a multiple of the chunk size.
458ba4f5cc0SJason Evans  - Fix prof_tctx_dump_iter() to filter out nodes that were created after heap
459ba4f5cc0SJason Evans    profile dumping started.
460ba4f5cc0SJason Evans  - Work around a potentially bad thread-specific data initialization
461ba4f5cc0SJason Evans    interaction with NPTL (glibc's pthreads implementation).
462ba4f5cc0SJason Evans
463536b3538SJason Evans* 4.0.2 (September 21, 2015)
464536b3538SJason Evans
465536b3538SJason Evans  This bugfix release addresses a few bugs specific to heap profiling.
466536b3538SJason Evans
467536b3538SJason Evans  Bug fixes:
468536b3538SJason Evans  - Fix ixallocx_prof_sample() to never modify nor create sampled small
469536b3538SJason Evans    allocations.  xallocx() is in general incapable of moving small allocations,
470536b3538SJason Evans    so this fix removes buggy code without loss of generality.
471536b3538SJason Evans  - Fix irallocx_prof_sample() to always allocate large regions, even when
472536b3538SJason Evans    alignment is non-zero.
473536b3538SJason Evans  - Fix prof_alloc_rollback() to read tdata from thread-specific data rather
474536b3538SJason Evans    than dereferencing a potentially invalid tctx.
475536b3538SJason Evans
476536b3538SJason Evans* 4.0.1 (September 15, 2015)
477536b3538SJason Evans
478536b3538SJason Evans  This is a bugfix release that is somewhat high risk due to the amount of
479536b3538SJason Evans  refactoring required to address deep xallocx() problems.  As a side effect of
480536b3538SJason Evans  these fixes, xallocx() now tries harder to partially fulfill requests for
481536b3538SJason Evans  optional extra space.  Note that a couple of minor heap profiling
482536b3538SJason Evans  optimizations are included, but these are better thought of as performance
483536b3538SJason Evans  fixes that were integral to disovering most of the other bugs.
484536b3538SJason Evans
485536b3538SJason Evans  Optimizations:
486536b3538SJason Evans  - Avoid a chunk metadata read in arena_prof_tctx_set(), since it is in the
487536b3538SJason Evans    fast path when heap profiling is enabled.  Additionally, split a special
488536b3538SJason Evans    case out into arena_prof_tctx_reset(), which also avoids chunk metadata
489536b3538SJason Evans    reads.
490536b3538SJason Evans  - Optimize irallocx_prof() to optimistically update the sampler state.  The
491536b3538SJason Evans    prior implementation appears to have been a holdover from when
492536b3538SJason Evans    rallocx()/xallocx() functionality was combined as rallocm().
493536b3538SJason Evans
494536b3538SJason Evans  Bug fixes:
495536b3538SJason Evans  - Fix TLS configuration such that it is enabled by default for platforms on
496536b3538SJason Evans    which it works correctly.
497536b3538SJason Evans  - Fix arenas_cache_cleanup() and arena_get_hard() to handle
498536b3538SJason Evans    allocation/deallocation within the application's thread-specific data
499536b3538SJason Evans    cleanup functions even after arenas_cache is torn down.
500536b3538SJason Evans  - Fix xallocx() bugs related to size+extra exceeding HUGE_MAXCLASS.
501536b3538SJason Evans  - Fix chunk purge hook calls for in-place huge shrinking reallocation to
502536b3538SJason Evans    specify the old chunk size rather than the new chunk size.  This bug caused
503536b3538SJason Evans    no correctness issues for the default chunk purge function, but was
504536b3538SJason Evans    visible to custom functions set via the "arena.<i>.chunk_hooks" mallctl.
505536b3538SJason Evans  - Fix heap profiling bugs:
506536b3538SJason Evans    + Fix heap profiling to distinguish among otherwise identical sample sites
507536b3538SJason Evans      with interposed resets (triggered via the "prof.reset" mallctl).  This bug
508536b3538SJason Evans      could cause data structure corruption that would most likely result in a
509536b3538SJason Evans      segfault.
510536b3538SJason Evans    + Fix irealloc_prof() to prof_alloc_rollback() on OOM.
511536b3538SJason Evans    + Make one call to prof_active_get_unlocked() per allocation event, and use
512536b3538SJason Evans      the result throughout the relevant functions that handle an allocation
513536b3538SJason Evans      event.  Also add a missing check in prof_realloc().  These fixes protect
514536b3538SJason Evans      allocation events against concurrent prof_active changes.
515536b3538SJason Evans    + Fix ixallocx_prof() to pass usize_max and zero to ixallocx_prof_sample()
516536b3538SJason Evans      in the correct order.
517536b3538SJason Evans    + Fix prof_realloc() to call prof_free_sampled_object() after calling
518536b3538SJason Evans      prof_malloc_sample_object().  Prior to this fix, if tctx and old_tctx were
519536b3538SJason Evans      the same, the tctx could have been prematurely destroyed.
520536b3538SJason Evans  - Fix portability bugs:
521536b3538SJason Evans    + Don't bitshift by negative amounts when encoding/decoding run sizes in
522536b3538SJason Evans      chunk header maps.  This affected systems with page sizes greater than 8
523536b3538SJason Evans      KiB.
524536b3538SJason Evans    + Rename index_t to szind_t to avoid an existing type on Solaris.
525536b3538SJason Evans    + Add JEMALLOC_CXX_THROW to the memalign() function prototype, in order to
526536b3538SJason Evans      match glibc and avoid compilation errors when including both
527536b3538SJason Evans      jemalloc/jemalloc.h and malloc.h in C++ code.
528536b3538SJason Evans    + Don't assume that /bin/sh is appropriate when running size_classes.sh
529536b3538SJason Evans      during configuration.
530536b3538SJason Evans    + Consider __sparcv9 a synonym for __sparc64__ when defining LG_QUANTUM.
531536b3538SJason Evans    + Link tests to librt if it contains clock_gettime(2).
532536b3538SJason Evans
533d0e79aa3SJason Evans* 4.0.0 (August 17, 2015)
534d0e79aa3SJason Evans
535d0e79aa3SJason Evans  This version contains many speed and space optimizations, both minor and
536d0e79aa3SJason Evans  major.  The major themes are generalization, unification, and simplification.
537d0e79aa3SJason Evans  Although many of these optimizations cause no visible behavior change, their
538d0e79aa3SJason Evans  cumulative effect is substantial.
539d0e79aa3SJason Evans
540d0e79aa3SJason Evans  New features:
541d0e79aa3SJason Evans  - Normalize size class spacing to be consistent across the complete size
542d0e79aa3SJason Evans    range.  By default there are four size classes per size doubling, but this
543d0e79aa3SJason Evans    is now configurable via the --with-lg-size-class-group option.  Also add the
544d0e79aa3SJason Evans    --with-lg-page, --with-lg-page-sizes, --with-lg-quantum, and
545d0e79aa3SJason Evans    --with-lg-tiny-min options, which can be used to tweak page and size class
546d0e79aa3SJason Evans    settings.  Impacts:
547d0e79aa3SJason Evans    + Worst case performance for incrementally growing/shrinking reallocation
548d0e79aa3SJason Evans      is improved because there are far fewer size classes, and therefore
549d0e79aa3SJason Evans      copying happens less often.
550d0e79aa3SJason Evans    + Internal fragmentation is limited to 20% for all but the smallest size
551d0e79aa3SJason Evans      classes (those less than four times the quantum).  (1B + 4 KiB)
552d0e79aa3SJason Evans      and (1B + 4 MiB) previously suffered nearly 50% internal fragmentation.
553d0e79aa3SJason Evans    + Chunk fragmentation tends to be lower because there are fewer distinct run
554d0e79aa3SJason Evans      sizes to pack.
555d0e79aa3SJason Evans  - Add support for explicit tcaches.  The "tcache.create", "tcache.flush", and
556d0e79aa3SJason Evans    "tcache.destroy" mallctls control tcache lifetime and flushing, and the
557d0e79aa3SJason Evans    MALLOCX_TCACHE(tc) and MALLOCX_TCACHE_NONE flags to the *allocx() API
558d0e79aa3SJason Evans    control which tcache is used for each operation.
559d0e79aa3SJason Evans  - Implement per thread heap profiling, as well as the ability to
560d0e79aa3SJason Evans    enable/disable heap profiling on a per thread basis.  Add the "prof.reset",
561d0e79aa3SJason Evans    "prof.lg_sample", "thread.prof.name", "thread.prof.active",
562d0e79aa3SJason Evans    "opt.prof_thread_active_init", "prof.thread_active_init", and
563d0e79aa3SJason Evans    "thread.prof.active" mallctls.
564d0e79aa3SJason Evans  - Add support for per arena application-specified chunk allocators, configured
565d0e79aa3SJason Evans    via the "arena.<i>.chunk_hooks" mallctl.
566d0e79aa3SJason Evans  - Refactor huge allocation to be managed by arenas, so that arenas now
567d0e79aa3SJason Evans    function as general purpose independent allocators.  This is important in
568d0e79aa3SJason Evans    the context of user-specified chunk allocators, aside from the scalability
569d0e79aa3SJason Evans    benefits.  Related new statistics:
570d0e79aa3SJason Evans    + The "stats.arenas.<i>.huge.allocated", "stats.arenas.<i>.huge.nmalloc",
571d0e79aa3SJason Evans      "stats.arenas.<i>.huge.ndalloc", and "stats.arenas.<i>.huge.nrequests"
572d0e79aa3SJason Evans      mallctls provide high level per arena huge allocation statistics.
573d0e79aa3SJason Evans    + The "arenas.nhchunks", "arenas.hchunk.<i>.size",
574d0e79aa3SJason Evans      "stats.arenas.<i>.hchunks.<j>.nmalloc",
575d0e79aa3SJason Evans      "stats.arenas.<i>.hchunks.<j>.ndalloc",
576d0e79aa3SJason Evans      "stats.arenas.<i>.hchunks.<j>.nrequests", and
577d0e79aa3SJason Evans      "stats.arenas.<i>.hchunks.<j>.curhchunks" mallctls provide per size class
578d0e79aa3SJason Evans      statistics.
579d0e79aa3SJason Evans  - Add the 'util' column to malloc_stats_print() output, which reports the
580d0e79aa3SJason Evans    proportion of available regions that are currently in use for each small
581d0e79aa3SJason Evans    size class.
582d0e79aa3SJason Evans  - Add "alloc" and "free" modes for for junk filling (see the "opt.junk"
583d0e79aa3SJason Evans    mallctl), so that it is possible to separately enable junk filling for
584d0e79aa3SJason Evans    allocation versus deallocation.
585d0e79aa3SJason Evans  - Add the jemalloc-config script, which provides information about how
586d0e79aa3SJason Evans    jemalloc was configured, and how to integrate it into application builds.
587d0e79aa3SJason Evans  - Add metadata statistics, which are accessible via the "stats.metadata",
588d0e79aa3SJason Evans    "stats.arenas.<i>.metadata.mapped", and
589d0e79aa3SJason Evans    "stats.arenas.<i>.metadata.allocated" mallctls.
590d0e79aa3SJason Evans  - Add the "stats.resident" mallctl, which reports the upper limit of
591d0e79aa3SJason Evans    physically resident memory mapped by the allocator.
592d0e79aa3SJason Evans  - Add per arena control over unused dirty page purging, via the
593d0e79aa3SJason Evans    "arenas.lg_dirty_mult", "arena.<i>.lg_dirty_mult", and
594d0e79aa3SJason Evans    "stats.arenas.<i>.lg_dirty_mult" mallctls.
595d0e79aa3SJason Evans  - Add the "prof.gdump" mallctl, which makes it possible to toggle the gdump
596d0e79aa3SJason Evans    feature on/off during program execution.
597d0e79aa3SJason Evans  - Add sdallocx(), which implements sized deallocation.  The primary
598d0e79aa3SJason Evans    optimization over dallocx() is the removal of a metadata read, which often
599d0e79aa3SJason Evans    suffers an L1 cache miss.
600d0e79aa3SJason Evans  - Add missing header includes in jemalloc/jemalloc.h, so that applications
601d0e79aa3SJason Evans    only have to #include <jemalloc/jemalloc.h>.
602d0e79aa3SJason Evans  - Add support for additional platforms:
603d0e79aa3SJason Evans    + Bitrig
604d0e79aa3SJason Evans    + Cygwin
605d0e79aa3SJason Evans    + DragonFlyBSD
606d0e79aa3SJason Evans    + iOS
607d0e79aa3SJason Evans    + OpenBSD
608d0e79aa3SJason Evans    + OpenRISC/or1k
609d0e79aa3SJason Evans
610d0e79aa3SJason Evans  Optimizations:
611d0e79aa3SJason Evans  - Maintain dirty runs in per arena LRUs rather than in per arena trees of
612d0e79aa3SJason Evans    dirty-run-containing chunks.  In practice this change significantly reduces
613d0e79aa3SJason Evans    dirty page purging volume.
614d0e79aa3SJason Evans  - Integrate whole chunks into the unused dirty page purging machinery.  This
615d0e79aa3SJason Evans    reduces the cost of repeated huge allocation/deallocation, because it
616d0e79aa3SJason Evans    effectively introduces a cache of chunks.
617d0e79aa3SJason Evans  - Split the arena chunk map into two separate arrays, in order to increase
618d0e79aa3SJason Evans    cache locality for the frequently accessed bits.
619d0e79aa3SJason Evans  - Move small run metadata out of runs, into arena chunk headers.  This reduces
620d0e79aa3SJason Evans    run fragmentation, smaller runs reduce external fragmentation for small size
621d0e79aa3SJason Evans    classes, and packed (less uniformly aligned) metadata layout improves CPU
622d0e79aa3SJason Evans    cache set distribution.
623d0e79aa3SJason Evans  - Randomly distribute large allocation base pointer alignment relative to page
624d0e79aa3SJason Evans    boundaries in order to more uniformly utilize CPU cache sets.  This can be
625d0e79aa3SJason Evans    disabled via the --disable-cache-oblivious configure option, and queried via
626d0e79aa3SJason Evans    the "config.cache_oblivious" mallctl.
627d0e79aa3SJason Evans  - Micro-optimize the fast paths for the public API functions.
628d0e79aa3SJason Evans  - Refactor thread-specific data to reside in a single structure.  This assures
629d0e79aa3SJason Evans    that only a single TLS read is necessary per call into the public API.
630d0e79aa3SJason Evans  - Implement in-place huge allocation growing and shrinking.
631d0e79aa3SJason Evans  - Refactor rtree (radix tree for chunk lookups) to be lock-free, and make
632d0e79aa3SJason Evans    additional optimizations that reduce maximum lookup depth to one or two
633d0e79aa3SJason Evans    levels.  This resolves what was a concurrency bottleneck for per arena huge
634d0e79aa3SJason Evans    allocation, because a global data structure is critical for determining
635d0e79aa3SJason Evans    which arenas own which huge allocations.
636d0e79aa3SJason Evans
637d0e79aa3SJason Evans  Incompatible changes:
638d0e79aa3SJason Evans  - Replace --enable-cc-silence with --disable-cc-silence to suppress spurious
639d0e79aa3SJason Evans    warnings by default.
640d0e79aa3SJason Evans  - Assure that the constness of malloc_usable_size()'s return type matches that
641d0e79aa3SJason Evans    of the system implementation.
642d0e79aa3SJason Evans  - Change the heap profile dump format to support per thread heap profiling,
643d0e79aa3SJason Evans    rename pprof to jeprof, and enhance it with the --thread=<n> option.  As a
644d0e79aa3SJason Evans    result, the bundled jeprof must now be used rather than the upstream
645d0e79aa3SJason Evans    (gperftools) pprof.
646d0e79aa3SJason Evans  - Disable "opt.prof_final" by default, in order to avoid atexit(3), which can
647d0e79aa3SJason Evans    internally deadlock on some platforms.
648d0e79aa3SJason Evans  - Change the "arenas.nlruns" mallctl type from size_t to unsigned.
649d0e79aa3SJason Evans  - Replace the "stats.arenas.<i>.bins.<j>.allocated" mallctl with
650d0e79aa3SJason Evans    "stats.arenas.<i>.bins.<j>.curregs".
651d0e79aa3SJason Evans  - Ignore MALLOC_CONF in set{uid,gid,cap} binaries.
652d0e79aa3SJason Evans  - Ignore MALLOCX_ARENA(a) in dallocx(), in favor of using the
653d0e79aa3SJason Evans    MALLOCX_TCACHE(tc) and MALLOCX_TCACHE_NONE flags to control tcache usage.
654d0e79aa3SJason Evans
655d0e79aa3SJason Evans  Removed features:
656d0e79aa3SJason Evans  - Remove the *allocm() API, which is superseded by the *allocx() API.
657d0e79aa3SJason Evans  - Remove the --enable-dss options, and make dss non-optional on all platforms
658d0e79aa3SJason Evans    which support sbrk(2).
659d0e79aa3SJason Evans  - Remove the "arenas.purge" mallctl, which was obsoleted by the
660d0e79aa3SJason Evans    "arena.<i>.purge" mallctl in 3.1.0.
661d0e79aa3SJason Evans  - Remove the unnecessary "opt.valgrind" mallctl; jemalloc automatically
662d0e79aa3SJason Evans    detects whether it is running inside Valgrind.
663d0e79aa3SJason Evans  - Remove the "stats.huge.allocated", "stats.huge.nmalloc", and
664d0e79aa3SJason Evans    "stats.huge.ndalloc" mallctls.
665d0e79aa3SJason Evans  - Remove the --enable-mremap option.
666d0e79aa3SJason Evans  - Remove the "stats.chunks.current", "stats.chunks.total", and
667d0e79aa3SJason Evans    "stats.chunks.high" mallctls.
668d0e79aa3SJason Evans
669d0e79aa3SJason Evans  Bug fixes:
670d0e79aa3SJason Evans  - Fix the cactive statistic to decrease (rather than increase) when active
671d0e79aa3SJason Evans    memory decreases.  This regression was first released in 3.5.0.
672d0e79aa3SJason Evans  - Fix OOM handling in memalign() and valloc().  A variant of this bug existed
673d0e79aa3SJason Evans    in all releases since 2.0.0, which introduced these functions.
674d0e79aa3SJason Evans  - Fix an OOM-related regression in arena_tcache_fill_small(), which could
675d0e79aa3SJason Evans    cause cache corruption on OOM.  This regression was present in all releases
676d0e79aa3SJason Evans    from 2.2.0 through 3.6.0.
677d0e79aa3SJason Evans  - Fix size class overflow handling for malloc(), posix_memalign(), memalign(),
678d0e79aa3SJason Evans    calloc(), and realloc() when profiling is enabled.
679d0e79aa3SJason Evans  - Fix the "arena.<i>.dss" mallctl to return an error if "primary" or
680d0e79aa3SJason Evans    "secondary" precedence is specified, but sbrk(2) is not supported.
681d0e79aa3SJason Evans  - Fix fallback lg_floor() implementations to handle extremely large inputs.
682d0e79aa3SJason Evans  - Ensure the default purgeable zone is after the default zone on OS X.
683d0e79aa3SJason Evans  - Fix latent bugs in atomic_*().
684d0e79aa3SJason Evans  - Fix the "arena.<i>.dss" mallctl to handle read-only calls.
685d0e79aa3SJason Evans  - Fix tls_model configuration to enable the initial-exec model when possible.
686d0e79aa3SJason Evans  - Mark malloc_conf as a weak symbol so that the application can override it.
687d0e79aa3SJason Evans  - Correctly detect glibc's adaptive pthread mutexes.
688d0e79aa3SJason Evans  - Fix the --without-export configure option.
689d0e79aa3SJason Evans
6902fff27f8SJason Evans* 3.6.0 (March 31, 2014)
6912fff27f8SJason Evans
6922fff27f8SJason Evans  This version contains a critical bug fix for a regression present in 3.5.0 and
6932fff27f8SJason Evans  3.5.1.
6942fff27f8SJason Evans
6952fff27f8SJason Evans  Bug fixes:
6962fff27f8SJason Evans  - Fix a regression in arena_chunk_alloc() that caused crashes during
6972fff27f8SJason Evans    small/large allocation if chunk allocation failed.  In the absence of this
6982fff27f8SJason Evans    bug, chunk allocation failure would result in allocation failure, e.g.  NULL
6992fff27f8SJason Evans    return from malloc().  This regression was introduced in 3.5.0.
7002fff27f8SJason Evans  - Fix backtracing for gcc intrinsics-based backtracing by specifying
7012fff27f8SJason Evans    -fno-omit-frame-pointer to gcc.  Note that the application (and all the
7022fff27f8SJason Evans    libraries it links to) must also be compiled with this option for
7032fff27f8SJason Evans    backtracing to be reliable.
7042fff27f8SJason Evans  - Use dss allocation precedence for huge allocations as well as small/large
7052fff27f8SJason Evans    allocations.
706d0e79aa3SJason Evans  - Fix test assertion failure message formatting.  This bug did not manifest on
7072fff27f8SJason Evans    x86_64 systems because of implementation subtleties in va_list.
7082fff27f8SJason Evans  - Fix inconsequential test failures for hash and SFMT code.
7092fff27f8SJason Evans
7102fff27f8SJason Evans  New features:
7112fff27f8SJason Evans  - Support heap profiling on FreeBSD.  This feature depends on the proc
7122fff27f8SJason Evans    filesystem being mounted during heap profile dumping.
7132fff27f8SJason Evans
714706d9bd1SJason Evans* 3.5.1 (February 25, 2014)
715706d9bd1SJason Evans
716706d9bd1SJason Evans  This version primarily addresses minor bugs in test code.
717706d9bd1SJason Evans
718706d9bd1SJason Evans  Bug fixes:
719706d9bd1SJason Evans  - Configure Solaris/Illumos to use MADV_FREE.
720706d9bd1SJason Evans  - Fix junk filling for mremap(2)-based huge reallocation.  This is only
721706d9bd1SJason Evans    relevant if configuring with the --enable-mremap option specified.
722706d9bd1SJason Evans  - Avoid compilation failure if 'restrict' C99 keyword is not supported by the
723706d9bd1SJason Evans    compiler.
724706d9bd1SJason Evans  - Add a configure test for SSE2 rather than assuming it is usable on i686
725706d9bd1SJason Evans    systems.  This fixes test compilation errors, especially on 32-bit Linux
726706d9bd1SJason Evans    systems.
727706d9bd1SJason Evans  - Fix mallctl argument size mismatches (size_t vs. uint64_t) in the stats unit
728706d9bd1SJason Evans    test.
729706d9bd1SJason Evans  - Fix/remove flawed alignment-related overflow tests.
730706d9bd1SJason Evans  - Prevent compiler optimizations that could change backtraces in the
731706d9bd1SJason Evans    prof_accum unit test.
732a4bd5210SJason Evans
733f921d10fSJason Evans* 3.5.0 (January 22, 2014)
734f921d10fSJason Evans
735f921d10fSJason Evans  This version focuses on refactoring and automated testing, though it also
736f921d10fSJason Evans  includes some non-trivial heap profiling optimizations not mentioned below.
737f921d10fSJason Evans
738f921d10fSJason Evans  New features:
739f921d10fSJason Evans  - Add the *allocx() API, which is a successor to the experimental *allocm()
740f921d10fSJason Evans    API.  The *allocx() functions are slightly simpler to use because they have
741f921d10fSJason Evans    fewer parameters, they directly return the results of primary interest, and
742f921d10fSJason Evans    mallocx()/rallocx() avoid the strict aliasing pitfall that
743706d9bd1SJason Evans    allocm()/rallocm() share with posix_memalign().  Note that *allocm() is
744f921d10fSJason Evans    slated for removal in the next non-bugfix release.
745f921d10fSJason Evans  - Add support for LinuxThreads.
746f921d10fSJason Evans
747f921d10fSJason Evans  Bug fixes:
748f921d10fSJason Evans  - Unless heap profiling is enabled, disable floating point code and don't link
749f921d10fSJason Evans    with libm.  This, in combination with e.g. EXTRA_CFLAGS=-mno-sse on x64
750f921d10fSJason Evans    systems, makes it possible to completely disable floating point register
751f921d10fSJason Evans    use.  Some versions of glibc neglect to save/restore caller-saved floating
752f921d10fSJason Evans    point registers during dynamic lazy symbol loading, and the symbol loading
753f921d10fSJason Evans    code uses whatever malloc the application happens to have linked/loaded
754f921d10fSJason Evans    with, the result being potential floating point register corruption.
755f921d10fSJason Evans  - Report ENOMEM rather than EINVAL if an OOM occurs during heap profiling
756f921d10fSJason Evans    backtrace creation in imemalign().  This bug impacted posix_memalign() and
757f921d10fSJason Evans    aligned_alloc().
758f921d10fSJason Evans  - Fix a file descriptor leak in a prof_dump_maps() error path.
759f921d10fSJason Evans  - Fix prof_dump() to close the dump file descriptor for all relevant error
760f921d10fSJason Evans    paths.
761f921d10fSJason Evans  - Fix rallocm() to use the arena specified by the ALLOCM_ARENA(s) flag for
762f921d10fSJason Evans    allocation, not just deallocation.
763f921d10fSJason Evans  - Fix a data race for large allocation stats counters.
764f921d10fSJason Evans  - Fix a potential infinite loop during thread exit.  This bug occurred on
765f921d10fSJason Evans    Solaris, and could affect other platforms with similar pthreads TSD
766f921d10fSJason Evans    implementations.
767f921d10fSJason Evans  - Don't junk-fill reallocations unless usable size changes.  This fixes a
768f921d10fSJason Evans    violation of the *allocx()/*allocm() semantics.
769f921d10fSJason Evans  - Fix growing large reallocation to junk fill new space.
770f921d10fSJason Evans  - Fix huge deallocation to junk fill when munmap is disabled.
771f921d10fSJason Evans  - Change the default private namespace prefix from empty to je_, and change
772f921d10fSJason Evans    --with-private-namespace-prefix so that it prepends an additional prefix
773f921d10fSJason Evans    rather than replacing je_.  This reduces the likelihood of applications
774f921d10fSJason Evans    which statically link jemalloc experiencing symbol name collisions.
775f921d10fSJason Evans  - Add missing private namespace mangling (relevant when
776f921d10fSJason Evans    --with-private-namespace is specified).
777f921d10fSJason Evans  - Add and use JEMALLOC_INLINE_C so that static inline functions are marked as
778f921d10fSJason Evans    static even for debug builds.
779f921d10fSJason Evans  - Add a missing mutex unlock in a malloc_init_hard() error path.  In practice
780f921d10fSJason Evans    this error path is never executed.
781f921d10fSJason Evans  - Fix numerous bugs in malloc_strotumax() error handling/reporting.  These
782f921d10fSJason Evans    bugs had no impact except for malformed inputs.
783f921d10fSJason Evans  - Fix numerous bugs in malloc_snprintf().  These bugs were not exercised by
784f921d10fSJason Evans    existing calls, so they had no impact.
785f921d10fSJason Evans
7862b06b201SJason Evans* 3.4.1 (October 20, 2013)
7872b06b201SJason Evans
7882b06b201SJason Evans  Bug fixes:
7892b06b201SJason Evans  - Fix a race in the "arenas.extend" mallctl that could cause memory corruption
7902b06b201SJason Evans    of internal data structures and subsequent crashes.
7912b06b201SJason Evans  - Fix Valgrind integration flaws that caused Valgrind warnings about reads of
7922b06b201SJason Evans    uninitialized memory in:
7932b06b201SJason Evans    + arena chunk headers
7942b06b201SJason Evans    + internal zero-initialized data structures (relevant to tcache and prof
7952b06b201SJason Evans      code)
7962b06b201SJason Evans  - Preserve errno during the first allocation.  A readlink(2) call during
7972b06b201SJason Evans    initialization fails unless /etc/malloc.conf exists, so errno was typically
7982b06b201SJason Evans    set during the first allocation prior to this fix.
7992b06b201SJason Evans  - Fix compilation warnings reported by gcc 4.8.1.
8002b06b201SJason Evans
801f8ca2db1SJason Evans* 3.4.0 (June 2, 2013)
802f8ca2db1SJason Evans
803f8ca2db1SJason Evans  This version is essentially a small bugfix release, but the addition of
804f8ca2db1SJason Evans  aarch64 support requires that the minor version be incremented.
805f8ca2db1SJason Evans
806f8ca2db1SJason Evans  Bug fixes:
807f8ca2db1SJason Evans  - Fix race-triggered deadlocks in chunk_record().  These deadlocks were
808f8ca2db1SJason Evans    typically triggered by multiple threads concurrently deallocating huge
809f8ca2db1SJason Evans    objects.
810f8ca2db1SJason Evans
811f8ca2db1SJason Evans  New features:
812f8ca2db1SJason Evans  - Add support for the aarch64 architecture.
813f8ca2db1SJason Evans
814f8ca2db1SJason Evans* 3.3.1 (March 6, 2013)
815f8ca2db1SJason Evans
816f8ca2db1SJason Evans  This version fixes bugs that are typically encountered only when utilizing
817f8ca2db1SJason Evans  custom run-time options.
818f8ca2db1SJason Evans
819f8ca2db1SJason Evans  Bug fixes:
820f8ca2db1SJason Evans  - Fix a locking order bug that could cause deadlock during fork if heap
821f8ca2db1SJason Evans    profiling were enabled.
822f8ca2db1SJason Evans  - Fix a chunk recycling bug that could cause the allocator to lose track of
823f8ca2db1SJason Evans    whether a chunk was zeroed.  On FreeBSD, NetBSD, and OS X, it could cause
824f8ca2db1SJason Evans    corruption if allocating via sbrk(2) (unlikely unless running with the
825f8ca2db1SJason Evans    "dss:primary" option specified).  This was completely harmless on Linux
826f8ca2db1SJason Evans    unless using mlockall(2) (and unlikely even then, unless the
827f8ca2db1SJason Evans    --disable-munmap configure option or the "dss:primary" option was
828f8ca2db1SJason Evans    specified).  This regression was introduced in 3.1.0 by the
829f8ca2db1SJason Evans    mlockall(2)/madvise(2) interaction fix.
830f8ca2db1SJason Evans  - Fix TLS-related memory corruption that could occur during thread exit if the
831f8ca2db1SJason Evans    thread never allocated memory.  Only the quarantine and prof facilities were
832f8ca2db1SJason Evans    susceptible.
833f8ca2db1SJason Evans  - Fix two quarantine bugs:
834f8ca2db1SJason Evans    + Internal reallocation of the quarantined object array leaked the old
835f8ca2db1SJason Evans      array.
836f8ca2db1SJason Evans    + Reallocation failure for internal reallocation of the quarantined object
837f8ca2db1SJason Evans      array (very unlikely) resulted in memory corruption.
838f8ca2db1SJason Evans  - Fix Valgrind integration to annotate all internally allocated memory in a
839f8ca2db1SJason Evans    way that keeps Valgrind happy about internal data structure access.
840f8ca2db1SJason Evans  - Fix building for s390 systems.
841f8ca2db1SJason Evans
84288ad2f8dSJason Evans* 3.3.0 (January 23, 2013)
84388ad2f8dSJason Evans
84488ad2f8dSJason Evans  This version includes a few minor performance improvements in addition to the
84588ad2f8dSJason Evans  listed new features and bug fixes.
84688ad2f8dSJason Evans
84788ad2f8dSJason Evans  New features:
84888ad2f8dSJason Evans  - Add clipping support to lg_chunk option processing.
84988ad2f8dSJason Evans  - Add the --enable-ivsalloc option.
85088ad2f8dSJason Evans  - Add the --without-export option.
85188ad2f8dSJason Evans  - Add the --disable-zone-allocator option.
85288ad2f8dSJason Evans
85388ad2f8dSJason Evans  Bug fixes:
85488ad2f8dSJason Evans  - Fix "arenas.extend" mallctl to output the number of arenas.
8552b06b201SJason Evans  - Fix chunk_recycle() to unconditionally inform Valgrind that returned memory
85688ad2f8dSJason Evans    is undefined.
85788ad2f8dSJason Evans  - Fix build break on FreeBSD related to alloca.h.
85888ad2f8dSJason Evans
85982872ac0SJason Evans* 3.2.0 (November 9, 2012)
86082872ac0SJason Evans
86182872ac0SJason Evans  In addition to a couple of bug fixes, this version modifies page run
86282872ac0SJason Evans  allocation and dirty page purging algorithms in order to better control
86382872ac0SJason Evans  page-level virtual memory fragmentation.
86482872ac0SJason Evans
86582872ac0SJason Evans  Incompatible changes:
86682872ac0SJason Evans  - Change the "opt.lg_dirty_mult" default from 5 to 3 (32:1 to 8:1).
86782872ac0SJason Evans
86882872ac0SJason Evans  Bug fixes:
86982872ac0SJason Evans  - Fix dss/mmap allocation precedence code to use recyclable mmap memory only
87082872ac0SJason Evans    after primary dss allocation fails.
87182872ac0SJason Evans  - Fix deadlock in the "arenas.purge" mallctl.  This regression was introduced
87282872ac0SJason Evans    in 3.1.0 by the addition of the "arena.<i>.purge" mallctl.
87382872ac0SJason Evans
87482872ac0SJason Evans* 3.1.0 (October 16, 2012)
87582872ac0SJason Evans
87682872ac0SJason Evans  New features:
87782872ac0SJason Evans  - Auto-detect whether running inside Valgrind, thus removing the need to
87882872ac0SJason Evans    manually specify MALLOC_CONF=valgrind:true.
87982872ac0SJason Evans  - Add the "arenas.extend" mallctl, which allows applications to create
88082872ac0SJason Evans    manually managed arenas.
88182872ac0SJason Evans  - Add the ALLOCM_ARENA() flag for {,r,d}allocm().
88282872ac0SJason Evans  - Add the "opt.dss", "arena.<i>.dss", and "stats.arenas.<i>.dss" mallctls,
88382872ac0SJason Evans    which provide control over dss/mmap precedence.
88482872ac0SJason Evans  - Add the "arena.<i>.purge" mallctl, which obsoletes "arenas.purge".
88582872ac0SJason Evans  - Define LG_QUANTUM for hppa.
88682872ac0SJason Evans
88782872ac0SJason Evans  Incompatible changes:
88882872ac0SJason Evans  - Disable tcache by default if running inside Valgrind, in order to avoid
88982872ac0SJason Evans    making unallocated objects appear reachable to Valgrind.
89082872ac0SJason Evans  - Drop const from malloc_usable_size() argument on Linux.
89182872ac0SJason Evans
89282872ac0SJason Evans  Bug fixes:
89382872ac0SJason Evans  - Fix heap profiling crash if sampled object is freed via realloc(p, 0).
89482872ac0SJason Evans  - Remove const from __*_hook variable declarations, so that glibc can modify
89582872ac0SJason Evans    them during process forking.
89682872ac0SJason Evans  - Fix mlockall(2)/madvise(2) interaction.
89782872ac0SJason Evans  - Fix fork(2)-related deadlocks.
89882872ac0SJason Evans  - Fix error return value for "thread.tcache.enabled" mallctl.
89982872ac0SJason Evans
90035dad073SJason Evans* 3.0.0 (May 11, 2012)
901a4bd5210SJason Evans
902a4bd5210SJason Evans  Although this version adds some major new features, the primary focus is on
903a4bd5210SJason Evans  internal code cleanup that facilitates maintainability and portability, most
904a4bd5210SJason Evans  of which is not reflected in the ChangeLog.  This is the first release to
905a4bd5210SJason Evans  incorporate substantial contributions from numerous other developers, and the
906a4bd5210SJason Evans  result is a more broadly useful allocator (see the git revision history for
907a4bd5210SJason Evans  contribution details).  Note that the license has been unified, thanks to
908a4bd5210SJason Evans  Facebook granting a license under the same terms as the other copyright
909a4bd5210SJason Evans  holders (see COPYING).
910a4bd5210SJason Evans
911a4bd5210SJason Evans  New features:
912a4bd5210SJason Evans  - Implement Valgrind support, redzones, and quarantine.
913e722f8f8SJason Evans  - Add support for additional platforms:
914a4bd5210SJason Evans    + FreeBSD
915a4bd5210SJason Evans    + Mac OS X Lion
916e722f8f8SJason Evans    + MinGW
91735dad073SJason Evans    + Windows (no support yet for replacing the system malloc)
918a4bd5210SJason Evans  - Add support for additional architectures:
919a4bd5210SJason Evans    + MIPS
920a4bd5210SJason Evans    + SH4
921a4bd5210SJason Evans    + Tilera
922a4bd5210SJason Evans  - Add support for cross compiling.
923a4bd5210SJason Evans  - Add nallocm(), which rounds a request size up to the nearest size class
924a4bd5210SJason Evans    without actually allocating.
925a4bd5210SJason Evans  - Implement aligned_alloc() (blame C11).
926a4bd5210SJason Evans  - Add the "thread.tcache.enabled" mallctl.
9278ed34ab0SJason Evans  - Add the "opt.prof_final" mallctl.
9288ed34ab0SJason Evans  - Update pprof (from gperftools 2.0).
92935dad073SJason Evans  - Add the --with-mangling option.
93035dad073SJason Evans  - Add the --disable-experimental option.
93135dad073SJason Evans  - Add the --disable-munmap option, and make it the default on Linux.
93235dad073SJason Evans  - Add the --enable-mremap option, which disables use of mremap(2) by default.
933a4bd5210SJason Evans
934a4bd5210SJason Evans  Incompatible changes:
935a4bd5210SJason Evans  - Enable stats by default.
936a4bd5210SJason Evans  - Enable fill by default.
937a4bd5210SJason Evans  - Disable lazy locking by default.
938a4bd5210SJason Evans  - Rename the "tcache.flush" mallctl to "thread.tcache.flush".
939a4bd5210SJason Evans  - Rename the "arenas.pagesize" mallctl to "arenas.page".
9408ed34ab0SJason Evans  - Change the "opt.lg_prof_sample" default from 0 to 19 (1 B to 512 KiB).
9418ed34ab0SJason Evans  - Change the "opt.prof_accum" default from true to false.
942a4bd5210SJason Evans
943a4bd5210SJason Evans  Removed features:
944a4bd5210SJason Evans  - Remove the swap feature, including the "config.swap", "swap.avail",
945a4bd5210SJason Evans    "swap.prezeroed", "swap.nfds", and "swap.fds" mallctls.
946a4bd5210SJason Evans  - Remove highruns statistics, including the
947a4bd5210SJason Evans    "stats.arenas.<i>.bins.<j>.highruns" and
948a4bd5210SJason Evans    "stats.arenas.<i>.lruns.<j>.highruns" mallctls.
949a4bd5210SJason Evans  - As part of small size class refactoring, remove the "opt.lg_[qc]space_max",
950a4bd5210SJason Evans    "arenas.cacheline", "arenas.subpage", "arenas.[tqcs]space_{min,max}", and
951a4bd5210SJason Evans    "arenas.[tqcs]bins" mallctls.
952a4bd5210SJason Evans  - Remove the "arenas.chunksize" mallctl.
953a4bd5210SJason Evans  - Remove the "opt.lg_prof_tcmax" option.
954a4bd5210SJason Evans  - Remove the "opt.lg_prof_bt_max" option.
955a4bd5210SJason Evans  - Remove the "opt.lg_tcache_gc_sweep" option.
956a4bd5210SJason Evans  - Remove the --disable-tiny option, including the "config.tiny" mallctl.
957a4bd5210SJason Evans  - Remove the --enable-dynamic-page-shift configure option.
958a4bd5210SJason Evans  - Remove the --enable-sysv configure option.
959a4bd5210SJason Evans
960a4bd5210SJason Evans  Bug fixes:
961a4bd5210SJason Evans  - Fix a statistics-related bug in the "thread.arena" mallctl that could cause
962a4bd5210SJason Evans    invalid statistics and crashes.
963e722f8f8SJason Evans  - Work around TLS deallocation via free() on Linux.  This bug could cause
964a4bd5210SJason Evans    write-after-free memory corruption.
965e722f8f8SJason Evans  - Fix a potential deadlock that could occur during interval- and
966e722f8f8SJason Evans    growth-triggered heap profile dumps.
96735dad073SJason Evans  - Fix large calloc() zeroing bugs due to dropping chunk map unzeroed flags.
9684bcb1430SJason Evans  - Fix chunk_alloc_dss() to stop claiming memory is zeroed.  This bug could
9694bcb1430SJason Evans    cause memory corruption and crashes with --enable-dss specified.
970e722f8f8SJason Evans  - Fix fork-related bugs that could cause deadlock in children between fork
971e722f8f8SJason Evans    and exec.
972a4bd5210SJason Evans  - Fix malloc_stats_print() to honor 'b' and 'l' in the opts parameter.
973a4bd5210SJason Evans  - Fix realloc(p, 0) to act like free(p).
974a4bd5210SJason Evans  - Do not enforce minimum alignment in memalign().
975a4bd5210SJason Evans  - Check for NULL pointer in malloc_usable_size().
976e722f8f8SJason Evans  - Fix an off-by-one heap profile statistics bug that could be observed in
977e722f8f8SJason Evans    interval- and growth-triggered heap profiles.
978e722f8f8SJason Evans  - Fix the "epoch" mallctl to update cached stats even if the passed in epoch
979e722f8f8SJason Evans    is 0.
980a4bd5210SJason Evans  - Fix bin->runcur management to fix a layout policy bug.  This bug did not
981a4bd5210SJason Evans    affect correctness.
982a4bd5210SJason Evans  - Fix a bug in choose_arena_hard() that potentially caused more arenas to be
983a4bd5210SJason Evans    initialized than necessary.
984a4bd5210SJason Evans  - Add missing "opt.lg_tcache_max" mallctl implementation.
985a4bd5210SJason Evans  - Use glibc allocator hooks to make mixed allocator usage less likely.
986a4bd5210SJason Evans  - Fix build issues for --disable-tcache.
9878ed34ab0SJason Evans  - Don't mangle pthread_create() when --with-private-namespace is specified.
988a4bd5210SJason Evans
989a4bd5210SJason Evans* 2.2.5 (November 14, 2011)
990a4bd5210SJason Evans
991a4bd5210SJason Evans  Bug fixes:
992a4bd5210SJason Evans  - Fix huge_ralloc() race when using mremap(2).  This is a serious bug that
993a4bd5210SJason Evans    could cause memory corruption and/or crashes.
994a4bd5210SJason Evans  - Fix huge_ralloc() to maintain chunk statistics.
995a4bd5210SJason Evans  - Fix malloc_stats_print(..., "a") output.
996a4bd5210SJason Evans
997a4bd5210SJason Evans* 2.2.4 (November 5, 2011)
998a4bd5210SJason Evans
999a4bd5210SJason Evans  Bug fixes:
1000a4bd5210SJason Evans  - Initialize arenas_tsd before using it.  This bug existed for 2.2.[0-3], as
1001a4bd5210SJason Evans    well as for --disable-tls builds in earlier releases.
1002a4bd5210SJason Evans  - Do not assume a 4 KiB page size in test/rallocm.c.
1003a4bd5210SJason Evans
1004a4bd5210SJason Evans* 2.2.3 (August 31, 2011)
1005a4bd5210SJason Evans
1006a4bd5210SJason Evans  This version fixes numerous bugs related to heap profiling.
1007a4bd5210SJason Evans
1008a4bd5210SJason Evans  Bug fixes:
1009a4bd5210SJason Evans  - Fix a prof-related race condition.  This bug could cause memory corruption,
1010a4bd5210SJason Evans    but only occurred in non-default configurations (prof_accum:false).
1011a4bd5210SJason Evans  - Fix off-by-one backtracing issues (make sure that prof_alloc_prep() is
1012a4bd5210SJason Evans    excluded from backtraces).
1013a4bd5210SJason Evans  - Fix a prof-related bug in realloc() (only triggered by OOM errors).
1014a4bd5210SJason Evans  - Fix prof-related bugs in allocm() and rallocm().
1015a4bd5210SJason Evans  - Fix prof_tdata_cleanup() for --disable-tls builds.
1016a4bd5210SJason Evans  - Fix a relative include path, to fix objdir builds.
1017a4bd5210SJason Evans
1018a4bd5210SJason Evans* 2.2.2 (July 30, 2011)
1019a4bd5210SJason Evans
1020a4bd5210SJason Evans  Bug fixes:
1021a4bd5210SJason Evans  - Fix a build error for --disable-tcache.
1022a4bd5210SJason Evans  - Fix assertions in arena_purge() (for real this time).
1023a4bd5210SJason Evans  - Add the --with-private-namespace option.  This is a workaround for symbol
1024a4bd5210SJason Evans    conflicts that can inadvertently arise when using static libraries.
1025a4bd5210SJason Evans
1026a4bd5210SJason Evans* 2.2.1 (March 30, 2011)
1027a4bd5210SJason Evans
1028a4bd5210SJason Evans  Bug fixes:
1029a4bd5210SJason Evans  - Implement atomic operations for x86/x64.  This fixes compilation failures
1030a4bd5210SJason Evans    for versions of gcc that are still in wide use.
1031a4bd5210SJason Evans  - Fix an assertion in arena_purge().
1032a4bd5210SJason Evans
1033a4bd5210SJason Evans* 2.2.0 (March 22, 2011)
1034a4bd5210SJason Evans
1035a4bd5210SJason Evans  This version incorporates several improvements to algorithms and data
1036a4bd5210SJason Evans  structures that tend to reduce fragmentation and increase speed.
1037a4bd5210SJason Evans
1038a4bd5210SJason Evans  New features:
1039a4bd5210SJason Evans  - Add the "stats.cactive" mallctl.
1040a4bd5210SJason Evans  - Update pprof (from google-perftools 1.7).
1041a4bd5210SJason Evans  - Improve backtracing-related configuration logic, and add the
1042a4bd5210SJason Evans    --disable-prof-libgcc option.
1043a4bd5210SJason Evans
1044a4bd5210SJason Evans  Bug fixes:
1045a4bd5210SJason Evans  - Change default symbol visibility from "internal", to "hidden", which
1046a4bd5210SJason Evans    decreases the overhead of library-internal function calls.
1047a4bd5210SJason Evans  - Fix symbol visibility so that it is also set on OS X.
1048a4bd5210SJason Evans  - Fix a build dependency regression caused by the introduction of the .pic.o
1049a4bd5210SJason Evans    suffix for PIC object files.
1050a4bd5210SJason Evans  - Add missing checks for mutex initialization failures.
1051a4bd5210SJason Evans  - Don't use libgcc-based backtracing except on x64, where it is known to work.
1052a4bd5210SJason Evans  - Fix deadlocks on OS X that were due to memory allocation in
1053a4bd5210SJason Evans    pthread_mutex_lock().
1054a4bd5210SJason Evans  - Heap profiling-specific fixes:
1055a4bd5210SJason Evans    + Fix memory corruption due to integer overflow in small region index
1056a4bd5210SJason Evans      computation, when using a small enough sample interval that profiling
1057a4bd5210SJason Evans      context pointers are stored in small run headers.
1058a4bd5210SJason Evans    + Fix a bootstrap ordering bug that only occurred with TLS disabled.
1059a4bd5210SJason Evans    + Fix a rallocm() rsize bug.
1060a4bd5210SJason Evans    + Fix error detection bugs for aligned memory allocation.
1061a4bd5210SJason Evans
1062a4bd5210SJason Evans* 2.1.3 (March 14, 2011)
1063a4bd5210SJason Evans
1064a4bd5210SJason Evans  Bug fixes:
1065a4bd5210SJason Evans  - Fix a cpp logic regression (due to the "thread.{de,}allocatedp" mallctl fix
1066a4bd5210SJason Evans    for OS X in 2.1.2).
1067a4bd5210SJason Evans  - Fix a "thread.arena" mallctl bug.
1068a4bd5210SJason Evans  - Fix a thread cache stats merging bug.
1069a4bd5210SJason Evans
1070a4bd5210SJason Evans* 2.1.2 (March 2, 2011)
1071a4bd5210SJason Evans
1072a4bd5210SJason Evans  Bug fixes:
1073a4bd5210SJason Evans  - Fix "thread.{de,}allocatedp" mallctl for OS X.
1074a4bd5210SJason Evans  - Add missing jemalloc.a to build system.
1075a4bd5210SJason Evans
1076a4bd5210SJason Evans* 2.1.1 (January 31, 2011)
1077a4bd5210SJason Evans
1078a4bd5210SJason Evans  Bug fixes:
1079a4bd5210SJason Evans  - Fix aligned huge reallocation (affected allocm()).
1080a4bd5210SJason Evans  - Fix the ALLOCM_LG_ALIGN macro definition.
1081a4bd5210SJason Evans  - Fix a heap dumping deadlock.
1082a4bd5210SJason Evans  - Fix a "thread.arena" mallctl bug.
1083a4bd5210SJason Evans
1084a4bd5210SJason Evans* 2.1.0 (December 3, 2010)
1085a4bd5210SJason Evans
1086a4bd5210SJason Evans  This version incorporates some optimizations that can't quite be considered
1087a4bd5210SJason Evans  bug fixes.
1088a4bd5210SJason Evans
1089a4bd5210SJason Evans  New features:
1090a4bd5210SJason Evans  - Use Linux's mremap(2) for huge object reallocation when possible.
1091a4bd5210SJason Evans  - Avoid locking in mallctl*() when possible.
1092a4bd5210SJason Evans  - Add the "thread.[de]allocatedp" mallctl's.
1093a4bd5210SJason Evans  - Convert the manual page source from roff to DocBook, and generate both roff
1094a4bd5210SJason Evans    and HTML manuals.
1095a4bd5210SJason Evans
1096a4bd5210SJason Evans  Bug fixes:
1097a4bd5210SJason Evans  - Fix a crash due to incorrect bootstrap ordering.  This only impacted
1098a4bd5210SJason Evans    --enable-debug --enable-dss configurations.
1099a4bd5210SJason Evans  - Fix a minor statistics bug for mallctl("swap.avail", ...).
1100a4bd5210SJason Evans
1101a4bd5210SJason Evans* 2.0.1 (October 29, 2010)
1102a4bd5210SJason Evans
1103a4bd5210SJason Evans  Bug fixes:
1104a4bd5210SJason Evans  - Fix a race condition in heap profiling that could cause undefined behavior
1105a4bd5210SJason Evans    if "opt.prof_accum" were disabled.
1106a4bd5210SJason Evans  - Add missing mutex unlocks for some OOM error paths in the heap profiling
1107a4bd5210SJason Evans    code.
1108a4bd5210SJason Evans  - Fix a compilation error for non-C99 builds.
1109a4bd5210SJason Evans
1110a4bd5210SJason Evans* 2.0.0 (October 24, 2010)
1111a4bd5210SJason Evans
1112a4bd5210SJason Evans  This version focuses on the experimental *allocm() API, and on improved
1113a4bd5210SJason Evans  run-time configuration/introspection.  Nonetheless, numerous performance
1114a4bd5210SJason Evans  improvements are also included.
1115a4bd5210SJason Evans
1116a4bd5210SJason Evans  New features:
1117a4bd5210SJason Evans  - Implement the experimental {,r,s,d}allocm() API, which provides a superset
1118a4bd5210SJason Evans    of the functionality available via malloc(), calloc(), posix_memalign(),
1119a4bd5210SJason Evans    realloc(), malloc_usable_size(), and free().  These functions can be used to
1120a4bd5210SJason Evans    allocate/reallocate aligned zeroed memory, ask for optional extra memory
1121a4bd5210SJason Evans    during reallocation, prevent object movement during reallocation, etc.
1122a4bd5210SJason Evans  - Replace JEMALLOC_OPTIONS/JEMALLOC_PROF_PREFIX with MALLOC_CONF, which is
1123a4bd5210SJason Evans    more human-readable, and more flexible.  For example:
1124a4bd5210SJason Evans      JEMALLOC_OPTIONS=AJP
1125a4bd5210SJason Evans    is now:
1126a4bd5210SJason Evans      MALLOC_CONF=abort:true,fill:true,stats_print:true
1127a4bd5210SJason Evans  - Port to Apple OS X.  Sponsored by Mozilla.
1128a4bd5210SJason Evans  - Make it possible for the application to control thread-->arena mappings via
1129a4bd5210SJason Evans    the "thread.arena" mallctl.
1130a4bd5210SJason Evans  - Add compile-time support for all TLS-related functionality via pthreads TSD.
1131a4bd5210SJason Evans    This is mainly of interest for OS X, which does not support TLS, but has a
1132a4bd5210SJason Evans    TSD implementation with similar performance.
1133a4bd5210SJason Evans  - Override memalign() and valloc() if they are provided by the system.
1134a4bd5210SJason Evans  - Add the "arenas.purge" mallctl, which can be used to synchronously purge all
1135a4bd5210SJason Evans    dirty unused pages.
1136a4bd5210SJason Evans  - Make cumulative heap profiling data optional, so that it is possible to
1137a4bd5210SJason Evans    limit the amount of memory consumed by heap profiling data structures.
1138a4bd5210SJason Evans  - Add per thread allocation counters that can be accessed via the
1139a4bd5210SJason Evans    "thread.allocated" and "thread.deallocated" mallctls.
1140a4bd5210SJason Evans
1141a4bd5210SJason Evans  Incompatible changes:
1142a4bd5210SJason Evans  - Remove JEMALLOC_OPTIONS and malloc_options (see MALLOC_CONF above).
1143a4bd5210SJason Evans  - Increase default backtrace depth from 4 to 128 for heap profiling.
1144a4bd5210SJason Evans  - Disable interval-based profile dumps by default.
1145a4bd5210SJason Evans
1146a4bd5210SJason Evans  Bug fixes:
1147a4bd5210SJason Evans  - Remove bad assertions in fork handler functions.  These assertions could
1148a4bd5210SJason Evans    cause aborts for some combinations of configure settings.
1149a4bd5210SJason Evans  - Fix strerror_r() usage to deal with non-standard semantics in GNU libc.
1150a4bd5210SJason Evans  - Fix leak context reporting.  This bug tended to cause the number of contexts
1151a4bd5210SJason Evans    to be underreported (though the reported number of objects and bytes were
1152a4bd5210SJason Evans    correct).
1153a4bd5210SJason Evans  - Fix a realloc() bug for large in-place growing reallocation.  This bug could
1154a4bd5210SJason Evans    cause memory corruption, but it was hard to trigger.
1155a4bd5210SJason Evans  - Fix an allocation bug for small allocations that could be triggered if
1156a4bd5210SJason Evans    multiple threads raced to create a new run of backing pages.
1157a4bd5210SJason Evans  - Enhance the heap profiler to trigger samples based on usable size, rather
1158a4bd5210SJason Evans    than request size.
1159a4bd5210SJason Evans  - Fix a heap profiling bug due to sometimes losing track of requested object
1160a4bd5210SJason Evans    size for sampled objects.
1161a4bd5210SJason Evans
1162a4bd5210SJason Evans* 1.0.3 (August 12, 2010)
1163a4bd5210SJason Evans
1164a4bd5210SJason Evans  Bug fixes:
1165a4bd5210SJason Evans  - Fix the libunwind-based implementation of stack backtracing (used for heap
1166a4bd5210SJason Evans    profiling).  This bug could cause zero-length backtraces to be reported.
1167a4bd5210SJason Evans  - Add a missing mutex unlock in library initialization code.  If multiple
1168a4bd5210SJason Evans    threads raced to initialize malloc, some of them could end up permanently
1169a4bd5210SJason Evans    blocked.
1170a4bd5210SJason Evans
1171a4bd5210SJason Evans* 1.0.2 (May 11, 2010)
1172a4bd5210SJason Evans
1173a4bd5210SJason Evans  Bug fixes:
1174a4bd5210SJason Evans  - Fix junk filling of large objects, which could cause memory corruption.
1175a4bd5210SJason Evans  - Add MAP_NORESERVE support for chunk mapping, because otherwise virtual
1176a4bd5210SJason Evans    memory limits could cause swap file configuration to fail.  Contributed by
1177a4bd5210SJason Evans    Jordan DeLong.
1178a4bd5210SJason Evans
1179a4bd5210SJason Evans* 1.0.1 (April 14, 2010)
1180a4bd5210SJason Evans
1181a4bd5210SJason Evans  Bug fixes:
1182a4bd5210SJason Evans  - Fix compilation when --enable-fill is specified.
1183a4bd5210SJason Evans  - Fix threads-related profiling bugs that affected accuracy and caused memory
1184a4bd5210SJason Evans    to be leaked during thread exit.
1185a4bd5210SJason Evans  - Fix dirty page purging race conditions that could cause crashes.
1186a4bd5210SJason Evans  - Fix crash in tcache flushing code during thread destruction.
1187a4bd5210SJason Evans
1188a4bd5210SJason Evans* 1.0.0 (April 11, 2010)
1189a4bd5210SJason Evans
1190a4bd5210SJason Evans  This release focuses on speed and run-time introspection.  Numerous
1191a4bd5210SJason Evans  algorithmic improvements make this release substantially faster than its
1192a4bd5210SJason Evans  predecessors.
1193a4bd5210SJason Evans
1194a4bd5210SJason Evans  New features:
1195a4bd5210SJason Evans  - Implement autoconf-based configuration system.
1196a4bd5210SJason Evans  - Add mallctl*(), for the purposes of introspection and run-time
1197a4bd5210SJason Evans    configuration.
1198a4bd5210SJason Evans  - Make it possible for the application to manually flush a thread's cache, via
1199a4bd5210SJason Evans    the "tcache.flush" mallctl.
1200a4bd5210SJason Evans  - Base maximum dirty page count on proportion of active memory.
1201d0e79aa3SJason Evans  - Compute various additional run-time statistics, including per size class
1202a4bd5210SJason Evans    statistics for large objects.
1203a4bd5210SJason Evans  - Expose malloc_stats_print(), which can be called repeatedly by the
1204a4bd5210SJason Evans    application.
1205a4bd5210SJason Evans  - Simplify the malloc_message() signature to only take one string argument,
1206a4bd5210SJason Evans    and incorporate an opaque data pointer argument for use by the application
1207a4bd5210SJason Evans    in combination with malloc_stats_print().
1208a4bd5210SJason Evans  - Add support for allocation backed by one or more swap files, and allow the
1209a4bd5210SJason Evans    application to disable over-commit if swap files are in use.
1210a4bd5210SJason Evans  - Implement allocation profiling and leak checking.
1211a4bd5210SJason Evans
1212a4bd5210SJason Evans  Removed features:
1213a4bd5210SJason Evans  - Remove the dynamic arena rebalancing code, since thread-specific caching
1214a4bd5210SJason Evans    reduces its utility.
1215a4bd5210SJason Evans
1216a4bd5210SJason Evans  Bug fixes:
1217a4bd5210SJason Evans  - Modify chunk allocation to work when address space layout randomization
1218a4bd5210SJason Evans    (ASLR) is in use.
1219a4bd5210SJason Evans  - Fix thread cleanup bugs related to TLS destruction.
1220a4bd5210SJason Evans  - Handle 0-size allocation requests in posix_memalign().
1221a4bd5210SJason Evans  - Fix a chunk leak.  The leaked chunks were never touched, so this impacted
1222a4bd5210SJason Evans    virtual memory usage, but not physical memory usage.
1223a4bd5210SJason Evans
1224a4bd5210SJason Evans* linux_2008082[78]a (August 27/28, 2008)
1225a4bd5210SJason Evans
1226a4bd5210SJason Evans  These snapshot releases are the simple result of incorporating Linux-specific
1227a4bd5210SJason Evans  support into the FreeBSD malloc sources.
1228a4bd5210SJason Evans
1229a4bd5210SJason Evans--------------------------------------------------------------------------------
1230a4bd5210SJason Evansvim:filetype=text:textwidth=80
1231