Lines Matching +full:- +full:- +full:test +full:- +full:large +full:- +full:data
10 optimizations on common paths to rework of internal data structures and
13 workloads. The release has gone through large-scale production testing.
16 - Add the thread.idle mallctl which hints that the calling thread will be
18 - Allow small size classes to be the maximum size class to cache in the
19 thread-specific cache, through the opt.[lg_]tcache_max option. (@interwq,
21 - Make the behavior of realloc(ptr, 0) configurable with opt.zero_realloc.
23 - Add 'make uninstall' support. (@sangshuduo, @Lapenkov)
24 - Support C++17 over-aligned allocation. (@marksantaniello)
25 - Add the thread.peak mallctl for approximate per-thread peak memory tracking.
27 - Add interval-based stats output opt.stats_interval. (@interwq)
28 - Add prof.prefix to override filename prefixes for dumps. (@zhxchen17)
29 - Add high resolution timestamp support for profiling. (@tyroguru)
30 - Add the --collapsed flag to jeprof for flamegraph generation.
32 - Add the --debug-syms-by-id option to jeprof for debug symbols discovery.
34 - Add the opt.prof_leak_error option to exit with error code when leak is
36 - Add opt.cache_oblivious as an runtime alternative to config.cache_oblivious.
38 - Add mallctl interfaces:
52 - Fix the synchronization around explicit tcache creation which could cause
55 - Fix a profiling biasing issue which could cause incorrect heap usage and
58 - Fix the order of stats counter updating on large realloc which could cause
60 - Fix the locking on the arena destroy mallctl, which could cause concurrent
65 - Remove nothrow from system function declarations on macOS and FreeBSD.
67 - Improve overcommit and page alignment settings on NetBSD. (@zoulasc)
68 - Improve CPU affinity support on BSD platforms. (@devnexen)
69 - Improve utrace detection and support. (@devnexen)
70 - Improve QEMU support with MADV_DONTNEED zeroed pages detection. (@azat)
71 - Add memcntl support on Solaris / illumos. (@devnexen)
72 - Improve CPU_SPINWAIT on ARM. (@AWSjswinney)
73 - Improve TSD cleanup on FreeBSD. (@Lapenkov)
74 - Disable percpu_arena if the CPU count cannot be reliably detected. (@azat)
75 - Add malloc_size(3) override support. (@devnexen)
76 - Add mmap VM_MAKE_TAG support. (@devnexen)
77 - Add support for MADV_[NO]CORE. (@devnexen)
78 - Add support for DragonFlyBSD. (@devnexen)
79 - Fix the QUANTUM setting on MIPS64. (@brooksdavis)
80 - Add the QUANTUM setting for ARC. (@vineetgarc)
81 - Add the QUANTUM setting for LoongArch. (@wangjl-uos)
82 - Add QNX support. (@jqian-aurora)
83 - Avoid atexit(3) calls unless the relevant profiling features are enabled.
84 (@BusyJay, @laiwei-rice, @interwq)
85 - Fix unknown option detection when using Clang. (@Lapenkov)
86 - Fix symbol conflict with musl libc. (@georgthegreat)
87 - Add -Wimplicit-fallthrough checks. (@nickdesaulniers)
88 - Add __forceinline support on MSVC. (@santagada)
89 - Improve FreeBSD and Windows CI support. (@Lapenkov)
90 - Add CI support for PPC64LE architecture. (@ezeeyahoo)
93 - Maximum size class allowed in tcache (opt.[lg_]tcache_max) now has an upper
97 - Optimize the common cases of the thread cache operations.
98 - Optimize internal data structures, including RB tree and pairing heap.
99 - Optimize the internal locking on extent management.
100 - Extract and refactor the internal page allocator and interface modules.
103 - Fix doc build with --with-install-suffix. (@lawmurray, @interwq)
104 - Add PROFILING_INTERNALS.md. (@davidtgoldblatt)
105 - Ensure the proper order of doc building and installation. (@Mingli-Yu)
114 - Fix a severe virtual memory leak on Windows. This regression was first
115 released in 5.0.0. (@Ignition, @j0t, @frederik-h, @davidtgoldblatt,
117 - Fix size 0 handling in posix_memalign(). This regression was first released
119 - Fix the prof_log unit test which may observe unexpected backtraces from
120 compiler optimizations. The test was first added in 5.2.0. (@marxin,
122 - Fix the declaration of the extent_avail tree. This regression was first
124 - Fix an incorrect reference in jeprof. This functionality was first released
125 in 3.0.0. (@prehistoric-penguin)
126 - Fix an assertion on the deallocation fast-path. This regression was first
128 - Fix the TLS_MODEL attribute in headers. This regression was first released
132 - Implement opt.retain on Windows and enable by default on 64-bit. (@interwq,
134 - Optimize away a branch on the operator delete[] path. (@mgrice)
135 - Add format annotation to the format generator function. (@zoulasc)
136 - Refactor and improve the size class header generation. (@yinan1048576)
137 - Remove best fit. (@djwatson)
138 - Avoid blocking on background thread locks for stats. (@oranagra, @interwq)
143 1) improved fast-path performance from the optimizations by @djwatson; 2)
147 prior dev versions have gone through large-scale production testing.
150 - Implement oversize_threshold, which uses a dedicated arena for allocations
152 - Add extents usage information to stats. (@tyleretzel)
153 - Log time information for sampled allocations. (@tyleretzel)
154 - Support 0 size in sdallocx. (@djwatson)
155 - Output rate for certain counters in malloc_stats. (@zinoale)
156 - Add configure option --enable-readlinkat, which allows the use of readlinkat
158 - Add configure options --{enable,disable}-{static,shared} to allow not
160 - Add configure option --disable-libdl to enable fully static builds.
162 - Add mallctl interfaces:
170 - Update MSVC builds. (@maksqwe, @rustyx)
171 - Workaround a compiler optimizer bug on s390x. (@rkmisra)
172 - Make use of pthread_set_name_np(3) on FreeBSD. (@trasz)
173 - Implement malloc_getcpu() to enable percpu_arena for windows. (@santagada)
174 - Link against -pthread instead of -lpthread. (@paravoid)
175 - Make background_thread not dependent on libdl. (@interwq)
176 - Add stringify to fix a linker directive issue on MSVC. (@daverigby)
177 - Detect and fall back when 8-bit atomics are unavailable. (@interwq)
178 - Fall back to the default pthread_create if dlsym(3) fails. (@interwq)
181 - Refactor the TSD module. (@davidtgoldblatt)
182 - Avoid taking extents_muzzy mutex when muzzy is disabled. (@interwq)
183 - Avoid taking large_mtx for auto arenas on the tcache flush path. (@interwq)
184 - Optimize ixalloc by avoiding a size lookup. (@interwq)
185 - Implement opt.oversize_threshold which uses a dedicated arena for requests
188 - Clean compilation with -Wextra. (@gnzlbg, @jasone)
189 - Refactor the size class module. (@davidtgoldblatt)
190 - Refactor the stats emitter. (@tyleretzel)
191 - Optimize pow2_ceil. (@rkmisra)
192 - Avoid runtime detection of lazy purging on FreeBSD. (@trasz)
193 - Optimize mmap(2) alignment handling on FreeBSD. (@trasz)
194 - Improve error handling for THP state initialization. (@jsteemann)
195 - Rework the malloc() fast path. (@djwatson)
196 - Rework the free() fast path. (@djwatson)
197 - Refactor and optimize the tcache fill / flush paths. (@djwatson)
198 - Optimize sync / lwsync on PowerPC. (@chmeeedalf)
199 - Bypass extent_dalloc() when retain is enabled. (@interwq)
200 - Optimize the locking on large deallocation. (@interwq)
201 - Reduce the number of pages committed from sanity checking in debug build.
203 - Deprecate OSSpinLock. (@interwq)
204 - Lower the default number of background threads to 4 (when the feature
206 - Optimize the trylock spin wait. (@djwatson)
207 - Use arena index for arena-matching checks. (@interwq)
208 - Avoid forced decay on thread termination when using background threads.
210 - Disable muzzy decay by default. (@djwatson, @interwq)
211 - Only initialize libgcc unwinder when profiling is enabled. (@paravoid,
215 - Fix background thread index issues with max_background_threads. (@djwatson,
217 - Fix stats output for opt.lg_extent_max_active_fit. (@interwq)
218 - Fix opt.prof_prefix initialization. (@davidtgoldblatt)
219 - Properly trigger decay on tcache destroy. (@interwq, @amosbird)
220 - Fix tcache.flush. (@interwq)
221 - Detect whether explicit extent zero out is necessary with huge pages or
223 - Fix a side effect caused by extent_max_active_fit combined with decay-based
226 - Fix a missing unlock on extent register error handling. (@zoulasc)
229 - Simplify the Travis script output. (@gnzlbg)
230 - Update the test scripts for FreeBSD. (@devnexen)
231 - Add unit tests for the producer-consumer pattern. (@interwq)
232 - Add Cirrus-CI config for FreeBSD builds. (@jasone)
233 - Add size-matching sanity checks on tcache flush. (@davidtgoldblatt,
237 - Remove --with-lg-page-sizes. (@davidtgoldblatt)
240 - Attempt to build docs by default, however skip doc building when xsltproc
245 This release is primarily about fine-tuning, ranging from several new features
247 prior dev versions have been running in multiple large scale applications for
252 performance-critical applications, the newly added TUNING.md provides
256 - Implement transparent huge page support for internal metadata. (@interwq)
257 - Add opt.thp to allow enabling / disabling transparent huge pages for all
259 - Add maximum background thread count option. (@djwatson)
260 - Allow prof_active to control opt.lg_prof_interval and prof.gdump.
262 - Allow arena index lookup based on allocation addresses via mallctl.
264 - Allow disabling initial-exec TLS model. (@davidtgoldblatt, @KenMacD)
265 - Add opt.lg_extent_max_active_fit to set the max ratio between the size of
268 - Add retain_grow_limit to set the max size when growing virtual address
270 - Add mallctl interfaces:
281 - Support GNU/kFreeBSD configuration. (@paravoid)
282 - Support m68k, nios2 and SH3 architectures. (@paravoid)
283 - Fall back to FD_CLOEXEC when O_CLOEXEC is unavailable. (@zonyitoo)
284 - Fix symbol listing for cross-compiling. (@tamird)
285 - Fix high bits computation on ARM. (@davidtgoldblatt, @paravoid)
286 - Disable the CPU_SPINWAIT macro for Power. (@davidtgoldblatt, @marxin)
287 - Fix MSVC 2015 & 2017 builds. (@rustyx)
288 - Improve RISC-V support. (@EdSchouten)
289 - Set name mangling script in strict mode. (@nicolov)
290 - Avoid MADV_HUGEPAGE on ARM. (@marxin)
291 - Modify configure to determine return value of strerror_r.
293 - Make sure CXXFLAGS is tested with CPP compiler. (@nehaljwani)
294 - Fix 32-bit build on MSVC. (@rustyx)
295 - Fix external symbol on MSVC. (@maksqwe)
296 - Avoid a printf format specifier warning. (@jasone)
297 - Add configure option --disable-initial-exec-tls which can allow jemalloc to
299 - AArch64: Add ILP32 support. (@cmuellner)
300 - Add --with-lg-vaddr configure option to support cross compiling.
304 - Improve active extent fit with extent_max_active_fit. This considerably
307 - Eagerly coalesce large extents to reduce fragmentation. (@interwq)
308 - sdallocx: only read size info when page aligned (i.e. possibly sampled),
310 - Avoid attempting new mappings for in place expansion with retain, since
312 - Refactor OOM handling in newImpl. (@wqfish)
313 - Add internal fine-grained logging functionality for debugging use.
315 - Refactor arena / tcache interactions. (@davidtgoldblatt)
316 - Refactor extent management with dumpable flag. (@davidtgoldblatt)
317 - Add runtime detection of lazy purging. (@interwq)
318 - Use pairing heap instead of red-black tree for extents_avail. (@djwatson)
319 - Use sysctl on startup in FreeBSD. (@trasz)
320 - Use thread local prng state instead of atomic. (@djwatson)
321 - Make decay to always purge one more extent than before, because in
322 practice large extents are usually the ones that cross the decay threshold.
325 - Fast division by dynamic values. (@davidtgoldblatt)
326 - Improve the fit for aligned allocation. (@interwq, @edwinsmith)
327 - Refactor extent_t bitpacking. (@rkmisra)
328 - Optimize the generated assembly for ticker operations. (@davidtgoldblatt)
329 - Convert stats printing to use a structured text emitter. (@davidtgoldblatt)
330 - Remove preserve_lru feature for extents management. (@djwatson)
331 - Consolidate two memory loads into one on the fast deallocation path.
335 - Fix deadlock with multithreaded fork in OS X. (@davidtgoldblatt)
336 - Validate returned file descriptor before use. (@zonyitoo)
337 - Fix a few background thread initialization and shutdown issues. (@interwq)
338 - Fix an extent coalesce + decay race by taking both coalescing extents off
340 - Fix potentially unbound increase during decay, caused by one thread keep
343 - Fix a FreeBSD bootstrap assertion. (@strejda, @interwq)
344 - Handle 32 bit mutex counters. (@rkmisra)
345 - Fix a indexing bug when creating background threads. (@davidtgoldblatt,
347 - Fix arguments passed to extent_init. (@yuleniwo, @interwq)
348 - Fix addresses used for ordering mutexes. (@rkmisra)
349 - Fix abort_conf processing during bootstrap. (@interwq)
350 - Fix include path order for out-of-tree builds. (@cmuellner)
353 - Remove --disable-thp. (@interwq)
354 - Remove mallctl interfaces:
358 - Add TUNING.md. (@interwq, @davidtgoldblatt, @djwatson)
366 - Update decay->nunpurged before purging, in order to avoid potential update
368 - Only abort on dlsym(3) error if the failure impacts an enabled feature (lazy
370 failure bug for which we still do not have a clear reproduction test case.
372 - Modify tsd management so that it neither crashes nor leaks if a thread's
376 - Mask signals during background thread creation. This prevents signals from
379 - Avoid inactivity checks within background threads, in order to prevent
381 - Fix extent_grow_retained() to use the specified hooks when the
384 - Add missing reentrancy support for custom extent hooks which allocate.
386 - Post-fork(2), re-initialize the list of tcaches associated with each arena
388 - Add missing post-fork(2) mutex reinitialization for extent_grow_mtx. This
390 - Enforce minimum autoconf version (currently 2.68), since 2.63 is known to
392 - Ensure that the configured page size (--with-lg-page) is no larger than the
393 configured huge page size (--with-lg-hugepage). (@jasone)
398 aligned "chunks" for virtual memory management, and instead uses page-aligned
410 - Implement optional per-CPU arena support; threads choose which arena to use
411 based on current CPU rather than on fixed thread-->arena associations.
413 - Implement two-phase decay of unused dirty pages. Pages transition from
414 dirty-->muzzy-->clean, where the first phase transition relies on
416 pages such that they are replaced with demand-zeroed pages on next access.
418 - Increase decay time resolution from seconds to milliseconds. (@jasone)
419 - Implement opt-in per CPU background threads, and use them for asynchronous
420 decay-driven unused dirty page purging. (@interwq)
421 - Add mutex profiling, which collects a variety of statistics useful for
423 - Add C++ new/delete operator bindings. (@djwatson)
424 - Support manually created arena destruction, such that all data and metadata
427 - Add MALLCTL_ARENAS_ALL as a fixed index for use in accessing
429 - Add opt.abort_conf to optionally abort if invalid configuration options are
431 - Add opt.stats_print_opts, so that e.g. JSON output can be selected for the
433 - Add --with-version=VERSION for use when embedding jemalloc into another
435 - Add --disable-thp to support cross compiling. (@jasone)
436 - Add --with-lg-hugepage to support cross compiling. (@jasone)
437 - Add mallctl interfaces (various authors):
467 + stats.arenas.i.mutexes.{large,extent_avail,extents_dirty,extents_muzzy,
473 - Improve reentrant allocation support, such that deadlock is less likely if
476 - Support static linking of jemalloc with glibc. (@djwatson)
479 - Organize virtual memory as "extents" of virtual memory pages, rather than as
483 - Fold large and huge size classes together; only small and large size classes
485 - Unify the allocation paths, and merge most fast-path branching decisions.
487 - Embed per thread automatic tcache into thread-specific data, which reduces
489 fast-path data locality. (@interwq)
490 - Rewrite atomics to closely model the C11 API, convert various
491 synchronization from mutex-based to atomic, and use the explicit memory
494 - Extensively optimize rtree via various methods:
496 part of fast-path deallocation. (@interwq)
499 + Embed the root node in the top-level rtree data structure, thus avoiding
502 and directly embed extent metadata needed for fast-path deallocation.
504 + Ignore leading always-zero address bits (architecture-specific).
506 - Reorganize headers (ongoing work) to make them hermetic, and disentangle
508 - Convert various internal data structures such as size class metadata from
509 boot-time-initialized to compile-time-initialized. Propagate resulting data
510 structure simplifications, such as making arena metadata fixed-size.
512 - Simplify size class lookups when constrained to size classes that are
516 - Lock individual extents when possible for localized extent operations,
517 rather than relying on a top-level arena lock. (@davidtgoldblatt, @jasone)
518 - Use first fit layout policy instead of best fit, in order to improve
520 - If munmap(2) is not in use, use an exponential series to grow each arena's
523 - Implement per arena base allocators, so that arenas never share any virtual
525 - Automatically generate private symbol name mangling macros. (@jasone)
528 - Replace chunk hooks with an expanded/normalized set of extent hooks.
530 - Remove ratio-based purging. (@jasone)
531 - Remove --disable-tcache. (@jasone)
532 - Remove --disable-tls. (@jasone)
533 - Remove --enable-ivsalloc. (@jasone)
534 - Remove --with-lg-size-class-group. (@jasone)
535 - Remove --with-lg-tiny-min. (@jasone)
536 - Remove --disable-cc-silence. (@jasone)
537 - Remove --enable-code-coverage. (@jasone)
538 - Remove --disable-munmap (replaced by opt.retain). (@jasone)
539 - Remove Valgrind support. (@jasone)
540 - Remove quarantine support. (@jasone)
541 - Remove redzone support. (@jasone)
542 - Remove mallctl interfaces (various authors):
577 - Improve interval-based profile dump triggering to dump only one profile when
579 - Use prefixed function names (as controlled by --with-jemalloc-prefix) when
590 - Add --disable-thp and the opt.thp mallctl to provide opt-out mechanisms for
592 - Update zone allocator integration to work with macOS 10.12. (@glandium)
593 - Restructure *CFLAGS configuration, so that CFLAGS behaves typically, and
594 EXTRA_CFLAGS provides a way to specify e.g. -Werror during building, but not
598 - Fix DSS (sbrk(2)-based) allocation. This regression was first released in
600 - Handle race in per size class utilization computation. This functionality
602 - Fix lock order reversal during gdump. (@jasone)
603 - Fix/refactor tcache synchronization. This regression was first released in
605 - Fix various JSON-formatted malloc_stats_print() bugs. This functionality
607 - Fix huge-aligned allocation. This regression was first released in 4.4.0.
609 - When transparent huge page integration is enabled, detect what state pages
611 arena chunks to non-huge during purging if that is not their initial state.
613 - Fix lg_chunk clamping for the --enable-cache-oblivious --disable-fill case.
615 - Properly detect sparc64 when building for Linux. (@glaubitz)
620 - Add configure support for *-*-linux-android. (@cferris1000, @jasone)
621 - Add the --disable-syscall configure option, for use on systems that place
622 security-motivated limitations on syscall(2). (@jasone)
623 - Add support for Debian GNU/kFreeBSD. (@thesam)
626 - Add extent serial numbers and use them where appropriate as a sort key that
630 - Refactor madvise(2) configuration so that MADV_FREE is detected and utilized
632 - Mark partially purged arena chunks as non-huge-page. This improves
636 - Fix size class computations for edge conditions involving extremely large
639 - Remove overly restrictive assertions related to the cactive statistic. This
641 - Implement a more reliable detection scheme for os_unfair_lock on macOS.
647 - Fix a severe virtual memory leak. This regression was first released in
649 - Refactor atomic and prng APIs to restore support for 32-bit platforms that
650 use pre-C11 toolchains, e.g. FreeBSD's mips. (@jasone)
654 This is the first release that passes the test suite for multiple Windows
655 configurations, thanks in large part to @glandium setting up continuous
659 - Add "J" (JSON) support to malloc_stats_print(). (@jasone)
660 - Add Cray compiler support. (@ronawho)
663 - Add/use adaptive spinning for bootstrapping and radix tree node
667 - Fix large allocation to search starting in the optimal size class heap,
670 - Fix stats.arenas.<i>.nthreads accounting. (@interwq)
671 - Fix and simplify decay-based purging. (@jasone)
672 - Make DSS (sbrk(2)-related) operations lockless, which resolves potential
674 - Fix over-sized allocation of radix tree leaf nodes. (@mjp41, @ogaun,
676 - Fix over-sized allocation of arena_t (plus associated stats) data
678 - Fix EXTRA_CFLAGS to not affect configuration. (@jasone)
679 - Fix a Valgrind integration bug. (@ronawho)
680 - Disallow 0x5a junk filling when running in Valgrind. (@jasone)
681 - Fix a file descriptor leak on Linux. This regression was first released in
683 - Fix static linking of jemalloc with glibc. (@djwatson)
684 - Use syscall(2) rather than {open,read,close}(2) during boot on Linux. This
687 - Fix OS X default zone replacement to work with OS X 10.12. (@glandium,
689 - Fix cached memory management to avoid needless commit/decommit operations
692 - Fix TSD fetches to avoid (recursive) allocation. This is relevant to
693 non-TLS and Windows configurations. (@jasone)
694 - Fix malloc_conf overriding to work on Windows. (@jasone)
695 - Forcibly disable lazy-lock on Windows (was forcibly *enabled*). (@jasone)
700 - Fix bootstrapping issues for configurations that require allocation during
701 tsd initialization (e.g. --disable-tls). (@cferris1000, @jasone)
702 - Fix gettimeofday() version of nstime_update(). (@ronawho)
703 - Fix Valgrind regressions in calloc() and chunk_alloc_wrapper(). (@ronawho)
704 - Fix potential VM map fragmentation regression. (@jasone)
705 - Fix opt_zero-triggered in-place huge reallocation zeroing. (@jasone)
706 - Fix heap profiling context leaks in reallocation edge cases. (@jasone)
711 - Add the arena.<i>.reset mallctl, which makes it possible to discard all of
713 - Add the stats.retained and stats.arenas.<i>.retained statistics. (@jasone)
714 - Add the --with-version configure option. (@jasone)
715 - Support --with-lg-page values larger than actual page size. (@jasone)
718 - Use pairing heaps rather than red-black trees for various hot data
720 - Streamline fast paths of rtree operations. (@jasone)
721 - Optimize the fast paths of calloc() and [m,d,sd]allocx(). (@jasone)
722 - Decommit unused virtual memory if the OS does not overcommit. (@jasone)
723 - Specify MAP_NORESERVE on Linux if [heuristic] overcommit is active, in order
727 - Fix chunk accounting related to triggering gdump profiles. (@jasone)
728 - Link against librt for clock_gettime(2) if glibc < 2.17. (@jasone)
729 - Scale leak report summary according to sampling probability. (@jasone)
734 bitmap fix is critical for 64-bit Windows.
737 - Fix the linear scan version of bitmap_sfu() to shift by the proper amount
738 even when sizeof(long) is not the same as sizeof(void *), as on 64-bit
740 - Fix hashing functions to avoid unaligned memory accesses (and resulting
741 crashes). This is relevant at least to some ARM-based platforms.
743 - Fix fork()-related lock rank ordering reversals. These reversals were
746 - Fix various chunk leaks in OOM code paths. (@jasone)
747 - Fix malloc_stats_print() to print opt.narenas correctly. (@jasone)
748 - Fix MSVC-specific build/test issues. (@rustyx, @yuslepukhin)
749 - Fix a variety of test failures that were due to test fragility rather than
755 of portability-motivated refactoring and enhancements. Many people worked on
763 - Implement decay-based unused dirty page purging, a major optimization with
764 mallctl API impact. This is an alternative to the existing ratio-based
774 - Add --with-malloc-conf, which makes it possible to embed a default
776 specify --with-malloc-conf=purge:decay , since the default must remain
778 - Add MS Visual Studio 2015 support. (@rustyx, @yuslepukhin)
779 - Make *allocx() size class overflow behavior defined. The maximum
784 - jeprof:
786 + Add --retain and --exclude for backtrace symbol filtering. (@jasone)
789 - Optimize the fast path to combine various bootstrapping and configuration
791 - Use linear scan for small bitmaps (used for small object tracking). In
792 addition to speeding up bitmap operations on 64-bit systems, this reduces
794 - Separate arena_avail trees, which substantially speeds up run tree
796 - Use memoization (boot-time-computed table) for run quantization. Separate
798 - Attempt mmap-based in-place huge reallocation. This can dramatically speed
802 - Make opt.narenas unsigned rather than size_t. (@jasone)
805 - Fix stats.cactive accounting regression. (@rustyx, @jasone)
806 - Handle unaligned keys in hash(). This caused problems for some ARM systems.
808 - Refactor arenas array. In addition to fixing a fork-related deadlock, this
810 - Move retained memory allocation out of the default chunk allocation
814 - Fix a potential tsd cleanup leak. (@cferris1000, @jasone)
815 - Fix run quantization. In practice this bug had no impact unless
818 - Fix LinuxThreads-specific bootstrapping deadlock. (Cosmin Paraschiv)
819 - jeprof:
822 - Fix stats.arenas.<i>.{dss,lg_dirty_mult,decay_time,pactive,pdirty} for
823 --disable-stats case. (@jasone)
833 - Fix xallocx(..., MALLOCX_ZERO to zero the last full trailing page of large
835 --enable-cache-oblivious configure option is enabled.
842 - Fix xallocx(..., MALLOCX_ZERO) to zero all trailing bytes of large
843 allocations when --enable-cache-oblivious configure option is enabled.
844 - Fix xallocx(..., MALLOCX_ZERO) to zero trailing bytes of huge allocations
846 - Fix prof_tctx_dump_iter() to filter out nodes that were created after heap
848 - Work around a potentially bad thread-specific data initialization
856 - Fix ixallocx_prof_sample() to never modify nor create sampled small
859 - Fix irallocx_prof_sample() to always allocate large regions, even when
860 alignment is non-zero.
861 - Fix prof_alloc_rollback() to read tdata from thread-specific data rather
874 - Avoid a chunk metadata read in arena_prof_tctx_set(), since it is in the
878 - Optimize irallocx_prof() to optimistically update the sampler state. The
883 - Fix TLS configuration such that it is enabled by default for platforms on
885 - Fix arenas_cache_cleanup() and arena_get_hard() to handle
886 allocation/deallocation within the application's thread-specific data
888 - Fix xallocx() bugs related to size+extra exceeding HUGE_MAXCLASS.
889 - Fix chunk purge hook calls for in-place huge shrinking reallocation to
893 - Fix heap profiling bugs:
896 could cause data structure corruption that would most likely result in a
908 - Fix portability bugs:
929 - Normalize size class spacing to be consistent across the complete size
931 is now configurable via the --with-lg-size-class-group option. Also add the
932 --with-lg-page, --with-lg-page-sizes, --with-lg-quantum, and
933 --with-lg-tiny-min options, which can be used to tweak page and size class
943 - Add support for explicit tcaches. The "tcache.create", "tcache.flush", and
947 - Implement per thread heap profiling, as well as the ability to
952 - Add support for per arena application-specified chunk allocators, configured
954 - Refactor huge allocation to be managed by arenas, so that arenas now
956 the context of user-specified chunk allocators, aside from the scalability
967 - Add the 'util' column to malloc_stats_print() output, which reports the
970 - Add "alloc" and "free" modes for for junk filling (see the "opt.junk"
973 - Add the jemalloc-config script, which provides information about how
975 - Add metadata statistics, which are accessible via the "stats.metadata",
978 - Add the "stats.resident" mallctl, which reports the upper limit of
980 - Add per arena control over unused dirty page purging, via the
983 - Add the "prof.gdump" mallctl, which makes it possible to toggle the gdump
985 - Add sdallocx(), which implements sized deallocation. The primary
988 - Add missing header includes in jemalloc/jemalloc.h, so that applications
990 - Add support for additional platforms:
999 - Maintain dirty runs in per arena LRUs rather than in per arena trees of
1000 dirty-run-containing chunks. In practice this change significantly reduces
1002 - Integrate whole chunks into the unused dirty page purging machinery. This
1005 - Split the arena chunk map into two separate arrays, in order to increase
1007 - Move small run metadata out of runs, into arena chunk headers. This reduces
1011 - Randomly distribute large allocation base pointer alignment relative to page
1013 disabled via the --disable-cache-oblivious configure option, and queried via
1015 - Micro-optimize the fast paths for the public API functions.
1016 - Refactor thread-specific data to reside in a single structure. This assures
1018 - Implement in-place huge allocation growing and shrinking.
1019 - Refactor rtree (radix tree for chunk lookups) to be lock-free, and make
1022 allocation, because a global data structure is critical for determining
1026 - Replace --enable-cc-silence with --disable-cc-silence to suppress spurious
1028 - Assure that the constness of malloc_usable_size()'s return type matches that
1030 - Change the heap profile dump format to support per thread heap profiling,
1031 rename pprof to jeprof, and enhance it with the --thread=<n> option. As a
1034 - Disable "opt.prof_final" by default, in order to avoid atexit(3), which can
1036 - Change the "arenas.nlruns" mallctl type from size_t to unsigned.
1037 - Replace the "stats.arenas.<i>.bins.<j>.allocated" mallctl with
1039 - Ignore MALLOC_CONF in set{uid,gid,cap} binaries.
1040 - Ignore MALLOCX_ARENA(a) in dallocx(), in favor of using the
1044 - Remove the *allocm() API, which is superseded by the *allocx() API.
1045 - Remove the --enable-dss options, and make dss non-optional on all platforms
1047 - Remove the "arenas.purge" mallctl, which was obsoleted by the
1049 - Remove the unnecessary "opt.valgrind" mallctl; jemalloc automatically
1051 - Remove the "stats.huge.allocated", "stats.huge.nmalloc", and
1053 - Remove the --enable-mremap option.
1054 - Remove the "stats.chunks.current", "stats.chunks.total", and
1058 - Fix the cactive statistic to decrease (rather than increase) when active
1060 - Fix OOM handling in memalign() and valloc(). A variant of this bug existed
1062 - Fix an OOM-related regression in arena_tcache_fill_small(), which could
1065 - Fix size class overflow handling for malloc(), posix_memalign(), memalign(),
1067 - Fix the "arena.<i>.dss" mallctl to return an error if "primary" or
1069 - Fix fallback lg_floor() implementations to handle extremely large inputs.
1070 - Ensure the default purgeable zone is after the default zone on OS X.
1071 - Fix latent bugs in atomic_*().
1072 - Fix the "arena.<i>.dss" mallctl to handle read-only calls.
1073 - Fix tls_model configuration to enable the initial-exec model when possible.
1074 - Mark malloc_conf as a weak symbol so that the application can override it.
1075 - Correctly detect glibc's adaptive pthread mutexes.
1076 - Fix the --without-export configure option.
1084 - Fix a regression in arena_chunk_alloc() that caused crashes during
1085 small/large allocation if chunk allocation failed. In the absence of this
1088 - Fix backtracing for gcc intrinsics-based backtracing by specifying
1089 -fno-omit-frame-pointer to gcc. Note that the application (and all the
1092 - Use dss allocation precedence for huge allocations as well as small/large
1094 - Fix test assertion failure message formatting. This bug did not manifest on
1096 - Fix inconsequential test failures for hash and SFMT code.
1099 - Support heap profiling on FreeBSD. This feature depends on the proc
1104 This version primarily addresses minor bugs in test code.
1107 - Configure Solaris/Illumos to use MADV_FREE.
1108 - Fix junk filling for mremap(2)-based huge reallocation. This is only
1109 relevant if configuring with the --enable-mremap option specified.
1110 - Avoid compilation failure if 'restrict' C99 keyword is not supported by the
1112 - Add a configure test for SSE2 rather than assuming it is usable on i686
1113 systems. This fixes test compilation errors, especially on 32-bit Linux
1115 - Fix mallctl argument size mismatches (size_t vs. uint64_t) in the stats unit
1116 test.
1117 - Fix/remove flawed alignment-related overflow tests.
1118 - Prevent compiler optimizations that could change backtraces in the
1119 prof_accum unit test.
1124 includes some non-trivial heap profiling optimizations not mentioned below.
1127 - Add the *allocx() API, which is a successor to the experimental *allocm()
1132 slated for removal in the next non-bugfix release.
1133 - Add support for LinuxThreads.
1136 - Unless heap profiling is enabled, disable floating point code and don't link
1137 with libm. This, in combination with e.g. EXTRA_CFLAGS=-mno-sse on x64
1139 use. Some versions of glibc neglect to save/restore caller-saved floating
1143 - Report ENOMEM rather than EINVAL if an OOM occurs during heap profiling
1146 - Fix a file descriptor leak in a prof_dump_maps() error path.
1147 - Fix prof_dump() to close the dump file descriptor for all relevant error
1149 - Fix rallocm() to use the arena specified by the ALLOCM_ARENA(s) flag for
1151 - Fix a data race for large allocation stats counters.
1152 - Fix a potential infinite loop during thread exit. This bug occurred on
1155 - Don't junk-fill reallocations unless usable size changes. This fixes a
1157 - Fix growing large reallocation to junk fill new space.
1158 - Fix huge deallocation to junk fill when munmap is disabled.
1159 - Change the default private namespace prefix from empty to je_, and change
1160 --with-private-namespace-prefix so that it prepends an additional prefix
1163 - Add missing private namespace mangling (relevant when
1164 --with-private-namespace is specified).
1165 - Add and use JEMALLOC_INLINE_C so that static inline functions are marked as
1167 - Add a missing mutex unlock in a malloc_init_hard() error path. In practice
1169 - Fix numerous bugs in malloc_strotumax() error handling/reporting. These
1171 - Fix numerous bugs in malloc_snprintf(). These bugs were not exercised by
1177 - Fix a race in the "arenas.extend" mallctl that could cause memory corruption
1178 of internal data structures and subsequent crashes.
1179 - Fix Valgrind integration flaws that caused Valgrind warnings about reads of
1182 + internal zero-initialized data structures (relevant to tcache and prof
1184 - Preserve errno during the first allocation. A readlink(2) call during
1187 - Fix compilation warnings reported by gcc 4.8.1.
1195 - Fix race-triggered deadlocks in chunk_record(). These deadlocks were
1200 - Add support for the aarch64 architecture.
1205 custom run-time options.
1208 - Fix a locking order bug that could cause deadlock during fork if heap
1210 - Fix a chunk recycling bug that could cause the allocator to lose track of
1215 --disable-munmap configure option or the "dss:primary" option was
1218 - Fix TLS-related memory corruption that could occur during thread exit if the
1221 - Fix two quarantine bugs:
1226 - Fix Valgrind integration to annotate all internally allocated memory in a
1227 way that keeps Valgrind happy about internal data structure access.
1228 - Fix building for s390 systems.
1236 - Add clipping support to lg_chunk option processing.
1237 - Add the --enable-ivsalloc option.
1238 - Add the --without-export option.
1239 - Add the --disable-zone-allocator option.
1242 - Fix "arenas.extend" mallctl to output the number of arenas.
1243 - Fix chunk_recycle() to unconditionally inform Valgrind that returned memory
1245 - Fix build break on FreeBSD related to alloca.h.
1251 page-level virtual memory fragmentation.
1254 - Change the "opt.lg_dirty_mult" default from 5 to 3 (32:1 to 8:1).
1257 - Fix dss/mmap allocation precedence code to use recyclable mmap memory only
1259 - Fix deadlock in the "arenas.purge" mallctl. This regression was introduced
1265 - Auto-detect whether running inside Valgrind, thus removing the need to
1267 - Add the "arenas.extend" mallctl, which allows applications to create
1269 - Add the ALLOCM_ARENA() flag for {,r,d}allocm().
1270 - Add the "opt.dss", "arena.<i>.dss", and "stats.arenas.<i>.dss" mallctls,
1272 - Add the "arena.<i>.purge" mallctl, which obsoletes "arenas.purge".
1273 - Define LG_QUANTUM for hppa.
1276 - Disable tcache by default if running inside Valgrind, in order to avoid
1278 - Drop const from malloc_usable_size() argument on Linux.
1281 - Fix heap profiling crash if sampled object is freed via realloc(p, 0).
1282 - Remove const from __*_hook variable declarations, so that glibc can modify
1284 - Fix mlockall(2)/madvise(2) interaction.
1285 - Fix fork(2)-related deadlocks.
1286 - Fix error return value for "thread.tcache.enabled" mallctl.
1300 - Implement Valgrind support, redzones, and quarantine.
1301 - Add support for additional platforms:
1306 - Add support for additional architectures:
1310 - Add support for cross compiling.
1311 - Add nallocm(), which rounds a request size up to the nearest size class
1313 - Implement aligned_alloc() (blame C11).
1314 - Add the "thread.tcache.enabled" mallctl.
1315 - Add the "opt.prof_final" mallctl.
1316 - Update pprof (from gperftools 2.0).
1317 - Add the --with-mangling option.
1318 - Add the --disable-experimental option.
1319 - Add the --disable-munmap option, and make it the default on Linux.
1320 - Add the --enable-mremap option, which disables use of mremap(2) by default.
1323 - Enable stats by default.
1324 - Enable fill by default.
1325 - Disable lazy locking by default.
1326 - Rename the "tcache.flush" mallctl to "thread.tcache.flush".
1327 - Rename the "arenas.pagesize" mallctl to "arenas.page".
1328 - Change the "opt.lg_prof_sample" default from 0 to 19 (1 B to 512 KiB).
1329 - Change the "opt.prof_accum" default from true to false.
1332 - Remove the swap feature, including the "config.swap", "swap.avail",
1334 - Remove highruns statistics, including the
1337 - As part of small size class refactoring, remove the "opt.lg_[qc]space_max",
1340 - Remove the "arenas.chunksize" mallctl.
1341 - Remove the "opt.lg_prof_tcmax" option.
1342 - Remove the "opt.lg_prof_bt_max" option.
1343 - Remove the "opt.lg_tcache_gc_sweep" option.
1344 - Remove the --disable-tiny option, including the "config.tiny" mallctl.
1345 - Remove the --enable-dynamic-page-shift configure option.
1346 - Remove the --enable-sysv configure option.
1349 - Fix a statistics-related bug in the "thread.arena" mallctl that could cause
1351 - Work around TLS deallocation via free() on Linux. This bug could cause
1352 write-after-free memory corruption.
1353 - Fix a potential deadlock that could occur during interval- and
1354 growth-triggered heap profile dumps.
1355 - Fix large calloc() zeroing bugs due to dropping chunk map unzeroed flags.
1356 - Fix chunk_alloc_dss() to stop claiming memory is zeroed. This bug could
1357 cause memory corruption and crashes with --enable-dss specified.
1358 - Fix fork-related bugs that could cause deadlock in children between fork
1360 - Fix malloc_stats_print() to honor 'b' and 'l' in the opts parameter.
1361 - Fix realloc(p, 0) to act like free(p).
1362 - Do not enforce minimum alignment in memalign().
1363 - Check for NULL pointer in malloc_usable_size().
1364 - Fix an off-by-one heap profile statistics bug that could be observed in
1365 interval- and growth-triggered heap profiles.
1366 - Fix the "epoch" mallctl to update cached stats even if the passed in epoch
1368 - Fix bin->runcur management to fix a layout policy bug. This bug did not
1370 - Fix a bug in choose_arena_hard() that potentially caused more arenas to be
1372 - Add missing "opt.lg_tcache_max" mallctl implementation.
1373 - Use glibc allocator hooks to make mixed allocator usage less likely.
1374 - Fix build issues for --disable-tcache.
1375 - Don't mangle pthread_create() when --with-private-namespace is specified.
1380 - Fix huge_ralloc() race when using mremap(2). This is a serious bug that
1382 - Fix huge_ralloc() to maintain chunk statistics.
1383 - Fix malloc_stats_print(..., "a") output.
1388 - Initialize arenas_tsd before using it. This bug existed for 2.2.[0-3], as
1389 well as for --disable-tls builds in earlier releases.
1390 - Do not assume a 4 KiB page size in test/rallocm.c.
1397 - Fix a prof-related race condition. This bug could cause memory corruption,
1398 but only occurred in non-default configurations (prof_accum:false).
1399 - Fix off-by-one backtracing issues (make sure that prof_alloc_prep() is
1401 - Fix a prof-related bug in realloc() (only triggered by OOM errors).
1402 - Fix prof-related bugs in allocm() and rallocm().
1403 - Fix prof_tdata_cleanup() for --disable-tls builds.
1404 - Fix a relative include path, to fix objdir builds.
1409 - Fix a build error for --disable-tcache.
1410 - Fix assertions in arena_purge() (for real this time).
1411 - Add the --with-private-namespace option. This is a workaround for symbol
1417 - Implement atomic operations for x86/x64. This fixes compilation failures
1419 - Fix an assertion in arena_purge().
1423 This version incorporates several improvements to algorithms and data
1427 - Add the "stats.cactive" mallctl.
1428 - Update pprof (from google-perftools 1.7).
1429 - Improve backtracing-related configuration logic, and add the
1430 --disable-prof-libgcc option.
1433 - Change default symbol visibility from "internal", to "hidden", which
1434 decreases the overhead of library-internal function calls.
1435 - Fix symbol visibility so that it is also set on OS X.
1436 - Fix a build dependency regression caused by the introduction of the .pic.o
1438 - Add missing checks for mutex initialization failures.
1439 - Don't use libgcc-based backtracing except on x64, where it is known to work.
1440 - Fix deadlocks on OS X that were due to memory allocation in
1442 - Heap profiling-specific fixes:
1453 - Fix a cpp logic regression (due to the "thread.{de,}allocatedp" mallctl fix
1455 - Fix a "thread.arena" mallctl bug.
1456 - Fix a thread cache stats merging bug.
1461 - Fix "thread.{de,}allocatedp" mallctl for OS X.
1462 - Add missing jemalloc.a to build system.
1467 - Fix aligned huge reallocation (affected allocm()).
1468 - Fix the ALLOCM_LG_ALIGN macro definition.
1469 - Fix a heap dumping deadlock.
1470 - Fix a "thread.arena" mallctl bug.
1478 - Use Linux's mremap(2) for huge object reallocation when possible.
1479 - Avoid locking in mallctl*() when possible.
1480 - Add the "thread.[de]allocatedp" mallctl's.
1481 - Convert the manual page source from roff to DocBook, and generate both roff
1485 - Fix a crash due to incorrect bootstrap ordering. This only impacted
1486 --enable-debug --enable-dss configurations.
1487 - Fix a minor statistics bug for mallctl("swap.avail", ...).
1492 - Fix a race condition in heap profiling that could cause undefined behavior
1494 - Add missing mutex unlocks for some OOM error paths in the heap profiling
1496 - Fix a compilation error for non-C99 builds.
1501 run-time configuration/introspection. Nonetheless, numerous performance
1505 - Implement the experimental {,r,s,d}allocm() API, which provides a superset
1510 - Replace JEMALLOC_OPTIONS/JEMALLOC_PROF_PREFIX with MALLOC_CONF, which is
1511 more human-readable, and more flexible. For example:
1515 - Port to Apple OS X. Sponsored by Mozilla.
1516 - Make it possible for the application to control thread-->arena mappings via
1518 - Add compile-time support for all TLS-related functionality via pthreads TSD.
1521 - Override memalign() and valloc() if they are provided by the system.
1522 - Add the "arenas.purge" mallctl, which can be used to synchronously purge all
1524 - Make cumulative heap profiling data optional, so that it is possible to
1525 limit the amount of memory consumed by heap profiling data structures.
1526 - Add per thread allocation counters that can be accessed via the
1530 - Remove JEMALLOC_OPTIONS and malloc_options (see MALLOC_CONF above).
1531 - Increase default backtrace depth from 4 to 128 for heap profiling.
1532 - Disable interval-based profile dumps by default.
1535 - Remove bad assertions in fork handler functions. These assertions could
1537 - Fix strerror_r() usage to deal with non-standard semantics in GNU libc.
1538 - Fix leak context reporting. This bug tended to cause the number of contexts
1541 - Fix a realloc() bug for large in-place growing reallocation. This bug could
1543 - Fix an allocation bug for small allocations that could be triggered if
1545 - Enhance the heap profiler to trigger samples based on usable size, rather
1547 - Fix a heap profiling bug due to sometimes losing track of requested object
1553 - Fix the libunwind-based implementation of stack backtracing (used for heap
1554 profiling). This bug could cause zero-length backtraces to be reported.
1555 - Add a missing mutex unlock in library initialization code. If multiple
1562 - Fix junk filling of large objects, which could cause memory corruption.
1563 - Add MAP_NORESERVE support for chunk mapping, because otherwise virtual
1570 - Fix compilation when --enable-fill is specified.
1571 - Fix threads-related profiling bugs that affected accuracy and caused memory
1573 - Fix dirty page purging race conditions that could cause crashes.
1574 - Fix crash in tcache flushing code during thread destruction.
1578 This release focuses on speed and run-time introspection. Numerous
1583 - Implement autoconf-based configuration system.
1584 - Add mallctl*(), for the purposes of introspection and run-time
1586 - Make it possible for the application to manually flush a thread's cache, via
1588 - Base maximum dirty page count on proportion of active memory.
1589 - Compute various additional run-time statistics, including per size class
1590 statistics for large objects.
1591 - Expose malloc_stats_print(), which can be called repeatedly by the
1593 - Simplify the malloc_message() signature to only take one string argument,
1594 and incorporate an opaque data pointer argument for use by the application
1596 - Add support for allocation backed by one or more swap files, and allow the
1597 application to disable over-commit if swap files are in use.
1598 - Implement allocation profiling and leak checking.
1601 - Remove the dynamic arena rebalancing code, since thread-specific caching
1605 - Modify chunk allocation to work when address space layout randomization
1607 - Fix thread cleanup bugs related to TLS destruction.
1608 - Handle 0-size allocation requests in posix_memalign().
1609 - Fix a chunk leak. The leaked chunks were never touched, so this impacted
1614 These snapshot releases are the simple result of incorporating Linux-specific
1617 --------------------------------------------------------------------------------