1a4bd5210SJason EvansFollowing are change highlights associated with official releases. Important 2d0e79aa3SJason Evansbug fixes are all mentioned, but some internal enhancements are omitted here for 3d0e79aa3SJason Evansbrevity. Much more detail can be found in the git revision history: 4a4bd5210SJason Evans 5706d9bd1SJason Evans https://github.com/jemalloc/jemalloc 6706d9bd1SJason Evans 7*c43cad87SWarner Losh* 5.3.0 (May 6, 2022) 8*c43cad87SWarner Losh 9*c43cad87SWarner Losh This release contains many speed and space optimizations, from micro 10*c43cad87SWarner Losh optimizations on common paths to rework of internal data structures and 11*c43cad87SWarner Losh locking schemes, and many more too detailed to list below. Multiple percent 12*c43cad87SWarner Losh of system level metric improvements were measured in tested production 13*c43cad87SWarner Losh workloads. The release has gone through large-scale production testing. 14*c43cad87SWarner Losh 15*c43cad87SWarner Losh New features: 16*c43cad87SWarner Losh - Add the thread.idle mallctl which hints that the calling thread will be 17*c43cad87SWarner Losh idle for a nontrivial period of time. (@davidtgoldblatt) 18*c43cad87SWarner Losh - Allow small size classes to be the maximum size class to cache in the 19*c43cad87SWarner Losh thread-specific cache, through the opt.[lg_]tcache_max option. (@interwq, 20*c43cad87SWarner Losh @jordalgo) 21*c43cad87SWarner Losh - Make the behavior of realloc(ptr, 0) configurable with opt.zero_realloc. 22*c43cad87SWarner Losh (@davidtgoldblatt) 23*c43cad87SWarner Losh - Add 'make uninstall' support. (@sangshuduo, @Lapenkov) 24*c43cad87SWarner Losh - Support C++17 over-aligned allocation. (@marksantaniello) 25*c43cad87SWarner Losh - Add the thread.peak mallctl for approximate per-thread peak memory tracking. 26*c43cad87SWarner Losh (@davidtgoldblatt) 27*c43cad87SWarner Losh - Add interval-based stats output opt.stats_interval. (@interwq) 28*c43cad87SWarner Losh - Add prof.prefix to override filename prefixes for dumps. (@zhxchen17) 29*c43cad87SWarner Losh - Add high resolution timestamp support for profiling. (@tyroguru) 30*c43cad87SWarner Losh - Add the --collapsed flag to jeprof for flamegraph generation. 31*c43cad87SWarner Losh (@igorwwwwwwwwwwwwwwwwwwww) 32*c43cad87SWarner Losh - Add the --debug-syms-by-id option to jeprof for debug symbols discovery. 33*c43cad87SWarner Losh (@DeannaGelbart) 34*c43cad87SWarner Losh - Add the opt.prof_leak_error option to exit with error code when leak is 35*c43cad87SWarner Losh detected using opt.prof_final. (@yunxuo) 36*c43cad87SWarner Losh - Add opt.cache_oblivious as an runtime alternative to config.cache_oblivious. 37*c43cad87SWarner Losh (@interwq) 38*c43cad87SWarner Losh - Add mallctl interfaces: 39*c43cad87SWarner Losh + opt.zero_realloc (@davidtgoldblatt) 40*c43cad87SWarner Losh + opt.cache_oblivious (@interwq) 41*c43cad87SWarner Losh + opt.prof_leak_error (@yunxuo) 42*c43cad87SWarner Losh + opt.stats_interval (@interwq) 43*c43cad87SWarner Losh + opt.stats_interval_opts (@interwq) 44*c43cad87SWarner Losh + opt.tcache_max (@interwq) 45*c43cad87SWarner Losh + opt.trust_madvise (@azat) 46*c43cad87SWarner Losh + prof.prefix (@zhxchen17) 47*c43cad87SWarner Losh + stats.zero_reallocs (@davidtgoldblatt) 48*c43cad87SWarner Losh + thread.idle (@davidtgoldblatt) 49*c43cad87SWarner Losh + thread.peak.{read,reset} (@davidtgoldblatt) 50*c43cad87SWarner Losh 51*c43cad87SWarner Losh Bug fixes: 52*c43cad87SWarner Losh - Fix the synchronization around explicit tcache creation which could cause 53*c43cad87SWarner Losh invalid tcache identifiers. This regression was first released in 5.0.0. 54*c43cad87SWarner Losh (@yoshinorim, @davidtgoldblatt) 55*c43cad87SWarner Losh - Fix a profiling biasing issue which could cause incorrect heap usage and 56*c43cad87SWarner Losh object counts. This issue existed in all previous releases with the heap 57*c43cad87SWarner Losh profiling feature. (@davidtgoldblatt) 58*c43cad87SWarner Losh - Fix the order of stats counter updating on large realloc which could cause 59*c43cad87SWarner Losh failed assertions. This regression was first released in 5.0.0. (@azat) 60*c43cad87SWarner Losh - Fix the locking on the arena destroy mallctl, which could cause concurrent 61*c43cad87SWarner Losh arena creations to fail. This functionality was first introduced in 5.0.0. 62*c43cad87SWarner Losh (@interwq) 63*c43cad87SWarner Losh 64*c43cad87SWarner Losh Portability improvements: 65*c43cad87SWarner Losh - Remove nothrow from system function declarations on macOS and FreeBSD. 66*c43cad87SWarner Losh (@davidtgoldblatt, @fredemmott, @leres) 67*c43cad87SWarner Losh - Improve overcommit and page alignment settings on NetBSD. (@zoulasc) 68*c43cad87SWarner Losh - Improve CPU affinity support on BSD platforms. (@devnexen) 69*c43cad87SWarner Losh - Improve utrace detection and support. (@devnexen) 70*c43cad87SWarner Losh - Improve QEMU support with MADV_DONTNEED zeroed pages detection. (@azat) 71*c43cad87SWarner Losh - Add memcntl support on Solaris / illumos. (@devnexen) 72*c43cad87SWarner Losh - Improve CPU_SPINWAIT on ARM. (@AWSjswinney) 73*c43cad87SWarner Losh - Improve TSD cleanup on FreeBSD. (@Lapenkov) 74*c43cad87SWarner Losh - Disable percpu_arena if the CPU count cannot be reliably detected. (@azat) 75*c43cad87SWarner Losh - Add malloc_size(3) override support. (@devnexen) 76*c43cad87SWarner Losh - Add mmap VM_MAKE_TAG support. (@devnexen) 77*c43cad87SWarner Losh - Add support for MADV_[NO]CORE. (@devnexen) 78*c43cad87SWarner Losh - Add support for DragonFlyBSD. (@devnexen) 79*c43cad87SWarner Losh - Fix the QUANTUM setting on MIPS64. (@brooksdavis) 80*c43cad87SWarner Losh - Add the QUANTUM setting for ARC. (@vineetgarc) 81*c43cad87SWarner Losh - Add the QUANTUM setting for LoongArch. (@wangjl-uos) 82*c43cad87SWarner Losh - Add QNX support. (@jqian-aurora) 83*c43cad87SWarner Losh - Avoid atexit(3) calls unless the relevant profiling features are enabled. 84*c43cad87SWarner Losh (@BusyJay, @laiwei-rice, @interwq) 85*c43cad87SWarner Losh - Fix unknown option detection when using Clang. (@Lapenkov) 86*c43cad87SWarner Losh - Fix symbol conflict with musl libc. (@georgthegreat) 87*c43cad87SWarner Losh - Add -Wimplicit-fallthrough checks. (@nickdesaulniers) 88*c43cad87SWarner Losh - Add __forceinline support on MSVC. (@santagada) 89*c43cad87SWarner Losh - Improve FreeBSD and Windows CI support. (@Lapenkov) 90*c43cad87SWarner Losh - Add CI support for PPC64LE architecture. (@ezeeyahoo) 91*c43cad87SWarner Losh 92*c43cad87SWarner Losh Incompatible changes: 93*c43cad87SWarner Losh - Maximum size class allowed in tcache (opt.[lg_]tcache_max) now has an upper 94*c43cad87SWarner Losh bound of 8MiB. (@interwq) 95*c43cad87SWarner Losh 96*c43cad87SWarner Losh Optimizations and refactors (@davidtgoldblatt, @Lapenkov, @interwq): 97*c43cad87SWarner Losh - Optimize the common cases of the thread cache operations. 98*c43cad87SWarner Losh - Optimize internal data structures, including RB tree and pairing heap. 99*c43cad87SWarner Losh - Optimize the internal locking on extent management. 100*c43cad87SWarner Losh - Extract and refactor the internal page allocator and interface modules. 101*c43cad87SWarner Losh 102*c43cad87SWarner Losh Documentation: 103*c43cad87SWarner Losh - Fix doc build with --with-install-suffix. (@lawmurray, @interwq) 104*c43cad87SWarner Losh - Add PROFILING_INTERNALS.md. (@davidtgoldblatt) 105*c43cad87SWarner Losh - Ensure the proper order of doc building and installation. (@Mingli-Yu) 106*c43cad87SWarner Losh 107c5ad8142SEric van Gyzen* 5.2.1 (August 5, 2019) 108c5ad8142SEric van Gyzen 109c5ad8142SEric van Gyzen This release is primarily about Windows. A critical virtual memory leak is 110c5ad8142SEric van Gyzen resolved on all Windows platforms. The regression was present in all releases 111c5ad8142SEric van Gyzen since 5.0.0. 112c5ad8142SEric van Gyzen 113c5ad8142SEric van Gyzen Bug fixes: 114c5ad8142SEric van Gyzen - Fix a severe virtual memory leak on Windows. This regression was first 115c5ad8142SEric van Gyzen released in 5.0.0. (@Ignition, @j0t, @frederik-h, @davidtgoldblatt, 116c5ad8142SEric van Gyzen @interwq) 117c5ad8142SEric van Gyzen - Fix size 0 handling in posix_memalign(). This regression was first released 118c5ad8142SEric van Gyzen in 5.2.0. (@interwq) 119c5ad8142SEric van Gyzen - Fix the prof_log unit test which may observe unexpected backtraces from 120c5ad8142SEric van Gyzen compiler optimizations. The test was first added in 5.2.0. (@marxin, 121c5ad8142SEric van Gyzen @gnzlbg, @interwq) 122c5ad8142SEric van Gyzen - Fix the declaration of the extent_avail tree. This regression was first 123c5ad8142SEric van Gyzen released in 5.1.0. (@zoulasc) 124c5ad8142SEric van Gyzen - Fix an incorrect reference in jeprof. This functionality was first released 125c5ad8142SEric van Gyzen in 3.0.0. (@prehistoric-penguin) 126c5ad8142SEric van Gyzen - Fix an assertion on the deallocation fast-path. This regression was first 127c5ad8142SEric van Gyzen released in 5.2.0. (@yinan1048576) 128c5ad8142SEric van Gyzen - Fix the TLS_MODEL attribute in headers. This regression was first released 129c5ad8142SEric van Gyzen in 5.0.0. (@zoulasc, @interwq) 130c5ad8142SEric van Gyzen 131c5ad8142SEric van Gyzen Optimizations and refactors: 132c5ad8142SEric van Gyzen - Implement opt.retain on Windows and enable by default on 64-bit. (@interwq, 133c5ad8142SEric van Gyzen @davidtgoldblatt) 134c5ad8142SEric van Gyzen - Optimize away a branch on the operator delete[] path. (@mgrice) 135c5ad8142SEric van Gyzen - Add format annotation to the format generator function. (@zoulasc) 136c5ad8142SEric van Gyzen - Refactor and improve the size class header generation. (@yinan1048576) 137c5ad8142SEric van Gyzen - Remove best fit. (@djwatson) 138c5ad8142SEric van Gyzen - Avoid blocking on background thread locks for stats. (@oranagra, @interwq) 139c5ad8142SEric van Gyzen 140c5ad8142SEric van Gyzen* 5.2.0 (April 2, 2019) 141c5ad8142SEric van Gyzen 142c5ad8142SEric van Gyzen This release includes a few notable improvements, which are summarized below: 143c5ad8142SEric van Gyzen 1) improved fast-path performance from the optimizations by @djwatson; 2) 144c5ad8142SEric van Gyzen reduced virtual memory fragmentation and metadata usage; and 3) bug fixes on 145c5ad8142SEric van Gyzen setting the number of background threads. In addition, peak / spike memory 146c5ad8142SEric van Gyzen usage is improved with certain allocation patterns. As usual, the release and 147c5ad8142SEric van Gyzen prior dev versions have gone through large-scale production testing. 148c5ad8142SEric van Gyzen 149c5ad8142SEric van Gyzen New features: 150c5ad8142SEric van Gyzen - Implement oversize_threshold, which uses a dedicated arena for allocations 151c5ad8142SEric van Gyzen crossing the specified threshold to reduce fragmentation. (@interwq) 152c5ad8142SEric van Gyzen - Add extents usage information to stats. (@tyleretzel) 153c5ad8142SEric van Gyzen - Log time information for sampled allocations. (@tyleretzel) 154c5ad8142SEric van Gyzen - Support 0 size in sdallocx. (@djwatson) 155c5ad8142SEric van Gyzen - Output rate for certain counters in malloc_stats. (@zinoale) 156c5ad8142SEric van Gyzen - Add configure option --enable-readlinkat, which allows the use of readlinkat 157c5ad8142SEric van Gyzen over readlink. (@davidtgoldblatt) 158c5ad8142SEric van Gyzen - Add configure options --{enable,disable}-{static,shared} to allow not 159c5ad8142SEric van Gyzen building unwanted libraries. (@Ericson2314) 160c5ad8142SEric van Gyzen - Add configure option --disable-libdl to enable fully static builds. 161c5ad8142SEric van Gyzen (@interwq) 162c5ad8142SEric van Gyzen - Add mallctl interfaces: 163c5ad8142SEric van Gyzen + opt.oversize_threshold (@interwq) 164c5ad8142SEric van Gyzen + stats.arenas.<i>.extent_avail (@tyleretzel) 165c5ad8142SEric van Gyzen + stats.arenas.<i>.extents.<j>.n{dirty,muzzy,retained} (@tyleretzel) 166c5ad8142SEric van Gyzen + stats.arenas.<i>.extents.<j>.{dirty,muzzy,retained}_bytes 167c5ad8142SEric van Gyzen (@tyleretzel) 168c5ad8142SEric van Gyzen 169c5ad8142SEric van Gyzen Portability improvements: 170c5ad8142SEric van Gyzen - Update MSVC builds. (@maksqwe, @rustyx) 171c5ad8142SEric van Gyzen - Workaround a compiler optimizer bug on s390x. (@rkmisra) 172c5ad8142SEric van Gyzen - Make use of pthread_set_name_np(3) on FreeBSD. (@trasz) 173c5ad8142SEric van Gyzen - Implement malloc_getcpu() to enable percpu_arena for windows. (@santagada) 174c5ad8142SEric van Gyzen - Link against -pthread instead of -lpthread. (@paravoid) 175c5ad8142SEric van Gyzen - Make background_thread not dependent on libdl. (@interwq) 176c5ad8142SEric van Gyzen - Add stringify to fix a linker directive issue on MSVC. (@daverigby) 177c5ad8142SEric van Gyzen - Detect and fall back when 8-bit atomics are unavailable. (@interwq) 178c5ad8142SEric van Gyzen - Fall back to the default pthread_create if dlsym(3) fails. (@interwq) 179c5ad8142SEric van Gyzen 180c5ad8142SEric van Gyzen Optimizations and refactors: 181c5ad8142SEric van Gyzen - Refactor the TSD module. (@davidtgoldblatt) 182c5ad8142SEric van Gyzen - Avoid taking extents_muzzy mutex when muzzy is disabled. (@interwq) 183c5ad8142SEric van Gyzen - Avoid taking large_mtx for auto arenas on the tcache flush path. (@interwq) 184c5ad8142SEric van Gyzen - Optimize ixalloc by avoiding a size lookup. (@interwq) 185c5ad8142SEric van Gyzen - Implement opt.oversize_threshold which uses a dedicated arena for requests 186c5ad8142SEric van Gyzen crossing the threshold, also eagerly purges the oversize extents. Default 187c5ad8142SEric van Gyzen the threshold to 8 MiB. (@interwq) 188c5ad8142SEric van Gyzen - Clean compilation with -Wextra. (@gnzlbg, @jasone) 189c5ad8142SEric van Gyzen - Refactor the size class module. (@davidtgoldblatt) 190c5ad8142SEric van Gyzen - Refactor the stats emitter. (@tyleretzel) 191c5ad8142SEric van Gyzen - Optimize pow2_ceil. (@rkmisra) 192c5ad8142SEric van Gyzen - Avoid runtime detection of lazy purging on FreeBSD. (@trasz) 193c5ad8142SEric van Gyzen - Optimize mmap(2) alignment handling on FreeBSD. (@trasz) 194c5ad8142SEric van Gyzen - Improve error handling for THP state initialization. (@jsteemann) 195c5ad8142SEric van Gyzen - Rework the malloc() fast path. (@djwatson) 196c5ad8142SEric van Gyzen - Rework the free() fast path. (@djwatson) 197c5ad8142SEric van Gyzen - Refactor and optimize the tcache fill / flush paths. (@djwatson) 198c5ad8142SEric van Gyzen - Optimize sync / lwsync on PowerPC. (@chmeeedalf) 199c5ad8142SEric van Gyzen - Bypass extent_dalloc() when retain is enabled. (@interwq) 200c5ad8142SEric van Gyzen - Optimize the locking on large deallocation. (@interwq) 201c5ad8142SEric van Gyzen - Reduce the number of pages committed from sanity checking in debug build. 202c5ad8142SEric van Gyzen (@trasz, @interwq) 203c5ad8142SEric van Gyzen - Deprecate OSSpinLock. (@interwq) 204c5ad8142SEric van Gyzen - Lower the default number of background threads to 4 (when the feature 205c5ad8142SEric van Gyzen is enabled). (@interwq) 206c5ad8142SEric van Gyzen - Optimize the trylock spin wait. (@djwatson) 207c5ad8142SEric van Gyzen - Use arena index for arena-matching checks. (@interwq) 208c5ad8142SEric van Gyzen - Avoid forced decay on thread termination when using background threads. 209c5ad8142SEric van Gyzen (@interwq) 210c5ad8142SEric van Gyzen - Disable muzzy decay by default. (@djwatson, @interwq) 211c5ad8142SEric van Gyzen - Only initialize libgcc unwinder when profiling is enabled. (@paravoid, 212c5ad8142SEric van Gyzen @interwq) 213c5ad8142SEric van Gyzen 214c5ad8142SEric van Gyzen Bug fixes (all only relevant to jemalloc 5.x): 215c5ad8142SEric van Gyzen - Fix background thread index issues with max_background_threads. (@djwatson, 216c5ad8142SEric van Gyzen @interwq) 217c5ad8142SEric van Gyzen - Fix stats output for opt.lg_extent_max_active_fit. (@interwq) 218c5ad8142SEric van Gyzen - Fix opt.prof_prefix initialization. (@davidtgoldblatt) 219c5ad8142SEric van Gyzen - Properly trigger decay on tcache destroy. (@interwq, @amosbird) 220c5ad8142SEric van Gyzen - Fix tcache.flush. (@interwq) 221c5ad8142SEric van Gyzen - Detect whether explicit extent zero out is necessary with huge pages or 222c5ad8142SEric van Gyzen custom extent hooks, which may change the purge semantics. (@interwq) 223c5ad8142SEric van Gyzen - Fix a side effect caused by extent_max_active_fit combined with decay-based 224c5ad8142SEric van Gyzen purging, where freed extents can accumulate and not be reused for an 225c5ad8142SEric van Gyzen extended period of time. (@interwq, @mpghf) 226c5ad8142SEric van Gyzen - Fix a missing unlock on extent register error handling. (@zoulasc) 227c5ad8142SEric van Gyzen 228c5ad8142SEric van Gyzen Testing: 229c5ad8142SEric van Gyzen - Simplify the Travis script output. (@gnzlbg) 230c5ad8142SEric van Gyzen - Update the test scripts for FreeBSD. (@devnexen) 231c5ad8142SEric van Gyzen - Add unit tests for the producer-consumer pattern. (@interwq) 232c5ad8142SEric van Gyzen - Add Cirrus-CI config for FreeBSD builds. (@jasone) 233c5ad8142SEric van Gyzen - Add size-matching sanity checks on tcache flush. (@davidtgoldblatt, 234c5ad8142SEric van Gyzen @interwq) 235c5ad8142SEric van Gyzen 236c5ad8142SEric van Gyzen Incompatible changes: 237c5ad8142SEric van Gyzen - Remove --with-lg-page-sizes. (@davidtgoldblatt) 238c5ad8142SEric van Gyzen 239c5ad8142SEric van Gyzen Documentation: 240c5ad8142SEric van Gyzen - Attempt to build docs by default, however skip doc building when xsltproc 241c5ad8142SEric van Gyzen is missing. (@interwq, @cmuellner) 242c5ad8142SEric van Gyzen 243c5ad8142SEric van Gyzen* 5.1.0 (May 4, 2018) 2440ef50b4eSJason Evans 2450ef50b4eSJason Evans This release is primarily about fine-tuning, ranging from several new features 2460ef50b4eSJason Evans to numerous notable performance and portability enhancements. The release and 2470ef50b4eSJason Evans prior dev versions have been running in multiple large scale applications for 2480ef50b4eSJason Evans months, and the cumulative improvements are substantial in many cases. 2490ef50b4eSJason Evans 2500ef50b4eSJason Evans Given the long and successful production runs, this release is likely a good 2510ef50b4eSJason Evans candidate for applications to upgrade, from both jemalloc 5.0 and before. For 2520ef50b4eSJason Evans performance-critical applications, the newly added TUNING.md provides 2530ef50b4eSJason Evans guidelines on jemalloc tuning. 2540ef50b4eSJason Evans 2550ef50b4eSJason Evans New features: 2560ef50b4eSJason Evans - Implement transparent huge page support for internal metadata. (@interwq) 2570ef50b4eSJason Evans - Add opt.thp to allow enabling / disabling transparent huge pages for all 2580ef50b4eSJason Evans mappings. (@interwq) 2590ef50b4eSJason Evans - Add maximum background thread count option. (@djwatson) 2600ef50b4eSJason Evans - Allow prof_active to control opt.lg_prof_interval and prof.gdump. 2610ef50b4eSJason Evans (@interwq) 2620ef50b4eSJason Evans - Allow arena index lookup based on allocation addresses via mallctl. 2630ef50b4eSJason Evans (@lionkov) 2640ef50b4eSJason Evans - Allow disabling initial-exec TLS model. (@davidtgoldblatt, @KenMacD) 2650ef50b4eSJason Evans - Add opt.lg_extent_max_active_fit to set the max ratio between the size of 2660ef50b4eSJason Evans the active extent selected (to split off from) and the size of the requested 2670ef50b4eSJason Evans allocation. (@interwq, @davidtgoldblatt) 2680ef50b4eSJason Evans - Add retain_grow_limit to set the max size when growing virtual address 2690ef50b4eSJason Evans space. (@interwq) 2700ef50b4eSJason Evans - Add mallctl interfaces: 2710ef50b4eSJason Evans + arena.<i>.retain_grow_limit (@interwq) 2720ef50b4eSJason Evans + arenas.lookup (@lionkov) 2730ef50b4eSJason Evans + max_background_threads (@djwatson) 2740ef50b4eSJason Evans + opt.lg_extent_max_active_fit (@interwq) 2750ef50b4eSJason Evans + opt.max_background_threads (@djwatson) 2760ef50b4eSJason Evans + opt.metadata_thp (@interwq) 2770ef50b4eSJason Evans + opt.thp (@interwq) 2780ef50b4eSJason Evans + stats.metadata_thp (@interwq) 2790ef50b4eSJason Evans 2800ef50b4eSJason Evans Portability improvements: 2810ef50b4eSJason Evans - Support GNU/kFreeBSD configuration. (@paravoid) 2820ef50b4eSJason Evans - Support m68k, nios2 and SH3 architectures. (@paravoid) 2830ef50b4eSJason Evans - Fall back to FD_CLOEXEC when O_CLOEXEC is unavailable. (@zonyitoo) 2840ef50b4eSJason Evans - Fix symbol listing for cross-compiling. (@tamird) 2850ef50b4eSJason Evans - Fix high bits computation on ARM. (@davidtgoldblatt, @paravoid) 2860ef50b4eSJason Evans - Disable the CPU_SPINWAIT macro for Power. (@davidtgoldblatt, @marxin) 2870ef50b4eSJason Evans - Fix MSVC 2015 & 2017 builds. (@rustyx) 2880ef50b4eSJason Evans - Improve RISC-V support. (@EdSchouten) 2890ef50b4eSJason Evans - Set name mangling script in strict mode. (@nicolov) 2900ef50b4eSJason Evans - Avoid MADV_HUGEPAGE on ARM. (@marxin) 2910ef50b4eSJason Evans - Modify configure to determine return value of strerror_r. 2920ef50b4eSJason Evans (@davidtgoldblatt, @cferris1000) 2930ef50b4eSJason Evans - Make sure CXXFLAGS is tested with CPP compiler. (@nehaljwani) 2940ef50b4eSJason Evans - Fix 32-bit build on MSVC. (@rustyx) 2950ef50b4eSJason Evans - Fix external symbol on MSVC. (@maksqwe) 2960ef50b4eSJason Evans - Avoid a printf format specifier warning. (@jasone) 2970ef50b4eSJason Evans - Add configure option --disable-initial-exec-tls which can allow jemalloc to 2980ef50b4eSJason Evans be dynamically loaded after program startup. (@davidtgoldblatt, @KenMacD) 2990ef50b4eSJason Evans - AArch64: Add ILP32 support. (@cmuellner) 3000ef50b4eSJason Evans - Add --with-lg-vaddr configure option to support cross compiling. 3010ef50b4eSJason Evans (@cmuellner, @davidtgoldblatt) 3020ef50b4eSJason Evans 3030ef50b4eSJason Evans Optimizations and refactors: 3040ef50b4eSJason Evans - Improve active extent fit with extent_max_active_fit. This considerably 3050ef50b4eSJason Evans reduces fragmentation over time and improves virtual memory and metadata 3060ef50b4eSJason Evans usage. (@davidtgoldblatt, @interwq) 3070ef50b4eSJason Evans - Eagerly coalesce large extents to reduce fragmentation. (@interwq) 3080ef50b4eSJason Evans - sdallocx: only read size info when page aligned (i.e. possibly sampled), 3090ef50b4eSJason Evans which speeds up the sized deallocation path significantly. (@interwq) 3100ef50b4eSJason Evans - Avoid attempting new mappings for in place expansion with retain, since 3110ef50b4eSJason Evans it rarely succeeds in practice and causes high overhead. (@interwq) 3120ef50b4eSJason Evans - Refactor OOM handling in newImpl. (@wqfish) 3130ef50b4eSJason Evans - Add internal fine-grained logging functionality for debugging use. 3140ef50b4eSJason Evans (@davidtgoldblatt) 3150ef50b4eSJason Evans - Refactor arena / tcache interactions. (@davidtgoldblatt) 3160ef50b4eSJason Evans - Refactor extent management with dumpable flag. (@davidtgoldblatt) 3170ef50b4eSJason Evans - Add runtime detection of lazy purging. (@interwq) 3180ef50b4eSJason Evans - Use pairing heap instead of red-black tree for extents_avail. (@djwatson) 3190ef50b4eSJason Evans - Use sysctl on startup in FreeBSD. (@trasz) 3200ef50b4eSJason Evans - Use thread local prng state instead of atomic. (@djwatson) 3210ef50b4eSJason Evans - Make decay to always purge one more extent than before, because in 3220ef50b4eSJason Evans practice large extents are usually the ones that cross the decay threshold. 3230ef50b4eSJason Evans Purging the additional extent helps save memory as well as reduce VM 3240ef50b4eSJason Evans fragmentation. (@interwq) 3250ef50b4eSJason Evans - Fast division by dynamic values. (@davidtgoldblatt) 3260ef50b4eSJason Evans - Improve the fit for aligned allocation. (@interwq, @edwinsmith) 3270ef50b4eSJason Evans - Refactor extent_t bitpacking. (@rkmisra) 3280ef50b4eSJason Evans - Optimize the generated assembly for ticker operations. (@davidtgoldblatt) 3290ef50b4eSJason Evans - Convert stats printing to use a structured text emitter. (@davidtgoldblatt) 3300ef50b4eSJason Evans - Remove preserve_lru feature for extents management. (@djwatson) 3310ef50b4eSJason Evans - Consolidate two memory loads into one on the fast deallocation path. 3320ef50b4eSJason Evans (@davidtgoldblatt, @interwq) 3330ef50b4eSJason Evans 3340ef50b4eSJason Evans Bug fixes (most of the issues are only relevant to jemalloc 5.0): 3350ef50b4eSJason Evans - Fix deadlock with multithreaded fork in OS X. (@davidtgoldblatt) 3360ef50b4eSJason Evans - Validate returned file descriptor before use. (@zonyitoo) 3370ef50b4eSJason Evans - Fix a few background thread initialization and shutdown issues. (@interwq) 3380ef50b4eSJason Evans - Fix an extent coalesce + decay race by taking both coalescing extents off 3390ef50b4eSJason Evans the LRU list. (@interwq) 3400ef50b4eSJason Evans - Fix potentially unbound increase during decay, caused by one thread keep 3410ef50b4eSJason Evans stashing memory to purge while other threads generating new pages. The 3420ef50b4eSJason Evans number of pages to purge is checked to prevent this. (@interwq) 3430ef50b4eSJason Evans - Fix a FreeBSD bootstrap assertion. (@strejda, @interwq) 3440ef50b4eSJason Evans - Handle 32 bit mutex counters. (@rkmisra) 3450ef50b4eSJason Evans - Fix a indexing bug when creating background threads. (@davidtgoldblatt, 3460ef50b4eSJason Evans @binliu19) 3470ef50b4eSJason Evans - Fix arguments passed to extent_init. (@yuleniwo, @interwq) 3480ef50b4eSJason Evans - Fix addresses used for ordering mutexes. (@rkmisra) 3490ef50b4eSJason Evans - Fix abort_conf processing during bootstrap. (@interwq) 3500ef50b4eSJason Evans - Fix include path order for out-of-tree builds. (@cmuellner) 3510ef50b4eSJason Evans 3520ef50b4eSJason Evans Incompatible changes: 3530ef50b4eSJason Evans - Remove --disable-thp. (@interwq) 3540ef50b4eSJason Evans - Remove mallctl interfaces: 3550ef50b4eSJason Evans + config.thp (@interwq) 3560ef50b4eSJason Evans 3570ef50b4eSJason Evans Documentation: 3580ef50b4eSJason Evans - Add TUNING.md. (@interwq, @davidtgoldblatt, @djwatson) 3590ef50b4eSJason Evans 3608b2f5aafSJason Evans* 5.0.1 (July 1, 2017) 3618b2f5aafSJason Evans 3628b2f5aafSJason Evans This bugfix release fixes several issues, most of which are obscure enough 3638b2f5aafSJason Evans that typical applications are not impacted. 3648b2f5aafSJason Evans 3658b2f5aafSJason Evans Bug fixes: 3668b2f5aafSJason Evans - Update decay->nunpurged before purging, in order to avoid potential update 3678b2f5aafSJason Evans races and subsequent incorrect purging volume. (@interwq) 3688b2f5aafSJason Evans - Only abort on dlsym(3) error if the failure impacts an enabled feature (lazy 3698b2f5aafSJason Evans locking and/or background threads). This mitigates an initialization 3708b2f5aafSJason Evans failure bug for which we still do not have a clear reproduction test case. 3718b2f5aafSJason Evans (@interwq) 3728b2f5aafSJason Evans - Modify tsd management so that it neither crashes nor leaks if a thread's 3738b2f5aafSJason Evans only allocation activity is to call free() after TLS destructors have been 3748b2f5aafSJason Evans executed. This behavior was observed when operating with GNU libc, and is 3758b2f5aafSJason Evans unlikely to be an issue with other libc implementations. (@interwq) 3768b2f5aafSJason Evans - Mask signals during background thread creation. This prevents signals from 3778b2f5aafSJason Evans being inadvertently delivered to background threads. (@jasone, 3788b2f5aafSJason Evans @davidtgoldblatt, @interwq) 3798b2f5aafSJason Evans - Avoid inactivity checks within background threads, in order to prevent 3808b2f5aafSJason Evans recursive mutex acquisition. (@interwq) 3818b2f5aafSJason Evans - Fix extent_grow_retained() to use the specified hooks when the 3828b2f5aafSJason Evans arena.<i>.extent_hooks mallctl is used to override the default hooks. 3838b2f5aafSJason Evans (@interwq) 3848b2f5aafSJason Evans - Add missing reentrancy support for custom extent hooks which allocate. 3858b2f5aafSJason Evans (@interwq) 3868b2f5aafSJason Evans - Post-fork(2), re-initialize the list of tcaches associated with each arena 3878b2f5aafSJason Evans to contain no tcaches except the forking thread's. (@interwq) 3888b2f5aafSJason Evans - Add missing post-fork(2) mutex reinitialization for extent_grow_mtx. This 3898b2f5aafSJason Evans fixes potential deadlocks after fork(2). (@interwq) 3908b2f5aafSJason Evans - Enforce minimum autoconf version (currently 2.68), since 2.63 is known to 3918b2f5aafSJason Evans generate corrupt configure scripts. (@jasone) 3928b2f5aafSJason Evans - Ensure that the configured page size (--with-lg-page) is no larger than the 3938b2f5aafSJason Evans configured huge page size (--with-lg-hugepage). (@jasone) 3948b2f5aafSJason Evans 395b7eaed25SJason Evans* 5.0.0 (June 13, 2017) 396b7eaed25SJason Evans 397b7eaed25SJason Evans Unlike all previous jemalloc releases, this release does not use naturally 398b7eaed25SJason Evans aligned "chunks" for virtual memory management, and instead uses page-aligned 399b7eaed25SJason Evans "extents". This change has few externally visible effects, but the internal 400b7eaed25SJason Evans impacts are... extensive. Many other internal changes combine to make this 401b7eaed25SJason Evans the most cohesively designed version of jemalloc so far, with ample 402b7eaed25SJason Evans opportunity for further enhancements. 403b7eaed25SJason Evans 404b7eaed25SJason Evans Continuous integration is now an integral aspect of development thanks to the 405b7eaed25SJason Evans efforts of @davidtgoldblatt, and the dev branch tends to remain reasonably 406b7eaed25SJason Evans stable on the tested platforms (Linux, FreeBSD, macOS, and Windows). As a 407b7eaed25SJason Evans side effect the official release frequency may decrease over time. 408b7eaed25SJason Evans 409b7eaed25SJason Evans New features: 410b7eaed25SJason Evans - Implement optional per-CPU arena support; threads choose which arena to use 411b7eaed25SJason Evans based on current CPU rather than on fixed thread-->arena associations. 412b7eaed25SJason Evans (@interwq) 413b7eaed25SJason Evans - Implement two-phase decay of unused dirty pages. Pages transition from 414b7eaed25SJason Evans dirty-->muzzy-->clean, where the first phase transition relies on 415b7eaed25SJason Evans madvise(... MADV_FREE) semantics, and the second phase transition discards 416b7eaed25SJason Evans pages such that they are replaced with demand-zeroed pages on next access. 417b7eaed25SJason Evans (@jasone) 418b7eaed25SJason Evans - Increase decay time resolution from seconds to milliseconds. (@jasone) 419b7eaed25SJason Evans - Implement opt-in per CPU background threads, and use them for asynchronous 420b7eaed25SJason Evans decay-driven unused dirty page purging. (@interwq) 421b7eaed25SJason Evans - Add mutex profiling, which collects a variety of statistics useful for 422b7eaed25SJason Evans diagnosing overhead/contention issues. (@interwq) 423b7eaed25SJason Evans - Add C++ new/delete operator bindings. (@djwatson) 424b7eaed25SJason Evans - Support manually created arena destruction, such that all data and metadata 425b7eaed25SJason Evans are discarded. Add MALLCTL_ARENAS_DESTROYED for accessing merged stats 426b7eaed25SJason Evans associated with destroyed arenas. (@jasone) 427b7eaed25SJason Evans - Add MALLCTL_ARENAS_ALL as a fixed index for use in accessing 428b7eaed25SJason Evans merged/destroyed arena statistics via mallctl. (@jasone) 429b7eaed25SJason Evans - Add opt.abort_conf to optionally abort if invalid configuration options are 430b7eaed25SJason Evans detected during initialization. (@interwq) 431b7eaed25SJason Evans - Add opt.stats_print_opts, so that e.g. JSON output can be selected for the 432b7eaed25SJason Evans stats dumped during exit if opt.stats_print is true. (@jasone) 433b7eaed25SJason Evans - Add --with-version=VERSION for use when embedding jemalloc into another 434b7eaed25SJason Evans project's git repository. (@jasone) 435b7eaed25SJason Evans - Add --disable-thp to support cross compiling. (@jasone) 436b7eaed25SJason Evans - Add --with-lg-hugepage to support cross compiling. (@jasone) 437b7eaed25SJason Evans - Add mallctl interfaces (various authors): 438b7eaed25SJason Evans + background_thread 439b7eaed25SJason Evans + opt.abort_conf 440b7eaed25SJason Evans + opt.retain 441b7eaed25SJason Evans + opt.percpu_arena 442b7eaed25SJason Evans + opt.background_thread 443b7eaed25SJason Evans + opt.{dirty,muzzy}_decay_ms 444b7eaed25SJason Evans + opt.stats_print_opts 445b7eaed25SJason Evans + arena.<i>.initialized 446b7eaed25SJason Evans + arena.<i>.destroy 447b7eaed25SJason Evans + arena.<i>.{dirty,muzzy}_decay_ms 448b7eaed25SJason Evans + arena.<i>.extent_hooks 449b7eaed25SJason Evans + arenas.{dirty,muzzy}_decay_ms 450b7eaed25SJason Evans + arenas.bin.<i>.slab_size 451b7eaed25SJason Evans + arenas.nlextents 452b7eaed25SJason Evans + arenas.lextent.<i>.size 453b7eaed25SJason Evans + arenas.create 454b7eaed25SJason Evans + stats.background_thread.{num_threads,num_runs,run_interval} 455b7eaed25SJason Evans + stats.mutexes.{ctl,background_thread,prof,reset}. 456b7eaed25SJason Evans {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds, 457b7eaed25SJason Evans num_owner_switch} 458b7eaed25SJason Evans + stats.arenas.<i>.{dirty,muzzy}_decay_ms 459b7eaed25SJason Evans + stats.arenas.<i>.uptime 460b7eaed25SJason Evans + stats.arenas.<i>.{pmuzzy,base,internal,resident} 461b7eaed25SJason Evans + stats.arenas.<i>.{dirty,muzzy}_{npurge,nmadvise,purged} 462b7eaed25SJason Evans + stats.arenas.<i>.bins.<j>.{nslabs,reslabs,curslabs} 463b7eaed25SJason Evans + stats.arenas.<i>.bins.<j>.mutex. 464b7eaed25SJason Evans {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds, 465b7eaed25SJason Evans num_owner_switch} 466b7eaed25SJason Evans + stats.arenas.<i>.lextents.<j>.{nmalloc,ndalloc,nrequests,curlextents} 467b7eaed25SJason Evans + stats.arenas.i.mutexes.{large,extent_avail,extents_dirty,extents_muzzy, 468b7eaed25SJason Evans extents_retained,decay_dirty,decay_muzzy,base,tcache_list}. 469b7eaed25SJason Evans {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds, 470b7eaed25SJason Evans num_owner_switch} 471b7eaed25SJason Evans 472b7eaed25SJason Evans Portability improvements: 473b7eaed25SJason Evans - Improve reentrant allocation support, such that deadlock is less likely if 474b7eaed25SJason Evans e.g. a system library call in turn allocates memory. (@davidtgoldblatt, 475b7eaed25SJason Evans @interwq) 476b7eaed25SJason Evans - Support static linking of jemalloc with glibc. (@djwatson) 477b7eaed25SJason Evans 478b7eaed25SJason Evans Optimizations and refactors: 479b7eaed25SJason Evans - Organize virtual memory as "extents" of virtual memory pages, rather than as 480b7eaed25SJason Evans naturally aligned "chunks", and store all metadata in arbitrarily distant 481b7eaed25SJason Evans locations. This reduces virtual memory external fragmentation, and will 482b7eaed25SJason Evans interact better with huge pages (not yet explicitly supported). (@jasone) 483b7eaed25SJason Evans - Fold large and huge size classes together; only small and large size classes 484b7eaed25SJason Evans remain. (@jasone) 485b7eaed25SJason Evans - Unify the allocation paths, and merge most fast-path branching decisions. 486b7eaed25SJason Evans (@davidtgoldblatt, @interwq) 487b7eaed25SJason Evans - Embed per thread automatic tcache into thread-specific data, which reduces 488b7eaed25SJason Evans conditional branches and dereferences. Also reorganize tcache to increase 489b7eaed25SJason Evans fast-path data locality. (@interwq) 490b7eaed25SJason Evans - Rewrite atomics to closely model the C11 API, convert various 491b7eaed25SJason Evans synchronization from mutex-based to atomic, and use the explicit memory 492b7eaed25SJason Evans ordering control to resolve various hypothetical races without increasing 493b7eaed25SJason Evans synchronization overhead. (@davidtgoldblatt) 494b7eaed25SJason Evans - Extensively optimize rtree via various methods: 495b7eaed25SJason Evans + Add multiple layers of rtree lookup caching, since rtree lookups are now 496b7eaed25SJason Evans part of fast-path deallocation. (@interwq) 497b7eaed25SJason Evans + Determine rtree layout at compile time. (@jasone) 498b7eaed25SJason Evans + Make the tree shallower for common configurations. (@jasone) 499b7eaed25SJason Evans + Embed the root node in the top-level rtree data structure, thus avoiding 500b7eaed25SJason Evans one level of indirection. (@jasone) 501b7eaed25SJason Evans + Further specialize leaf elements as compared to internal node elements, 502b7eaed25SJason Evans and directly embed extent metadata needed for fast-path deallocation. 503b7eaed25SJason Evans (@jasone) 504b7eaed25SJason Evans + Ignore leading always-zero address bits (architecture-specific). 505b7eaed25SJason Evans (@jasone) 506b7eaed25SJason Evans - Reorganize headers (ongoing work) to make them hermetic, and disentangle 507b7eaed25SJason Evans various module dependencies. (@davidtgoldblatt) 508b7eaed25SJason Evans - Convert various internal data structures such as size class metadata from 509b7eaed25SJason Evans boot-time-initialized to compile-time-initialized. Propagate resulting data 510b7eaed25SJason Evans structure simplifications, such as making arena metadata fixed-size. 511b7eaed25SJason Evans (@jasone) 512b7eaed25SJason Evans - Simplify size class lookups when constrained to size classes that are 513b7eaed25SJason Evans multiples of the page size. This speeds lookups, but the primary benefit is 514b7eaed25SJason Evans complexity reduction in code that was the source of numerous regressions. 515b7eaed25SJason Evans (@jasone) 516b7eaed25SJason Evans - Lock individual extents when possible for localized extent operations, 517b7eaed25SJason Evans rather than relying on a top-level arena lock. (@davidtgoldblatt, @jasone) 518b7eaed25SJason Evans - Use first fit layout policy instead of best fit, in order to improve 519b7eaed25SJason Evans packing. (@jasone) 520b7eaed25SJason Evans - If munmap(2) is not in use, use an exponential series to grow each arena's 521b7eaed25SJason Evans virtual memory, so that the number of disjoint virtual memory mappings 522b7eaed25SJason Evans remains low. (@jasone) 523b7eaed25SJason Evans - Implement per arena base allocators, so that arenas never share any virtual 524b7eaed25SJason Evans memory pages. (@jasone) 525b7eaed25SJason Evans - Automatically generate private symbol name mangling macros. (@jasone) 526b7eaed25SJason Evans 527b7eaed25SJason Evans Incompatible changes: 528b7eaed25SJason Evans - Replace chunk hooks with an expanded/normalized set of extent hooks. 529b7eaed25SJason Evans (@jasone) 530b7eaed25SJason Evans - Remove ratio-based purging. (@jasone) 531b7eaed25SJason Evans - Remove --disable-tcache. (@jasone) 532b7eaed25SJason Evans - Remove --disable-tls. (@jasone) 533b7eaed25SJason Evans - Remove --enable-ivsalloc. (@jasone) 534b7eaed25SJason Evans - Remove --with-lg-size-class-group. (@jasone) 535b7eaed25SJason Evans - Remove --with-lg-tiny-min. (@jasone) 536b7eaed25SJason Evans - Remove --disable-cc-silence. (@jasone) 537b7eaed25SJason Evans - Remove --enable-code-coverage. (@jasone) 538b7eaed25SJason Evans - Remove --disable-munmap (replaced by opt.retain). (@jasone) 539b7eaed25SJason Evans - Remove Valgrind support. (@jasone) 540b7eaed25SJason Evans - Remove quarantine support. (@jasone) 541b7eaed25SJason Evans - Remove redzone support. (@jasone) 542b7eaed25SJason Evans - Remove mallctl interfaces (various authors): 543b7eaed25SJason Evans + config.munmap 544b7eaed25SJason Evans + config.tcache 545b7eaed25SJason Evans + config.tls 546b7eaed25SJason Evans + config.valgrind 547b7eaed25SJason Evans + opt.lg_chunk 548b7eaed25SJason Evans + opt.purge 549b7eaed25SJason Evans + opt.lg_dirty_mult 550b7eaed25SJason Evans + opt.decay_time 551b7eaed25SJason Evans + opt.quarantine 552b7eaed25SJason Evans + opt.redzone 553b7eaed25SJason Evans + opt.thp 554b7eaed25SJason Evans + arena.<i>.lg_dirty_mult 555b7eaed25SJason Evans + arena.<i>.decay_time 556b7eaed25SJason Evans + arena.<i>.chunk_hooks 557b7eaed25SJason Evans + arenas.initialized 558b7eaed25SJason Evans + arenas.lg_dirty_mult 559b7eaed25SJason Evans + arenas.decay_time 560b7eaed25SJason Evans + arenas.bin.<i>.run_size 561b7eaed25SJason Evans + arenas.nlruns 562b7eaed25SJason Evans + arenas.lrun.<i>.size 563b7eaed25SJason Evans + arenas.nhchunks 564b7eaed25SJason Evans + arenas.hchunk.<i>.size 565b7eaed25SJason Evans + arenas.extend 566b7eaed25SJason Evans + stats.cactive 567b7eaed25SJason Evans + stats.arenas.<i>.lg_dirty_mult 568b7eaed25SJason Evans + stats.arenas.<i>.decay_time 569b7eaed25SJason Evans + stats.arenas.<i>.metadata.{mapped,allocated} 570b7eaed25SJason Evans + stats.arenas.<i>.{npurge,nmadvise,purged} 571b7eaed25SJason Evans + stats.arenas.<i>.huge.{allocated,nmalloc,ndalloc,nrequests} 572b7eaed25SJason Evans + stats.arenas.<i>.bins.<j>.{nruns,reruns,curruns} 573b7eaed25SJason Evans + stats.arenas.<i>.lruns.<j>.{nmalloc,ndalloc,nrequests,curruns} 574b7eaed25SJason Evans + stats.arenas.<i>.hchunks.<j>.{nmalloc,ndalloc,nrequests,curhchunks} 575b7eaed25SJason Evans 576b7eaed25SJason Evans Bug fixes: 577b7eaed25SJason Evans - Improve interval-based profile dump triggering to dump only one profile when 578b7eaed25SJason Evans a single allocation's size exceeds the interval. (@jasone) 579b7eaed25SJason Evans - Use prefixed function names (as controlled by --with-jemalloc-prefix) when 580b7eaed25SJason Evans pruning backtrace frames in jeprof. (@jasone) 581b7eaed25SJason Evans 5828244f2aaSJason Evans* 4.5.0 (February 28, 2017) 5838244f2aaSJason Evans 5848244f2aaSJason Evans This is the first release to benefit from much broader continuous integration 5858244f2aaSJason Evans testing, thanks to @davidtgoldblatt. Had we had this testing infrastructure 5868244f2aaSJason Evans in place for prior releases, it would have caught all of the most serious 5878244f2aaSJason Evans regressions fixed by this release. 5888244f2aaSJason Evans 5898244f2aaSJason Evans New features: 590b7eaed25SJason Evans - Add --disable-thp and the opt.thp mallctl to provide opt-out mechanisms for 5918244f2aaSJason Evans transparent huge page integration. (@jasone) 5928244f2aaSJason Evans - Update zone allocator integration to work with macOS 10.12. (@glandium) 5938244f2aaSJason Evans - Restructure *CFLAGS configuration, so that CFLAGS behaves typically, and 5948244f2aaSJason Evans EXTRA_CFLAGS provides a way to specify e.g. -Werror during building, but not 5958244f2aaSJason Evans during configuration. (@jasone, @ronawho) 5968244f2aaSJason Evans 5978244f2aaSJason Evans Bug fixes: 5988244f2aaSJason Evans - Fix DSS (sbrk(2)-based) allocation. This regression was first released in 5998244f2aaSJason Evans 4.3.0. (@jasone) 6008244f2aaSJason Evans - Handle race in per size class utilization computation. This functionality 6018244f2aaSJason Evans was first released in 4.0.0. (@interwq) 6028244f2aaSJason Evans - Fix lock order reversal during gdump. (@jasone) 603b7eaed25SJason Evans - Fix/refactor tcache synchronization. This regression was first released in 6048244f2aaSJason Evans 4.0.0. (@jasone) 6058244f2aaSJason Evans - Fix various JSON-formatted malloc_stats_print() bugs. This functionality 6068244f2aaSJason Evans was first released in 4.3.0. (@jasone) 6078244f2aaSJason Evans - Fix huge-aligned allocation. This regression was first released in 4.4.0. 6088244f2aaSJason Evans (@jasone) 6098244f2aaSJason Evans - When transparent huge page integration is enabled, detect what state pages 6108244f2aaSJason Evans start in according to the kernel's current operating mode, and only convert 6118244f2aaSJason Evans arena chunks to non-huge during purging if that is not their initial state. 6128244f2aaSJason Evans This functionality was first released in 4.4.0. (@jasone) 6138244f2aaSJason Evans - Fix lg_chunk clamping for the --enable-cache-oblivious --disable-fill case. 6148244f2aaSJason Evans This regression was first released in 4.0.0. (@jasone, @428desmo) 6158244f2aaSJason Evans - Properly detect sparc64 when building for Linux. (@glaubitz) 6168244f2aaSJason Evans 6177fa7f12fSJason Evans* 4.4.0 (December 3, 2016) 6187fa7f12fSJason Evans 6197fa7f12fSJason Evans New features: 6207fa7f12fSJason Evans - Add configure support for *-*-linux-android. (@cferris1000, @jasone) 6217fa7f12fSJason Evans - Add the --disable-syscall configure option, for use on systems that place 6227fa7f12fSJason Evans security-motivated limitations on syscall(2). (@jasone) 6237fa7f12fSJason Evans - Add support for Debian GNU/kFreeBSD. (@thesam) 6247fa7f12fSJason Evans 6257fa7f12fSJason Evans Optimizations: 6267fa7f12fSJason Evans - Add extent serial numbers and use them where appropriate as a sort key that 6277fa7f12fSJason Evans is higher priority than address, so that the allocation policy prefers older 6287fa7f12fSJason Evans extents. This tends to improve locality (decrease fragmentation) when 6297fa7f12fSJason Evans memory grows downward. (@jasone) 6307fa7f12fSJason Evans - Refactor madvise(2) configuration so that MADV_FREE is detected and utilized 6317fa7f12fSJason Evans on Linux 4.5 and newer. (@jasone) 6327fa7f12fSJason Evans - Mark partially purged arena chunks as non-huge-page. This improves 6337fa7f12fSJason Evans interaction with Linux's transparent huge page functionality. (@jasone) 6347fa7f12fSJason Evans 6357fa7f12fSJason Evans Bug fixes: 6367fa7f12fSJason Evans - Fix size class computations for edge conditions involving extremely large 6377fa7f12fSJason Evans allocations. This regression was first released in 4.0.0. (@jasone, 6387fa7f12fSJason Evans @ingvarha) 6397fa7f12fSJason Evans - Remove overly restrictive assertions related to the cactive statistic. This 6407fa7f12fSJason Evans regression was first released in 4.1.0. (@jasone) 6417fa7f12fSJason Evans - Implement a more reliable detection scheme for os_unfair_lock on macOS. 6427fa7f12fSJason Evans (@jszakmeister) 6437fa7f12fSJason Evans 644bde95144SJason Evans* 4.3.1 (November 7, 2016) 645bde95144SJason Evans 646bde95144SJason Evans Bug fixes: 647bde95144SJason Evans - Fix a severe virtual memory leak. This regression was first released in 648bde95144SJason Evans 4.3.0. (@interwq, @jasone) 649bde95144SJason Evans - Refactor atomic and prng APIs to restore support for 32-bit platforms that 650bde95144SJason Evans use pre-C11 toolchains, e.g. FreeBSD's mips. (@jasone) 651bde95144SJason Evans 652bde95144SJason Evans* 4.3.0 (November 4, 2016) 653bde95144SJason Evans 654bde95144SJason Evans This is the first release that passes the test suite for multiple Windows 655bde95144SJason Evans configurations, thanks in large part to @glandium setting up continuous 656bde95144SJason Evans integration via AppVeyor (and Travis CI for Linux and OS X). 657bde95144SJason Evans 658bde95144SJason Evans New features: 659bde95144SJason Evans - Add "J" (JSON) support to malloc_stats_print(). (@jasone) 660bde95144SJason Evans - Add Cray compiler support. (@ronawho) 661bde95144SJason Evans 662bde95144SJason Evans Optimizations: 663bde95144SJason Evans - Add/use adaptive spinning for bootstrapping and radix tree node 664bde95144SJason Evans initialization. (@jasone) 665bde95144SJason Evans 666bde95144SJason Evans Bug fixes: 667bde95144SJason Evans - Fix large allocation to search starting in the optimal size class heap, 668bde95144SJason Evans which can substantially reduce virtual memory churn and fragmentation. This 669bde95144SJason Evans regression was first released in 4.0.0. (@mjp41, @jasone) 670bde95144SJason Evans - Fix stats.arenas.<i>.nthreads accounting. (@interwq) 671bde95144SJason Evans - Fix and simplify decay-based purging. (@jasone) 672bde95144SJason Evans - Make DSS (sbrk(2)-related) operations lockless, which resolves potential 673bde95144SJason Evans deadlocks during thread exit. (@jasone) 674bde95144SJason Evans - Fix over-sized allocation of radix tree leaf nodes. (@mjp41, @ogaun, 675bde95144SJason Evans @jasone) 676bde95144SJason Evans - Fix over-sized allocation of arena_t (plus associated stats) data 677bde95144SJason Evans structures. (@jasone, @interwq) 678bde95144SJason Evans - Fix EXTRA_CFLAGS to not affect configuration. (@jasone) 679bde95144SJason Evans - Fix a Valgrind integration bug. (@ronawho) 680bde95144SJason Evans - Disallow 0x5a junk filling when running in Valgrind. (@jasone) 681bde95144SJason Evans - Fix a file descriptor leak on Linux. This regression was first released in 682bde95144SJason Evans 4.2.0. (@vsarunas, @jasone) 683bde95144SJason Evans - Fix static linking of jemalloc with glibc. (@djwatson) 684bde95144SJason Evans - Use syscall(2) rather than {open,read,close}(2) during boot on Linux. This 685bde95144SJason Evans works around other libraries' system call wrappers performing reentrant 686bde95144SJason Evans allocation. (@kspinka, @Whissi, @jasone) 687bde95144SJason Evans - Fix OS X default zone replacement to work with OS X 10.12. (@glandium, 688bde95144SJason Evans @jasone) 689bde95144SJason Evans - Fix cached memory management to avoid needless commit/decommit operations 690bde95144SJason Evans during purging, which resolves permanent virtual memory map fragmentation 691bde95144SJason Evans issues on Windows. (@mjp41, @jasone) 692bde95144SJason Evans - Fix TSD fetches to avoid (recursive) allocation. This is relevant to 693bde95144SJason Evans non-TLS and Windows configurations. (@jasone) 694bde95144SJason Evans - Fix malloc_conf overriding to work on Windows. (@jasone) 695bde95144SJason Evans - Forcibly disable lazy-lock on Windows (was forcibly *enabled*). (@jasone) 696bde95144SJason Evans 69762b2691eSJason Evans* 4.2.1 (June 8, 2016) 69862b2691eSJason Evans 69962b2691eSJason Evans Bug fixes: 70062b2691eSJason Evans - Fix bootstrapping issues for configurations that require allocation during 70162b2691eSJason Evans tsd initialization (e.g. --disable-tls). (@cferris1000, @jasone) 70262b2691eSJason Evans - Fix gettimeofday() version of nstime_update(). (@ronawho) 70362b2691eSJason Evans - Fix Valgrind regressions in calloc() and chunk_alloc_wrapper(). (@ronawho) 70462b2691eSJason Evans - Fix potential VM map fragmentation regression. (@jasone) 70562b2691eSJason Evans - Fix opt_zero-triggered in-place huge reallocation zeroing. (@jasone) 70662b2691eSJason Evans - Fix heap profiling context leaks in reallocation edge cases. (@jasone) 70762b2691eSJason Evans 7081f0a49e8SJason Evans* 4.2.0 (May 12, 2016) 7091f0a49e8SJason Evans 7101f0a49e8SJason Evans New features: 7111f0a49e8SJason Evans - Add the arena.<i>.reset mallctl, which makes it possible to discard all of 7121f0a49e8SJason Evans an arena's allocations in a single operation. (@jasone) 7131f0a49e8SJason Evans - Add the stats.retained and stats.arenas.<i>.retained statistics. (@jasone) 7141f0a49e8SJason Evans - Add the --with-version configure option. (@jasone) 7151f0a49e8SJason Evans - Support --with-lg-page values larger than actual page size. (@jasone) 7161f0a49e8SJason Evans 7171f0a49e8SJason Evans Optimizations: 7181f0a49e8SJason Evans - Use pairing heaps rather than red-black trees for various hot data 7191f0a49e8SJason Evans structures. (@djwatson, @jasone) 7201f0a49e8SJason Evans - Streamline fast paths of rtree operations. (@jasone) 7211f0a49e8SJason Evans - Optimize the fast paths of calloc() and [m,d,sd]allocx(). (@jasone) 7221f0a49e8SJason Evans - Decommit unused virtual memory if the OS does not overcommit. (@jasone) 7231f0a49e8SJason Evans - Specify MAP_NORESERVE on Linux if [heuristic] overcommit is active, in order 7241f0a49e8SJason Evans to avoid unfortunate interactions during fork(2). (@jasone) 7251f0a49e8SJason Evans 7261f0a49e8SJason Evans Bug fixes: 7271f0a49e8SJason Evans - Fix chunk accounting related to triggering gdump profiles. (@jasone) 7281f0a49e8SJason Evans - Link against librt for clock_gettime(2) if glibc < 2.17. (@jasone) 7291f0a49e8SJason Evans - Scale leak report summary according to sampling probability. (@jasone) 7301f0a49e8SJason Evans 7311f0a49e8SJason Evans* 4.1.1 (May 3, 2016) 7321f0a49e8SJason Evans 7331f0a49e8SJason Evans This bugfix release resolves a variety of mostly minor issues, though the 7341f0a49e8SJason Evans bitmap fix is critical for 64-bit Windows. 7351f0a49e8SJason Evans 7361f0a49e8SJason Evans Bug fixes: 7371f0a49e8SJason Evans - Fix the linear scan version of bitmap_sfu() to shift by the proper amount 7381f0a49e8SJason Evans even when sizeof(long) is not the same as sizeof(void *), as on 64-bit 7391f0a49e8SJason Evans Windows. (@jasone) 7401f0a49e8SJason Evans - Fix hashing functions to avoid unaligned memory accesses (and resulting 7411f0a49e8SJason Evans crashes). This is relevant at least to some ARM-based platforms. 7421f0a49e8SJason Evans (@rkmisra) 7431f0a49e8SJason Evans - Fix fork()-related lock rank ordering reversals. These reversals were 7441f0a49e8SJason Evans unlikely to cause deadlocks in practice except when heap profiling was 7451f0a49e8SJason Evans enabled and active. (@jasone) 7461f0a49e8SJason Evans - Fix various chunk leaks in OOM code paths. (@jasone) 7471f0a49e8SJason Evans - Fix malloc_stats_print() to print opt.narenas correctly. (@jasone) 7481f0a49e8SJason Evans - Fix MSVC-specific build/test issues. (@rustyx, @yuslepukhin) 7491f0a49e8SJason Evans - Fix a variety of test failures that were due to test fragility rather than 7501f0a49e8SJason Evans core bugs. (@jasone) 7511f0a49e8SJason Evans 752df0d881dSJason Evans* 4.1.0 (February 28, 2016) 753df0d881dSJason Evans 754df0d881dSJason Evans This release is primarily about optimizations, but it also incorporates a lot 755df0d881dSJason Evans of portability-motivated refactoring and enhancements. Many people worked on 756df0d881dSJason Evans this release, to an extent that even with the omission here of minor changes 757df0d881dSJason Evans (see git revision history), and of the people who reported and diagnosed 758df0d881dSJason Evans issues, so much of the work was contributed that starting with this release, 759df0d881dSJason Evans changes are annotated with author credits to help reflect the collaborative 760df0d881dSJason Evans effort involved. 761df0d881dSJason Evans 762df0d881dSJason Evans New features: 763df0d881dSJason Evans - Implement decay-based unused dirty page purging, a major optimization with 764df0d881dSJason Evans mallctl API impact. This is an alternative to the existing ratio-based 765df0d881dSJason Evans unused dirty page purging, and is intended to eventually become the sole 766df0d881dSJason Evans purging mechanism. New mallctls: 767df0d881dSJason Evans + opt.purge 768df0d881dSJason Evans + opt.decay_time 769df0d881dSJason Evans + arena.<i>.decay 770df0d881dSJason Evans + arena.<i>.decay_time 771df0d881dSJason Evans + arenas.decay_time 772df0d881dSJason Evans + stats.arenas.<i>.decay_time 773df0d881dSJason Evans (@jasone, @cevans87) 774df0d881dSJason Evans - Add --with-malloc-conf, which makes it possible to embed a default 775df0d881dSJason Evans options string during configuration. This was motivated by the desire to 776df0d881dSJason Evans specify --with-malloc-conf=purge:decay , since the default must remain 777df0d881dSJason Evans purge:ratio until the 5.0.0 release. (@jasone) 778df0d881dSJason Evans - Add MS Visual Studio 2015 support. (@rustyx, @yuslepukhin) 779df0d881dSJason Evans - Make *allocx() size class overflow behavior defined. The maximum 780df0d881dSJason Evans size class is now less than PTRDIFF_MAX to protect applications against 781df0d881dSJason Evans numerical overflow, and all allocation functions are guaranteed to indicate 782df0d881dSJason Evans errors rather than potentially crashing if the request size exceeds the 783df0d881dSJason Evans maximum size class. (@jasone) 784df0d881dSJason Evans - jeprof: 785df0d881dSJason Evans + Add raw heap profile support. (@jasone) 786df0d881dSJason Evans + Add --retain and --exclude for backtrace symbol filtering. (@jasone) 787df0d881dSJason Evans 788df0d881dSJason Evans Optimizations: 789df0d881dSJason Evans - Optimize the fast path to combine various bootstrapping and configuration 790df0d881dSJason Evans checks and execute more streamlined code in the common case. (@interwq) 791df0d881dSJason Evans - Use linear scan for small bitmaps (used for small object tracking). In 792df0d881dSJason Evans addition to speeding up bitmap operations on 64-bit systems, this reduces 793df0d881dSJason Evans allocator metadata overhead by approximately 0.2%. (@djwatson) 794df0d881dSJason Evans - Separate arena_avail trees, which substantially speeds up run tree 795df0d881dSJason Evans operations. (@djwatson) 796df0d881dSJason Evans - Use memoization (boot-time-computed table) for run quantization. Separate 797df0d881dSJason Evans arena_avail trees reduced the importance of this optimization. (@jasone) 798df0d881dSJason Evans - Attempt mmap-based in-place huge reallocation. This can dramatically speed 799df0d881dSJason Evans up incremental huge reallocation. (@jasone) 800df0d881dSJason Evans 801df0d881dSJason Evans Incompatible changes: 802df0d881dSJason Evans - Make opt.narenas unsigned rather than size_t. (@jasone) 803df0d881dSJason Evans 804df0d881dSJason Evans Bug fixes: 805df0d881dSJason Evans - Fix stats.cactive accounting regression. (@rustyx, @jasone) 806df0d881dSJason Evans - Handle unaligned keys in hash(). This caused problems for some ARM systems. 8071f0a49e8SJason Evans (@jasone, @cferris1000) 808df0d881dSJason Evans - Refactor arenas array. In addition to fixing a fork-related deadlock, this 809df0d881dSJason Evans makes arena lookups faster and simpler. (@jasone) 810df0d881dSJason Evans - Move retained memory allocation out of the default chunk allocation 811df0d881dSJason Evans function, to a location that gets executed even if the application installs 812df0d881dSJason Evans a custom chunk allocation function. This resolves a virtual memory leak. 813df0d881dSJason Evans (@buchgr) 8141f0a49e8SJason Evans - Fix a potential tsd cleanup leak. (@cferris1000, @jasone) 815df0d881dSJason Evans - Fix run quantization. In practice this bug had no impact unless 816df0d881dSJason Evans applications requested memory with alignment exceeding one page. 817df0d881dSJason Evans (@jasone, @djwatson) 818df0d881dSJason Evans - Fix LinuxThreads-specific bootstrapping deadlock. (Cosmin Paraschiv) 819df0d881dSJason Evans - jeprof: 820df0d881dSJason Evans + Don't discard curl options if timeout is not defined. (@djwatson) 821df0d881dSJason Evans + Detect failed profile fetches. (@djwatson) 822df0d881dSJason Evans - Fix stats.arenas.<i>.{dss,lg_dirty_mult,decay_time,pactive,pdirty} for 823df0d881dSJason Evans --disable-stats case. (@jasone) 824df0d881dSJason Evans 825ba4f5cc0SJason Evans* 4.0.4 (October 24, 2015) 826ba4f5cc0SJason Evans 827ba4f5cc0SJason Evans This bugfix release fixes another xallocx() regression. No other regressions 828ba4f5cc0SJason Evans have come to light in over a month, so this is likely a good starting point 829ba4f5cc0SJason Evans for people who prefer to wait for "dot one" releases with all the major issues 830ba4f5cc0SJason Evans shaken out. 831ba4f5cc0SJason Evans 832ba4f5cc0SJason Evans Bug fixes: 833ba4f5cc0SJason Evans - Fix xallocx(..., MALLOCX_ZERO to zero the last full trailing page of large 834ba4f5cc0SJason Evans allocations that have been randomly assigned an offset of 0 when 835ba4f5cc0SJason Evans --enable-cache-oblivious configure option is enabled. 836ba4f5cc0SJason Evans 837ba4f5cc0SJason Evans* 4.0.3 (September 24, 2015) 838ba4f5cc0SJason Evans 839ba4f5cc0SJason Evans This bugfix release continues the trend of xallocx() and heap profiling fixes. 840ba4f5cc0SJason Evans 841ba4f5cc0SJason Evans Bug fixes: 842ba4f5cc0SJason Evans - Fix xallocx(..., MALLOCX_ZERO) to zero all trailing bytes of large 843ba4f5cc0SJason Evans allocations when --enable-cache-oblivious configure option is enabled. 844ba4f5cc0SJason Evans - Fix xallocx(..., MALLOCX_ZERO) to zero trailing bytes of huge allocations 845ba4f5cc0SJason Evans when resizing from/to a size class that is not a multiple of the chunk size. 846ba4f5cc0SJason Evans - Fix prof_tctx_dump_iter() to filter out nodes that were created after heap 847ba4f5cc0SJason Evans profile dumping started. 848ba4f5cc0SJason Evans - Work around a potentially bad thread-specific data initialization 849ba4f5cc0SJason Evans interaction with NPTL (glibc's pthreads implementation). 850ba4f5cc0SJason Evans 851536b3538SJason Evans* 4.0.2 (September 21, 2015) 852536b3538SJason Evans 853536b3538SJason Evans This bugfix release addresses a few bugs specific to heap profiling. 854536b3538SJason Evans 855536b3538SJason Evans Bug fixes: 856536b3538SJason Evans - Fix ixallocx_prof_sample() to never modify nor create sampled small 857536b3538SJason Evans allocations. xallocx() is in general incapable of moving small allocations, 858536b3538SJason Evans so this fix removes buggy code without loss of generality. 859536b3538SJason Evans - Fix irallocx_prof_sample() to always allocate large regions, even when 860536b3538SJason Evans alignment is non-zero. 861536b3538SJason Evans - Fix prof_alloc_rollback() to read tdata from thread-specific data rather 862536b3538SJason Evans than dereferencing a potentially invalid tctx. 863536b3538SJason Evans 864536b3538SJason Evans* 4.0.1 (September 15, 2015) 865536b3538SJason Evans 866536b3538SJason Evans This is a bugfix release that is somewhat high risk due to the amount of 867536b3538SJason Evans refactoring required to address deep xallocx() problems. As a side effect of 868536b3538SJason Evans these fixes, xallocx() now tries harder to partially fulfill requests for 869536b3538SJason Evans optional extra space. Note that a couple of minor heap profiling 870536b3538SJason Evans optimizations are included, but these are better thought of as performance 8710ef50b4eSJason Evans fixes that were integral to discovering most of the other bugs. 872536b3538SJason Evans 873536b3538SJason Evans Optimizations: 874536b3538SJason Evans - Avoid a chunk metadata read in arena_prof_tctx_set(), since it is in the 875536b3538SJason Evans fast path when heap profiling is enabled. Additionally, split a special 876536b3538SJason Evans case out into arena_prof_tctx_reset(), which also avoids chunk metadata 877536b3538SJason Evans reads. 878536b3538SJason Evans - Optimize irallocx_prof() to optimistically update the sampler state. The 879536b3538SJason Evans prior implementation appears to have been a holdover from when 880536b3538SJason Evans rallocx()/xallocx() functionality was combined as rallocm(). 881536b3538SJason Evans 882536b3538SJason Evans Bug fixes: 883536b3538SJason Evans - Fix TLS configuration such that it is enabled by default for platforms on 884536b3538SJason Evans which it works correctly. 885536b3538SJason Evans - Fix arenas_cache_cleanup() and arena_get_hard() to handle 886536b3538SJason Evans allocation/deallocation within the application's thread-specific data 887536b3538SJason Evans cleanup functions even after arenas_cache is torn down. 888536b3538SJason Evans - Fix xallocx() bugs related to size+extra exceeding HUGE_MAXCLASS. 889536b3538SJason Evans - Fix chunk purge hook calls for in-place huge shrinking reallocation to 890536b3538SJason Evans specify the old chunk size rather than the new chunk size. This bug caused 891536b3538SJason Evans no correctness issues for the default chunk purge function, but was 892536b3538SJason Evans visible to custom functions set via the "arena.<i>.chunk_hooks" mallctl. 893536b3538SJason Evans - Fix heap profiling bugs: 894536b3538SJason Evans + Fix heap profiling to distinguish among otherwise identical sample sites 895536b3538SJason Evans with interposed resets (triggered via the "prof.reset" mallctl). This bug 896536b3538SJason Evans could cause data structure corruption that would most likely result in a 897536b3538SJason Evans segfault. 898536b3538SJason Evans + Fix irealloc_prof() to prof_alloc_rollback() on OOM. 899536b3538SJason Evans + Make one call to prof_active_get_unlocked() per allocation event, and use 900536b3538SJason Evans the result throughout the relevant functions that handle an allocation 901536b3538SJason Evans event. Also add a missing check in prof_realloc(). These fixes protect 902536b3538SJason Evans allocation events against concurrent prof_active changes. 903536b3538SJason Evans + Fix ixallocx_prof() to pass usize_max and zero to ixallocx_prof_sample() 904536b3538SJason Evans in the correct order. 905536b3538SJason Evans + Fix prof_realloc() to call prof_free_sampled_object() after calling 906536b3538SJason Evans prof_malloc_sample_object(). Prior to this fix, if tctx and old_tctx were 907536b3538SJason Evans the same, the tctx could have been prematurely destroyed. 908536b3538SJason Evans - Fix portability bugs: 909536b3538SJason Evans + Don't bitshift by negative amounts when encoding/decoding run sizes in 910536b3538SJason Evans chunk header maps. This affected systems with page sizes greater than 8 911536b3538SJason Evans KiB. 912536b3538SJason Evans + Rename index_t to szind_t to avoid an existing type on Solaris. 913536b3538SJason Evans + Add JEMALLOC_CXX_THROW to the memalign() function prototype, in order to 914536b3538SJason Evans match glibc and avoid compilation errors when including both 915536b3538SJason Evans jemalloc/jemalloc.h and malloc.h in C++ code. 916536b3538SJason Evans + Don't assume that /bin/sh is appropriate when running size_classes.sh 917536b3538SJason Evans during configuration. 918536b3538SJason Evans + Consider __sparcv9 a synonym for __sparc64__ when defining LG_QUANTUM. 919536b3538SJason Evans + Link tests to librt if it contains clock_gettime(2). 920536b3538SJason Evans 921d0e79aa3SJason Evans* 4.0.0 (August 17, 2015) 922d0e79aa3SJason Evans 923d0e79aa3SJason Evans This version contains many speed and space optimizations, both minor and 924d0e79aa3SJason Evans major. The major themes are generalization, unification, and simplification. 925d0e79aa3SJason Evans Although many of these optimizations cause no visible behavior change, their 926d0e79aa3SJason Evans cumulative effect is substantial. 927d0e79aa3SJason Evans 928d0e79aa3SJason Evans New features: 929d0e79aa3SJason Evans - Normalize size class spacing to be consistent across the complete size 930d0e79aa3SJason Evans range. By default there are four size classes per size doubling, but this 931d0e79aa3SJason Evans is now configurable via the --with-lg-size-class-group option. Also add the 932d0e79aa3SJason Evans --with-lg-page, --with-lg-page-sizes, --with-lg-quantum, and 933d0e79aa3SJason Evans --with-lg-tiny-min options, which can be used to tweak page and size class 934d0e79aa3SJason Evans settings. Impacts: 935d0e79aa3SJason Evans + Worst case performance for incrementally growing/shrinking reallocation 936d0e79aa3SJason Evans is improved because there are far fewer size classes, and therefore 937d0e79aa3SJason Evans copying happens less often. 938d0e79aa3SJason Evans + Internal fragmentation is limited to 20% for all but the smallest size 939d0e79aa3SJason Evans classes (those less than four times the quantum). (1B + 4 KiB) 940d0e79aa3SJason Evans and (1B + 4 MiB) previously suffered nearly 50% internal fragmentation. 941d0e79aa3SJason Evans + Chunk fragmentation tends to be lower because there are fewer distinct run 942d0e79aa3SJason Evans sizes to pack. 943d0e79aa3SJason Evans - Add support for explicit tcaches. The "tcache.create", "tcache.flush", and 944d0e79aa3SJason Evans "tcache.destroy" mallctls control tcache lifetime and flushing, and the 945d0e79aa3SJason Evans MALLOCX_TCACHE(tc) and MALLOCX_TCACHE_NONE flags to the *allocx() API 946d0e79aa3SJason Evans control which tcache is used for each operation. 947d0e79aa3SJason Evans - Implement per thread heap profiling, as well as the ability to 948d0e79aa3SJason Evans enable/disable heap profiling on a per thread basis. Add the "prof.reset", 949d0e79aa3SJason Evans "prof.lg_sample", "thread.prof.name", "thread.prof.active", 950d0e79aa3SJason Evans "opt.prof_thread_active_init", "prof.thread_active_init", and 951d0e79aa3SJason Evans "thread.prof.active" mallctls. 952d0e79aa3SJason Evans - Add support for per arena application-specified chunk allocators, configured 953d0e79aa3SJason Evans via the "arena.<i>.chunk_hooks" mallctl. 954d0e79aa3SJason Evans - Refactor huge allocation to be managed by arenas, so that arenas now 955d0e79aa3SJason Evans function as general purpose independent allocators. This is important in 956d0e79aa3SJason Evans the context of user-specified chunk allocators, aside from the scalability 957d0e79aa3SJason Evans benefits. Related new statistics: 958d0e79aa3SJason Evans + The "stats.arenas.<i>.huge.allocated", "stats.arenas.<i>.huge.nmalloc", 959d0e79aa3SJason Evans "stats.arenas.<i>.huge.ndalloc", and "stats.arenas.<i>.huge.nrequests" 960d0e79aa3SJason Evans mallctls provide high level per arena huge allocation statistics. 961d0e79aa3SJason Evans + The "arenas.nhchunks", "arenas.hchunk.<i>.size", 962d0e79aa3SJason Evans "stats.arenas.<i>.hchunks.<j>.nmalloc", 963d0e79aa3SJason Evans "stats.arenas.<i>.hchunks.<j>.ndalloc", 964d0e79aa3SJason Evans "stats.arenas.<i>.hchunks.<j>.nrequests", and 965d0e79aa3SJason Evans "stats.arenas.<i>.hchunks.<j>.curhchunks" mallctls provide per size class 966d0e79aa3SJason Evans statistics. 967d0e79aa3SJason Evans - Add the 'util' column to malloc_stats_print() output, which reports the 968d0e79aa3SJason Evans proportion of available regions that are currently in use for each small 969d0e79aa3SJason Evans size class. 970d0e79aa3SJason Evans - Add "alloc" and "free" modes for for junk filling (see the "opt.junk" 971d0e79aa3SJason Evans mallctl), so that it is possible to separately enable junk filling for 972d0e79aa3SJason Evans allocation versus deallocation. 973d0e79aa3SJason Evans - Add the jemalloc-config script, which provides information about how 974d0e79aa3SJason Evans jemalloc was configured, and how to integrate it into application builds. 975d0e79aa3SJason Evans - Add metadata statistics, which are accessible via the "stats.metadata", 976d0e79aa3SJason Evans "stats.arenas.<i>.metadata.mapped", and 977d0e79aa3SJason Evans "stats.arenas.<i>.metadata.allocated" mallctls. 978d0e79aa3SJason Evans - Add the "stats.resident" mallctl, which reports the upper limit of 979d0e79aa3SJason Evans physically resident memory mapped by the allocator. 980d0e79aa3SJason Evans - Add per arena control over unused dirty page purging, via the 981d0e79aa3SJason Evans "arenas.lg_dirty_mult", "arena.<i>.lg_dirty_mult", and 982d0e79aa3SJason Evans "stats.arenas.<i>.lg_dirty_mult" mallctls. 983d0e79aa3SJason Evans - Add the "prof.gdump" mallctl, which makes it possible to toggle the gdump 984d0e79aa3SJason Evans feature on/off during program execution. 985d0e79aa3SJason Evans - Add sdallocx(), which implements sized deallocation. The primary 986d0e79aa3SJason Evans optimization over dallocx() is the removal of a metadata read, which often 987d0e79aa3SJason Evans suffers an L1 cache miss. 988d0e79aa3SJason Evans - Add missing header includes in jemalloc/jemalloc.h, so that applications 989d0e79aa3SJason Evans only have to #include <jemalloc/jemalloc.h>. 990d0e79aa3SJason Evans - Add support for additional platforms: 991d0e79aa3SJason Evans + Bitrig 992d0e79aa3SJason Evans + Cygwin 993d0e79aa3SJason Evans + DragonFlyBSD 994d0e79aa3SJason Evans + iOS 995d0e79aa3SJason Evans + OpenBSD 996d0e79aa3SJason Evans + OpenRISC/or1k 997d0e79aa3SJason Evans 998d0e79aa3SJason Evans Optimizations: 999d0e79aa3SJason Evans - Maintain dirty runs in per arena LRUs rather than in per arena trees of 1000d0e79aa3SJason Evans dirty-run-containing chunks. In practice this change significantly reduces 1001d0e79aa3SJason Evans dirty page purging volume. 1002d0e79aa3SJason Evans - Integrate whole chunks into the unused dirty page purging machinery. This 1003d0e79aa3SJason Evans reduces the cost of repeated huge allocation/deallocation, because it 1004d0e79aa3SJason Evans effectively introduces a cache of chunks. 1005d0e79aa3SJason Evans - Split the arena chunk map into two separate arrays, in order to increase 1006d0e79aa3SJason Evans cache locality for the frequently accessed bits. 1007d0e79aa3SJason Evans - Move small run metadata out of runs, into arena chunk headers. This reduces 1008d0e79aa3SJason Evans run fragmentation, smaller runs reduce external fragmentation for small size 1009d0e79aa3SJason Evans classes, and packed (less uniformly aligned) metadata layout improves CPU 1010d0e79aa3SJason Evans cache set distribution. 1011d0e79aa3SJason Evans - Randomly distribute large allocation base pointer alignment relative to page 1012d0e79aa3SJason Evans boundaries in order to more uniformly utilize CPU cache sets. This can be 1013d0e79aa3SJason Evans disabled via the --disable-cache-oblivious configure option, and queried via 1014d0e79aa3SJason Evans the "config.cache_oblivious" mallctl. 1015d0e79aa3SJason Evans - Micro-optimize the fast paths for the public API functions. 1016d0e79aa3SJason Evans - Refactor thread-specific data to reside in a single structure. This assures 1017d0e79aa3SJason Evans that only a single TLS read is necessary per call into the public API. 1018d0e79aa3SJason Evans - Implement in-place huge allocation growing and shrinking. 1019d0e79aa3SJason Evans - Refactor rtree (radix tree for chunk lookups) to be lock-free, and make 1020d0e79aa3SJason Evans additional optimizations that reduce maximum lookup depth to one or two 1021d0e79aa3SJason Evans levels. This resolves what was a concurrency bottleneck for per arena huge 1022d0e79aa3SJason Evans allocation, because a global data structure is critical for determining 1023d0e79aa3SJason Evans which arenas own which huge allocations. 1024d0e79aa3SJason Evans 1025d0e79aa3SJason Evans Incompatible changes: 1026d0e79aa3SJason Evans - Replace --enable-cc-silence with --disable-cc-silence to suppress spurious 1027d0e79aa3SJason Evans warnings by default. 1028d0e79aa3SJason Evans - Assure that the constness of malloc_usable_size()'s return type matches that 1029d0e79aa3SJason Evans of the system implementation. 1030d0e79aa3SJason Evans - Change the heap profile dump format to support per thread heap profiling, 1031d0e79aa3SJason Evans rename pprof to jeprof, and enhance it with the --thread=<n> option. As a 1032d0e79aa3SJason Evans result, the bundled jeprof must now be used rather than the upstream 1033d0e79aa3SJason Evans (gperftools) pprof. 1034d0e79aa3SJason Evans - Disable "opt.prof_final" by default, in order to avoid atexit(3), which can 1035d0e79aa3SJason Evans internally deadlock on some platforms. 1036d0e79aa3SJason Evans - Change the "arenas.nlruns" mallctl type from size_t to unsigned. 1037d0e79aa3SJason Evans - Replace the "stats.arenas.<i>.bins.<j>.allocated" mallctl with 1038d0e79aa3SJason Evans "stats.arenas.<i>.bins.<j>.curregs". 1039d0e79aa3SJason Evans - Ignore MALLOC_CONF in set{uid,gid,cap} binaries. 1040d0e79aa3SJason Evans - Ignore MALLOCX_ARENA(a) in dallocx(), in favor of using the 1041d0e79aa3SJason Evans MALLOCX_TCACHE(tc) and MALLOCX_TCACHE_NONE flags to control tcache usage. 1042d0e79aa3SJason Evans 1043d0e79aa3SJason Evans Removed features: 1044d0e79aa3SJason Evans - Remove the *allocm() API, which is superseded by the *allocx() API. 1045d0e79aa3SJason Evans - Remove the --enable-dss options, and make dss non-optional on all platforms 1046d0e79aa3SJason Evans which support sbrk(2). 1047d0e79aa3SJason Evans - Remove the "arenas.purge" mallctl, which was obsoleted by the 1048d0e79aa3SJason Evans "arena.<i>.purge" mallctl in 3.1.0. 1049d0e79aa3SJason Evans - Remove the unnecessary "opt.valgrind" mallctl; jemalloc automatically 1050d0e79aa3SJason Evans detects whether it is running inside Valgrind. 1051d0e79aa3SJason Evans - Remove the "stats.huge.allocated", "stats.huge.nmalloc", and 1052d0e79aa3SJason Evans "stats.huge.ndalloc" mallctls. 1053d0e79aa3SJason Evans - Remove the --enable-mremap option. 1054d0e79aa3SJason Evans - Remove the "stats.chunks.current", "stats.chunks.total", and 1055d0e79aa3SJason Evans "stats.chunks.high" mallctls. 1056d0e79aa3SJason Evans 1057d0e79aa3SJason Evans Bug fixes: 1058d0e79aa3SJason Evans - Fix the cactive statistic to decrease (rather than increase) when active 1059d0e79aa3SJason Evans memory decreases. This regression was first released in 3.5.0. 1060d0e79aa3SJason Evans - Fix OOM handling in memalign() and valloc(). A variant of this bug existed 1061d0e79aa3SJason Evans in all releases since 2.0.0, which introduced these functions. 1062d0e79aa3SJason Evans - Fix an OOM-related regression in arena_tcache_fill_small(), which could 1063d0e79aa3SJason Evans cause cache corruption on OOM. This regression was present in all releases 1064d0e79aa3SJason Evans from 2.2.0 through 3.6.0. 1065d0e79aa3SJason Evans - Fix size class overflow handling for malloc(), posix_memalign(), memalign(), 1066d0e79aa3SJason Evans calloc(), and realloc() when profiling is enabled. 1067d0e79aa3SJason Evans - Fix the "arena.<i>.dss" mallctl to return an error if "primary" or 1068d0e79aa3SJason Evans "secondary" precedence is specified, but sbrk(2) is not supported. 1069d0e79aa3SJason Evans - Fix fallback lg_floor() implementations to handle extremely large inputs. 1070d0e79aa3SJason Evans - Ensure the default purgeable zone is after the default zone on OS X. 1071d0e79aa3SJason Evans - Fix latent bugs in atomic_*(). 1072d0e79aa3SJason Evans - Fix the "arena.<i>.dss" mallctl to handle read-only calls. 1073d0e79aa3SJason Evans - Fix tls_model configuration to enable the initial-exec model when possible. 1074d0e79aa3SJason Evans - Mark malloc_conf as a weak symbol so that the application can override it. 1075d0e79aa3SJason Evans - Correctly detect glibc's adaptive pthread mutexes. 1076d0e79aa3SJason Evans - Fix the --without-export configure option. 1077d0e79aa3SJason Evans 10782fff27f8SJason Evans* 3.6.0 (March 31, 2014) 10792fff27f8SJason Evans 10802fff27f8SJason Evans This version contains a critical bug fix for a regression present in 3.5.0 and 10812fff27f8SJason Evans 3.5.1. 10822fff27f8SJason Evans 10832fff27f8SJason Evans Bug fixes: 10842fff27f8SJason Evans - Fix a regression in arena_chunk_alloc() that caused crashes during 10852fff27f8SJason Evans small/large allocation if chunk allocation failed. In the absence of this 10862fff27f8SJason Evans bug, chunk allocation failure would result in allocation failure, e.g. NULL 10872fff27f8SJason Evans return from malloc(). This regression was introduced in 3.5.0. 10882fff27f8SJason Evans - Fix backtracing for gcc intrinsics-based backtracing by specifying 10892fff27f8SJason Evans -fno-omit-frame-pointer to gcc. Note that the application (and all the 10902fff27f8SJason Evans libraries it links to) must also be compiled with this option for 10912fff27f8SJason Evans backtracing to be reliable. 10922fff27f8SJason Evans - Use dss allocation precedence for huge allocations as well as small/large 10932fff27f8SJason Evans allocations. 1094d0e79aa3SJason Evans - Fix test assertion failure message formatting. This bug did not manifest on 10952fff27f8SJason Evans x86_64 systems because of implementation subtleties in va_list. 10962fff27f8SJason Evans - Fix inconsequential test failures for hash and SFMT code. 10972fff27f8SJason Evans 10982fff27f8SJason Evans New features: 10992fff27f8SJason Evans - Support heap profiling on FreeBSD. This feature depends on the proc 11002fff27f8SJason Evans filesystem being mounted during heap profile dumping. 11012fff27f8SJason Evans 1102706d9bd1SJason Evans* 3.5.1 (February 25, 2014) 1103706d9bd1SJason Evans 1104706d9bd1SJason Evans This version primarily addresses minor bugs in test code. 1105706d9bd1SJason Evans 1106706d9bd1SJason Evans Bug fixes: 1107706d9bd1SJason Evans - Configure Solaris/Illumos to use MADV_FREE. 1108706d9bd1SJason Evans - Fix junk filling for mremap(2)-based huge reallocation. This is only 1109706d9bd1SJason Evans relevant if configuring with the --enable-mremap option specified. 1110706d9bd1SJason Evans - Avoid compilation failure if 'restrict' C99 keyword is not supported by the 1111706d9bd1SJason Evans compiler. 1112706d9bd1SJason Evans - Add a configure test for SSE2 rather than assuming it is usable on i686 1113706d9bd1SJason Evans systems. This fixes test compilation errors, especially on 32-bit Linux 1114706d9bd1SJason Evans systems. 1115706d9bd1SJason Evans - Fix mallctl argument size mismatches (size_t vs. uint64_t) in the stats unit 1116706d9bd1SJason Evans test. 1117706d9bd1SJason Evans - Fix/remove flawed alignment-related overflow tests. 1118706d9bd1SJason Evans - Prevent compiler optimizations that could change backtraces in the 1119706d9bd1SJason Evans prof_accum unit test. 1120a4bd5210SJason Evans 1121f921d10fSJason Evans* 3.5.0 (January 22, 2014) 1122f921d10fSJason Evans 1123f921d10fSJason Evans This version focuses on refactoring and automated testing, though it also 1124f921d10fSJason Evans includes some non-trivial heap profiling optimizations not mentioned below. 1125f921d10fSJason Evans 1126f921d10fSJason Evans New features: 1127f921d10fSJason Evans - Add the *allocx() API, which is a successor to the experimental *allocm() 1128f921d10fSJason Evans API. The *allocx() functions are slightly simpler to use because they have 1129f921d10fSJason Evans fewer parameters, they directly return the results of primary interest, and 1130f921d10fSJason Evans mallocx()/rallocx() avoid the strict aliasing pitfall that 1131706d9bd1SJason Evans allocm()/rallocm() share with posix_memalign(). Note that *allocm() is 1132f921d10fSJason Evans slated for removal in the next non-bugfix release. 1133f921d10fSJason Evans - Add support for LinuxThreads. 1134f921d10fSJason Evans 1135f921d10fSJason Evans Bug fixes: 1136f921d10fSJason Evans - Unless heap profiling is enabled, disable floating point code and don't link 1137f921d10fSJason Evans with libm. This, in combination with e.g. EXTRA_CFLAGS=-mno-sse on x64 1138f921d10fSJason Evans systems, makes it possible to completely disable floating point register 1139f921d10fSJason Evans use. Some versions of glibc neglect to save/restore caller-saved floating 1140f921d10fSJason Evans point registers during dynamic lazy symbol loading, and the symbol loading 1141f921d10fSJason Evans code uses whatever malloc the application happens to have linked/loaded 1142f921d10fSJason Evans with, the result being potential floating point register corruption. 1143f921d10fSJason Evans - Report ENOMEM rather than EINVAL if an OOM occurs during heap profiling 1144f921d10fSJason Evans backtrace creation in imemalign(). This bug impacted posix_memalign() and 1145f921d10fSJason Evans aligned_alloc(). 1146f921d10fSJason Evans - Fix a file descriptor leak in a prof_dump_maps() error path. 1147f921d10fSJason Evans - Fix prof_dump() to close the dump file descriptor for all relevant error 1148f921d10fSJason Evans paths. 1149f921d10fSJason Evans - Fix rallocm() to use the arena specified by the ALLOCM_ARENA(s) flag for 1150f921d10fSJason Evans allocation, not just deallocation. 1151f921d10fSJason Evans - Fix a data race for large allocation stats counters. 1152f921d10fSJason Evans - Fix a potential infinite loop during thread exit. This bug occurred on 1153f921d10fSJason Evans Solaris, and could affect other platforms with similar pthreads TSD 1154f921d10fSJason Evans implementations. 1155f921d10fSJason Evans - Don't junk-fill reallocations unless usable size changes. This fixes a 1156f921d10fSJason Evans violation of the *allocx()/*allocm() semantics. 1157f921d10fSJason Evans - Fix growing large reallocation to junk fill new space. 1158f921d10fSJason Evans - Fix huge deallocation to junk fill when munmap is disabled. 1159f921d10fSJason Evans - Change the default private namespace prefix from empty to je_, and change 1160f921d10fSJason Evans --with-private-namespace-prefix so that it prepends an additional prefix 1161f921d10fSJason Evans rather than replacing je_. This reduces the likelihood of applications 1162f921d10fSJason Evans which statically link jemalloc experiencing symbol name collisions. 1163f921d10fSJason Evans - Add missing private namespace mangling (relevant when 1164f921d10fSJason Evans --with-private-namespace is specified). 1165f921d10fSJason Evans - Add and use JEMALLOC_INLINE_C so that static inline functions are marked as 1166f921d10fSJason Evans static even for debug builds. 1167f921d10fSJason Evans - Add a missing mutex unlock in a malloc_init_hard() error path. In practice 1168f921d10fSJason Evans this error path is never executed. 1169f921d10fSJason Evans - Fix numerous bugs in malloc_strotumax() error handling/reporting. These 1170f921d10fSJason Evans bugs had no impact except for malformed inputs. 1171f921d10fSJason Evans - Fix numerous bugs in malloc_snprintf(). These bugs were not exercised by 1172f921d10fSJason Evans existing calls, so they had no impact. 1173f921d10fSJason Evans 11742b06b201SJason Evans* 3.4.1 (October 20, 2013) 11752b06b201SJason Evans 11762b06b201SJason Evans Bug fixes: 11772b06b201SJason Evans - Fix a race in the "arenas.extend" mallctl that could cause memory corruption 11782b06b201SJason Evans of internal data structures and subsequent crashes. 11792b06b201SJason Evans - Fix Valgrind integration flaws that caused Valgrind warnings about reads of 11802b06b201SJason Evans uninitialized memory in: 11812b06b201SJason Evans + arena chunk headers 11822b06b201SJason Evans + internal zero-initialized data structures (relevant to tcache and prof 11832b06b201SJason Evans code) 11842b06b201SJason Evans - Preserve errno during the first allocation. A readlink(2) call during 11852b06b201SJason Evans initialization fails unless /etc/malloc.conf exists, so errno was typically 11862b06b201SJason Evans set during the first allocation prior to this fix. 11872b06b201SJason Evans - Fix compilation warnings reported by gcc 4.8.1. 11882b06b201SJason Evans 1189f8ca2db1SJason Evans* 3.4.0 (June 2, 2013) 1190f8ca2db1SJason Evans 1191f8ca2db1SJason Evans This version is essentially a small bugfix release, but the addition of 1192f8ca2db1SJason Evans aarch64 support requires that the minor version be incremented. 1193f8ca2db1SJason Evans 1194f8ca2db1SJason Evans Bug fixes: 1195f8ca2db1SJason Evans - Fix race-triggered deadlocks in chunk_record(). These deadlocks were 1196f8ca2db1SJason Evans typically triggered by multiple threads concurrently deallocating huge 1197f8ca2db1SJason Evans objects. 1198f8ca2db1SJason Evans 1199f8ca2db1SJason Evans New features: 1200f8ca2db1SJason Evans - Add support for the aarch64 architecture. 1201f8ca2db1SJason Evans 1202f8ca2db1SJason Evans* 3.3.1 (March 6, 2013) 1203f8ca2db1SJason Evans 1204f8ca2db1SJason Evans This version fixes bugs that are typically encountered only when utilizing 1205f8ca2db1SJason Evans custom run-time options. 1206f8ca2db1SJason Evans 1207f8ca2db1SJason Evans Bug fixes: 1208f8ca2db1SJason Evans - Fix a locking order bug that could cause deadlock during fork if heap 1209f8ca2db1SJason Evans profiling were enabled. 1210f8ca2db1SJason Evans - Fix a chunk recycling bug that could cause the allocator to lose track of 1211f8ca2db1SJason Evans whether a chunk was zeroed. On FreeBSD, NetBSD, and OS X, it could cause 1212f8ca2db1SJason Evans corruption if allocating via sbrk(2) (unlikely unless running with the 1213f8ca2db1SJason Evans "dss:primary" option specified). This was completely harmless on Linux 1214f8ca2db1SJason Evans unless using mlockall(2) (and unlikely even then, unless the 1215f8ca2db1SJason Evans --disable-munmap configure option or the "dss:primary" option was 1216f8ca2db1SJason Evans specified). This regression was introduced in 3.1.0 by the 1217f8ca2db1SJason Evans mlockall(2)/madvise(2) interaction fix. 1218f8ca2db1SJason Evans - Fix TLS-related memory corruption that could occur during thread exit if the 1219f8ca2db1SJason Evans thread never allocated memory. Only the quarantine and prof facilities were 1220f8ca2db1SJason Evans susceptible. 1221f8ca2db1SJason Evans - Fix two quarantine bugs: 1222f8ca2db1SJason Evans + Internal reallocation of the quarantined object array leaked the old 1223f8ca2db1SJason Evans array. 1224f8ca2db1SJason Evans + Reallocation failure for internal reallocation of the quarantined object 1225f8ca2db1SJason Evans array (very unlikely) resulted in memory corruption. 1226f8ca2db1SJason Evans - Fix Valgrind integration to annotate all internally allocated memory in a 1227f8ca2db1SJason Evans way that keeps Valgrind happy about internal data structure access. 1228f8ca2db1SJason Evans - Fix building for s390 systems. 1229f8ca2db1SJason Evans 123088ad2f8dSJason Evans* 3.3.0 (January 23, 2013) 123188ad2f8dSJason Evans 123288ad2f8dSJason Evans This version includes a few minor performance improvements in addition to the 123388ad2f8dSJason Evans listed new features and bug fixes. 123488ad2f8dSJason Evans 123588ad2f8dSJason Evans New features: 123688ad2f8dSJason Evans - Add clipping support to lg_chunk option processing. 123788ad2f8dSJason Evans - Add the --enable-ivsalloc option. 123888ad2f8dSJason Evans - Add the --without-export option. 123988ad2f8dSJason Evans - Add the --disable-zone-allocator option. 124088ad2f8dSJason Evans 124188ad2f8dSJason Evans Bug fixes: 124288ad2f8dSJason Evans - Fix "arenas.extend" mallctl to output the number of arenas. 12432b06b201SJason Evans - Fix chunk_recycle() to unconditionally inform Valgrind that returned memory 124488ad2f8dSJason Evans is undefined. 124588ad2f8dSJason Evans - Fix build break on FreeBSD related to alloca.h. 124688ad2f8dSJason Evans 124782872ac0SJason Evans* 3.2.0 (November 9, 2012) 124882872ac0SJason Evans 124982872ac0SJason Evans In addition to a couple of bug fixes, this version modifies page run 125082872ac0SJason Evans allocation and dirty page purging algorithms in order to better control 125182872ac0SJason Evans page-level virtual memory fragmentation. 125282872ac0SJason Evans 125382872ac0SJason Evans Incompatible changes: 125482872ac0SJason Evans - Change the "opt.lg_dirty_mult" default from 5 to 3 (32:1 to 8:1). 125582872ac0SJason Evans 125682872ac0SJason Evans Bug fixes: 125782872ac0SJason Evans - Fix dss/mmap allocation precedence code to use recyclable mmap memory only 125882872ac0SJason Evans after primary dss allocation fails. 125982872ac0SJason Evans - Fix deadlock in the "arenas.purge" mallctl. This regression was introduced 126082872ac0SJason Evans in 3.1.0 by the addition of the "arena.<i>.purge" mallctl. 126182872ac0SJason Evans 126282872ac0SJason Evans* 3.1.0 (October 16, 2012) 126382872ac0SJason Evans 126482872ac0SJason Evans New features: 126582872ac0SJason Evans - Auto-detect whether running inside Valgrind, thus removing the need to 126682872ac0SJason Evans manually specify MALLOC_CONF=valgrind:true. 126782872ac0SJason Evans - Add the "arenas.extend" mallctl, which allows applications to create 126882872ac0SJason Evans manually managed arenas. 126982872ac0SJason Evans - Add the ALLOCM_ARENA() flag for {,r,d}allocm(). 127082872ac0SJason Evans - Add the "opt.dss", "arena.<i>.dss", and "stats.arenas.<i>.dss" mallctls, 127182872ac0SJason Evans which provide control over dss/mmap precedence. 127282872ac0SJason Evans - Add the "arena.<i>.purge" mallctl, which obsoletes "arenas.purge". 127382872ac0SJason Evans - Define LG_QUANTUM for hppa. 127482872ac0SJason Evans 127582872ac0SJason Evans Incompatible changes: 127682872ac0SJason Evans - Disable tcache by default if running inside Valgrind, in order to avoid 127782872ac0SJason Evans making unallocated objects appear reachable to Valgrind. 127882872ac0SJason Evans - Drop const from malloc_usable_size() argument on Linux. 127982872ac0SJason Evans 128082872ac0SJason Evans Bug fixes: 128182872ac0SJason Evans - Fix heap profiling crash if sampled object is freed via realloc(p, 0). 128282872ac0SJason Evans - Remove const from __*_hook variable declarations, so that glibc can modify 128382872ac0SJason Evans them during process forking. 128482872ac0SJason Evans - Fix mlockall(2)/madvise(2) interaction. 128582872ac0SJason Evans - Fix fork(2)-related deadlocks. 128682872ac0SJason Evans - Fix error return value for "thread.tcache.enabled" mallctl. 128782872ac0SJason Evans 128835dad073SJason Evans* 3.0.0 (May 11, 2012) 1289a4bd5210SJason Evans 1290a4bd5210SJason Evans Although this version adds some major new features, the primary focus is on 1291a4bd5210SJason Evans internal code cleanup that facilitates maintainability and portability, most 1292a4bd5210SJason Evans of which is not reflected in the ChangeLog. This is the first release to 1293a4bd5210SJason Evans incorporate substantial contributions from numerous other developers, and the 1294a4bd5210SJason Evans result is a more broadly useful allocator (see the git revision history for 1295a4bd5210SJason Evans contribution details). Note that the license has been unified, thanks to 1296a4bd5210SJason Evans Facebook granting a license under the same terms as the other copyright 1297a4bd5210SJason Evans holders (see COPYING). 1298a4bd5210SJason Evans 1299a4bd5210SJason Evans New features: 1300a4bd5210SJason Evans - Implement Valgrind support, redzones, and quarantine. 1301e722f8f8SJason Evans - Add support for additional platforms: 1302a4bd5210SJason Evans + FreeBSD 1303a4bd5210SJason Evans + Mac OS X Lion 1304e722f8f8SJason Evans + MinGW 130535dad073SJason Evans + Windows (no support yet for replacing the system malloc) 1306a4bd5210SJason Evans - Add support for additional architectures: 1307a4bd5210SJason Evans + MIPS 1308a4bd5210SJason Evans + SH4 1309a4bd5210SJason Evans + Tilera 1310a4bd5210SJason Evans - Add support for cross compiling. 1311a4bd5210SJason Evans - Add nallocm(), which rounds a request size up to the nearest size class 1312a4bd5210SJason Evans without actually allocating. 1313a4bd5210SJason Evans - Implement aligned_alloc() (blame C11). 1314a4bd5210SJason Evans - Add the "thread.tcache.enabled" mallctl. 13158ed34ab0SJason Evans - Add the "opt.prof_final" mallctl. 13168ed34ab0SJason Evans - Update pprof (from gperftools 2.0). 131735dad073SJason Evans - Add the --with-mangling option. 131835dad073SJason Evans - Add the --disable-experimental option. 131935dad073SJason Evans - Add the --disable-munmap option, and make it the default on Linux. 132035dad073SJason Evans - Add the --enable-mremap option, which disables use of mremap(2) by default. 1321a4bd5210SJason Evans 1322a4bd5210SJason Evans Incompatible changes: 1323a4bd5210SJason Evans - Enable stats by default. 1324a4bd5210SJason Evans - Enable fill by default. 1325a4bd5210SJason Evans - Disable lazy locking by default. 1326a4bd5210SJason Evans - Rename the "tcache.flush" mallctl to "thread.tcache.flush". 1327a4bd5210SJason Evans - Rename the "arenas.pagesize" mallctl to "arenas.page". 13288ed34ab0SJason Evans - Change the "opt.lg_prof_sample" default from 0 to 19 (1 B to 512 KiB). 13298ed34ab0SJason Evans - Change the "opt.prof_accum" default from true to false. 1330a4bd5210SJason Evans 1331a4bd5210SJason Evans Removed features: 1332a4bd5210SJason Evans - Remove the swap feature, including the "config.swap", "swap.avail", 1333a4bd5210SJason Evans "swap.prezeroed", "swap.nfds", and "swap.fds" mallctls. 1334a4bd5210SJason Evans - Remove highruns statistics, including the 1335a4bd5210SJason Evans "stats.arenas.<i>.bins.<j>.highruns" and 1336a4bd5210SJason Evans "stats.arenas.<i>.lruns.<j>.highruns" mallctls. 1337a4bd5210SJason Evans - As part of small size class refactoring, remove the "opt.lg_[qc]space_max", 1338a4bd5210SJason Evans "arenas.cacheline", "arenas.subpage", "arenas.[tqcs]space_{min,max}", and 1339a4bd5210SJason Evans "arenas.[tqcs]bins" mallctls. 1340a4bd5210SJason Evans - Remove the "arenas.chunksize" mallctl. 1341a4bd5210SJason Evans - Remove the "opt.lg_prof_tcmax" option. 1342a4bd5210SJason Evans - Remove the "opt.lg_prof_bt_max" option. 1343a4bd5210SJason Evans - Remove the "opt.lg_tcache_gc_sweep" option. 1344a4bd5210SJason Evans - Remove the --disable-tiny option, including the "config.tiny" mallctl. 1345a4bd5210SJason Evans - Remove the --enable-dynamic-page-shift configure option. 1346a4bd5210SJason Evans - Remove the --enable-sysv configure option. 1347a4bd5210SJason Evans 1348a4bd5210SJason Evans Bug fixes: 1349a4bd5210SJason Evans - Fix a statistics-related bug in the "thread.arena" mallctl that could cause 1350a4bd5210SJason Evans invalid statistics and crashes. 1351e722f8f8SJason Evans - Work around TLS deallocation via free() on Linux. This bug could cause 1352a4bd5210SJason Evans write-after-free memory corruption. 1353e722f8f8SJason Evans - Fix a potential deadlock that could occur during interval- and 1354e722f8f8SJason Evans growth-triggered heap profile dumps. 135535dad073SJason Evans - Fix large calloc() zeroing bugs due to dropping chunk map unzeroed flags. 13564bcb1430SJason Evans - Fix chunk_alloc_dss() to stop claiming memory is zeroed. This bug could 13574bcb1430SJason Evans cause memory corruption and crashes with --enable-dss specified. 1358e722f8f8SJason Evans - Fix fork-related bugs that could cause deadlock in children between fork 1359e722f8f8SJason Evans and exec. 1360a4bd5210SJason Evans - Fix malloc_stats_print() to honor 'b' and 'l' in the opts parameter. 1361a4bd5210SJason Evans - Fix realloc(p, 0) to act like free(p). 1362a4bd5210SJason Evans - Do not enforce minimum alignment in memalign(). 1363a4bd5210SJason Evans - Check for NULL pointer in malloc_usable_size(). 1364e722f8f8SJason Evans - Fix an off-by-one heap profile statistics bug that could be observed in 1365e722f8f8SJason Evans interval- and growth-triggered heap profiles. 1366e722f8f8SJason Evans - Fix the "epoch" mallctl to update cached stats even if the passed in epoch 1367e722f8f8SJason Evans is 0. 1368a4bd5210SJason Evans - Fix bin->runcur management to fix a layout policy bug. This bug did not 1369a4bd5210SJason Evans affect correctness. 1370a4bd5210SJason Evans - Fix a bug in choose_arena_hard() that potentially caused more arenas to be 1371a4bd5210SJason Evans initialized than necessary. 1372a4bd5210SJason Evans - Add missing "opt.lg_tcache_max" mallctl implementation. 1373a4bd5210SJason Evans - Use glibc allocator hooks to make mixed allocator usage less likely. 1374a4bd5210SJason Evans - Fix build issues for --disable-tcache. 13758ed34ab0SJason Evans - Don't mangle pthread_create() when --with-private-namespace is specified. 1376a4bd5210SJason Evans 1377a4bd5210SJason Evans* 2.2.5 (November 14, 2011) 1378a4bd5210SJason Evans 1379a4bd5210SJason Evans Bug fixes: 1380a4bd5210SJason Evans - Fix huge_ralloc() race when using mremap(2). This is a serious bug that 1381a4bd5210SJason Evans could cause memory corruption and/or crashes. 1382a4bd5210SJason Evans - Fix huge_ralloc() to maintain chunk statistics. 1383a4bd5210SJason Evans - Fix malloc_stats_print(..., "a") output. 1384a4bd5210SJason Evans 1385a4bd5210SJason Evans* 2.2.4 (November 5, 2011) 1386a4bd5210SJason Evans 1387a4bd5210SJason Evans Bug fixes: 1388a4bd5210SJason Evans - Initialize arenas_tsd before using it. This bug existed for 2.2.[0-3], as 1389a4bd5210SJason Evans well as for --disable-tls builds in earlier releases. 1390a4bd5210SJason Evans - Do not assume a 4 KiB page size in test/rallocm.c. 1391a4bd5210SJason Evans 1392a4bd5210SJason Evans* 2.2.3 (August 31, 2011) 1393a4bd5210SJason Evans 1394a4bd5210SJason Evans This version fixes numerous bugs related to heap profiling. 1395a4bd5210SJason Evans 1396a4bd5210SJason Evans Bug fixes: 1397a4bd5210SJason Evans - Fix a prof-related race condition. This bug could cause memory corruption, 1398a4bd5210SJason Evans but only occurred in non-default configurations (prof_accum:false). 1399a4bd5210SJason Evans - Fix off-by-one backtracing issues (make sure that prof_alloc_prep() is 1400a4bd5210SJason Evans excluded from backtraces). 1401a4bd5210SJason Evans - Fix a prof-related bug in realloc() (only triggered by OOM errors). 1402a4bd5210SJason Evans - Fix prof-related bugs in allocm() and rallocm(). 1403a4bd5210SJason Evans - Fix prof_tdata_cleanup() for --disable-tls builds. 1404a4bd5210SJason Evans - Fix a relative include path, to fix objdir builds. 1405a4bd5210SJason Evans 1406a4bd5210SJason Evans* 2.2.2 (July 30, 2011) 1407a4bd5210SJason Evans 1408a4bd5210SJason Evans Bug fixes: 1409a4bd5210SJason Evans - Fix a build error for --disable-tcache. 1410a4bd5210SJason Evans - Fix assertions in arena_purge() (for real this time). 1411a4bd5210SJason Evans - Add the --with-private-namespace option. This is a workaround for symbol 1412a4bd5210SJason Evans conflicts that can inadvertently arise when using static libraries. 1413a4bd5210SJason Evans 1414a4bd5210SJason Evans* 2.2.1 (March 30, 2011) 1415a4bd5210SJason Evans 1416a4bd5210SJason Evans Bug fixes: 1417a4bd5210SJason Evans - Implement atomic operations for x86/x64. This fixes compilation failures 1418a4bd5210SJason Evans for versions of gcc that are still in wide use. 1419a4bd5210SJason Evans - Fix an assertion in arena_purge(). 1420a4bd5210SJason Evans 1421a4bd5210SJason Evans* 2.2.0 (March 22, 2011) 1422a4bd5210SJason Evans 1423a4bd5210SJason Evans This version incorporates several improvements to algorithms and data 1424a4bd5210SJason Evans structures that tend to reduce fragmentation and increase speed. 1425a4bd5210SJason Evans 1426a4bd5210SJason Evans New features: 1427a4bd5210SJason Evans - Add the "stats.cactive" mallctl. 1428a4bd5210SJason Evans - Update pprof (from google-perftools 1.7). 1429a4bd5210SJason Evans - Improve backtracing-related configuration logic, and add the 1430a4bd5210SJason Evans --disable-prof-libgcc option. 1431a4bd5210SJason Evans 1432a4bd5210SJason Evans Bug fixes: 1433a4bd5210SJason Evans - Change default symbol visibility from "internal", to "hidden", which 1434a4bd5210SJason Evans decreases the overhead of library-internal function calls. 1435a4bd5210SJason Evans - Fix symbol visibility so that it is also set on OS X. 1436a4bd5210SJason Evans - Fix a build dependency regression caused by the introduction of the .pic.o 1437a4bd5210SJason Evans suffix for PIC object files. 1438a4bd5210SJason Evans - Add missing checks for mutex initialization failures. 1439a4bd5210SJason Evans - Don't use libgcc-based backtracing except on x64, where it is known to work. 1440a4bd5210SJason Evans - Fix deadlocks on OS X that were due to memory allocation in 1441a4bd5210SJason Evans pthread_mutex_lock(). 1442a4bd5210SJason Evans - Heap profiling-specific fixes: 1443a4bd5210SJason Evans + Fix memory corruption due to integer overflow in small region index 1444a4bd5210SJason Evans computation, when using a small enough sample interval that profiling 1445a4bd5210SJason Evans context pointers are stored in small run headers. 1446a4bd5210SJason Evans + Fix a bootstrap ordering bug that only occurred with TLS disabled. 1447a4bd5210SJason Evans + Fix a rallocm() rsize bug. 1448a4bd5210SJason Evans + Fix error detection bugs for aligned memory allocation. 1449a4bd5210SJason Evans 1450a4bd5210SJason Evans* 2.1.3 (March 14, 2011) 1451a4bd5210SJason Evans 1452a4bd5210SJason Evans Bug fixes: 1453a4bd5210SJason Evans - Fix a cpp logic regression (due to the "thread.{de,}allocatedp" mallctl fix 1454a4bd5210SJason Evans for OS X in 2.1.2). 1455a4bd5210SJason Evans - Fix a "thread.arena" mallctl bug. 1456a4bd5210SJason Evans - Fix a thread cache stats merging bug. 1457a4bd5210SJason Evans 1458a4bd5210SJason Evans* 2.1.2 (March 2, 2011) 1459a4bd5210SJason Evans 1460a4bd5210SJason Evans Bug fixes: 1461a4bd5210SJason Evans - Fix "thread.{de,}allocatedp" mallctl for OS X. 1462a4bd5210SJason Evans - Add missing jemalloc.a to build system. 1463a4bd5210SJason Evans 1464a4bd5210SJason Evans* 2.1.1 (January 31, 2011) 1465a4bd5210SJason Evans 1466a4bd5210SJason Evans Bug fixes: 1467a4bd5210SJason Evans - Fix aligned huge reallocation (affected allocm()). 1468a4bd5210SJason Evans - Fix the ALLOCM_LG_ALIGN macro definition. 1469a4bd5210SJason Evans - Fix a heap dumping deadlock. 1470a4bd5210SJason Evans - Fix a "thread.arena" mallctl bug. 1471a4bd5210SJason Evans 1472a4bd5210SJason Evans* 2.1.0 (December 3, 2010) 1473a4bd5210SJason Evans 1474a4bd5210SJason Evans This version incorporates some optimizations that can't quite be considered 1475a4bd5210SJason Evans bug fixes. 1476a4bd5210SJason Evans 1477a4bd5210SJason Evans New features: 1478a4bd5210SJason Evans - Use Linux's mremap(2) for huge object reallocation when possible. 1479a4bd5210SJason Evans - Avoid locking in mallctl*() when possible. 1480a4bd5210SJason Evans - Add the "thread.[de]allocatedp" mallctl's. 1481a4bd5210SJason Evans - Convert the manual page source from roff to DocBook, and generate both roff 1482a4bd5210SJason Evans and HTML manuals. 1483a4bd5210SJason Evans 1484a4bd5210SJason Evans Bug fixes: 1485a4bd5210SJason Evans - Fix a crash due to incorrect bootstrap ordering. This only impacted 1486a4bd5210SJason Evans --enable-debug --enable-dss configurations. 1487a4bd5210SJason Evans - Fix a minor statistics bug for mallctl("swap.avail", ...). 1488a4bd5210SJason Evans 1489a4bd5210SJason Evans* 2.0.1 (October 29, 2010) 1490a4bd5210SJason Evans 1491a4bd5210SJason Evans Bug fixes: 1492a4bd5210SJason Evans - Fix a race condition in heap profiling that could cause undefined behavior 1493a4bd5210SJason Evans if "opt.prof_accum" were disabled. 1494a4bd5210SJason Evans - Add missing mutex unlocks for some OOM error paths in the heap profiling 1495a4bd5210SJason Evans code. 1496a4bd5210SJason Evans - Fix a compilation error for non-C99 builds. 1497a4bd5210SJason Evans 1498a4bd5210SJason Evans* 2.0.0 (October 24, 2010) 1499a4bd5210SJason Evans 1500a4bd5210SJason Evans This version focuses on the experimental *allocm() API, and on improved 1501a4bd5210SJason Evans run-time configuration/introspection. Nonetheless, numerous performance 1502a4bd5210SJason Evans improvements are also included. 1503a4bd5210SJason Evans 1504a4bd5210SJason Evans New features: 1505a4bd5210SJason Evans - Implement the experimental {,r,s,d}allocm() API, which provides a superset 1506a4bd5210SJason Evans of the functionality available via malloc(), calloc(), posix_memalign(), 1507a4bd5210SJason Evans realloc(), malloc_usable_size(), and free(). These functions can be used to 1508a4bd5210SJason Evans allocate/reallocate aligned zeroed memory, ask for optional extra memory 1509a4bd5210SJason Evans during reallocation, prevent object movement during reallocation, etc. 1510a4bd5210SJason Evans - Replace JEMALLOC_OPTIONS/JEMALLOC_PROF_PREFIX with MALLOC_CONF, which is 1511a4bd5210SJason Evans more human-readable, and more flexible. For example: 1512a4bd5210SJason Evans JEMALLOC_OPTIONS=AJP 1513a4bd5210SJason Evans is now: 1514a4bd5210SJason Evans MALLOC_CONF=abort:true,fill:true,stats_print:true 1515a4bd5210SJason Evans - Port to Apple OS X. Sponsored by Mozilla. 1516a4bd5210SJason Evans - Make it possible for the application to control thread-->arena mappings via 1517a4bd5210SJason Evans the "thread.arena" mallctl. 1518a4bd5210SJason Evans - Add compile-time support for all TLS-related functionality via pthreads TSD. 1519a4bd5210SJason Evans This is mainly of interest for OS X, which does not support TLS, but has a 1520a4bd5210SJason Evans TSD implementation with similar performance. 1521a4bd5210SJason Evans - Override memalign() and valloc() if they are provided by the system. 1522a4bd5210SJason Evans - Add the "arenas.purge" mallctl, which can be used to synchronously purge all 1523a4bd5210SJason Evans dirty unused pages. 1524a4bd5210SJason Evans - Make cumulative heap profiling data optional, so that it is possible to 1525a4bd5210SJason Evans limit the amount of memory consumed by heap profiling data structures. 1526a4bd5210SJason Evans - Add per thread allocation counters that can be accessed via the 1527a4bd5210SJason Evans "thread.allocated" and "thread.deallocated" mallctls. 1528a4bd5210SJason Evans 1529a4bd5210SJason Evans Incompatible changes: 1530a4bd5210SJason Evans - Remove JEMALLOC_OPTIONS and malloc_options (see MALLOC_CONF above). 1531a4bd5210SJason Evans - Increase default backtrace depth from 4 to 128 for heap profiling. 1532a4bd5210SJason Evans - Disable interval-based profile dumps by default. 1533a4bd5210SJason Evans 1534a4bd5210SJason Evans Bug fixes: 1535a4bd5210SJason Evans - Remove bad assertions in fork handler functions. These assertions could 1536a4bd5210SJason Evans cause aborts for some combinations of configure settings. 1537a4bd5210SJason Evans - Fix strerror_r() usage to deal with non-standard semantics in GNU libc. 1538a4bd5210SJason Evans - Fix leak context reporting. This bug tended to cause the number of contexts 1539a4bd5210SJason Evans to be underreported (though the reported number of objects and bytes were 1540a4bd5210SJason Evans correct). 1541a4bd5210SJason Evans - Fix a realloc() bug for large in-place growing reallocation. This bug could 1542a4bd5210SJason Evans cause memory corruption, but it was hard to trigger. 1543a4bd5210SJason Evans - Fix an allocation bug for small allocations that could be triggered if 1544a4bd5210SJason Evans multiple threads raced to create a new run of backing pages. 1545a4bd5210SJason Evans - Enhance the heap profiler to trigger samples based on usable size, rather 1546a4bd5210SJason Evans than request size. 1547a4bd5210SJason Evans - Fix a heap profiling bug due to sometimes losing track of requested object 1548a4bd5210SJason Evans size for sampled objects. 1549a4bd5210SJason Evans 1550a4bd5210SJason Evans* 1.0.3 (August 12, 2010) 1551a4bd5210SJason Evans 1552a4bd5210SJason Evans Bug fixes: 1553a4bd5210SJason Evans - Fix the libunwind-based implementation of stack backtracing (used for heap 1554a4bd5210SJason Evans profiling). This bug could cause zero-length backtraces to be reported. 1555a4bd5210SJason Evans - Add a missing mutex unlock in library initialization code. If multiple 1556a4bd5210SJason Evans threads raced to initialize malloc, some of them could end up permanently 1557a4bd5210SJason Evans blocked. 1558a4bd5210SJason Evans 1559a4bd5210SJason Evans* 1.0.2 (May 11, 2010) 1560a4bd5210SJason Evans 1561a4bd5210SJason Evans Bug fixes: 1562a4bd5210SJason Evans - Fix junk filling of large objects, which could cause memory corruption. 1563a4bd5210SJason Evans - Add MAP_NORESERVE support for chunk mapping, because otherwise virtual 1564a4bd5210SJason Evans memory limits could cause swap file configuration to fail. Contributed by 1565a4bd5210SJason Evans Jordan DeLong. 1566a4bd5210SJason Evans 1567a4bd5210SJason Evans* 1.0.1 (April 14, 2010) 1568a4bd5210SJason Evans 1569a4bd5210SJason Evans Bug fixes: 1570a4bd5210SJason Evans - Fix compilation when --enable-fill is specified. 1571a4bd5210SJason Evans - Fix threads-related profiling bugs that affected accuracy and caused memory 1572a4bd5210SJason Evans to be leaked during thread exit. 1573a4bd5210SJason Evans - Fix dirty page purging race conditions that could cause crashes. 1574a4bd5210SJason Evans - Fix crash in tcache flushing code during thread destruction. 1575a4bd5210SJason Evans 1576a4bd5210SJason Evans* 1.0.0 (April 11, 2010) 1577a4bd5210SJason Evans 1578a4bd5210SJason Evans This release focuses on speed and run-time introspection. Numerous 1579a4bd5210SJason Evans algorithmic improvements make this release substantially faster than its 1580a4bd5210SJason Evans predecessors. 1581a4bd5210SJason Evans 1582a4bd5210SJason Evans New features: 1583a4bd5210SJason Evans - Implement autoconf-based configuration system. 1584a4bd5210SJason Evans - Add mallctl*(), for the purposes of introspection and run-time 1585a4bd5210SJason Evans configuration. 1586a4bd5210SJason Evans - Make it possible for the application to manually flush a thread's cache, via 1587a4bd5210SJason Evans the "tcache.flush" mallctl. 1588a4bd5210SJason Evans - Base maximum dirty page count on proportion of active memory. 1589d0e79aa3SJason Evans - Compute various additional run-time statistics, including per size class 1590a4bd5210SJason Evans statistics for large objects. 1591a4bd5210SJason Evans - Expose malloc_stats_print(), which can be called repeatedly by the 1592a4bd5210SJason Evans application. 1593a4bd5210SJason Evans - Simplify the malloc_message() signature to only take one string argument, 1594a4bd5210SJason Evans and incorporate an opaque data pointer argument for use by the application 1595a4bd5210SJason Evans in combination with malloc_stats_print(). 1596a4bd5210SJason Evans - Add support for allocation backed by one or more swap files, and allow the 1597a4bd5210SJason Evans application to disable over-commit if swap files are in use. 1598a4bd5210SJason Evans - Implement allocation profiling and leak checking. 1599a4bd5210SJason Evans 1600a4bd5210SJason Evans Removed features: 1601a4bd5210SJason Evans - Remove the dynamic arena rebalancing code, since thread-specific caching 1602a4bd5210SJason Evans reduces its utility. 1603a4bd5210SJason Evans 1604a4bd5210SJason Evans Bug fixes: 1605a4bd5210SJason Evans - Modify chunk allocation to work when address space layout randomization 1606a4bd5210SJason Evans (ASLR) is in use. 1607a4bd5210SJason Evans - Fix thread cleanup bugs related to TLS destruction. 1608a4bd5210SJason Evans - Handle 0-size allocation requests in posix_memalign(). 1609a4bd5210SJason Evans - Fix a chunk leak. The leaked chunks were never touched, so this impacted 1610a4bd5210SJason Evans virtual memory usage, but not physical memory usage. 1611a4bd5210SJason Evans 1612a4bd5210SJason Evans* linux_2008082[78]a (August 27/28, 2008) 1613a4bd5210SJason Evans 1614a4bd5210SJason Evans These snapshot releases are the simple result of incorporating Linux-specific 1615a4bd5210SJason Evans support into the FreeBSD malloc sources. 1616a4bd5210SJason Evans 1617a4bd5210SJason Evans-------------------------------------------------------------------------------- 1618a4bd5210SJason Evansvim:filetype=text:textwidth=80 1619