summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Allow overriding `LG_PAGE`HEADdevKevin Svetlitski2023-05-172-0/+18
| | | | | This is useful for our internal builds where we override the configuration in the header files generated by autoconf.
* Remove dead stores detected by static analysisKevin Svetlitski2023-05-116-13/+7
| | | | | | | None of these are harmful, and they are almost certainly optimized away by the compiler. The motivation for fixing them anyway is that we'd like to enable static analysis as part of CI, and the first step towards that is resolving the warnings it produces at present.
* Fix possible `NULL` pointer dereference from `mallctl("prof.prefix", ...)`Kevin Svetlitski2023-05-111-0/+3
| | | | | | | | | | | | | | | | Static analysis flagged this issue. Here is a minimal program which causes a segfault within Jemalloc: ``` #include <jemalloc/jemalloc.h> const char *malloc_conf = "prof:true"; int main() { mallctl("prof.prefix", NULL, NULL, NULL, 0); } ``` Fixed by checking if `prefix` is `NULL`.
* Add the prof_sys_thread_name feature in the prof_recent unit test.Qi Wang2023-05-114-11/+29
| | | | | | | | This tests the combination of the prof_recent and thread_name features. Verified that it catches the issue being fixed in this PR. Also explicitly set thread name in test/unit/prof_recent. This fixes the name testing when no default thread name is set (e.g. FreeBSD).
* Fix the prof thread_name reference in prof_recent dump.Qi Wang2023-05-111-2/+4
| | | | | | As pointed out in #2434, the thread_name in prof_tdata_t was changed in #2407. This also requires an update for the prof_recent dump, specifically the emitter expects a "char **" which is fixed in this commit.
* Add config detection for JEMALLOC_HAVE_PTHREAD_SET_NAME_NP.Qi Wang2023-05-113-1/+14
| | | | and use it on the background thread name setting.
* If ptr present check if alloc_ctx.edata == NULLauxten2023-05-101-1/+1
|
* Make arenas_lookup_ctl triableauxten2023-05-101-4/+6
|
* Fix possible `NULL` pointer dereference in `VERIFY_READ`Kevin Svetlitski2023-05-091-1/+3
| | | | | Static analysis flagged this. Fixed by simply checking `oldlenp` before dereferencing it.
* Fix segfault in `extent_try_coalesce_impl`Kevin Svetlitski2023-05-091-1/+3
| | | | | | | | Static analysis flagged this. `extent_record` was passing `NULL` as the value for `coalesced` to `extent_try_coalesce`, which in turn passes that argument to `extent_try_coalesce_impl`, where it is written to without checking if it is `NULL`. I can confirm from reviewing the fleetwide coredump data that this was in fact being hit in production.
* Make eligible functions `static`Kevin Svetlitski2023-05-089-14/+16
| | | | | | The codebase is already very disciplined in making any function which can be `static`, but there are a few that appear to have slipped through the cracks.
* Make `edata_cmp_summary_comp` 30% fasterKevin Svetlitski2023-05-041-16/+19
| | | | | | | | | | | | | `edata_cmp_summary_comp` is one of the very hottest functions, taking up 3% of all time spent inside Jemalloc. I noticed that all existing callsites rely only on the sign of the value returned by this function, so I came up with this equivalent branchless implementation which preserves this property. After empirical measurement, I have found that this implementation is 30% faster, therefore representing a 1% speed-up to the allocator as a whole. At @interwq's suggestion, I've applied the same optimization to `edata_esnead_comp` in case this function becomes hotter in the future.
* Some nits in cache_bin.hAmaury Séchet2023-05-011-4/+4
|
* Remove errant `assert` in `arena_extent_alloc_large`Kevin Svetlitski2023-05-011-1/+0
| | | | | | | This codepath may generate deferred work when the HPA is enabled. See also [@davidtgoldblatt's relevant comment on the PR which introduced this](https://github.com/jemalloc/jemalloc/pull/2107#discussion_r699770967) which prevented a similarly incorrect `assert` from being added elsewhere.
* Check for equality instead of assigning in asserts in hpa_from_pai.Eric Mueller2023-04-171-4/+4
| | | | | | | | | | It appears like a simple typo means we're unconditionally overwriting some fields in hpa_from_pai when asserts are enabled. From hpa_shard_init, it looks like these fields have these values anyway, so this shouldn't cause bugs, but if something is wrong it seems better to have these asserts in place. See issue #2412.
* Remove locked flag set in malloc_mutex_trylockguangli-dai2023-04-061-1/+0
| | | | | As a hint flag of the lock, parameter locked should be set only when the lock is gained or freed.
* Disallow decay during reentrancy.Qi Wang2023-04-053-21/+76
| | | | | | Decay should not be triggered during reentrant calls (may cause lock order reversal / deadlocks). Added a delay_trigger flag to the tickers to bypass decay when rentrancy_level is not zero.
* Rearrange the bools in prof_tdata_t to save some bytes.Qi Wang2023-04-051-3/+3
| | | | | | This lowered the sizeof(prof_tdata_t) from 200 to 192 which is a round size class. Afterwards the tdata_t size remain unchanged with the last commit, which effectively inlined the storage of thread names for free.
* Inline the storage for thread name in prof_tdata_t.Qi Wang2023-04-0511-103/+120
| | | | | | | | | The previous approach managed the thread name in a separate buffer, which causes races because the thread name update (triggered by new samples) can happen at the same time as prof dumping (which reads the thread names) -- these two operations are under separate locks to avoid blocking each other. Implemented the thread name storage as part of the tdata struct, which resolves the lifetime issue and also avoids internal alloc / dalloc during prof_sample.
* Add a multithreaded test for prof_sys_thread_name.Qi Wang2023-04-051-1/+49
| | | | Verified that this catches the issue being fixed in 5fd5583.
* Simplify the logic in ph_removeAmaury Séchet2023-03-311-44/+20
|
* Do not maintain root->prev in ph_remove.Amaury Séchet2023-03-311-3/+0
|
* Simplify the logic in ph_insertAmaury Séchet2023-03-311-29/+30
| | | | | | Also fixes what looks like an off by one error in the lazy aux list merge part of the code that previously never touched the last node in the aux list.
* Fix the rdtscp detection bug and add prefix for the macro.guangli-dai2023-03-233-3/+9
|
* Explicit arena assignment in test_tcache_max.Qi Wang2023-03-221-0/+7
| | | | Otherwise the associated arena could change with percpu arena enabled.
* Explicit arena assignment in test_thread_idle.Qi Wang2023-03-221-5/+10
| | | | Otherwise the associated arena could change with percpu arena enabled.
* Fix exception specification error for hosts using musl libcMarvin Schmidt2023-03-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | It turns out that the previous commit did not suffice since the JEMALLOC_SYS_NOTHROW definition also causes the same exception specification errors as JEMALLOC_USE_CXX_THROW did: ``` x86_64-pc-linux-musl-cc -std=gnu11 -Werror=unknown-warning-option -Wall -Wextra -Wshorten-64-to-32 -Wsign-compare -Wundef -Wno-format-zero-length -Wpointer- arith -Wno-missing-braces -Wno-missing-field-initializers -pipe -g3 -fvisibility=hidden -Wimplicit-fallthrough -O3 -funroll-loops -march=native -O2 -pipe -c -march=native -O2 -pipe -D_GNU_SOURCE -D_REENTRANT -Iinclude -Iinclude -o src/background_thread.o src/background_thread.c In file included from src/jemalloc_cpp.cpp:9: In file included from include/jemalloc/internal/jemalloc_preamble.h:27: include/jemalloc/internal/../jemalloc.h:254:32: error: exception specification in declaration does not match previous declaration void JEMALLOC_SYS_NOTHROW *je_malloc(size_t size) ^ include/jemalloc/internal/../jemalloc.h:75:21: note: expanded from macro 'je_malloc' ^ /usr/x86_64-pc-linux-musl/include/stdlib.h:40:7: note: previous declaration is here void *malloc (size_t); ^ ``` On systems using the musl C library we have to omit the exception specification on malloc function family like it's done for MacOS, FreeBSD and OpenBSD.
* configure: Handle *-linux-musl* hosts properlyMarvin Schmidt2023-03-161-0/+13
| | | | | This is the same as the `*-*-linux*` case with the two exceptions that we don't set glibc=1 and don't define JEMALLOC_USE_CXX_THROW
* Add the missing descriptions in AC_DEFINEQi Wang2023-03-141-2/+2
|
* Avoid assuming the arena id in test when percpu_arena is used.Qi Wang2023-03-131-0/+3
|
* Remove unused mutex from hpa_centralAmaury Séchet2023-03-102-10/+1
|
* switch to httpsChris Seymour2023-03-093-15/+15
|
* Use asm volatile during benchmarks.guangli-dai2023-02-246-7/+51
|
* [MSVC] support for Visual Studio 2019 and 2022Fernando Pelliccioni2023-02-2110-0/+1982
|
* Makefile.in: link with g++ when cxx enabledbarracuda1562023-02-211-0/+4
|
* Add a header in HPA stats for the nonfull slabs.Qi Wang2023-02-171-2/+3
|
* Add an explicit name to the dedicated oversize arena.Qi Wang2023-02-171-0/+5
|
* More conservative setting for /test/unit/background_thread_enable.Qi Wang2023-02-161-6/+3
| | | | Lower the thread and arena count to avoid resource exhaustion on 32-bit.
* Fix thread_name updating for heap profiling.Qi Wang2023-02-151-11/+10
| | | | | | | | | The current thread name reading path updates the name every time, which requires both alloc and dalloc -- and the temporary NULL value in the middle causes races where the prof dump read path gets NULLed in the middle. Minimize the changes in this commit to isolate the bugfix testing; will also refactor the whole thread name paths later.
* Implement prof sample hooks "experimental.hooks.prof_sample(_free)".Qi Wang2022-12-077-19/+307
| | | | | | | | | | | The added hooks hooks.prof_sample and hooks.prof_sample_free are intended to allow advanced users to track additional information, to enable new ways of profiling on top of the jemalloc heap profile and sample features. The sample hook is invoked after the allocation and backtracing, and forwards the both the allocation and backtrace to the user hook; the sample_free hook happens before the actual deallocation, and forwards only the ptr and usz to the hook.
* Fix dividing 0 error in stress/cpp/microbenchguangli-dai2022-12-062-18/+29
| | | | | | | | Summary: Per issue #2356, some CXX compilers may optimize away the new/delete operation in stress/cpp/microbench.cpp. Thus, this commit (1) bumps the time interval to 1 if it is 0, and (2) modifies the pointers in the microbench to volatile.
* Inline free and sdallocx into operator deleteGuangli Dai2022-11-216-228/+241
|
* Benchmark operator deleteguangli-dai2022-11-214-10/+102
| | | | | Added the microbenchmark for operator delete. Also modified bench.h so that it can be used in C++.
* Update the ratio display in benchmarkguangli-dai2022-11-211-1/+1
| | | | | In bench.h, specify the ratio as the time consumption ratio and modify the display of the ratio.
* Add a configure option --enable-force-getenv.Qi Wang2022-11-043-6/+32
| | | | | | Allows the use of getenv() rather than secure_getenv() to read MALLOC_CONF. This helps in situations where hosts are under full control, and setting MALLOC_CONF is needed while also setuid. Disabled by default.
* Enable fast thread locals for dealloc-only threads.Qi Wang2022-10-253-1/+77
| | | | | | | | | | Previously if a thread does only allocations, it stays on the slow path / minimal initialized state forever. However, dealloc-only is a valid pattern for dedicated reclamation threads -- this means thread cache is disabled (no batched flush) for them, which causes high overhead and contention. Added the condition to fully initialize TSD when a fair amount of dealloc activities are observed.
* jemalloc_internal_types.h: Use alloca if __STDC_NO_VLA__ is definedPaul Smith2022-10-141-1/+1
| | | | | | | | | No currently-available version of Visual Studio C compiler supports variable length arrays, even if it defines __STDC_VERSION__ >= C99. As far as I know Microsoft has no plans to ever support VLAs in MSVC. The C11 standard requires that the __STDC_NO_VLA__ macro be defined if the compiler doesn't support VLAs, so fall back to alloca() if so.
* Fix safety_check segfault in double free testdivanorama2022-10-031-2/+1
|
* update PROFILING_INTERNALS.mdJordan Rome2022-10-031-1/+19
| | | | Expand the bad example of summing before unbiasing.
* fix build for non linux/BSD platforms.David Carlier2022-10-033-3/+15
|