summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Make PPC64 use 64K of internal page size for tcmalloc by defaultrzinsly-masterRaphael Moreira Zinsly2014-12-111-2/+4
| | | | | This patch set the default tcmalloc internal page size to 64K when built on PPC.
* New compiler flags to set the size and alignment of tcmalloc pagesRaphael Moreira Zinsly2014-12-113-28/+65
| | | | | | | Added two new compiler flags, --with-tcmalloc-pagesize and --with-tcmalloc-alignment, in order to set the tcmalloc internal page size and alignment without the need of a compiler directive and to make the choice of the page size independent of the alignment.
* Added option to disable libunwind linkingRaphael Moreira Zinsly2014-11-271-6/+19
| | | | | This patch adds a configure option to enable or disable libunwind linking. The patch also disables libunwind on ppc by default.
* bumped version to 2.3rcgperftools-2.2.90Aliaksey Kandratsenka2014-11-023-7/+7
|
* updated NEWS for gperftools 2.3rcAliaksey Kandratsenka2014-11-021-0/+62
|
* implemented cpu-profiling mode that profiles threads separatelyAliaksey Kandratsenka2014-11-026-24/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Default mode of operation of cpu profiler uses itimer and SIGPROF. This timer is by definition per-process and no spec defines which thread is going to receive SIGPROF. And it provides correct profiles only if we assume that probability of picking threads will be proportional to cpu time spent by threads. It is easy to see, that recent Linux (at least on common SMP hardware) doesn't satisfy that assumption. Quite big skews of SIGPROF ticks between threads is visible. I.e. I could see as big as 70%/20% division instead of 50%/50% for pair of cpu-hog threads. (And I do see it become 50/50 with new mode) Fortunately POSIX provides mechanism to track per-thread cpu time via posix timers facility. And even more fortunately, Linux also provides mechanism to deliver timer ticks to specific threads. Interestingly, it looks like FreeBSD also has very similar facility and seems to suffer from same skew. But due to difference in a way how threads are identified, I haven't bothered to try to support this mode on FreeBSD. This commit implements new profiling mode where every thread creates posix timer which tracks thread's cpu time. Threads also also set up signal delivery to itself on overflows of that timer. This new mode requires every thread to be registered in cpu profiler. Existing ProfilerRegisterThread function is used for that. Because registering threads requires application support (or suitable LD_PRELOAD-able wrapper for thread creation API), new mode is off by default. And it has to be manually activated by setting environment variable CPUPROFILE_PER_THREAD_TIMERS. New mode also requires librt symbols to be available. Which we do not link to due to librt's dependency on libpthread. Which we avoid due to perf impact of bringing in libpthread to otherwise single-threaded programs. So it has to be either already loaded by profiling program or LD_PRELOAD-ed.
* drop workaround for too old redhat 7Aliaksey Kandratsenka2014-11-021-7/+0
| | | | Note that this is _not_ RHEL7 but original redhat 7 from early 2000s.
* don't add leaf function twice to profile under libunwindAliaksey Kandratsenka2014-11-021-2/+11
|
* pprof: indicate if using remote profileAliaksey Kandratsenka2014-11-021-0/+1
| | | | | Missing profile file is common source of confusion. So a bit more clarify is useful.
* issue-493: correctly detect __ARM_ARCH_6ZK__ for MemoryBarrierAliaksey Kandratsenka2014-11-021-1/+1
| | | | Which should fix issue reported by user pedronavf
* issue-655: use safe getenv for aggressive decommit mode flagAliaksey Kandratsenka2014-11-022-5/+46
| | | | | Because otherwise we risk deadlock due to too early use of getenv on windows.
* issue-654: [pprof] handle split text segmentsAliaksey Kandratsenka2014-10-181-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This applies patch by user simonb. Quoting: Relocation packing splits a single executable load segment into two. Before: LOAD 0x000000 0x00000000 0x00000000 0x2034d28 0x2034d28 R E 0x1000 LOAD 0x2035888 0x02036888 0x02036888 0x182d38 0x1a67d0 RW 0x1000 After: LOAD 0x000000 0x00000000 0x00000000 0x14648 0x14648 R E 0x1000 LOAD 0x014648 0x0020c648 0x0020c648 0x1e286e0 0x1e286e0 R E 0x1000 ... LOAD 0x1e3d888 0x02036888 0x02036888 0x182d38 0x1a67d0 RW 0x1000 The .text section is in the second LOAD, and this is not at offset/address zero. The result is that this library shows up in /proc/self/maps as multiple executable entries, for example (note: this trace is not from the library dissected above, but rather from an earlier version of it): 73b0c000-73b21000 r-xp 00000000 b3:19 786460 /data/.../libchrome.2160.0.so 73b21000-73d12000 ---p 00000000 00:00 0 73d12000-75a90000 r-xp 00014000 b3:19 786460 /data/.../libchrome.2160.0.so 75a90000-75c0d000 rw-p 01d91000 b3:19 786460 /data/.../libchrome.2160.0.so When parsing this, pprof needs to merge the two r-xp entries above into a single entry, otherwise the addresses it prints are incorrect. The following fix against 2.2.1 was sufficient to make pprof --text print the correct output. Untested with other pprof options.
* Fix parsing /proc/pid/maps dump in CPU profile data fileRicardo M. Correia2014-10-111-1/+1
| | | | | | | | | | | | | When trying to use pprof on my machine, the symbols of my program were not being recognized. It turned out that pprof, when calculating the offset of the text list of mapped objects (the last section of the CPU profile data file), was assuming that the slot size was always 4 bytes, even on 64-bit machines. This led to ParseLibraries() reading a lot of garbage data at the beginning of the map, and consequently the regex was failing to match on the first line of the real (non-garbage) map.
* Added remaining memory allocated info to 'Exiting' dump messageAliaksey Kandratsenka2014-09-061-1/+21
| | | | This applies patch by user yurivict.
* Cope with new addr2line outputs for DWARF4Adam McNeeney2014-08-231-0/+6
| | | | | | | Copes with ? for line number (converts to 0). Copes with (discriminator <num>) suffixes to file/linenum (removes). Change-Id: I96207165e4852c71d3512157864f12d101cdf44a
* issue-641: Added --show_addresses optionAliaksey Kandratsenka2014-08-231-2/+8
| | | | This applies patch by user yurivict.
* issue-644: fix possible out-of-bounds access in GetenvBeforeMainAliaksey Kandratsenka2014-08-191-0/+3
| | | | As suggested by user Ivan L.
* Add an option to allow disabling stripping template argument in pprofjiakai2014-08-011-1/+9
|
* issue-635: allow whitespace in libraries pathsAliaksey Kandratsenka2014-07-261-1/+1
| | | | This applies change suggested by user mich...@sebesbefut.com
* issue-636: fix prof/web command on Windows/MinGWAliaksey Kandratsenka2014-07-261-0/+6
| | | | This applies patch sent by user chaishushan.
* added option to display stack traces in output for heap checkerMichael Pasieka2014-07-131-1/+27
| | | | | | | | | | | | | | | | | | Quoting from email: I had the same question as William posted to stack overflow back on Dec 9,2013: How to display symbols in stack trace of google-perftools heap profiler (*). I dug into the source and realized the functionality was not there but could be added. I am hoping that someone else will find this useful/helpful. The patch I created will not attach so I am adding below. Enjoy! -- Michael * http://stackoverflow.com/questions/20476918/how-to-display-symbols-in-stack-trace-of-google-perftools-heap-profiler
* issue-630: The env var should be "CPUPROFILE"WenSheng He2014-07-061-1/+1
| | | | | | | To enable cpu profile, the env var should be "CPUPROFILE", not "PROFILE" actually. Signed-off-by: Aliaksey Kandratsenka <alk@tut.by>
* issue-631: fixed miscompilation of debugallocation without mmapAliaksey Kandratsenka2014-06-281-1/+1
| | | | | | | This applies patch sent by user iamxujian. Clearly, when I updated debugallocation to fix issue-464 I've broken no-mmap path by forgetting closing brace.
* bumped version to 2.2.1gperftools-2.2.1Aliaksey Kandratsenka2014-06-213-7/+7
|
* updated NEWS for 2.2.1Aliaksey Kandratsenka2014-06-211-0/+12
|
* applied chromium patch fixing some build issue on androidAliaksey Kandratsenka2014-06-211-0/+5
| | | | | This applies patch from: https://codereview.chromium.org/284843002/ by jungjik.lee@samsung.com
* issue-628:package missing stacktrace_powerpc-{linux,darwin}-inl.hAliaksey Kandratsenka2014-06-151-0/+2
| | | | | This headers were missing in .tar.gz because they were not mentioned anywhere in Makefile.am.
* issue-626: Fix SetupAggressiveDecommit initializationmaster-issue_626Adhemerval Zanella2014-06-031-7/+1
| | | | | | This patch fixes the SetupAggressiveDecommit initialization to run after pageheap_ creation. Current code it not enforcing it, since InitStaticVars is being called outside the static_vars module.
* bumped version to 2.2gperftools-2.2Aliaksey Kandratsenka2014-05-033-7/+7
|
* updated NEWS for 2.2Aliaksey Kandratsenka2014-05-031-0/+9
|
* issue-620: windows dll patching: fixed delete of old stub codeAliaksey Kandratsenka2014-05-031-7/+4
| | | | | | | | | | | After code for issue 359 was applied PreamblePatcher started using it's own code to manage memory of stub code fragments. It's not using new[] anymore. And it automatically frees stub code memory on Unpatch. Clearly, author of that code forgot to remote that no more needed delete call. With that delete call we end up trying to free memory that was never allocated with any of known allocators and crash.
* bumped version to 2.1.90Aliaksey Kandratsenka2014-04-193-7/+7
|
* updated NEWS for 2.2rcgperftools-2.1.90Aliaksey Kandratsenka2014-04-191-0/+80
|
* issue-610: use TCMallocGetenvSafe from inside mallocAliaksey Kandratsenka2014-04-122-4/+7
| | | | | Instead of plain getenv. So that windows getenv implementation that may call malloc does not deadlock.
* issue-610: made dynamic_annotations.c use TCMallocGetenvSafeAliaksey Kandratsenka2014-04-121-13/+2
|
* issue-610: introduced TCMallocGetenvSafeAliaksey Kandratsenka2014-04-123-0/+70
| | | | This is version of GetenvBeforeMain that's available to C code.
* don't enable backtrace() for stacktrace capturing by defaultAliaksey Kandratsenka2014-04-121-4/+12
| | | | | Because we don't yet have a treatment for deadlocks that are caused by (recursive) use of malloc from within that facility.
* PowerPC: stacktrace function refactor and fixesRaphael Moreira Zinsly2014-04-123-1/+394
| | | | | | | | | | This patch fixes the stacktrace creating when the function is interrupted by a signal. For Linux, the vDSO signal trampoline symbol is compared against LR from stack backchain and handled different in that case (since the signal trampoline layout a different stack frame). Because of this extensive change the PowerPC stacktrace code has now been refactored to split in Linux and Darwin specific codes.
* VDSOsupport cleanupRaphael Moreira Zinsly2014-04-122-69/+0
| | | | | | This patch cleans up unused VDSO getcpu racking from VDSOsupport class, since the code is not used anywhere in gperftools and symbol name is not architecture independent.
* Fixed issues with heap checker on PPC64 LE.Raphael Moreira Zinsly2014-04-122-5/+21
| | | | | | Fixed the wrapper for the syscall sys_clone and the test for heap checker on PPC64 LE. Both use the ODP structure, which is only used on BE architectures.
* Fixed the way that pprof packed profile data in BE.Raphael Moreira Zinsly2014-04-121-5/+21
| | | | | | pprof was writing profile data in a way that only works for little-endian files, this patch verifies if the system is big-endian and writes packed data correctly.
* Fixed the use of addr2line to discover the separator symbol.Raphael Moreira Zinsly2014-04-121-1/+33
| | | | | | | | In systems where addr2line has a version greater than 2.22 pprof fails in discover the separator symbol (_fini). This patch identifies if addr2line can find the symbol, otherwise pprof uses objdump to recover a address that addr2line's newer versions can recognize as the separator function.
* issue-614: use tc_memalign in ReallocAfterMemalloc testAliaksey Kandratsenka2014-04-072-6/+3
| | | | | Because some OSes lack plain memalign. And we really need to test our implementation which is always available via tc_malloc.
* added tc_malloc_skip_new_handlerAliaksey Kandratsenka2014-04-016-0/+20
| | | | | | | | | | | | This is port of corresponding chromium change at: https://codereview.chromium.org/55333002/ Basic idea is that sometimes apps that use tc_set_new_mode in order to have C++ out-of-memory handler catch OOMs in malloc, need to invoke usual malloc that returns 0 on OOM. That new API is exactly for that. It'll always return NULL on OOM even if tc_new_mode is set to true.
* issue deprecation warning on use of google/ headersAliaksey Kandratsenka2014-04-019-0/+27
|
* speed up MallocExtension::instance()Aliaksey Kandratsenka2014-03-291-4/+9
| | | | | | | | | | | | | | | | | | | It was reported that pthread_once is expensive, especially on ppc. In new implementation in hot path instead of doing potentially expensive atomic read with barrier, we do just plain read. It's slightly less robust than older implementation, but it should be faster. New code is making assumption that programs do not spawn threads before main() is called. And therefore all variables & modules are initialized before threads are created. Which looks like pretty safe assumption. With that assumption, doing plain read is safe, because current_instance is initialized as part of module init and therefore before threads are spawned. This patch is based on feedback of Adhemerval Zanella.
* Fix getpc_test for PPC64v2 LEAdhemerval Zanella2014-03-291-2/+4
| | | | | | This patch fix the PPC64 guard to get the function address for PPC64v2. It removes the use of an indirection (to get the ODP text address), since the PPCv2 does not have function descriptors.
* issue-613: remove friend declaration from HeapLeakCheckerAliaksey Kandratsenka2014-03-291-3/+0
| | | | | | | | | | | | | | | | | | | This applies patch by davide.italiano@10gen.com: heap-checker.h contains the following friend declaration of main: friend int main(int, char**). C99 allows another declaration of main, i.e. int main(int, char**, char**), and if code uses it and includes the heap-checker header, this might result in a conflict, e.g. error: declaration of C function 'int main(int, char**, char**)' conflicts with int main(int argc, char* argv[], char** envp) Actually the comment above the friend declaration of main() mentions that this is required to get the unittest working and for other internal usage, but I'm not completely sure if this is true as long as I'm able to build and run the unittest removing the declaration.
* issue-612: added missing include for std::minAliaksey Kandratsenka2014-03-291-0/+1
| | | | Otherwise Visual Studio 2013 rightfully complains
* unbreak building with libunwindAliaksey Kandratsenka2014-03-011-2/+2
| | | | | | | | | | Caused by premature merging of previous patch. When we're searching for backtrace in libexecinfo and don't find it, we should not reset UNWIND_LIBS to empty value. Correct fix is to first search for backtrace in libunwind and then to search for it in libexecinfo.