summaryrefslogtreecommitdiff
path: root/include/my_cpu.h
Commit message (Collapse)AuthorAgeFilesLines
* Merge 10.4 into 10.5Marko Mäkelä2020-09-041-0/+4
|\
| * MDEV-23633 fixup: Add missing semicolonMarko Mäkelä2020-09-041-1/+1
| |
| * MDEV-23633 MY_RELAX_CPU performs unnecessary compare-and-swap on ARMMarko Mäkelä2020-09-041-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This follows up MDEV-14374, which was filed against MariaDB Server 10.3. Back then, on a 48-core Qualcomm Centriq 2400, the performance of delay loops for spinloops was tested both with and without the dummy compare-and-swap operation, and it was decided to keep the dummy operation. On target architectures where nothing special is available (other than x86 (IA-32, AMD64) or POWER), we perform a dummy compare-and-swap operation. This is contrary to the idea of the x86 PAUSE instruction and the __ppc_get_timebase(), which aim to keep the memory bus idle for a while, to allow other cores to better execute code while a spinloop is waiting for something to be changed. On MariaDB Server 10.4 and another implementation of the ARMv8 ISA, omitting the dummy compare-and-swap improved performance by up to 12%. So, let us avoid the dummy compare-and-swap on ARM. For now, we are retaining the dummy compare-and-swap on other ISAs (such as SPARC, MIPS, S390x, RISC-V) because we do not have any performance data for them.
* | Fix build on aarch64, after MDEV-21534Vladislav Vaintroub2020-03-021-0/+1
|/ | | | MY_RELAX_CPU on this arch needs int32, defined in my_global.h
* MDEV-19845: Make my_cpu.h self-containedMarko Mäkelä2020-02-011-1/+8
| | | | Fix up commit f5c080c7353cc9c30d0b269c07024cd38253c3bc
* MDEV-19845: Adaptive spin loopsMarko Mäkelä2019-06-271-13/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Starting with the Intel Skylake microarchitecture, the PAUSE instruction latency is about 140 clock cycles instead of earlier 10. On AMD processors, the latency could be 10 or 50 clock cycles, depending on microarchitecture. Because of this big range of latency, let us scale the loops around the PAUSE instruction based on timing results at server startup. my_cpu_relax_multiplier: New variable: How many times to invoke PAUSE in a loop. Only defined for IA-32 and AMD64. my_cpu_init(): Determine with RDTSC the time to run 16 PAUSE instructions in two unrolled loops according, and based on the quicker of the two runs, initialize my_cpu_relax_multiplier. This form of calibration was suggested by Mikhail Sinyavin from Intel. LF_BACKOFF(), ut_delay(): Use my_cpu_relax_multiplier when available. ut_delay(): Define inline in my_cpu.h. UT_COMPILER_BARRIER(): Remove. This does not seem to have any effect, because in our ut_delay() implementation, no computations are being performed inside the loop. The purpose of UT_COMPILER_BARRIER() was to prohibit the compiler from reordering computations. It was not emitting any code.
* Merge 10.2 into 10.3Marko Mäkelä2019-05-141-1/+1
|\
| * Update FSF addressMichal Schorm2019-05-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit is based on the work of Michal Schorm, rebased on the earliest MariaDB version. Th command line used to generate this diff was: find ./ -type f \ -exec sed -i -e 's/Foundation, Inc., 59 Temple Place, Suite 330, Boston, /Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, /g' {} \; \ -exec sed -i -e 's/Foundation, Inc. 59 Temple Place.* Suite 330, Boston, /Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, /g' {} \; \ -exec sed -i -e 's/MA.*.....-1307.*USA/MA 02110-1335 USA/g' {} \; \ -exec sed -i -e 's/Foundation, Inc., 59 Temple/Foundation, Inc., 51 Franklin/g' {} \; \ -exec sed -i -e 's/Place, Suite 330, Boston, MA.*02111-1307.*USA/Street, Fifth Floor, Boston, MA 02110-1335 USA/g' {} \; \ -exec sed -i -e 's/MA.*.....-1307/MA 02110-1335/g' {} \;
* | Merge 10.2 into 10.3Marko Mäkelä2018-11-061-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | main.derived_cond_pushdown: Move all 10.3 tests to the end, trim trailing white space, and add an "End of 10.3 tests" marker. Add --sorted_result to tests where the ordering is not deterministic. main.win_percentile: Add --sorted_result to tests where the ordering is no longer deterministic.
| * | MDEV-14267: correct FSF addressDaniel Black2018-10-301-1/+1
| |/
* | Merge bb-10.2-ext into 10.3Marko Mäkelä2017-12-121-0/+57
|\ \
| * | Restore LF_BACKOFFSergey Vojtovich2017-12-081-0/+57
| |/ | | | | | | | | Moved InnoDB UT_RELAX_CPU() to server. Restored cross-platform LF_BACKOFF implementation basing on UT_RELAX_CPU().
* | Cleanup UT_LOW_PRIORITY_CPU/UT_RESUME_PRIORITY_CPUSergey Vojtovich2017-11-281-6/+7
|/ | | | Server already has HMT_low/HMT_medium.
* MDEV-6450 - MariaDB crash on Power8 when built with advance tool chainMichael Widenius2014-08-191-0/+44
Part of this work is based on Stewart Smitch's memory barrier and lower priori patches for power8. - Added memory syncronization for innodb & xtradb for power8. - Added HAVE_WINDOWS_MM_FENCE to CMakeList.txt - Added os_isync to fix a syncronization problem on power - Added log_get_lsn_nowait which is now used srv_error_monitor_thread to ensur if log mutex is locked. All changes done both for InnoDB and Xtradb