diff options
author | Adhemerval Zanella <adhemerval.zanella@linaro.org> | 2021-08-04 15:30:56 +0000 |
---|---|---|
committer | Adhemerval Zanella <adhemerval.zanella@linaro.org> | 2021-09-10 15:07:38 -0300 |
commit | c8315ccd30fcecc1b93a9bc3f073010190a86e05 (patch) | |
tree | 5f8cc1af7a951206e25b72eeca7875561f024a4f /sysdeps/aarch64/tst-audit28mod.c | |
parent | 171fdd4bd4f337001db053721477add60d205ed8 (diff) | |
download | glibc-c8315ccd30fcecc1b93a9bc3f073010190a86e05.tar.gz |
elf: Add SVE support for aarch64 rtld-auditazanella/ld-audit-fixes
To implement lazy binding is enabled when profiling or auditing used,
even when STO_AARCH64_VARIANT_PCS is set. Also, to not incur in
performance penalties on architecture without SVE, the PLT entrypoint
is set to a newer one, _dl_runtime_profile_sve, which is used iff
'hwcap' has HWCAP_SVE bit set.
This should be a fair assumption since SVE has a defined set of
registers for argument passing and return values. A new ABI with either
different argument passing or different registers would require a
different PLT entry, but I assume this would require another symbol flag
anyway (or at least a different ELF mark to indicate so).
The profile '_dl_runtime_profile_sve' entrypoint assumes the largest SVE
register size possible (2048 bits) and thus it requires a quite large
stack (8976 bytes). I think it would be possible make the stack
requirement dynamic depending of the vector length, but it would make
the PLT audit function way more complex.
It extends the La_aarch64_vector with a long double pointer to a stack
alloced buffer to hold the SVE Z register, along with a pointer to hold
the P registers on La_aarch64_regs.
It means the if 'lr_sve' is 0 in either La_aarch64_regs or
La_aarch64_retval the La_aarch64_vector contains the floating-pointer
registers that can be accessed directly (non SVE hardware). Otherwise,
'La_aarch64_vector.z' points to a memory area that holds up to 'lr_sve'
bytes for the Z registers, which can be loaded with svld1 intrinsic for
instance (as tst-audit28.c does). The P register follows the same
logic, with each La_aarch64_regs.lr_sve_pregs pointing to an area of
memory 'lr_sve/8' in size.
So, to access the FP register as float you can use:
static inline float regs_vec_to_float (const La_aarch64_regs *regs,
int idx)
{
float r;
if (regs->lr_sve == 0)
r = regs->lr_vreg[idx].s;
else
memcpy (&r, ®s->lr_vreg[idx].z[0], sizeof (r));
return r;
}
This patch is not complete yet: the tst-audit28 does not check if
compiler supports SVE (we would need a configure check to disable for
such case), I need to add a proper comment for the
_dl_runtime_profile_sve stack layout, the test need to check for the P
register state clobbering.
I also haven't check the performance penalties with this approach, and
maybe the way I am saving/restoring the SVE register might be optimized.
In any case, I checked on a SVE machine and at least the testcase work
as expected without any regressions. I also did a sniff test on a non
SVE machine.
Diffstat (limited to 'sysdeps/aarch64/tst-audit28mod.c')
-rw-r--r-- | sysdeps/aarch64/tst-audit28mod.c | 48 |
1 files changed, 48 insertions, 0 deletions
diff --git a/sysdeps/aarch64/tst-audit28mod.c b/sysdeps/aarch64/tst-audit28mod.c new file mode 100644 index 0000000000..f5e24346b4 --- /dev/null +++ b/sysdeps/aarch64/tst-audit28mod.c @@ -0,0 +1,48 @@ +/* Check DT_AUDIT for aarch64 ABI specifics. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +#include <array_length.h> +#include <assert.h> +#include <stdlib.h> +#include <support/check.h> +#include "tst-audit28mod.h" + +svint8_t +tst_audit28_func_sve_args (svint8_t z0, svint16_t z1, svint32_t z2, + svint64_t z3, svuint8_t z4, svuint16_t z5, + svuint32_t z6, svuint64_t z7) +{ + assert (svptest_any (svptrue_b8 (), svcmpeq_s8 (svptrue_b8 (), + z0, sve_args_z0 ()))); + assert (svptest_any (svptrue_b16 (), svcmpeq_s16 (svptrue_b16 (), + z1, sve_args_z1 ()))); + assert (svptest_any (svptrue_b32 (), svcmpeq_s32 (svptrue_b32 (), + z2, sve_args_z2 ()))); + assert (svptest_any (svptrue_b64 (), svcmpeq_s64 (svptrue_b64 (), + z3, sve_args_z3 ()))); + assert (svptest_any (svptrue_b16 (), svcmpeq_u8 (svptrue_b8 (), + z4, sve_args_z4 ()))); + assert (svptest_any (svptrue_b16 (), svcmpeq_u16 (svptrue_b16 (), + z5, sve_args_z5 ()))); + assert (svptest_any (svptrue_b16 (), svcmpeq_u32 (svptrue_b32 (), + z6, sve_args_z6 ()))); + assert (svptest_any (svptrue_b16 (), svcmpeq_u64 (svptrue_b64 (), + z7, sve_args_z7 ()))); + + return sve_ret (); +} |