summaryrefslogtreecommitdiff
path: root/tools/objtool
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'objtool-core-2023-04-27' of ↵Linus Torvalds2023-04-288-271/+290
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - Mark arch_cpu_idle_dead() __noreturn, make all architectures & drivers that did this inconsistently follow this new, common convention, and fix all the fallout that objtool can now detect statically - Fix/improve the ORC unwinder becoming unreliable due to UNWIND_HINT_EMPTY ambiguity, split it into UNWIND_HINT_END_OF_STACK and UNWIND_HINT_UNDEFINED to resolve it - Fix noinstr violations in the KCSAN code and the lkdtm/stackleak code - Generate ORC data for __pfx code - Add more __noreturn annotations to various kernel startup/shutdown and panic functions - Misc improvements & fixes * tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) x86/hyperv: Mark hv_ghcb_terminate() as noreturn scsi: message: fusion: Mark mpt_halt_firmware() __noreturn x86/cpu: Mark {hlt,resume}_play_dead() __noreturn btrfs: Mark btrfs_assertfail() __noreturn objtool: Include weak functions in global_noreturns check cpu: Mark nmi_panic_self_stop() __noreturn cpu: Mark panic_smp_self_stop() __noreturn arm64/cpu: Mark cpu_park_loop() and friends __noreturn x86/head: Mark *_start_kernel() __noreturn init: Mark start_kernel() __noreturn init: Mark [arch_call_]rest_init() __noreturn objtool: Generate ORC data for __pfx code x86/linkage: Fix padding for typed functions objtool: Separate prefix code from stack validation code objtool: Remove superfluous dead_end_function() check objtool: Add symbol iteration helpers objtool: Add WARN_INSN() scripts/objdump-func: Support multiple functions context_tracking: Fix KCSAN noinstr violation objtool: Add stackleak instrumentation to uaccess safe list ...
| * x86/hyperv: Mark hv_ghcb_terminate() as noreturnGuilherme G. Piccoli2023-04-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Annotate the function prototype and definition as noreturn to prevent objtool warnings like: vmlinux.o: warning: objtool: hyperv_init+0x55c: unreachable instruction Also, as per Josh's suggestion, add it to the global_noreturns list. As a comparison, an objdump output without the annotation: [...] 1b63: mov $0x1,%esi 1b68: xor %edi,%edi 1b6a: callq ffffffff8102f680 <hv_ghcb_terminate> 1b6f: jmpq ffffffff82f217ec <hyperv_init+0x9c> # unreachable 1b74: cmpq $0xffffffffffffffff,-0x702a24(%rip) [...] Now, after adding the __noreturn to the function prototype: [...] 17df: callq ffffffff8102f6d0 <hv_ghcb_negotiate_protocol> 17e4: test %al,%al 17e6: je ffffffff82f21bb9 <hyperv_init+0x469> [...] <many insns> 1bb9: mov $0x1,%esi 1bbe: xor %edi,%edi 1bc0: callq ffffffff8102f680 <hv_ghcb_terminate> 1bc5: nopw %cs:0x0(%rax,%rax,1) # end of function Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/32453a703dfcf0d007b473c9acbf70718222b74b.1681342859.git.jpoimboe@kernel.org
| * scsi: message: fusion: Mark mpt_halt_firmware() __noreturnJosh Poimboeuf2023-04-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mpt_halt_firmware() doesn't return. Mark it as such. Fixes the following warnings: vmlinux.o: warning: objtool: mptscsih_abort+0x7f4: unreachable instruction vmlinux.o: warning: objtool: mptctl_timeout_expired+0x310: unreachable instruction Reported-by: kernel test robot <lkp@intel.com> Reported-by: Mark Rutland <mark.rutland@arm.com> Debugged-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/d8129817423422355bf30e90dadc6764261b53e0.1681342859.git.jpoimboe@kernel.org
| * x86/cpu: Mark {hlt,resume}_play_dead() __noreturnJosh Poimboeuf2023-04-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | Fixes the following warning: vmlinux.o: warning: objtool: resume_play_dead+0x21: unreachable instruction Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/ce1407c4bf88b1334fe40413126343792a77ca50.1681342859.git.jpoimboe@kernel.org
| * btrfs: Mark btrfs_assertfail() __noreturnJosh Poimboeuf2023-04-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes a bunch of warnings including: vmlinux.o: warning: objtool: select_reloc_root+0x314: unreachable instruction vmlinux.o: warning: objtool: finish_inode_if_needed+0x15b1: unreachable instruction vmlinux.o: warning: objtool: get_bio_sector_nr+0x259: unreachable instruction vmlinux.o: warning: objtool: raid_wait_read_end_io+0xc26: unreachable instruction vmlinux.o: warning: objtool: raid56_parity_alloc_scrub_rbio+0x37b: unreachable instruction ... Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/960bd9c0c9e3cfc409ba9c35a17644b11b832956.1681342859.git.jpoimboe@kernel.org
| * objtool: Include weak functions in global_noreturns checkJosh Poimboeuf2023-04-141-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a global function doesn't return, and its prototype has the __noreturn attribute, its weak counterpart must also not return so that it matches the prototype and meets call site expectations. To properly follow the compiled control flow at the call sites, change the global_noreturns check to include both global and weak functions. On the other hand, if a weak function isn't in global_noreturns, assume the prototype doesn't have __noreturn. Even if the weak function doesn't return, call sites treat it like a returnable function. Fixes the following warning: kernel/sched/build_policy.o: warning: objtool: do_idle() falls through to next function play_idle_precise() Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lore.kernel.org/r/ede3460d63f4a65d282c86f1175bd2662c2286ba.1681342859.git.jpoimboe@kernel.org
| * cpu: Mark nmi_panic_self_stop() __noreturnJosh Poimboeuf2023-04-141-0/+1
| | | | | | | | | | | | | | | | | | In preparation for improving objtool's handling of weak noreturn functions, mark nmi_panic_self_stop() __noreturn. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/316fc6dfab5a8c4e024c7185484a1ee5fb0afb79.1681342859.git.jpoimboe@kernel.org
| * cpu: Mark panic_smp_self_stop() __noreturnJosh Poimboeuf2023-04-141-0/+1
| | | | | | | | | | | | | | | | | | In preparation for improving objtool's handling of weak noreturn functions, mark panic_smp_self_stop() __noreturn. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/92d76ab5c8bf660f04fdcd3da1084519212de248.1681342859.git.jpoimboe@kernel.org
| * x86/head: Mark *_start_kernel() __noreturnJosh Poimboeuf2023-04-141-0/+2
| | | | | | | | | | | | | | | | | | Now that start_kernel() is __noreturn, mark its chain of callers __noreturn. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/c2525f96b88be98ee027ee0291d58003036d4120.1681342859.git.jpoimboe@kernel.org
| * init: Mark start_kernel() __noreturnJosh Poimboeuf2023-04-141-0/+1
| | | | | | | | | | | | | | | | | | Now that arch_call_rest_init() is __noreturn, mark its caller start_kernel() __noreturn. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/7069acf026a195f26a88061227fba5a3b0337b9a.1681342859.git.jpoimboe@kernel.org
| * init: Mark [arch_call_]rest_init() __noreturnJosh Poimboeuf2023-04-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | In preparation for improving objtool's handling of weak noreturn functions, mark start_kernel(), arch_call_rest_init(), and rest_init() __noreturn. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/7194ed8a989a85b98d92e62df660f4a90435a723.1681342859.git.jpoimboe@kernel.org
| * objtool: Generate ORC data for __pfx codeJosh Poimboeuf2023-04-141-0/+14
| | | | | | | | | | | | | | | | | | | | | | Allow unwinding from prefix code by copying the CFI from the starting instruction of the corresponding function. Even when the NOPs are replaced, they're still stack-invariant instructions so the same ORC entry can be reused everywhere. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/bc3344e51f3e87102f1301a0be0f72a7689ea4a4.1681331135.git.jpoimboe@kernel.org
| * objtool: Separate prefix code from stack validation codeJosh Poimboeuf2023-04-141-38/+50
| | | | | | | | | | | | | | | | | | Simplify the prefix code by moving it after validate_reachable_instructions(). Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/d7f31ac2de462d0cd7b1db01b7ecb525c057c8f6.1681331135.git.jpoimboe@kernel.org
| * objtool: Remove superfluous dead_end_function() checkJosh Poimboeuf2023-04-141-2/+1
| | | | | | | | | | | | | | | | | | annotate_call_site() already sets 'insn->dead_end' for calls to dead end functions. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/5d603a301e9a8b1036b61503385907e154867ace.1681325924.git.jpoimboe@kernel.org
| * objtool: Add symbol iteration helpersJosh Poimboeuf2023-04-143-58/+51
| | | | | | | | | | | | | | | | Add [sec_]for_each_sym() and use them. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/59023e5886ab125aa30702e633be7732b1acaa7e.1681325924.git.jpoimboe@kernel.org
| * objtool: Add WARN_INSN()Josh Poimboeuf2023-04-143-115/+70
| | | | | | | | | | | | | | | | | | | | | | It's easier to use and also gives easy access to the instruction's containing function, which is useful for printing that function's symbol. It will also be useful in the future for rate-limiting and disassembly of warned functions. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/2eaa3155c90fba683d8723599f279c46025b75f3.1681325924.git.jpoimboe@kernel.org
| * objtool: Add stackleak instrumentation to uaccess safe listJosh Poimboeuf2023-04-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a function has a large stack frame, the stackleak plugin adds a call to stackleak_track_stack() after the prologue. This function may be called in uaccess-enabled code. Add it to the uaccess safe list. Fixes the following warning: vmlinux.o: warning: objtool: kasan_report+0x12: call to stackleak_track_stack() with UACCESS enabled Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/42e9b487ef89e9b237fd5220ad1c7cf1a2ad7eb8.1681320562.git.jpoimboe@kernel.org
| * Revert "objtool: Support addition to set CFA base"Josh Poimboeuf2023-04-141-11/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 468af56a7bba ("objtool: Support addition to set CFA base") was added as a preparatory patch for arm64 support, but that support never came. It triggers a false positive warning on x86, so just revert it for now. Fixes the following warning: vmlinux.o: warning: objtool: cdce925_regmap_i2c_write+0xdb: stack state mismatch: cfa1=4+120 cfa2=5+40 Fixes: 468af56a7bba ("objtool: Support addition to set CFA base") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/oe-kbuild-all/202304080538.j5G6h1AB-lkp@intel.com/
| * x86,objtool: Split UNWIND_HINT_EMPTY in twoJosh Poimboeuf2023-03-233-16/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark reported that the ORC unwinder incorrectly marks an unwind as reliable when the unwind terminates prematurely in the dark corners of return_to_handler() due to lack of information about the next frame. The problem is UNWIND_HINT_EMPTY is used in two different situations: 1) The end of the kernel stack unwind before hitting user entry, boot code, or fork entry 2) A blind spot in ORC coverage where the unwinder has to bail due to lack of information about the next frame The ORC unwinder has no way to tell the difference between the two. When it encounters an undefined stack state with 'end=1', it blindly marks the stack reliable, which can break the livepatch consistency model. Fix it by splitting UNWIND_HINT_EMPTY into UNWIND_HINT_UNDEFINED and UNWIND_HINT_END_OF_STACK. Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/fd6212c8b450d3564b855e1cb48404d6277b4d9f.1677683419.git.jpoimboe@kernel.org
| * x86,objtool: Separate unret validation from unwind hintsJosh Poimboeuf2023-03-232-22/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | The ENTRY unwind hint type is serving double duty as both an empty unwind hint and an unret validation annotation. Unret validation is unrelated to unwinding. Separate it out into its own annotation. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/ff7448d492ea21b86d8a90264b105fbd0d751077.1677683419.git.jpoimboe@kernel.org
| * x86,objtool: Introduce ORC_TYPE_*Josh Poimboeuf2023-03-232-6/+20
| | | | | | | | | | | | | | | | | | | | Unwind hints and ORC entry types are two distinct things. Separate them out more explicitly. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/cc879d38fff8a43f8f7beb2fd56e35a5a384d7cd.1677683419.git.jpoimboe@kernel.org
| * objtool: Add objtool_types.hJosh Poimboeuf2023-03-234-4/+4
| | | | | | | | | | | | | | | | | | | | Reduce the amount of header sync churn by splitting the shared objtool.h types into a new file. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/dec622720851210ceafa12d4f4c5f9e73c832152.1677683419.git.jpoimboe@kernel.org
| * sched/idle: Mark arch_cpu_idle_dead() __noreturnJosh Poimboeuf2023-03-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before commit 076cbf5d2163 ("x86/xen: don't let xen_pv_play_dead() return"), in Xen, when a previously offlined CPU was brought back online, it unexpectedly resumed execution where it left off in the middle of the idle loop. There were some hacks to make that work, but the behavior was surprising as do_idle() doesn't expect an offlined CPU to return from the dead (in arch_cpu_idle_dead()). Now that Xen has been fixed, and the arch-specific implementations of arch_cpu_idle_dead() also don't return, give it a __noreturn attribute. This will cause the compiler to complain if an arch-specific implementation might return. It also improves code generation for both caller and callee. Also fixes the following warning: vmlinux.o: warning: objtool: do_idle+0x25f: unreachable instruction Reported-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/60d527353da8c99d4cf13b6473131d46719ed16d.1676358308.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
* | Merge tag 'for-6.4-tag' of ↵Linus Torvalds2023-04-261-0/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "Mostly core changes and cleanups, some notable fixes and two performance improvements in directory logging. The IO path cleanups are removing or refactoring old code, scrub main loop has been completely rewritten also refactoring old code. There are some changes to non-btrfs code, mostly trivial, the cgroup punt bio logic is only moved from generic code. Performance improvements: - improve logging changes in a directory during one transaction, avoid iterating over items and reduce lock contention (fsync time 4x lower) - when logging directory entries during one transaction, reduce locking of subvolume trees by checking tree-log instead (improvement in throughput and latency for concurrent access to a subvolume) Notable fixes: - dev-replace: - properly honor read mode when requested to avoid reading from source device - target device won't be used for eventual read repair, this is unreliable for NODATASUM files - when there are unpaired (and unrepairable) metadata during replace, exit early with error and don't try to finish whole operation - scrub ioctl properly rejects unknown flags - fix global block reserve calculations - fix partial direct io write when there's a page fault in the middle, iomap will try to continue with partial request but the btrfs part did not match that, this can lead to zeros written instead of data Core changes: - io path: - continued cleanups and refactoring around bio handling - extent io submit path simplifications and cleanups - flush write path simplifications and cleanups - rework logic of passing sync mode of bio, with further cleanups - rewrite scrub code flow, restructure how the stripes are enumerated and verified in a more unified way - allow to set lower threshold for block group reclaim in debug mode to aid zoned mode testing - remove obsolete time-based delayed ref throttling logic when truncating items - DREW locks are not using percpu variables anymore - more warning fixes (-Wmaybe-uninitialized) - u64 division simplifications - error handling improvements Non-btrfs code changes: - push cgroup punt bio logic to btrfs code (there was no other user of that), the functionality can be now selected separately by BLK_CGROUP_PUNT_BIO - crc32c_impl removed after removing last uses in btrfs code - add btrfs_assertfail() to objtool table" * tag 'for-6.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (147 commits) btrfs: mark btrfs_assertfail() __noreturn btrfs: fix uninitialized variable warnings btrfs: use log root when iterating over index keys when logging directory btrfs: avoid iterating over all indexes when logging directory btrfs: dev-replace: error out if we have unrepaired metadata error during btrfs: remove pointless loop at btrfs_get_next_valid_item() btrfs: scrub: reject unsupported scrub flags btrfs: reinterpret async discard iops_limit=0 as no delay btrfs: set default discard iops_limit to 1000 btrfs: remove unused raid56 functions which were dedicated for scrub btrfs: scrub: remove scrub_bio structure btrfs: scrub: remove scrub_block and scrub_sector structures btrfs: scrub: remove the old scrub recheck code btrfs: scrub: remove the old writeback infrastructure btrfs: scrub: remove scrub_parity structure btrfs: scrub: use scrub_stripe to implement RAID56 P/Q scrub btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure btrfs: scrub: introduce helper to queue a stripe for scrub btrfs: scrub: introduce error reporting functionality for scrub_stripe btrfs: scrub: introduce a writeback helper for scrub_stripe ...
| * | btrfs: mark btrfs_assertfail() __noreturnJosh Poimboeuf2023-04-171-0/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes a bunch of warnings including: vmlinux.o: warning: objtool: select_reloc_root+0x314: unreachable instruction vmlinux.o: warning: objtool: finish_inode_if_needed+0x15b1: unreachable instruction vmlinux.o: warning: objtool: get_bio_sector_nr+0x259: unreachable instruction vmlinux.o: warning: objtool: raid_wait_read_end_io+0xc26: unreachable instruction vmlinux.o: warning: objtool: raid56_parity_alloc_scrub_rbio+0x37b: unreachable instruction ... Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202302210709.IlXfgMpX-lkp@intel.com/ Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* | Merge tag 'docs-6.4' of git://git.lwn.net/linuxLinus Torvalds2023-04-241-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull documentation updates from Jonathan Corbet: "Commit volume in documentation is relatively low this time, but there is still a fair amount going on, including: - Reorganize the architecture-specific documentation under Documentation/arch This makes the structure match the source directory and helps to clean up the mess that is the top-level Documentation directory a bit. This work creates the new directory and moves x86 and most of the less-active architectures there. The current plan is to move the rest of the architectures in 6.5, with the patches going through the appropriate subsystem trees. - Some more Spanish translations and maintenance of the Italian translation - A new "Kernel contribution maturity model" document from Ted - A new tutorial on quickly building a trimmed kernel from Thorsten Plus the usual set of updates and fixes" * tag 'docs-6.4' of git://git.lwn.net/linux: (47 commits) media: Adjust column width for pdfdocs media: Fix building pdfdocs docs: clk: add documentation to log which clocks have been disabled docs: trace: Fix typo in ftrace.rst Documentation/process: always CC responsible lists docs: kmemleak: adjust to config renaming ELF: document some de-facto PT_* ABI quirks Documentation: arm: remove stih415/stih416 related entries docs: turn off "smart quotes" in the HTML build Documentation: firmware: Clarify firmware path usage docs/mm: Physical Memory: Fix grammar Documentation: Add document for false sharing dma-api-howto: typo fix docs: move m68k architecture documentation under Documentation/arch/ docs: move parisc documentation under Documentation/arch/ docs: move ia64 architecture docs under Documentation/arch/ docs: Move arc architecture docs under Documentation/arch/ docs: move nios2 documentation under Documentation/arch/ docs: move openrisc documentation under Documentation/arch/ docs: move superh documentation under Documentation/arch/ ...
| * | docs: move x86 documentation into Documentation/arch/Jonathan Corbet2023-03-301-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the x86 documentation under Documentation/arch/ as a way of cleaning up the top-level directory and making the structure of our docs more closely match the structure of the source directories it describes. All in-kernel references to the old paths have been updated. Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-arch@vger.kernel.org Cc: x86@kernel.org Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/20230315211523.108836-1-corbet@lwn.net/ Signed-off-by: Jonathan Corbet <corbet@lwn.net>
* | x86: improve on the non-rep 'copy_user' functionLinus Torvalds2023-04-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The old 'copy_user_generic_unrolled' function was oddly implemented for largely historical reasons: it had been largely based on the uncached copy case, which has some other concerns. For example, the __copy_user_nocache() function uses 'movnti' for the destination stores, and those want the destination to be aligned. In contrast, the regular copy function doesn't really care, and trying to align things only complicates matters. Also, like the clear_user function, the copy function had some odd handling of the repeat counts, complicating the exception handling for no really good reason. So as with clear_user, just write it to keep all the byte counts in the %rcx register, exactly like the 'rep movs' functionality that this replaces. Unlike a real 'rep movs', we do allow for this to trash a few temporary registers to not have to unnecessarily save/restore registers on the stack. And like the clearing case, rename this to what it now clearly is: 'rep_movs_alternative', and make it one coherent function, so that it shows up as such in profiles (instead of the odd split between "copy_user_generic_unrolled" and "copy_user_short_string", the latter of which was not about strings at all, and which was shared with the uncached case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | x86: improve on the non-rep 'clear_user' functionLinus Torvalds2023-04-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The old version was oddly written to have the repeat count in multiple registers. So instead of taking advantage of %rax being zero, it had some sub-counts in it. All just for a "single word clearing" loop, which isn't even efficient to begin with. So get rid of those games, and just keep all the state in the same registers we got it in (and that we should return things in). That not only makes this act much more like 'rep stos' (which this function is replacing), but makes it much easier to actually do the obvious loop unrolling. Also rename the function from the now nonsensical 'clear_user_original' to what it now clearly is: 'rep_stos_alternative'. End result: if we don't have a fast 'rep stosb', at least we can have a fast fallback for it. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | x86: inline the 'rep movs' in user copies for the FSRM caseLinus Torvalds2023-04-181-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This does the same thing for the user copies as commit 0db7058e8e23 ("x86/clear_user: Make it faster") did for clear_user(). In other words, it inlines the "rep movs" case when X86_FEATURE_FSRM is set, avoiding the function call entirely. In order to do that, it makes the calling convention for the out-of-line case ("copy_user_generic_unrolled") match the 'rep movs' calling convention, although it does also end up clobbering a number of additional registers. Also, to simplify code sharing in the low-level assembly with the __copy_user_nocache() function (that uses the normal C calling convention), we end up with a kind of mixed return value for the low-level asm code: it will return the result in both %rcx (to work as an alternative for the 'rep movs' case), _and_ in %rax (for the nocache case). We could avoid this by wrapping __copy_user_nocache() callers in an inline asm, but since the cost is just an extra register copy, it's probably not worth it. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | x86: move stac/clac from user copy routines into callersLinus Torvalds2023-04-181-0/+3
| | | | | | | | | | | | | | | | | | This is preparatory work for inlining the 'rep movs' case, but also a cleanup. The __copy_user_nocache() function was mis-used by the rdma code to do uncached kernel copies that don't actually want user copies at all, and as a result doesn't want the stac/clac either. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | x86: don't use REP_GOOD or ERMS for user memory clearingLinus Torvalds2023-04-181-2/+0
|/ | | | | | | | | | | | | | | The modern target to use is FSRS (Fast Short REP STOS), and the other cases should only be used for bigger areas (ie mainly things like page clearing). Note! This changes the conditional for the inlining from FSRM ("fast short rep movs") to FSRS ("fast short rep stos"). We'll have a separate fixup for AMD microarchitectures that have a good 'rep stosb' yet do not set the new Intel-specific FSRS bit (because FSRM was there first). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'objtool-core-2023-03-02' of ↵Linus Torvalds2023-03-0220-300/+419
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - Shrink 'struct instruction', to improve objtool performance & memory footprint - Other maximum memory usage reductions - this makes the build both faster, and fixes kernel build OOM failures on allyesconfig and similar configs when they try to build the final (large) vmlinux.o - Fix ORC unwinding when a kprobe (INT3) is set on a stack-modifying single-byte instruction (PUSH/POP or LEAVE). This requires the extension of the ORC metadata structure with a 'signal' field - Misc fixes & cleanups * tag 'objtool-core-2023-03-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits) objtool: Fix ORC 'signal' propagation objtool: Remove instruction::list x86: Fix FILL_RETURN_BUFFER objtool: Fix overlapping alternatives objtool: Union instruction::{call_dest,jump_table} objtool: Remove instruction::reloc objtool: Shrink instruction::{type,visited} objtool: Make instruction::alts a single-linked list objtool: Make instruction::stack_ops a single-linked list objtool: Change arch_decode_instruction() signature x86/entry: Fix unwinding from kprobe on PUSH/POP instruction x86/unwind/orc: Add 'signal' field to ORC metadata objtool: Optimize layout of struct special_alt objtool: Optimize layout of struct symbol objtool: Allocate multiple structures with calloc() objtool: Make struct check_options static objtool: Make struct entries[] static and const objtool: Fix HOSTCC flag usage objtool: Properly support make V=1 objtool: Install libsubcmd in build ...
| * objtool: Fix ORC 'signal' propagationJosh Poimboeuf2023-02-233-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There have been some recently reported ORC unwinder warnings like: WARNING: can't access registers at entry_SYSCALL_64_after_hwframe+0x63/0xcd WARNING: stack going in the wrong direction? at __sys_setsockopt+0x2c6/0x5b0 net/socket.c:2271 And a KASAN warning: BUG: KASAN: stack-out-of-bounds in unwind_next_frame (arch/x86/include/asm/ptrace.h:136 arch/x86/kernel/unwind_orc.c:455) It turns out the 'signal' bit isn't getting propagated from the unwind hints to the ORC entries, making the unwinder confused at times. Fixes: ffb1b4a41016 ("x86/unwind/orc: Add 'signal' field to ORC metadata") Reported-by: kernel test robot <oliver.sang@intel.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/97eef9db60cd86d376a9a40d49d77bb67a8f6526.1676579666.git.jpoimboe@kernel.org
| * objtool: Remove instruction::listPeter Zijlstra2023-02-234-86/+133
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the instruction::list by allocating instructions in arrays of 256 entries and stringing them together by (amortized) find_insn(). This shrinks instruction by 16 bytes and brings it down to 128. struct instruction { - struct list_head list; /* 0 16 */ - struct hlist_node hash; /* 16 16 */ - struct list_head call_node; /* 32 16 */ - struct section * sec; /* 48 8 */ - long unsigned int offset; /* 56 8 */ - /* --- cacheline 1 boundary (64 bytes) --- */ - long unsigned int immediate; /* 64 8 */ - unsigned int len; /* 72 4 */ - u8 type; /* 76 1 */ - - /* Bitfield combined with previous fields */ + struct hlist_node hash; /* 0 16 */ + struct list_head call_node; /* 16 16 */ + struct section * sec; /* 32 8 */ + long unsigned int offset; /* 40 8 */ + long unsigned int immediate; /* 48 8 */ + u8 len; /* 56 1 */ + u8 prev_len; /* 57 1 */ + u8 type; /* 58 1 */ + s8 instr; /* 59 1 */ + u32 idx:8; /* 60: 0 4 */ + u32 dead_end:1; /* 60: 8 4 */ + u32 ignore:1; /* 60: 9 4 */ + u32 ignore_alts:1; /* 60:10 4 */ + u32 hint:1; /* 60:11 4 */ + u32 save:1; /* 60:12 4 */ + u32 restore:1; /* 60:13 4 */ + u32 retpoline_safe:1; /* 60:14 4 */ + u32 noendbr:1; /* 60:15 4 */ + u32 entry:1; /* 60:16 4 */ + u32 visited:4; /* 60:17 4 */ + u32 no_reloc:1; /* 60:21 4 */ - u16 dead_end:1; /* 76: 8 2 */ - u16 ignore:1; /* 76: 9 2 */ - u16 ignore_alts:1; /* 76:10 2 */ - u16 hint:1; /* 76:11 2 */ - u16 save:1; /* 76:12 2 */ - u16 restore:1; /* 76:13 2 */ - u16 retpoline_safe:1; /* 76:14 2 */ - u16 noendbr:1; /* 76:15 2 */ - u16 entry:1; /* 78: 0 2 */ - u16 visited:4; /* 78: 1 2 */ - u16 no_reloc:1; /* 78: 5 2 */ + /* XXX 10 bits hole, try to pack */ - /* XXX 2 bits hole, try to pack */ - /* Bitfield combined with next fields */ - - s8 instr; /* 79 1 */ - struct alt_group * alt_group; /* 80 8 */ - struct instruction * jump_dest; /* 88 8 */ - struct instruction * first_jump_src; /* 96 8 */ + /* --- cacheline 1 boundary (64 bytes) --- */ + struct alt_group * alt_group; /* 64 8 */ + struct instruction * jump_dest; /* 72 8 */ + struct instruction * first_jump_src; /* 80 8 */ union { - struct symbol * _call_dest; /* 104 8 */ - struct reloc * _jump_table; /* 104 8 */ - }; /* 104 8 */ - struct alternative * alts; /* 112 8 */ - struct symbol * sym; /* 120 8 */ - /* --- cacheline 2 boundary (128 bytes) --- */ - struct stack_op * stack_ops; /* 128 8 */ - struct cfi_state * cfi; /* 136 8 */ + struct symbol * _call_dest; /* 88 8 */ + struct reloc * _jump_table; /* 88 8 */ + }; /* 88 8 */ + struct alternative * alts; /* 96 8 */ + struct symbol * sym; /* 104 8 */ + struct stack_op * stack_ops; /* 112 8 */ + struct cfi_state * cfi; /* 120 8 */ - /* size: 144, cachelines: 3, members: 28 */ - /* sum members: 142 */ - /* sum bitfield members: 14 bits, bit holes: 1, sum bit holes: 2 bits */ - /* last cacheline: 16 bytes */ + /* size: 128, cachelines: 2, members: 29 */ + /* sum members: 124 */ + /* sum bitfield members: 22 bits, bit holes: 1, sum bit holes: 10 bits */ }; pre: 5:38.18 real, 213.25 user, 124.90 sys, 23449040 mem post: 5:03.34 real, 210.75 user, 88.80 sys, 20241232 mem Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> # build only Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run Link: https://lore.kernel.org/r/20230208172245.851307606@infradead.org
| * objtool: Fix overlapping alternativesPeter Zijlstra2023-02-231-26/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Things like ALTERNATIVE_{2,3}() generate multiple alternatives on the same place, objtool would override the first orig_alt_group with the second (or third), failing to check the CFI among all the different variants. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> # build only Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run Link: https://lore.kernel.org/r/20230208172245.711471461@infradead.org
| * objtool: Union instruction::{call_dest,jump_table}Peter Zijlstra2023-02-232-29/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The instruction call_dest and jump_table members can never be used at the same time, their usage depends on type. struct instruction { struct list_head list; /* 0 16 */ struct hlist_node hash; /* 16 16 */ struct list_head call_node; /* 32 16 */ struct section * sec; /* 48 8 */ long unsigned int offset; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ long unsigned int immediate; /* 64 8 */ unsigned int len; /* 72 4 */ u8 type; /* 76 1 */ /* Bitfield combined with previous fields */ u16 dead_end:1; /* 76: 8 2 */ u16 ignore:1; /* 76: 9 2 */ u16 ignore_alts:1; /* 76:10 2 */ u16 hint:1; /* 76:11 2 */ u16 save:1; /* 76:12 2 */ u16 restore:1; /* 76:13 2 */ u16 retpoline_safe:1; /* 76:14 2 */ u16 noendbr:1; /* 76:15 2 */ u16 entry:1; /* 78: 0 2 */ u16 visited:4; /* 78: 1 2 */ u16 no_reloc:1; /* 78: 5 2 */ /* XXX 2 bits hole, try to pack */ /* Bitfield combined with next fields */ s8 instr; /* 79 1 */ struct alt_group * alt_group; /* 80 8 */ - struct symbol * call_dest; /* 88 8 */ - struct instruction * jump_dest; /* 96 8 */ - struct instruction * first_jump_src; /* 104 8 */ - struct reloc * jump_table; /* 112 8 */ - struct alternative * alts; /* 120 8 */ + struct instruction * jump_dest; /* 88 8 */ + struct instruction * first_jump_src; /* 96 8 */ + union { + struct symbol * _call_dest; /* 104 8 */ + struct reloc * _jump_table; /* 104 8 */ + }; /* 104 8 */ + struct alternative * alts; /* 112 8 */ + struct symbol * sym; /* 120 8 */ /* --- cacheline 2 boundary (128 bytes) --- */ - struct symbol * sym; /* 128 8 */ - struct stack_op * stack_ops; /* 136 8 */ - struct cfi_state * cfi; /* 144 8 */ + struct stack_op * stack_ops; /* 128 8 */ + struct cfi_state * cfi; /* 136 8 */ - /* size: 152, cachelines: 3, members: 29 */ - /* sum members: 150 */ + /* size: 144, cachelines: 3, members: 28 */ + /* sum members: 142 */ /* sum bitfield members: 14 bits, bit holes: 1, sum bit holes: 2 bits */ - /* last cacheline: 24 bytes */ + /* last cacheline: 16 bytes */ }; pre: 5:39.35 real, 215.58 user, 123.69 sys, 23448736 mem post: 5:38.18 real, 213.25 user, 124.90 sys, 23449040 mem Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> # build only Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run Link: https://lore.kernel.org/r/20230208172245.640914454@infradead.org
| * objtool: Remove instruction::relocPeter Zijlstra2023-02-232-16/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of caching the reloc for each instruction, only keep a negative cache of not having a reloc (by far the most common case). struct instruction { struct list_head list; /* 0 16 */ struct hlist_node hash; /* 16 16 */ struct list_head call_node; /* 32 16 */ struct section * sec; /* 48 8 */ long unsigned int offset; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ long unsigned int immediate; /* 64 8 */ unsigned int len; /* 72 4 */ u8 type; /* 76 1 */ /* Bitfield combined with previous fields */ u16 dead_end:1; /* 76: 8 2 */ u16 ignore:1; /* 76: 9 2 */ u16 ignore_alts:1; /* 76:10 2 */ u16 hint:1; /* 76:11 2 */ u16 save:1; /* 76:12 2 */ u16 restore:1; /* 76:13 2 */ u16 retpoline_safe:1; /* 76:14 2 */ u16 noendbr:1; /* 76:15 2 */ u16 entry:1; /* 78: 0 2 */ u16 visited:4; /* 78: 1 2 */ + u16 no_reloc:1; /* 78: 5 2 */ - /* XXX 3 bits hole, try to pack */ + /* XXX 2 bits hole, try to pack */ /* Bitfield combined with next fields */ s8 instr; /* 79 1 */ struct alt_group * alt_group; /* 80 8 */ struct symbol * call_dest; /* 88 8 */ struct instruction * jump_dest; /* 96 8 */ struct instruction * first_jump_src; /* 104 8 */ struct reloc * jump_table; /* 112 8 */ - struct reloc * reloc; /* 120 8 */ + struct alternative * alts; /* 120 8 */ /* --- cacheline 2 boundary (128 bytes) --- */ - struct alternative * alts; /* 128 8 */ - struct symbol * sym; /* 136 8 */ - struct stack_op * stack_ops; /* 144 8 */ - struct cfi_state * cfi; /* 152 8 */ + struct symbol * sym; /* 128 8 */ + struct stack_op * stack_ops; /* 136 8 */ + struct cfi_state * cfi; /* 144 8 */ - /* size: 160, cachelines: 3, members: 29 */ - /* sum members: 158 */ - /* sum bitfield members: 13 bits, bit holes: 1, sum bit holes: 3 bits */ - /* last cacheline: 32 bytes */ + /* size: 152, cachelines: 3, members: 29 */ + /* sum members: 150 */ + /* sum bitfield members: 14 bits, bit holes: 1, sum bit holes: 2 bits */ + /* last cacheline: 24 bytes */ }; pre: 5:48.89 real, 220.96 user, 127.55 sys, 24834672 mem post: 5:39.35 real, 215.58 user, 123.69 sys, 23448736 mem Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> # build only Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run Link: https://lore.kernel.org/r/20230208172245.572145269@infradead.org
| * objtool: Shrink instruction::{type,visited}Peter Zijlstra2023-02-231-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we don't have that many types in enum insn_type, force it into a u8 and re-arrange member to get rid of the holes, saves another 8 bytes. struct instruction { struct list_head list; /* 0 16 */ struct hlist_node hash; /* 16 16 */ struct list_head call_node; /* 32 16 */ struct section * sec; /* 48 8 */ long unsigned int offset; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ - unsigned int len; /* 64 4 */ - enum insn_type type; /* 68 4 */ - long unsigned int immediate; /* 72 8 */ - u16 dead_end:1; /* 80: 0 2 */ - u16 ignore:1; /* 80: 1 2 */ - u16 ignore_alts:1; /* 80: 2 2 */ - u16 hint:1; /* 80: 3 2 */ - u16 save:1; /* 80: 4 2 */ - u16 restore:1; /* 80: 5 2 */ - u16 retpoline_safe:1; /* 80: 6 2 */ - u16 noendbr:1; /* 80: 7 2 */ - u16 entry:1; /* 80: 8 2 */ + long unsigned int immediate; /* 64 8 */ + unsigned int len; /* 72 4 */ + u8 type; /* 76 1 */ - /* XXX 7 bits hole, try to pack */ + /* Bitfield combined with previous fields */ - s8 instr; /* 82 1 */ - u8 visited; /* 83 1 */ + u16 dead_end:1; /* 76: 8 2 */ + u16 ignore:1; /* 76: 9 2 */ + u16 ignore_alts:1; /* 76:10 2 */ + u16 hint:1; /* 76:11 2 */ + u16 save:1; /* 76:12 2 */ + u16 restore:1; /* 76:13 2 */ + u16 retpoline_safe:1; /* 76:14 2 */ + u16 noendbr:1; /* 76:15 2 */ + u16 entry:1; /* 78: 0 2 */ + u16 visited:4; /* 78: 1 2 */ - /* XXX 4 bytes hole, try to pack */ + /* XXX 3 bits hole, try to pack */ + /* Bitfield combined with next fields */ - struct alt_group * alt_group; /* 88 8 */ - struct symbol * call_dest; /* 96 8 */ - struct instruction * jump_dest; /* 104 8 */ - struct instruction * first_jump_src; /* 112 8 */ - struct reloc * jump_table; /* 120 8 */ + s8 instr; /* 79 1 */ + struct alt_group * alt_group; /* 80 8 */ + struct symbol * call_dest; /* 88 8 */ + struct instruction * jump_dest; /* 96 8 */ + struct instruction * first_jump_src; /* 104 8 */ + struct reloc * jump_table; /* 112 8 */ + struct reloc * reloc; /* 120 8 */ /* --- cacheline 2 boundary (128 bytes) --- */ - struct reloc * reloc; /* 128 8 */ - struct alternative * alts; /* 136 8 */ - struct symbol * sym; /* 144 8 */ - struct stack_op * stack_ops; /* 152 8 */ - struct cfi_state * cfi; /* 160 8 */ + struct alternative * alts; /* 128 8 */ + struct symbol * sym; /* 136 8 */ + struct stack_op * stack_ops; /* 144 8 */ + struct cfi_state * cfi; /* 152 8 */ - /* size: 168, cachelines: 3, members: 29 */ - /* sum members: 162, holes: 1, sum holes: 4 */ - /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 7 bits */ - /* last cacheline: 40 bytes */ + /* size: 160, cachelines: 3, members: 29 */ + /* sum members: 158 */ + /* sum bitfield members: 13 bits, bit holes: 1, sum bit holes: 3 bits */ + /* last cacheline: 32 bytes */ }; pre: 5:48.86 real, 220.30 user, 128.34 sys, 24834672 mem post: 5:48.89 real, 220.96 user, 127.55 sys, 24834672 mem Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> # build only Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run Link: https://lore.kernel.org/r/20230208172245.501847188@infradead.org
| * objtool: Make instruction::alts a single-linked listPeter Zijlstra2023-02-232-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct instruction { struct list_head list; /* 0 16 */ struct hlist_node hash; /* 16 16 */ struct list_head call_node; /* 32 16 */ struct section * sec; /* 48 8 */ long unsigned int offset; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ unsigned int len; /* 64 4 */ enum insn_type type; /* 68 4 */ long unsigned int immediate; /* 72 8 */ u16 dead_end:1; /* 80: 0 2 */ u16 ignore:1; /* 80: 1 2 */ u16 ignore_alts:1; /* 80: 2 2 */ u16 hint:1; /* 80: 3 2 */ u16 save:1; /* 80: 4 2 */ u16 restore:1; /* 80: 5 2 */ u16 retpoline_safe:1; /* 80: 6 2 */ u16 noendbr:1; /* 80: 7 2 */ u16 entry:1; /* 80: 8 2 */ /* XXX 7 bits hole, try to pack */ s8 instr; /* 82 1 */ u8 visited; /* 83 1 */ /* XXX 4 bytes hole, try to pack */ struct alt_group * alt_group; /* 88 8 */ struct symbol * call_dest; /* 96 8 */ struct instruction * jump_dest; /* 104 8 */ struct instruction * first_jump_src; /* 112 8 */ struct reloc * jump_table; /* 120 8 */ /* --- cacheline 2 boundary (128 bytes) --- */ struct reloc * reloc; /* 128 8 */ - struct list_head alts; /* 136 16 */ - struct symbol * sym; /* 152 8 */ - struct stack_op * stack_ops; /* 160 8 */ - struct cfi_state * cfi; /* 168 8 */ + struct alternative * alts; /* 136 8 */ + struct symbol * sym; /* 144 8 */ + struct stack_op * stack_ops; /* 152 8 */ + struct cfi_state * cfi; /* 160 8 */ - /* size: 176, cachelines: 3, members: 29 */ - /* sum members: 170, holes: 1, sum holes: 4 */ + /* size: 168, cachelines: 3, members: 29 */ + /* sum members: 162, holes: 1, sum holes: 4 */ /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 7 bits */ - /* last cacheline: 48 bytes */ + /* last cacheline: 40 bytes */ }; pre: 5:58.50 real, 229.64 user, 128.65 sys, 26221520 mem post: 5:48.86 real, 220.30 user, 128.34 sys, 24834672 mem Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> # build only Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run Link: https://lore.kernel.org/r/20230208172245.430556498@infradead.org
| * objtool: Make instruction::stack_ops a single-linked listPeter Zijlstra2023-02-234-10/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct instruction { struct list_head list; /* 0 16 */ struct hlist_node hash; /* 16 16 */ struct list_head call_node; /* 32 16 */ struct section * sec; /* 48 8 */ long unsigned int offset; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ unsigned int len; /* 64 4 */ enum insn_type type; /* 68 4 */ long unsigned int immediate; /* 72 8 */ u16 dead_end:1; /* 80: 0 2 */ u16 ignore:1; /* 80: 1 2 */ u16 ignore_alts:1; /* 80: 2 2 */ u16 hint:1; /* 80: 3 2 */ u16 save:1; /* 80: 4 2 */ u16 restore:1; /* 80: 5 2 */ u16 retpoline_safe:1; /* 80: 6 2 */ u16 noendbr:1; /* 80: 7 2 */ u16 entry:1; /* 80: 8 2 */ /* XXX 7 bits hole, try to pack */ s8 instr; /* 82 1 */ u8 visited; /* 83 1 */ /* XXX 4 bytes hole, try to pack */ struct alt_group * alt_group; /* 88 8 */ struct symbol * call_dest; /* 96 8 */ struct instruction * jump_dest; /* 104 8 */ struct instruction * first_jump_src; /* 112 8 */ struct reloc * jump_table; /* 120 8 */ /* --- cacheline 2 boundary (128 bytes) --- */ struct reloc * reloc; /* 128 8 */ struct list_head alts; /* 136 16 */ struct symbol * sym; /* 152 8 */ - struct list_head stack_ops; /* 160 16 */ - struct cfi_state * cfi; /* 176 8 */ + struct stack_op * stack_ops; /* 160 8 */ + struct cfi_state * cfi; /* 168 8 */ - /* size: 184, cachelines: 3, members: 29 */ - /* sum members: 178, holes: 1, sum holes: 4 */ + /* size: 176, cachelines: 3, members: 29 */ + /* sum members: 170, holes: 1, sum holes: 4 */ /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 7 bits */ - /* last cacheline: 56 bytes */ + /* last cacheline: 48 bytes */ }; pre: 5:58.22 real, 226.69 user, 131.22 sys, 26221520 mem post: 5:58.50 real, 229.64 user, 128.65 sys, 26221520 mem Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> # build only Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run Link: https://lore.kernel.org/r/20230208172245.362196959@infradead.org
| * objtool: Change arch_decode_instruction() signaturePeter Zijlstra2023-02-234-71/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation to changing struct instruction around a bit, avoid passing it's members by pointer and instead pass the whole thing. A cleanup in it's own right too. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> # build only Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run Link: https://lore.kernel.org/r/20230208172245.291087549@infradead.org
| * Merge branch 'linus' into objtool/core, to pick up Xen dependenciesIngo Molnar2023-02-232-3/+27
| |\ | | | | | | | | | | | | | | | | | | Pick up dependencies - freshly merged upstream via xen-next - before applying dependent objtool changes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | x86/unwind/orc: Add 'signal' field to ORC metadataJosh Poimboeuf2023-02-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a 'signal' field which allows unwind hints to specify whether the instruction pointer should be taken literally (like for most interrupts and exceptions) rather than decremented (like for call stack return addresses) when used to find the next ORC entry. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/d2c5ec4d83a45b513d8fd72fab59f1a8cfa46871.1676068346.git.jpoimboe@kernel.org
| * | objtool: Optimize layout of struct special_altThomas Weißschuh2023-02-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Reduce the size of struct special_alt from 72 to 64 bytes. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20221216-objtool-memory-v2-7-17968f85a464@weissschuh.net Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
| * | objtool: Optimize layout of struct symbolThomas Weißschuh2023-02-011-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reduce the size of struct symbol on x86_64 from 208 to 200 bytes. This structure is allocated a lot and never freed. This reduces maximum memory usage while processing vmlinux.o from 2919716 KB to 2917988 KB (-0.5%) on my notebooks "localmodconfig". Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20221216-objtool-memory-v2-6-17968f85a464@weissschuh.net Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
| * | objtool: Allocate multiple structures with calloc()Thomas Weißschuh2023-02-012-21/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By using calloc() instead of malloc() in a loop, libc does not have to keep around bookkeeping information for each single structure. This reduces maximum memory usage while processing vmlinux.o from 3153325 KB to 3035668 KB (-3.7%) on my notebooks "localmodconfig". Note this introduces memory leaks, because some additional structs get added to the lists later after reading the symbols and sections from the original object. Luckily we don't really care about memory leaks in objtool. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20221216-objtool-memory-v2-3-17968f85a464@weissschuh.net Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
| * | objtool: Make struct check_options staticThomas Weißschuh2023-02-012-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is not used outside of builtin-check.c. Also remove the unused declaration from builtin.h . Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20221216-objtool-memory-v2-2-17968f85a464@weissschuh.net Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
| * | objtool: Make struct entries[] static and constThomas Weißschuh2023-02-011-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This data is not modified and not used outside of special.c. Also adapt its users to the constness. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20221216-objtool-memory-v2-1-17968f85a464@weissschuh.net Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
| * | objtool: Fix HOSTCC flag usageIan Rogers2023-02-011-12/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | HOSTCC is always wanted when building objtool. Setting CC to HOSTCC happens after tools/scripts/Makefile.include is included, meaning flags (like CFLAGS) are set assuming say CC is gcc, but then it can be later set to HOSTCC which may be clang. tools/scripts/Makefile.include is needed for host set up and common macros in objtool's Makefile. Rather than override the CC variable to HOSTCC, just pass CC as HOSTCC to the sub-makes of Makefile.build, the libsubcmd builds and also to the linkage step. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20230126190606.40739-4-irogers@google.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>