diff options
221 files changed, 3506 insertions, 1591 deletions
diff --git a/Documentation/bpf/classic_vs_extended.rst b/Documentation/bpf/classic_vs_extended.rst new file mode 100644 index 000000000000..2f81a81f5267 --- /dev/null +++ b/Documentation/bpf/classic_vs_extended.rst @@ -0,0 +1,376 @@ + +=================== +Classic BPF vs eBPF +=================== + +eBPF is designed to be JITed with one to one mapping, which can also open up +the possibility for GCC/LLVM compilers to generate optimized eBPF code through +an eBPF backend that performs almost as fast as natively compiled code. + +Some core changes of the eBPF format from classic BPF: + +- Number of registers increase from 2 to 10: + + The old format had two registers A and X, and a hidden frame pointer. The + new layout extends this to be 10 internal registers and a read-only frame + pointer. Since 64-bit CPUs are passing arguments to functions via registers + the number of args from eBPF program to in-kernel function is restricted + to 5 and one register is used to accept return value from an in-kernel + function. Natively, x86_64 passes first 6 arguments in registers, aarch64/ + sparcv9/mips64 have 7 - 8 registers for arguments; x86_64 has 6 callee saved + registers, and aarch64/sparcv9/mips64 have 11 or more callee saved registers. + + Thus, all eBPF registers map one to one to HW registers on x86_64, aarch64, + etc, and eBPF calling convention maps directly to ABIs used by the kernel on + 64-bit architectures. + + On 32-bit architectures JIT may map programs that use only 32-bit arithmetic + and may let more complex programs to be interpreted. + + R0 - R5 are scratch registers and eBPF program needs spill/fill them if + necessary across calls. Note that there is only one eBPF program (== one + eBPF main routine) and it cannot call other eBPF functions, it can only + call predefined in-kernel functions, though. + +- Register width increases from 32-bit to 64-bit: + + Still, the semantics of the original 32-bit ALU operations are preserved + via 32-bit subregisters. All eBPF registers are 64-bit with 32-bit lower + subregisters that zero-extend into 64-bit if they are being written to. + That behavior maps directly to x86_64 and arm64 subregister definition, but + makes other JITs more difficult. + + 32-bit architectures run 64-bit eBPF programs via interpreter. + Their JITs may convert BPF programs that only use 32-bit subregisters into + native instruction set and let the rest being interpreted. + + Operation is 64-bit, because on 64-bit architectures, pointers are also + 64-bit wide, and we want to pass 64-bit values in/out of kernel functions, + so 32-bit eBPF registers would otherwise require to define register-pair + ABI, thus, there won't be able to use a direct eBPF register to HW register + mapping and JIT would need to do combine/split/move operations for every + register in and out of the function, which is complex, bug prone and slow. + Another reason is the use of atomic 64-bit counters. + +- Conditional jt/jf targets replaced with jt/fall-through: + + While the original design has constructs such as ``if (cond) jump_true; + else jump_false;``, they are being replaced into alternative constructs like + ``if (cond) jump_true; /* else fall-through */``. + +- Introduces bpf_call insn and register passing convention for zero overhead + calls from/to other kernel functions: + + Before an in-kernel function call, the eBPF program needs to + place function arguments into R1 to R5 registers to satisfy calling + convention, then the interpreter will take them from registers and pass + to in-kernel function. If R1 - R5 registers are mapped to CPU registers + that are used for argument passing on given architecture, the JIT compiler + doesn't need to emit extra moves. Function arguments will be in the correct + registers and BPF_CALL instruction will be JITed as single 'call' HW + instruction. This calling convention was picked to cover common call + situations without performance penalty. + + After an in-kernel function call, R1 - R5 are reset to unreadable and R0 has + a return value of the function. Since R6 - R9 are callee saved, their state + is preserved across the call. + + For example, consider three C functions:: + + u64 f1() { return (*_f2)(1); } + u64 f2(u64 a) { return f3(a + 1, a); } + u64 f3(u64 a, u64 b) { return a - b; } + + GCC can compile f1, f3 into x86_64:: + + f1: + movl $1, %edi + movq _f2(%rip), %rax + jmp *%rax + f3: + movq %rdi, %rax + subq %rsi, %rax + ret + + Function f2 in eBPF may look like:: + + f2: + bpf_mov R2, R1 + bpf_add R1, 1 + bpf_call f3 + bpf_exit + + If f2 is JITed and the pointer stored to ``_f2``. The calls f1 -> f2 -> f3 and + returns will be seamless. Without JIT, __bpf_prog_run() interpreter needs to + be used to call into f2. + + For practical reasons all eBPF programs have only one argument 'ctx' which is + already placed into R1 (e.g. on __bpf_prog_run() startup) and the programs + can call kernel functions with up to 5 arguments. Calls with 6 or more arguments + are currently not supported, but these restrictions can be lifted if necessary + in the future. + + On 64-bit architectures all register map to HW registers one to one. For + example, x86_64 JIT compiler can map them as ... + + :: + + R0 - rax + R1 - rdi + R2 - rsi + R3 - rdx + R4 - rcx + R5 - r8 + R6 - rbx + R7 - r13 + R8 - r14 + R9 - r15 + R10 - rbp + + ... since x86_64 ABI mandates rdi, rsi, rdx, rcx, r8, r9 for argument passing + and rbx, r12 - r15 are callee saved. + + Then the following eBPF pseudo-program:: + + bpf_mov R6, R1 /* save ctx */ + bpf_mov R2, 2 + bpf_mov R3, 3 + bpf_mov R4, 4 + bpf_mov R5, 5 + bpf_call foo + bpf_mov R7, R0 /* save foo() return value */ + bpf_mov R1, R6 /* restore ctx for next call */ + bpf_mov R2, 6 + bpf_mov R3, 7 + bpf_mov R4, 8 + bpf_mov R5, 9 + bpf_call bar + bpf_add R0, R7 + bpf_exit + + After JIT to x86_64 may look like:: + + push %rbp + mov %rsp,%rbp + sub $0x228,%rsp + mov %rbx,-0x228(%rbp) + mov %r13,-0x220(%rbp) + mov %rdi,%rbx + mov $0x2,%esi + mov $0x3,%edx + mov $0x4,%ecx + mov $0x5,%r8d + callq foo + mov %rax,%r13 + mov %rbx,%rdi + mov $0x6,%esi + mov $0x7,%edx + mov $0x8,%ecx + mov $0x9,%r8d + callq bar + add %r13,%rax + mov -0x228(%rbp),%rbx + mov -0x220(%rbp),%r13 + leaveq + retq + + Which is in this example equivalent in C to:: + + u64 bpf_filter(u64 ctx) + { + return foo(ctx, 2, 3, 4, 5) + bar(ctx, 6, 7, 8, 9); + } + + In-kernel functions foo() and bar() with prototype: u64 (*)(u64 arg1, u64 + arg2, u64 arg3, u64 arg4, u64 arg5); will receive arguments in proper + registers and place their return value into ``%rax`` which is R0 in eBPF. + Prologue and epilogue are emitted by JIT and are implicit in the + interpreter. R0-R5 are scratch registers, so eBPF program needs to preserve + them across the calls as defined by calling convention. + + For example the following program is invalid:: + + bpf_mov R1, 1 + bpf_call foo + bpf_mov R0, R1 + bpf_exit + + After the call the registers R1-R5 contain junk values and cannot be read. + An in-kernel verifier.rst is used to validate eBPF programs. + +Also in the new design, eBPF is limited to 4096 insns, which means that any +program will terminate quickly and will only call a fixed number of kernel +functions. Original BPF and eBPF are two operand instructions, +which helps to do one-to-one mapping between eBPF insn and x86 insn during JIT. + +The input context pointer for invoking the interpreter function is generic, +its content is defined by a specific use case. For seccomp register R1 points +to seccomp_data, for converted BPF filters R1 points to a skb. + +A program, that is translated internally consists of the following elements:: + + op:16, jt:8, jf:8, k:32 ==> op:8, dst_reg:4, src_reg:4, off:16, imm:32 + +So far 87 eBPF instructions were implemented. 8-bit 'op' opcode field +has room for new instructions. Some of them may use 16/24/32 byte encoding. New +instructions must be multiple of 8 bytes to preserve backward compatibility. + +eBPF is a general purpose RISC instruction set. Not every register and +every instruction are used during translation from original BPF to eBPF. +For example, socket filters are not using ``exclusive add`` instruction, but +tracing filters may do to maintain counters of events, for example. Register R9 +is not used by socket filters either, but more complex filters may be running +out of registers and would have to resort to spill/fill to stack. + +eBPF can be used as a generic assembler for last step performance +optimizations, socket filters and seccomp are using it as assembler. Tracing +filters may use it as assembler to generate code from kernel. In kernel usage +may not be bounded by security considerations, since generated eBPF code +may be optimizing internal code path and not being exposed to the user space. +Safety of eBPF can come from the verifier.rst. In such use cases as +described, it may be used as safe instruction set. + +Just like the original BPF, eBPF runs within a controlled environment, +is deterministic and the kernel can easily prove that. The safety of the program +can be determined in two steps: first step does depth-first-search to disallow +loops and other CFG validation; second step starts from the first insn and +descends all possible paths. It simulates execution of every insn and observes +the state change of registers and stack. + +opcode encoding +=============== + +eBPF is reusing most of the opcode encoding from classic to simplify conversion +of classic BPF to eBPF. + +For arithmetic and jump instructions the 8-bit 'code' field is divided into three +parts:: + + +----------------+--------+--------------------+ + | 4 bits | 1 bit | 3 bits | + | operation code | source | instruction class | + +----------------+--------+--------------------+ + (MSB) (LSB) + +Three LSB bits store instruction class which is one of: + + =================== =============== + Classic BPF classes eBPF classes + =================== =============== + BPF_LD 0x00 BPF_LD 0x00 + BPF_LDX 0x01 BPF_LDX 0x01 + BPF_ST 0x02 BPF_ST 0x02 + BPF_STX 0x03 BPF_STX 0x03 + BPF_ALU 0x04 BPF_ALU 0x04 + BPF_JMP 0x05 BPF_JMP 0x05 + BPF_RET 0x06 BPF_JMP32 0x06 + BPF_MISC 0x07 BPF_ALU64 0x07 + =================== =============== + +The 4th bit encodes the source operand ... + + :: + + BPF_K 0x00 + BPF_X 0x08 + + * in classic BPF, this means:: + + BPF_SRC(code) == BPF_X - use register X as source operand + BPF_SRC(code) == BPF_K - use 32-bit immediate as source operand + + * in eBPF, this means:: + + BPF_SRC(code) == BPF_X - use 'src_reg' register as source operand + BPF_SRC(code) == BPF_K - use 32-bit immediate as source operand + +... and four MSB bits store operation code. + +If BPF_CLASS(code) == BPF_ALU or BPF_ALU64 [ in eBPF ], BPF_OP(code) is one of:: + + BPF_ADD 0x00 + BPF_SUB 0x10 + BPF_MUL 0x20 + BPF_DIV 0x30 + BPF_OR 0x40 + BPF_AND 0x50 + BPF_LSH 0x60 + BPF_RSH 0x70 + BPF_NEG 0x80 + BPF_MOD 0x90 + BPF_XOR 0xa0 + BPF_MOV 0xb0 /* eBPF only: mov reg to reg */ + BPF_ARSH 0xc0 /* eBPF only: sign extending shift right */ + BPF_END 0xd0 /* eBPF only: endianness conversion */ + +If BPF_CLASS(code) == BPF_JMP or BPF_JMP32 [ in eBPF ], BPF_OP(code) is one of:: + + BPF_JA 0x00 /* BPF_JMP only */ + BPF_JEQ 0x10 + BPF_JGT 0x20 + BPF_JGE 0x30 + BPF_JSET 0x40 + BPF_JNE 0x50 /* eBPF only: jump != */ + BPF_JSGT 0x60 /* eBPF only: signed '>' */ + BPF_JSGE 0x70 /* eBPF only: signed '>=' */ + BPF_CALL 0x80 /* eBPF BPF_JMP only: function call */ + BPF_EXIT 0x90 /* eBPF BPF_JMP only: function return */ + BPF_JLT 0xa0 /* eBPF only: unsigned '<' */ + BPF_JLE 0xb0 /* eBPF only: unsigned '<=' */ + BPF_JSLT 0xc0 /* eBPF only: signed '<' */ + BPF_JSLE 0xd0 /* eBPF only: signed '<=' */ + +So BPF_ADD | BPF_X | BPF_ALU means 32-bit addition in both classic BPF +and eBPF. There are only two registers in classic BPF, so it means A += X. +In eBPF it means dst_reg = (u32) dst_reg + (u32) src_reg; similarly, +BPF_XOR | BPF_K | BPF_ALU means A ^= imm32 in classic BPF and analogous +src_reg = (u32) src_reg ^ (u32) imm32 in eBPF. + +Classic BPF is using BPF_MISC class to represent A = X and X = A moves. +eBPF is using BPF_MOV | BPF_X | BPF_ALU code instead. Since there are no +BPF_MISC operations in eBPF, the class 7 is used as BPF_ALU64 to mean +exactly the same operations as BPF_ALU, but with 64-bit wide operands +instead. So BPF_ADD | BPF_X | BPF_ALU64 means 64-bit addition, i.e.: +dst_reg = dst_reg + src_reg + +Classic BPF wastes the whole BPF_RET class to represent a single ``ret`` +operation. Classic BPF_RET | BPF_K means copy imm32 into return register +and perform function exit. eBPF is modeled to match CPU, so BPF_JMP | BPF_EXIT +in eBPF means function exit only. The eBPF program needs to store return +value into register R0 before doing a BPF_EXIT. Class 6 in eBPF is used as +BPF_JMP32 to mean exactly the same operations as BPF_JMP, but with 32-bit wide +operands for the comparisons instead. + +For load and store instructions the 8-bit 'code' field is divided as:: + + +--------+--------+-------------------+ + | 3 bits | 2 bits | 3 bits | + | mode | size | instruction class | + +--------+--------+-------------------+ + (MSB) (LSB) + +Size modifier is one of ... + +:: + + BPF_W 0x00 /* word */ + BPF_H 0x08 /* half word */ + BPF_B 0x10 /* byte */ + BPF_DW 0x18 /* eBPF only, double word */ + +... which encodes size of load/store operation:: + + B - 1 byte + H - 2 byte + W - 4 byte + DW - 8 byte (eBPF only) + +Mode modifier is one of:: + + BPF_IMM 0x00 /* used for 32-bit mov in classic BPF and 64-bit in eBPF */ + BPF_ABS 0x20 + BPF_IND 0x40 + BPF_MEM 0x60 + BPF_LEN 0x80 /* classic BPF only, reserved in eBPF */ + BPF_MSH 0xa0 /* classic BPF only, reserved in eBPF */ + BPF_ATOMIC 0xc0 /* eBPF only, atomic operations */ diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst index 91ba5a62026b..ef5c996547ec 100644 --- a/Documentation/bpf/index.rst +++ b/Documentation/bpf/index.rst @@ -21,6 +21,7 @@ that goes into great technical depth about the BPF Architecture. helpers programs maps + classic_vs_extended.rst bpf_licensing test_debug other diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst index fa7cba59031e..1af51143ff9f 100644 --- a/Documentation/bpf/instruction-set.rst +++ b/Documentation/bpf/instruction-set.rst @@ -3,296 +3,68 @@ eBPF Instruction Set ==================== -eBPF is designed to be JITed with one to one mapping, which can also open up -the possibility for GCC/LLVM compilers to generate optimized eBPF code through -an eBPF backend that performs almost as fast as natively compiled code. - -Some core changes of the eBPF format from classic BPF: - -- Number of registers increase from 2 to 10: - - The old format had two registers A and X, and a hidden frame pointer. The - new layout extends this to be 10 internal registers and a read-only frame - pointer. Since 64-bit CPUs are passing arguments to functions via registers - the number of args from eBPF program to in-kernel function is restricted - to 5 and one register is used to accept return value from an in-kernel - function. Natively, x86_64 passes first 6 arguments in registers, aarch64/ - sparcv9/mips64 have 7 - 8 registers for arguments; x86_64 has 6 callee saved - registers, and aarch64/sparcv9/mips64 have 11 or more callee saved registers. - - Therefore, eBPF calling convention is defined as: - - * R0 - return value from in-kernel function, and exit value for eBPF program - * R1 - R5 - arguments from eBPF program to in-kernel function - * R6 - R9 - callee saved registers that in-kernel function will preserve - * R10 - read-only frame pointer to access stack - - Thus, all eBPF registers map one to one to HW registers on x86_64, aarch64, - etc, and eBPF calling convention maps directly to ABIs used by the kernel on - 64-bit architectures. - - On 32-bit architectures JIT may map programs that use only 32-bit arithmetic - and may let more complex programs to be interpreted. - - R0 - R5 are scratch registers and eBPF program needs spill/fill them if - necessary across calls. Note that there is only one eBPF program (== one - eBPF main routine) and it cannot call other eBPF functions, it can only - call predefined in-kernel functions, though. - -- Register width increases from 32-bit to 64-bit: - - Still, the semantics of the original 32-bit ALU operations are preserved - via 32-bit subregisters. All eBPF registers are 64-bit with 32-bit lower - subregisters that zero-extend into 64-bit if they are being written to. - That behavior maps directly to x86_64 and arm64 subregister definition, but - makes other JITs more difficult. - - 32-bit architectures run 64-bit eBPF programs via interpreter. - Their JITs may convert BPF programs that only use 32-bit subregisters into - native instruction set and let the rest being interpreted. - - Operation is 64-bit, because on 64-bit architectures, pointers are also - 64-bit wide, and we want to pass 64-bit values in/out of kernel functions, - so 32-bit eBPF registers would otherwise require to define register-pair - ABI, thus, there won't be able to use a direct eBPF register to HW register - mapping and JIT would need to do combine/split/move operations for every - register in and out of the function, which is complex, bug prone and slow. - Another reason is the use of atomic 64-bit counters. - -- Conditional jt/jf targets replaced with jt/fall-through: - - While the original design has constructs such as ``if (cond) jump_true; - else jump_false;``, they are being replaced into alternative constructs like - ``if (cond) jump_true; /* else fall-through */``. - -- Introduces bpf_call insn and register passing convention for zero overhead - calls from/to other kernel functions: - - Before an in-kernel function call, the eBPF program needs to - place function arguments into R1 to R5 registers to satisfy calling - convention, then the interpreter will take them from registers and pass - to in-kernel function. If R1 - R5 registers are mapped to CPU registers - that are used for argument passing on given architecture, the JIT compiler - doesn't need to emit extra moves. Function arguments will be in the correct - registers and BPF_CALL instruction will be JITed as single 'call' HW - instruction. This calling convention was picked to cover common call - situations without performance penalty. - - After an in-kernel function call, R1 - R5 are reset to unreadable and R0 has - a return value of the function. Since R6 - R9 are callee saved, their state - is preserved across the call. - - For example, consider three C functions:: - - u64 f1() { return (*_f2)(1); } - u64 f2(u64 a) { return f3(a + 1, a); } - u64 f3(u64 a, u64 b) { return a - b; } - - GCC can compile f1, f3 into x86_64:: - - f1: - movl $1, %edi - movq _f2(%rip), %rax - jmp *%rax - f3: - movq %rdi, %rax - subq %rsi, %rax - ret - - Function f2 in eBPF may look like:: - - f2: - bpf_mov R2, R1 - bpf_add R1, 1 - bpf_call f3 - bpf_exit - - If f2 is JITed and the pointer stored to ``_f2``. The calls f1 -> f2 -> f3 and - returns will be seamless. Without JIT, __bpf_prog_run() interpreter needs to - be used to call into f2. - - For practical reasons all eBPF programs have only one argument 'ctx' which is - already placed into R1 (e.g. on __bpf_prog_run() startup) and the programs - can call kernel functions with up to 5 arguments. Calls with 6 or more arguments - are currently not supported, but these restrictions can be lifted if necessary - in the future. - - On 64-bit architectures all register map to HW registers one to one. For - example, x86_64 JIT compiler can map them as ... - - :: - - R0 - rax - R1 - rdi - R2 - rsi - R3 - rdx - R4 - rcx - R5 - r8 - R6 - rbx - R7 - r13 - R8 - r14 - R9 - r15 - R10 - rbp - - ... since x86_64 ABI mandates rdi, rsi, rdx, rcx, r8, r9 for argument passing - and rbx, r12 - r15 are callee saved. - - Then the following eBPF pseudo-program:: - - bpf_mov R6, R1 /* save ctx */ - bpf_mov R2, 2 - bpf_mov R3, 3 - bpf_mov R4, 4 - bpf_mov R5, 5 - bpf_call foo - bpf_mov R7, R0 /* save foo() return value */ - bpf_mov R1, R6 /* restore ctx for next call */ - bpf_mov R2, 6 - bpf_mov R3, 7 - bpf_mov R4, 8 - bpf_mov R5, 9 - bpf_call bar - bpf_add R0, R7 - bpf_exit - - After JIT to x86_64 may look like:: - - push %rbp - mov %rsp,%rbp - sub $0x228,%rsp - mov %rbx,-0x228(%rbp) - mov %r13,-0x220(%rbp) - mov %rdi,%rbx - mov $0x2,%esi - mov $0x3,%edx - mov $0x4,%ecx - mov $0x5,%r8d - callq foo - mov %rax,%r13 - mov %rbx,%rdi - mov $0x6,%esi - mov $0x7,%edx - mov $0x8,%ecx - mov $0x9,%r8d - callq bar - add %r13,%rax - mov -0x228(%rbp),%rbx - mov -0x220(%rbp),%r13 - leaveq - retq - - Which is in this example equivalent in C to:: - - u64 bpf_filter(u64 ctx) - { - return foo(ctx, 2, 3, 4, 5) + bar(ctx, 6, 7, 8, 9); - } - - In-kernel functions foo() and bar() with prototype: u64 (*)(u64 arg1, u64 - arg2, u64 arg3, u64 arg4, u64 arg5); will receive arguments in proper - registers and place their return value into ``%rax`` which is R0 in eBPF. - Prologue and epilogue are emitted by JIT and are implicit in the - interpreter. R0-R5 are scratch registers, so eBPF program needs to preserve - them across the calls as defined by calling convention. - - For example the following program is invalid:: - - bpf_mov R1, 1 - bpf_call foo - bpf_mov R0, R1 - bpf_exit - - After the call the registers R1-R5 contain junk values and cannot be read. - An in-kernel `eBPF verifier`_ is used to validate eBPF programs. - -Also in the new design, eBPF is limited to 4096 insns, which means that any -program will terminate quickly and will only call a fixed number of kernel -functions. Original BPF and eBPF are two operand instructions, -which helps to do one-to-one mapping between eBPF insn and x86 insn during JIT. - -The input context pointer for invoking the interpreter function is generic, -its content is defined by a specific use case. For seccomp register R1 points -to seccomp_data, for converted BPF filters R1 points to a skb. - -A program, that is translated internally consists of the following elements:: - - op:16, jt:8, jf:8, k:32 ==> op:8, dst_reg:4, src_reg:4, off:16, imm:32 - -So far 87 eBPF instructions were implemented. 8-bit 'op' opcode field -has room for new instructions. Some of them may use 16/24/32 byte encoding. New -instructions must be multiple of 8 bytes to preserve backward compatibility. - -eBPF is a general purpose RISC instruction set. Not every register and -every instruction are used during translation from original BPF to eBPF. -For example, socket filters are not using ``exclusive add`` instruction, but -tracing filters may do to maintain counters of events, for example. Register R9 -is not used by socket filters either, but more complex filters may be running -out of registers and would have to resort to spill/fill to stack. - -eBPF can be used as a generic assembler for last step performance -optimizations, socket filters and seccomp are using it as assembler. Tracing -filters may use it as assembler to generate code from kernel. In kernel usage -may not be bounded by security considerations, since generated eBPF code -may be optimizing internal code path and not being exposed to the user space. -Safety of eBPF can come from the `eBPF verifier`_. In such use cases as -described, it may be used as safe instruction set. - -Just like the original BPF, eBPF runs within a controlled environment, -is deterministic and the kernel can easily prove that. The safety of the program -can be determined in two steps: first step does depth-first-search to disallow -loops and other CFG validation; second step starts from the first insn and -descends all possible paths. It simulates execution of every insn and observes -the state change of registers and stack. - -eBPF opcode encoding -==================== +Registers and calling convention +================================ + +eBPF has 10 general purpose registers and a read-only frame pointer register, +all of which are 64-bits wide. -eBPF is reusing most of the opcode encoding from classic to simplify conversion -of classic BPF to eBPF. For arithmetic and jump instructions the 8-bit 'code' -field is divided into three parts:: +The eBPF calling convention is defined as: - +----------------+--------+--------------------+ - | 4 bits | 1 bit | 3 bits | - | operation code | source | instruction class | - +----------------+--------+--------------------+ - (MSB) (LSB) + * R0: return value from function calls, and exit value for eBPF programs + * R1 - R5: arguments for function calls + * R6 - R9: callee saved registers that function calls will preserve + * R10: read-only frame pointer to access stack -Three LSB bits store instruction class which is one of: +R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if +necessary across calls. - =================== =============== - Classic BPF classes eBPF classes - =================== =============== - BPF_LD 0x00 BPF_LD 0x00 - BPF_LDX 0x01 BPF_LDX 0x01 - BPF_ST 0x02 BPF_ST 0x02 - BPF_STX 0x03 BPF_STX 0x03 - BPF_ALU 0x04 BPF_ALU 0x04 - BPF_JMP 0x05 BPF_JMP 0x05 - BPF_RET 0x06 BPF_JMP32 0x06 - BPF_MISC 0x07 BPF_ALU64 0x07 - =================== =============== +Instruction classes +=================== -When BPF_CLASS(code) == BPF_ALU or BPF_JMP, 4th bit encodes source operand ... +The three LSB bits of the 'opcode' field store the instruction class: - :: + ========= ===== + class value + ========= ===== + BPF_LD 0x00 + BPF_LDX 0x01 + BPF_ST 0x02 + BPF_STX 0x03 + BPF_ALU 0x04 + BPF_JMP 0x05 + BPF_JMP32 0x06 + BPF_ALU64 0x07 + ========= ===== - BPF_K 0x00 - BPF_X 0x08 +Arithmetic and jump instructions +================================ - * in classic BPF, this means:: +For arithmetic and jump instructions (BPF_ALU, BPF_ALU64, BPF_JMP and +BPF_JMP32), the 8-bit 'opcode' field is divided into three parts: - BPF_SRC(code) == BPF_X - use register X as source operand - BPF_SRC(code) == BPF_K - use 32-bit immediate as source operand + ============== ====== ================= + 4 bits (MSB) 1 bit 3 bits (LSB) + ============== ====== ================= + operation code source instruction class + ============== ====== ================= - * in eBPF, this means:: +The 4th bit encodes the source operand: - BPF_SRC(code) == BPF_X - use 'src_reg' register as source operand - BPF_SRC(code) == BPF_K - use 32-bit immediate as source operand + ====== ===== ======================================== + source value description + ====== ===== ======================================== + BPF_K 0x00 use 32-bit immediate as source operand + BPF_X 0x08 use 'src_reg' register as source operand + ====== ===== ======================================== -... and four MSB bits store operation code. +The four MSB bits store the operation code. -If BPF_CLASS(code) == BPF_ALU or BPF_ALU64 [ in eBPF ], BPF_OP(code) is one of:: +For class BPF_ALU or BPF_ALU64: + ======== ===== ========================= + code value description + ======== ===== ========================= BPF_ADD 0x00 BPF_SUB 0x10 BPF_MUL 0x20 @@ -304,116 +76,105 @@ If BPF_CLASS(code) == BPF_ALU or BPF_ALU64 [ in eBPF ], BPF_OP(code) is one of:: BPF_NEG 0x80 BPF_MOD 0x90 BPF_XOR 0xa0 - BPF_MOV 0xb0 /* eBPF only: mov reg to reg */ - BPF_ARSH 0xc0 /* eBPF only: sign extending shift right */ - BPF_END 0xd0 /* eBPF only: endianness conversion */ + BPF_MOV 0xb0 mov reg to reg + BPF_ARSH 0xc0 sign extending shift right + BPF_END 0xd0 endianness conversion + ======== ===== ========================= -If BPF_CLASS(code) == BPF_JMP or BPF_JMP32 [ in eBPF ], BPF_OP(code) is one of:: +For class BPF_JMP or BPF_JMP32: - BPF_JA 0x00 /* BPF_JMP only */ + ======== ===== ========================= + code value description + ======== ===== ========================= + BPF_JA 0x00 BPF_JMP only BPF_JEQ 0x10 BPF_JGT 0x20 BPF_JGE 0x30 BPF_JSET 0x40 - BPF_JNE 0x50 /* eBPF only: jump != */ - BPF_JSGT 0x60 /* eBPF only: signed '>' */ - BPF_JSGE 0x70 /* eBPF only: signed '>=' */ - BPF_CALL 0x80 /* eBPF BPF_JMP only: function call */ - BPF_EXIT 0x90 /* eBPF BPF_JMP only: function return */ - BPF_JLT 0xa0 /* eBPF only: unsigned '<' */ - BPF_JLE 0xb0 /* eBPF only: unsigned '<=' */ - BPF_JSLT 0xc0 /* eBPF only: signed '<' */ - BPF_JSLE 0xd0 /* eBPF only: signed '<=' */ - -So BPF_ADD | BPF_X | BPF_ALU means 32-bit addition in both classic BPF -and eBPF. There are only two registers in classic BPF, so it means A += X. -In eBPF it means dst_reg = (u32) dst_reg + (u32) src_reg; similarly, -BPF_XOR | BPF_K | BPF_ALU means A ^= imm32 in classic BPF and analogous -src_reg = (u32) src_reg ^ (u32) imm32 in eBPF. - -Classic BPF is using BPF_MISC class to represent A = X and X = A moves. -eBPF is using BPF_MOV | BPF_X | BPF_ALU code instead. Since there are no -BPF_MISC operations in eBPF, the class 7 is used as BPF_ALU64 to mean -exactly the same operations as BPF_ALU, but with 64-bit wide operands -instead. So BPF_ADD | BPF_X | BPF_ALU64 means 64-bit addition, i.e.: -dst_reg = dst_reg + src_reg - -Classic BPF wastes the whole BPF_RET class to represent a single ``ret`` -operation. Classic BPF_RET | BPF_K means copy imm32 into return register -and perform function exit. eBPF is modeled to match CPU, so BPF_JMP | BPF_EXIT -in eBPF means function exit only. The eBPF program needs to store return -value into register R0 before doing a BPF_EXIT. Class 6 in eBPF is used as -BPF_JMP32 to mean exactly the same operations as BPF_JMP, but with 32-bit wide -operands for the comparisons instead. + BPF_JNE 0x50 jump '!=' + BPF_JSGT 0x60 signed '>' + BPF_JSGE 0x70 signed '>=' + BPF_CALL 0x80 function call + BPF_EXIT 0x90 function return + BPF_JLT 0xa0 unsigned '<' + BPF_JLE 0xb0 unsigned '<=' + BPF_JSLT 0xc0 signed '<' + BPF_JSLE 0xd0 signed '<=' + ======== ===== ========================= -For load and store instructions the 8-bit 'code' field is divided as:: +So BPF_ADD | BPF_X | BPF_ALU means:: - +--------+--------+-------------------+ - | 3 bits | 2 bits | 3 bits | - | mode | size | instruction class | - +--------+--------+-------------------+ - (MSB) (LSB) + dst_reg = (u32) dst_reg + (u32) src_reg; -Size modifier is one of ... +Similarly, BPF_XOR | BPF_K | BPF_ALU means:: -:: + src_reg = (u32) src_reg ^ (u32) imm32 - BPF_W 0x00 /* word */ - BPF_H 0x08 /* half word */ - BPF_B 0x10 /* byte */ - BPF_DW 0x18 /* eBPF only, double word */ +eBPF is using BPF_MOV | BPF_X | BPF_ALU to represent A = B moves. BPF_ALU64 +is used to mean exactly the same operations as BPF_ALU, but with 64-bit wide +operands instead. So BPF_ADD | BPF_X | BPF_ALU64 means 64-bit addition, i.e.:: -... which encodes size of load/store operation:: + dst_reg = dst_reg + src_reg - B - 1 byte - H - 2 byte - W - 4 byte - DW - 8 byte (eBPF only) +BPF_JMP | BPF_EXIT means function exit only. The eBPF program needs to store +the return value into register R0 before doing a BPF_EXIT. Class 6 is used as +BPF_JMP32 to mean exactly the same operations as BPF_JMP, but with 32-bit wide +operands for the comparisons instead. -Mode modifier is one of:: - BPF_IMM 0x00 /* used for 32-bit mov in classic BPF and 64-bit in eBPF */ - BPF_ABS 0x20 - BPF_IND 0x40 - BPF_MEM 0x60 - BPF_LEN 0x80 /* classic BPF only, reserved in eBPF */ - BPF_MSH 0xa0 /* classic BPF only, reserved in eBPF */ - BPF_ATOMIC 0xc0 /* eBPF only, atomic operations */ +Load and store instructions +=========================== -eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and -(BPF_IND | <size> | BPF_LD) which are used to access packet data. +For load and store instructions (BPF_LD, BPF_LDX, BPF_ST and BPF_STX), the +8-bit 'opcode' field is divided as: -They had to be carried over from classic to have strong performance of -socket filters running in eBPF interpreter. These instructions can only -be used when interpreter context is a pointer to ``struct sk_buff`` and -have seven implicit operands. Register R6 is an implicit input that must -contain pointer to sk_buff. Register R0 is an implicit output which contains -the data fetched from the packet. Registers R1-R5 are scratch registers -and must not be used to store the data across BPF_ABS | BPF_LD or -BPF_IND | BPF_LD instructions. + ============ ====== ================= + 3 bits (MSB) 2 bits 3 bits (LSB) + ============ ====== ================= + mode size instruction class + ============ ====== ================= -These instructions have implicit program exit condition as well. When -eBPF program is trying to access the data beyond the packet boundary, -the interpreter will abort the execution of the program. JIT compilers -therefore must preserve this property. src_reg and imm32 fields are -explicit inputs to these instructions. +The size modifier is one of: + + ============= ===== ===================== + size modifier value description + ============= ===== ===================== + BPF_W 0x00 word (4 bytes) + BPF_H 0x08 half word (2 bytes) + BPF_B 0x10 byte + BPF_DW 0x18 double word (8 bytes) + ============= ===== ===================== + +The mode modifier is one of: + + ============= ===== ===================== + mode modifier value description + ============= ===== ===================== + BPF_IMM 0x00 used for 64-bit mov + BPF_ABS 0x20 + BPF_IND 0x40 + BPF_MEM 0x60 + BPF_ATOMIC 0xc0 atomic operations + ============= ===== ===================== -For example:: +BPF_MEM | <size> | BPF_STX means:: - BPF_IND | BPF_W | BPF_LD means: + *(size *) (dst_reg + off) = src_reg - R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32)) - and R1 - R5 were scratched. +BPF_MEM | <size> | BPF_ST means:: -Unlike classic BPF instruction set, eBPF has generic load/store operations:: + *(size *) (dst_reg + off) = imm32 - BPF_MEM | <size> | BPF_STX: *(size *) (dst_reg + off) = src_reg - BPF_MEM | <size> | BPF_ST: *(size *) (dst_reg + off) = imm32 - BPF_MEM | <size> | BPF_LDX: dst_reg = *(size *) (src_reg + off) +BPF_MEM | <size> | BPF_LDX means:: + + dst_reg = *(size *) (src_reg + off) Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW. -It also includes atomic operations, which use the immediate field for extra +Atomic operations +----------------- + +eBPF includes atomic operations, which use the immediate field for extra encoding:: .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg @@ -457,11 +218,36 @@ You may encounter ``BPF_XADD`` - this is a legacy name for ``BPF_ATOMIC``, referring to the exclusive-add operation encoded when the immediate field is zero. +16-byte instructions +-------------------- + eBPF has one 16-byte instruction: ``BPF_LD | BPF_DW | BPF_IMM`` which consists of two consecutive ``struct bpf_insn`` 8-byte blocks and interpreted as single instruction that loads 64-bit immediate value into a dst_reg. -Classic BPF has similar instruction: ``BPF_LD | BPF_W | BPF_IMM`` which loads -32-bit immediate value into a register. -.. Links: -.. _eBPF verifier: verifiers.rst +Packet access instructions +-------------------------- + +eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and +(BPF_IND | <size> | BPF_LD) which are used to access packet data. + +They had to be carried over from classic BPF to have strong performance of +socket filters running in eBPF interpreter. These instructions can only +be used when interpreter context is a pointer to ``struct sk_buff`` and +have seven implicit operands. Register R6 is an implicit input that must +contain pointer to sk_buff. Register R0 is an implicit output which contains +the data fetched from the packet. Registers R1-R5 are scratch registers +and must not be used to store the data across BPF_ABS | BPF_LD or +BPF_IND | BPF_LD instructions. + +These instructions have implicit program exit condition as well. When +eBPF program is trying to access the data beyond the packet boundary, +the interpreter will abort the execution of the program. JIT compilers +therefore must preserve this property. src_reg and imm32 fields are +explicit inputs to these instructions. + +For example, BPF_IND | BPF_W | BPF_LD means:: + + R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32)) + +and R1 - R5 are clobbered. diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c index da36d13ffc16..082793d497ec 100644 --- a/arch/s390/mm/hugetlbpage.c +++ b/arch/s390/mm/hugetlbpage.c @@ -9,6 +9,7 @@ #define KMSG_COMPONENT "hugetlb" #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt +#include <asm/pgalloc.h> #include <linux/mm.h> #include <linux/hugetlb.h> #include <linux/mman.h> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index c1205414de10..ce1f86f245c9 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1976,7 +1976,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i void *orig_call) { int ret, i, nr_args = m->nr_args; - int stack_size = nr_args * 8; + int regs_off, ip_off, args_off, stack_size = nr_args * 8; struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY]; struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT]; struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN]; @@ -1991,14 +1991,39 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i if (!is_valid_bpf_tramp_flags(flags)) return -EINVAL; + /* Generated trampoline stack layout: + * + * RBP + 8 [ return address ] + * RBP + 0 [ RBP ] + * + * RBP - 8 [ return value ] BPF_TRAMP_F_CALL_ORIG or + * BPF_TRAMP_F_RET_FENTRY_RET flags + * + * [ reg_argN ] always + * [ ... ] + * RBP - regs_off [ reg_arg1 ] program's ctx pointer + * + * RBP - args_off [ args count ] always + * + * RBP - ip_off [ traced function ] BPF_TRAMP_F_IP_ARG flag + */ + /* room for return value of orig_call or fentry prog */ save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET); if (save_ret) stack_size += 8; + regs_off = stack_size; + + /* args count */ + stack_size += 8; + args_off = stack_size; + if (flags & BPF_TRAMP_F_IP_ARG) stack_size += 8; /* room for IP address argument */ + ip_off = stack_size; + if (flags & BPF_TRAMP_F_SKIP_FRAME) /* skip patched call instruction and point orig_call to actual * body of the kernel function. @@ -2012,23 +2037,25 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */ EMIT1(0x53); /* push rbx */ + /* Store number of arguments of the traced function: + * mov rax, nr_args + * mov QWORD PTR [rbp - args_off], rax + */ + emit_mov_imm64(&prog, BPF_REG_0, 0, (u32) nr_args); + emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -args_off); + if (flags & BPF_TRAMP_F_IP_ARG) { /* Store IP address of the traced function: * mov rax, QWORD PTR [rbp + 8] * sub rax, X86_PATCH_SIZE - * mov QWORD PTR [rbp - stack_size], rax + * mov QWORD PTR [rbp - ip_off], rax */ emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8); EMIT4(0x48, 0x83, 0xe8, X86_PATCH_SIZE); - emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -stack_size); - - /* Continue with stack_size for regs storage, stack will - * be correctly restored with 'leave' instruction. - */ - stack_size -= 8; + emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -ip_off); } - save_regs(m, &prog, nr_args, stack_size); + save_regs(m, &prog, nr_args, regs_off); if (flags & BPF_TRAMP_F_CALL_ORIG) { /* arg1: mov rdi, im */ @@ -2040,7 +2067,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i } if (fentry->nr_progs) - if (invoke_bpf(m, &prog, fentry, stack_size, + if (invoke_bpf(m, &prog, fentry, regs_off, flags & BPF_TRAMP_F_RET_FENTRY_RET)) return -EINVAL; @@ -2050,7 +2077,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i if (!branches) return -ENOMEM; - if (invoke_bpf_mod_ret(m, &prog, fmod_ret, stack_size, + if (invoke_bpf_mod_ret(m, &prog, fmod_ret, regs_off, branches)) { ret = -EINVAL; goto cleanup; @@ -2058,7 +2085,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i } if (flags & BPF_TRAMP_F_CALL_ORIG) { - restore_regs(m, &prog, nr_args, stack_size); + restore_regs(m, &prog, nr_args, regs_off); /* call original function */ if (emit_call(&prog, orig_call, prog)) { @@ -2088,13 +2115,13 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i } if (fexit->nr_progs) - if (invoke_bpf(m, &prog, fexit, stack_size, false)) { + if (invoke_bpf(m, &prog, fexit, regs_off, false)) { ret = -EINVAL; goto cleanup; } if (flags & BPF_TRAMP_F_RESTORE_REGS) - restore_regs(m, &prog, nr_args, stack_size); + restore_regs(m, &prog, nr_args, regs_off); /* This needs to be done regardless. If there were fmod_ret programs, * the return value is only updated on the stack and still needs to be diff --git a/drivers/bluetooth/btqca.c b/drivers/bluetooth/btqca.c index be04d74037d2..f7bf311d7910 100644 --- a/drivers/bluetooth/btqca.c +++ b/drivers/bluetooth/btqca.c @@ -6,6 +6,7 @@ */ #include <linux/module.h> #include <linux/firmware.h> +#include <linux/vmalloc.h> #include <net/bluetooth/bluetooth.h> #include <net/bluetooth/hci_core.h> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c index 0c98dd3dee67..b79f816a7203 100644 --- a/drivers/infiniband/core/cache.c +++ b/drivers/infiniband/core/cache.c @@ -33,6 +33,7 @@ * SOFTWARE. */ +#include <linux/if_vlan.h> #include <linux/module.h> #include <linux/errno.h> #include <linux/slab.h> diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index 7264f8c2f7d5..3141a9c85de5 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -1,5 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB /* Copyright (c) 2015 - 2021 Intel Corporation */ +#include <linux/etherdevice.h> + #include "osdep.h" #include "status.h" #include "hmc.h" diff --git a/drivers/infiniband/hw/irdma/uda.c b/drivers/infiniband/hw/irdma/uda.c index f5b1b6150cdc..7a9988ddbd01 100644 --- a/drivers/infiniband/hw/irdma/uda.c +++ b/drivers/infiniband/hw/irdma/uda.c @@ -1,5 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB /* Copyright (c) 2016 - 2021 Intel Corporation */ +#include <linux/etherdevice.h> + #include "osdep.h" #include "status.h" #include "hmc.h" diff --git a/drivers/infiniband/hw/mlx5/doorbell.c b/drivers/infiniband/hw/mlx5/doorbell.c index 6398e2f48579..e32111117a5e 100644 --- a/drivers/infiniband/hw/mlx5/doorbell.c +++ b/drivers/infiniband/hw/mlx5/doorbell.c @@ -32,6 +32,7 @@ #include <linux/kref.h> #include <linux/slab.h> +#include <linux/sched/mm.h> #include <rdma/ib_umem.h> #include "mlx5_ib.h" diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c index e5abbcfc1d57..29475cf8c7c3 100644 --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -30,6 +30,7 @@ * SOFTWARE. */ +#include <linux/etherdevice.h> #include <linux/module.h> #include <rdma/ib_umem.h> #include <rdma/ib_cache.h> diff --git a/drivers/net/amt.c b/drivers/net/amt.c index b732ee9a50ef..a1c7a8acd9c8 100644 --- a/drivers/net/amt.c +++ b/drivers/net/amt.c @@ -11,6 +11,7 @@ #include <linux/net.h> #include <linux/igmp.h> #include <linux/workqueue.h> +#include <net/sch_generic.h> #include <net/net_namespace.h> #include <net/ip.h> #include <net/udp.h> diff --git a/drivers/net/appletalk/ipddp.c b/drivers/net/appletalk/ipddp.c index 5566daefbff4..d558535390f9 100644 --- a/drivers/net/appletalk/ipddp.c +++ b/drivers/net/appletalk/ipddp.c @@ -23,6 +23,7 @@ * of the GNU General Public License, incorporated herein by reference. */ +#include <linux/compat.h> #include <linux/module.h> #include <linux/kernel.h> #include <linux/init.h> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 1bb8fa9fd3aa..07fc603c2fa7 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -35,6 +35,7 @@ #include <linux/module.h> #include <linux/types.h> #include <linux/fcntl.h> +#include <linux/filter.h> #include <linux/interrupt.h> #include <linux/ptrace.h> #include <linux/ioport.h> diff --git a/drivers/net/can/usb/peak_usb/pcan_usb.c b/drivers/net/can/usb/peak_usb/pcan_usb.c index 876218752766..ac6772fe9746 100644 --- a/drivers/net/can/usb/peak_usb/pcan_usb.c +++ b/drivers/net/can/usb/peak_usb/pcan_usb.c @@ -8,6 +8,7 @@ * * Many thanks to Klaus Hitschler <klaus.hitschler@gmx.de> */ +#include <asm/unaligned.h> #include <linux/netdevice.h> #include <linux/usb.h> #include <linux/module.h> diff --git a/drivers/net/dsa/microchip/ksz8795.c b/drivers/net/dsa/microchip/ksz8795.c index 013e9c02be71..991b9c6b6ce7 100644 --- a/drivers/net/dsa/microchip/ksz8795.c +++ b/drivers/net/dsa/microchip/ksz8795.c @@ -10,6 +10,7 @@ #include <linux/delay.h> #include <linux/export.h> #include <linux/gpio.h> +#include <linux/if_vlan.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/platform_data/microchip-ksz.h> diff --git a/drivers/net/dsa/xrs700x/xrs700x.c b/drivers/net/dsa/xrs700x/xrs700x.c index 35fa19ddaf19..0730352cdd57 100644 --- a/drivers/net/dsa/xrs700x/xrs700x.c +++ b/drivers/net/dsa/xrs700x/xrs700x.c @@ -5,6 +5,7 @@ */ #include <net/dsa.h> +#include <linux/etherdevice.h> #include <linux/if_bridge.h> #include <linux/of_device.h> #include <linux/netdev_features.h> diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index 7d5d885d85d5..3b46f1df5609 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -434,7 +434,7 @@ static int ena_xdp_execute(struct ena_ring *rx_ring, struct xdp_buff *xdp) xdp_stat = &rx_ring->rx_stats.xdp_pass; break; default: - bpf_warn_invalid_xdp_action(verdict); + bpf_warn_invalid_xdp_action(rx_ring->netdev, xdp_prog, verdict); xdp_stat = &rx_ring->rx_stats.xdp_invalid; } diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h index 0c39fc2fa345..9391c7101fba 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.h +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h @@ -14,6 +14,7 @@ #include <linux/interrupt.h> #include <linux/netdevice.h> #include <linux/skbuff.h> +#include <uapi/linux/bpf.h> #include "ena_com.h" #include "ena_eth_com.h" diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c index 951c4c569a9b..4da31b1b84f9 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c @@ -9,6 +9,7 @@ #include <linux/pci.h> #include <linux/netdevice.h> +#include <linux/vmalloc.h> #include <net/devlink.h> #include "bnxt_hsi.h" #include "bnxt.h" diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c index c8083df5e0ab..52fad0fdeacf 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c @@ -195,7 +195,7 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons, *event |= BNXT_REDIRECT_EVENT; break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(bp->dev, xdp_prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(bp->dev, xdp_prog, act); diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 63191692f624..a04aa206fddc 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -590,7 +590,7 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog, nicvf_xdp_sq_append_pkt(nic, sq, (u64)xdp.data, dma_addr, len); return true; default: - bpf_warn_invalid_xdp_action(action); + bpf_warn_invalid_xdp_action(nic->netdev, prog, action); fallthrough; case XDP_ABORTED: trace_xdp_exception(nic->netdev, prog, action); diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c index 50bbe79fb93d..4367edbdd579 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c @@ -10,6 +10,7 @@ #include <linux/iommu.h> #include <net/ip.h> #include <net/tso.h> +#include <uapi/linux/bpf.h> #include "nic_reg.h" #include "nic.h" diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c index d6871437d951..c78883c3a2c8 100644 --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c @@ -2623,7 +2623,7 @@ static u32 dpaa_run_xdp(struct dpaa_priv *priv, struct qm_fd *fd, void *vaddr, } break; default: - bpf_warn_invalid_xdp_action(xdp_act); + bpf_warn_invalid_xdp_action(priv->net_dev, xdp_prog, xdp_act); fallthrough; case XDP_ABORTED: trace_xdp_exception(priv->net_dev, xdp_prog, xdp_act); diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c index 8e643567abce..d21ba70ef4a3 100644 --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c @@ -374,7 +374,7 @@ static u32 dpaa2_eth_run_xdp(struct dpaa2_eth_priv *priv, dpaa2_eth_xdp_enqueue(priv, ch, fd, vaddr, rx_fq->flowid); break; default: - bpf_warn_invalid_xdp_action(xdp_act); + bpf_warn_invalid_xdp_action(priv->net_dev, xdp_prog, xdp_act); fallthrough; case XDP_ABORTED: trace_xdp_exception(priv->net_dev, xdp_prog, xdp_act); diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 504e12554079..eacb41f86bdb 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -1547,7 +1547,7 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, switch (xdp_act) { default: - bpf_warn_invalid_xdp_action(xdp_act); + bpf_warn_invalid_xdp_action(rx_ring->ndev, prog, xdp_act); fallthrough; case XDP_ABORTED: trace_xdp_exception(rx_ring->ndev, prog, xdp_act); diff --git a/drivers/net/ethernet/huawei/hinic/hinic_tx.c b/drivers/net/ethernet/huawei/hinic/hinic_tx.c index a984a7a6dd2e..8d59babbf476 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_tx.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_tx.c @@ -4,6 +4,7 @@ * Copyright(c) 2017 Huawei Technologies Co., Ltd */ +#include <linux/if_vlan.h> #include <linux/kernel.h> #include <linux/netdevice.h> #include <linux/u64_stats_sync.h> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index 9e3991caa5c9..66cc79500c10 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -2322,7 +2322,7 @@ static int i40e_run_xdp(struct i40e_ring *rx_ring, struct xdp_buff *xdp) result = I40E_XDP_REDIR; break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(rx_ring->netdev, xdp_prog, act); fallthrough; case XDP_ABORTED: out_failure: diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c index ea06e957393e..945b1bb9c6f4 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c @@ -176,7 +176,7 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp) goto out_failure; break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(rx_ring->netdev, xdp_prog, act); fallthrough; case XDP_ABORTED: out_failure: diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.h b/drivers/net/ethernet/intel/i40e/i40e_xsk.h index ea88f4597a07..bb962987f300 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.h +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.h @@ -22,7 +22,6 @@ struct i40e_vsi; struct xsk_buff_pool; -struct zero_copy_allocator; int i40e_queue_pair_disable(struct i40e_vsi *vsi, int queue_pair); int i40e_queue_pair_enable(struct i40e_vsi *vsi, int queue_pair); diff --git a/drivers/net/ethernet/intel/ice/ice_devlink.c b/drivers/net/ethernet/intel/ice/ice_devlink.c index 5aa18219920b..a230edb38466 100644 --- a/drivers/net/ethernet/intel/ice/ice_devlink.c +++ b/drivers/net/ethernet/intel/ice/ice_devlink.c @@ -1,6 +1,8 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright (c) 2020, Intel Corporation. */ +#include <linux/vmalloc.h> + #include "ice.h" #include "ice_lib.h" #include "ice_devlink.h" diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 486fc5509d13..3e38695f1c9d 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -576,7 +576,7 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, goto out_failure; return ICE_XDP_REDIR; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(rx_ring->netdev, xdp_prog, act); fallthrough; case XDP_ABORTED: out_failure: diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c index 1dd7e84f41f8..9520b140bdbf 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c @@ -1,6 +1,8 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright (c) 2019, Intel Corporation. */ +#include <linux/filter.h> + #include "ice_txrx_lib.h" #include "ice_eswitch.h" #include "ice_lib.h" diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index c895351b25e0..2388837d6d6c 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -481,7 +481,7 @@ ice_run_xdp_zc(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, goto out_failure; break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(rx_ring->netdev, xdp_prog, act); fallthrough; case XDP_ABORTED: out_failure: diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 4603ae35fcab..38ba92022cd4 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -8500,7 +8500,7 @@ static struct sk_buff *igb_run_xdp(struct igb_adapter *adapter, result = IGB_XDP_REDIR; break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(adapter->netdev, xdp_prog, act); fallthrough; case XDP_ABORTED: out_failure: diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 453ef9f17162..2f17f36e94fd 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -2241,7 +2241,7 @@ static int __igc_xdp_run_prog(struct igc_adapter *adapter, return IGC_XDP_REDIRECT; break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(adapter->netdev, prog, act); fallthrough; case XDP_ABORTED: out_failure: diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c b/drivers/net/ethernet/intel/igc/igc_xdp.c index a8cf5374be47..aeeb34e64610 100644 --- a/drivers/net/ethernet/intel/igc/igc_xdp.c +++ b/drivers/net/ethernet/intel/igc/igc_xdp.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright (c) 2020, Intel Corporation. */ +#include <linux/if_vlan.h> #include <net/xdp_sock_drv.h> #include "igc.h" diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 15095ac868d5..c6ff656b2476 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -2235,7 +2235,7 @@ static struct sk_buff *ixgbe_run_xdp(struct ixgbe_adapter *adapter, result = IXGBE_XDP_REDIR; break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(rx_ring->netdev, xdp_prog, act); fallthrough; case XDP_ABORTED: out_failure: diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_txrx_common.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_txrx_common.h index a82533f21d36..bba3feaf3318 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_txrx_common.h +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_txrx_common.h @@ -35,8 +35,6 @@ int ixgbe_xsk_pool_setup(struct ixgbe_adapter *adapter, struct xsk_buff_pool *pool, u16 qid); -void ixgbe_zca_free(struct zero_copy_allocator *alloc, unsigned long handle); - bool ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 cleaned_count); int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector, struct ixgbe_ring *rx_ring, diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c index db2bc58dfcfd..b3fd8e5cd85b 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c @@ -131,7 +131,7 @@ static int ixgbe_run_xdp_zc(struct ixgbe_adapter *adapter, goto out_failure; break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(rx_ring->netdev, xdp_prog, act); fallthrough; case XDP_ABORTED: out_failure: diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index ea73fb3026bc..0015fcf1df2b 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -1070,7 +1070,7 @@ static struct sk_buff *ixgbevf_run_xdp(struct ixgbevf_adapter *adapter, goto out_failure; break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(rx_ring->netdev, xdp_prog, act); fallthrough; case XDP_ABORTED: out_failure: diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index e4c328f61188..83c8908f0cc7 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -2240,7 +2240,7 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq, mvneta_xdp_put_buff(pp, rxq, xdp, sinfo, sync); break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(pp->dev, prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(pp->dev, prog, act); diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c index 339ae274021e..7cdbf8b8bbf6 100644 --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c @@ -3823,7 +3823,7 @@ mvpp2_run_xdp(struct mvpp2_port *port, struct bpf_prog *prog, } break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(port->dev, prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(port->dev, prog, act); diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c index 0cc6353254bf..7c4068c5d1ac 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c @@ -1198,7 +1198,7 @@ static bool otx2_xdp_rcv_pkt_handler(struct otx2_nic *pfvf, put_page(page); break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(pfvf->netdev, prog, act); break; case XDP_ABORTED: trace_xdp_exception(pfvf->netdev, prog, act); diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c index ad1e4caf48bf..c61dc7ae0c05 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c @@ -33,6 +33,7 @@ #include <linux/bpf.h> #include <linux/etherdevice.h> +#include <linux/filter.h> #include <linux/tcp.h> #include <linux/if_vlan.h> #include <linux/delay.h> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c index 650e6a1844ae..8cfc649f226b 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c @@ -812,7 +812,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud trace_xdp_exception(dev, xdp_prog, act); goto xdp_drop_no_cnt; /* Drop on xmit failure */ default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(dev, xdp_prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(dev, xdp_prog, act); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c b/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c index 50977f01a050..00449df98a5e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2020, Mellanox Technologies inc. All rights reserved. */ +#include <net/sch_generic.h> #include "en.h" #include "params.h" diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index 2f0df5cc1a2d..338d65e2c9ce 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -151,7 +151,7 @@ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di, rq->stats->xdp_redirect++; return true; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(rq->netdev, prog, act); fallthrough; case XDP_ABORTED: xdp_abort: diff --git a/drivers/net/ethernet/microsoft/mana/mana_bpf.c b/drivers/net/ethernet/microsoft/mana/mana_bpf.c index 1bc8ff388341..1d2f948b5c00 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_bpf.c +++ b/drivers/net/ethernet/microsoft/mana/mana_bpf.c @@ -60,7 +60,7 @@ u32 mana_run_xdp(struct net_device *ndev, struct mana_rxq *rxq, break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(ndev, prog, act); } out: diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index d37d35885579..498d0f999275 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -1,6 +1,8 @@ // SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause /* Copyright (c) 2021, Microsoft Corporation. */ +#include <uapi/linux/bpf.h> + #include <linux/inetdevice.h> #include <linux/etherdevice.h> #include <linux/ethtool.h> diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c index 6e38da4ad554..79257ec41987 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c @@ -1944,7 +1944,7 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget) xdp_prog, act); continue; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(dp->netdev, xdp_prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(dp->netdev, xdp_prog, act); diff --git a/drivers/net/ethernet/qlogic/qede/qede_fp.c b/drivers/net/ethernet/qlogic/qede/qede_fp.c index 5ea9cb4311a1..b242000a77fd 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_fp.c +++ b/drivers/net/ethernet/qlogic/qede/qede_fp.c @@ -1153,7 +1153,7 @@ static bool qede_rx_xdp(struct qede_dev *edev, qede_rx_bd_ring_consume(rxq); break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(edev->ndev, prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(edev->ndev, prog, act); diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c index a8c252e2b252..302dc835ac3d 100644 --- a/drivers/net/ethernet/sfc/efx.c +++ b/drivers/net/ethernet/sfc/efx.c @@ -5,6 +5,7 @@ * Copyright 2005-2013 Solarflare Communications Inc. */ +#include <linux/filter.h> #include <linux/module.h> #include <linux/pci.h> #include <linux/netdevice.h> diff --git a/drivers/net/ethernet/sfc/efx_channels.c b/drivers/net/ethernet/sfc/efx_channels.c index 3dbea028b325..b015d1f2e204 100644 --- a/drivers/net/ethernet/sfc/efx_channels.c +++ b/drivers/net/ethernet/sfc/efx_channels.c @@ -10,6 +10,7 @@ #include "net_driver.h" #include <linux/module.h> +#include <linux/filter.h> #include "efx_channels.h" #include "efx.h" #include "efx_common.h" diff --git a/drivers/net/ethernet/sfc/efx_common.c b/drivers/net/ethernet/sfc/efx_common.c index f187631b2c5c..af37c990217e 100644 --- a/drivers/net/ethernet/sfc/efx_common.c +++ b/drivers/net/ethernet/sfc/efx_common.c @@ -9,6 +9,7 @@ */ #include "net_driver.h" +#include <linux/filter.h> #include <linux/module.h> #include <linux/netdevice.h> #include <net/gre.h> diff --git a/drivers/net/ethernet/sfc/rx.c b/drivers/net/ethernet/sfc/rx.c index 606750938b89..2375cef577e4 100644 --- a/drivers/net/ethernet/sfc/rx.c +++ b/drivers/net/ethernet/sfc/rx.c @@ -338,7 +338,7 @@ static bool efx_do_xdp(struct efx_nic *efx, struct efx_channel *channel, break; default: - bpf_warn_invalid_xdp_action(xdp_act); + bpf_warn_invalid_xdp_action(efx->net_dev, xdp_prog, xdp_act); efx_free_rx_buffers(rx_queue, rx_buf, 1); channel->n_rx_xdp_bad_drops++; trace_xdp_exception(efx->net_dev, xdp_prog, xdp_act); diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c index 2e6bd65f6f43..556bd353dd42 100644 --- a/drivers/net/ethernet/socionext/netsec.c +++ b/drivers/net/ethernet/socionext/netsec.c @@ -933,7 +933,7 @@ static u32 netsec_run_xdp(struct netsec_priv *priv, struct bpf_prog *prog, } break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(priv->ndev, prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(priv->ndev, prog, act); diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h index aca3c3f0f0ad..40b5ed94cb54 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h @@ -22,6 +22,7 @@ #include <linux/net_tstamp.h> #include <linux/reset.h> #include <net/page_pool.h> +#include <uapi/linux/bpf.h> struct stmmac_resources { void __iomem *addr; diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index fd50e5783e8f..63ff2dad8c85 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -4723,7 +4723,7 @@ static int __stmmac_xdp_run_prog(struct stmmac_priv *priv, res = STMMAC_XDP_REDIRECT; break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(priv->dev, prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(priv->dev, prog, act); diff --git a/drivers/net/ethernet/ti/cpsw_priv.c b/drivers/net/ethernet/ti/cpsw_priv.c index 8624a044776f..3537502e5e8b 100644 --- a/drivers/net/ethernet/ti/cpsw_priv.c +++ b/drivers/net/ethernet/ti/cpsw_priv.c @@ -1362,7 +1362,7 @@ int cpsw_run_xdp(struct cpsw_priv *priv, int ch, struct xdp_buff *xdp, xdp_do_flush_map(); break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(ndev, prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(ndev, prog, act); diff --git a/drivers/net/ethernet/ti/cpsw_priv.h b/drivers/net/ethernet/ti/cpsw_priv.h index f33c882eb70e..74555970730c 100644 --- a/drivers/net/ethernet/ti/cpsw_priv.h +++ b/drivers/net/ethernet/ti/cpsw_priv.h @@ -6,6 +6,8 @@ #ifndef DRIVERS_NET_ETHERNET_TI_CPSW_PRIV_H_ #define DRIVERS_NET_ETHERNET_TI_CPSW_PRIV_H_ +#include <uapi/linux/bpf.h> + #include "davinci_cpdma.h" #define CPSW_DEBUG (NETIF_MSG_HW | NETIF_MSG_WOL | \ diff --git a/drivers/net/hamradio/hdlcdrv.c b/drivers/net/hamradio/hdlcdrv.c index b0edb91bb10a..8297411e87ea 100644 --- a/drivers/net/hamradio/hdlcdrv.c +++ b/drivers/net/hamradio/hdlcdrv.c @@ -30,6 +30,7 @@ /*****************************************************************************/ #include <linux/capability.h> +#include <linux/compat.h> #include <linux/module.h> #include <linux/types.h> #include <linux/net.h> diff --git a/drivers/net/hamradio/scc.c b/drivers/net/hamradio/scc.c index 3d59dac063ac..f90830d3dfa6 100644 --- a/drivers/net/hamradio/scc.c +++ b/drivers/net/hamradio/scc.c @@ -148,6 +148,7 @@ /* ----------------------------------------------------------------------- */ +#include <linux/compat.h> #include <linux/module.h> #include <linux/errno.h> #include <linux/signal.h> diff --git a/drivers/net/hyperv/netvsc_bpf.c b/drivers/net/hyperv/netvsc_bpf.c index aa877da113f8..7856905414eb 100644 --- a/drivers/net/hyperv/netvsc_bpf.c +++ b/drivers/net/hyperv/netvsc_bpf.c @@ -68,7 +68,7 @@ u32 netvsc_run_xdp(struct net_device *ndev, struct netvsc_channel *nvchan, break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(ndev, prog, act); } out: diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c index a1c77cc00416..ed0edf5884ef 100644 --- a/drivers/net/loopback.c +++ b/drivers/net/loopback.c @@ -44,6 +44,7 @@ #include <linux/etherdevice.h> #include <linux/skbuff.h> #include <linux/ethtool.h> +#include <net/sch_generic.h> #include <net/sock.h> #include <net/checksum.h> #include <linux/if_ether.h> /* For the statistics structure. */ diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 45a67e72a02c..fed85447701a 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1602,7 +1602,7 @@ static int tun_xdp_act(struct tun_struct *tun, struct bpf_prog *xdp_prog, case XDP_PASS: break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(tun->dev, xdp_prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(tun->dev, xdp_prog, act); diff --git a/drivers/net/veth.c b/drivers/net/veth.c index a4be530b3a1b..d21dd25f429e 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -644,7 +644,7 @@ static struct xdp_frame *veth_xdp_rcv_one(struct veth_rq *rq, rcu_read_unlock(); goto xdp_xmit; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(rq->dev, xdp_prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(rq->dev, xdp_prog, act); @@ -794,7 +794,7 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, rcu_read_unlock(); goto xdp_xmit; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(rq->dev, xdp_prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(rq->dev, xdp_prog, act); diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 87f8c896238f..569eecfbc2cd 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -812,7 +812,7 @@ static struct sk_buff *receive_small(struct net_device *dev, rcu_read_unlock(); goto xdp_xmit; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(vi->dev, xdp_prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(vi->dev, xdp_prog, act); @@ -1022,7 +1022,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, rcu_read_unlock(); goto xdp_xmit; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(vi->dev, xdp_prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(vi->dev, xdp_prog, act); diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c index b4c64226b7ca..e0b1ab99a359 100644 --- a/drivers/net/vrf.c +++ b/drivers/net/vrf.c @@ -34,6 +34,7 @@ #include <net/addrconf.h> #include <net/l3mdev.h> #include <net/fib_rules.h> +#include <net/sch_generic.h> #include <net/netns/generic.h> #include <net/netfilter/nf_conntrack.h> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index d514d96027a6..8b18246ad999 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -948,7 +948,7 @@ static u32 xennet_run_xdp(struct netfront_queue *queue, struct page *pdata, break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(queue->info->netdev, prog, act); } return act; diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 731d31015b6a..347793626f19 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -18,6 +18,7 @@ * 6 Jun 1999 Cache readdir lookups in the page cache. -DaveM */ +#include <linux/compat.h> #include <linux/module.h> #include <linux/time.h> #include <linux/errno.h> diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c index 0d444a90f513..ea17fa1f31ec 100644 --- a/fs/nfs/fs_context.c +++ b/fs/nfs/fs_context.c @@ -10,6 +10,7 @@ * Split from fs/nfs/super.c by David Howells <dhowells@redhat.com> */ +#include <linux/compat.h> #include <linux/module.h> #include <linux/fs.h> #include <linux/fs_context.h> diff --git a/fs/select.c b/fs/select.c index 945896d0ac9e..02cd8cb5e69f 100644 --- a/fs/select.c +++ b/fs/select.c @@ -15,6 +15,7 @@ * of fds to overcome nfds < 16390 descriptors limit (Tigran Aivazian). */ +#include <linux/compat.h> #include <linux/kernel.h> #include <linux/sched/signal.h> #include <linux/sched/rt.h> diff --git a/include/linux/bpf-cgroup-defs.h b/include/linux/bpf-cgroup-defs.h new file mode 100644 index 000000000000..695d1224a71b --- /dev/null +++ b/include/linux/bpf-cgroup-defs.h @@ -0,0 +1,70 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _BPF_CGROUP_DEFS_H +#define _BPF_CGROUP_DEFS_H + +#ifdef CONFIG_CGROUP_BPF + +#include <linux/list.h> +#include <linux/percpu-refcount.h> +#include <linux/workqueue.h> + +struct bpf_prog_array; + +enum cgroup_bpf_attach_type { + CGROUP_BPF_ATTACH_TYPE_INVALID = -1, + CGROUP_INET_INGRESS = 0, + CGROUP_INET_EGRESS, + CGROUP_INET_SOCK_CREATE, + CGROUP_SOCK_OPS, + CGROUP_DEVICE, + CGROUP_INET4_BIND, + CGROUP_INET6_BIND, + CGROUP_INET4_CONNECT, + CGROUP_INET6_CONNECT, + CGROUP_INET4_POST_BIND, + CGROUP_INET6_POST_BIND, + CGROUP_UDP4_SENDMSG, + CGROUP_UDP6_SENDMSG, + CGROUP_SYSCTL, + CGROUP_UDP4_RECVMSG, + CGROUP_UDP6_RECVMSG, + CGROUP_GETSOCKOPT, + CGROUP_SETSOCKOPT, + CGROUP_INET4_GETPEERNAME, + CGROUP_INET6_GETPEERNAME, + CGROUP_INET4_GETSOCKNAME, + CGROUP_INET6_GETSOCKNAME, + CGROUP_INET_SOCK_RELEASE, + MAX_CGROUP_BPF_ATTACH_TYPE +}; + +struct cgroup_bpf { + /* array of effective progs in this cgroup */ + struct bpf_prog_array __rcu *effective[MAX_CGROUP_BPF_ATTACH_TYPE]; + + /* attached progs to this cgroup and attach flags + * when flags == 0 or BPF_F_ALLOW_OVERRIDE the progs list will + * have either zero or one element + * when BPF_F_ALLOW_MULTI the list can have up to BPF_CGROUP_MAX_PROGS + */ + struct list_head progs[MAX_CGROUP_BPF_ATTACH_TYPE]; + u32 flags[MAX_CGROUP_BPF_ATTACH_TYPE]; + + /* list of cgroup shared storages */ + struct list_head storages; + + /* temp storage for effective prog array used by prog_attach/detach */ + struct bpf_prog_array *inactive; + + /* reference counter used to detach bpf programs after cgroup removal */ + struct percpu_ref refcnt; + + /* cgroup_bpf is released using a work queue */ + struct work_struct release_work; +}; + +#else /* CONFIG_CGROUP_BPF */ +struct cgroup_bpf {}; +#endif /* CONFIG_CGROUP_BPF */ + +#endif diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h index 11820a430d6c..b525d8cdc25b 100644 --- a/include/linux/bpf-cgroup.h +++ b/include/linux/bpf-cgroup.h @@ -3,10 +3,10 @@ #define _BPF_CGROUP_H #include <linux/bpf.h> +#include <linux/bpf-cgroup-defs.h> #include <linux/errno.h> #include <linux/jump_label.h> #include <linux/percpu.h> -#include <linux/percpu-refcount.h> #include <linux/rbtree.h> #include <uapi/linux/bpf.h> @@ -23,33 +23,6 @@ struct ctl_table_header; struct task_struct; #ifdef CONFIG_CGROUP_BPF -enum cgroup_bpf_attach_type { - CGROUP_BPF_ATTACH_TYPE_INVALID = -1, - CGROUP_INET_INGRESS = 0, - CGROUP_INET_EGRESS, - CGROUP_INET_SOCK_CREATE, - CGROUP_SOCK_OPS, - CGROUP_DEVICE, - CGROUP_INET4_BIND, - CGROUP_INET6_BIND, - CGROUP_INET4_CONNECT, - CGROUP_INET6_CONNECT, - CGROUP_INET4_POST_BIND, - CGROUP_INET6_POST_BIND, - CGROUP_UDP4_SENDMSG, - CGROUP_UDP6_SENDMSG, - CGROUP_SYSCTL, - CGROUP_UDP4_RECVMSG, - CGROUP_UDP6_RECVMSG, - CGROUP_GETSOCKOPT, - CGROUP_SETSOCKOPT, - CGROUP_INET4_GETPEERNAME, - CGROUP_INET6_GETPEERNAME, - CGROUP_INET4_GETSOCKNAME, - CGROUP_INET6_GETSOCKNAME, - CGROUP_INET_SOCK_RELEASE, - MAX_CGROUP_BPF_ATTACH_TYPE -}; #define CGROUP_ATYPE(type) \ case BPF_##type: return type @@ -127,33 +100,6 @@ struct bpf_prog_list { struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE]; }; -struct bpf_prog_array; - -struct cgroup_bpf { - /* array of effective progs in this cgroup */ - struct bpf_prog_array __rcu *effective[MAX_CGROUP_BPF_ATTACH_TYPE]; - - /* attached progs to this cgroup and attach flags - * when flags == 0 or BPF_F_ALLOW_OVERRIDE the progs list will - * have either zero or one element - * when BPF_F_ALLOW_MULTI the list can have up to BPF_CGROUP_MAX_PROGS - */ - struct list_head progs[MAX_CGROUP_BPF_ATTACH_TYPE]; - u32 flags[MAX_CGROUP_BPF_ATTACH_TYPE]; - - /* list of cgroup shared storages */ - struct list_head storages; - - /* temp storage for effective prog array used by prog_attach/detach */ - struct bpf_prog_array *inactive; - - /* reference counter used to detach bpf programs after cgroup removal */ - struct percpu_ref refcnt; - - /* cgroup_bpf is released using a work queue */ - struct work_struct release_work; -}; - int cgroup_bpf_inherit(struct cgroup *cgrp); void cgroup_bpf_offline(struct cgroup *cgrp); @@ -451,7 +397,6 @@ int cgroup_bpf_prog_query(const union bpf_attr *attr, union bpf_attr __user *uattr); #else -struct cgroup_bpf {}; static inline int cgroup_bpf_inherit(struct cgroup *cgrp) { return 0; } static inline void cgroup_bpf_offline(struct cgroup *cgrp) {} diff --git a/include/linux/bpf-netns.h b/include/linux/bpf-netns.h index 722f799c1a2e..413cfa5e4b07 100644 --- a/include/linux/bpf-netns.h +++ b/include/linux/bpf-netns.h @@ -3,15 +3,9 @@ #define _BPF_NETNS_H #include <linux/mutex.h> +#include <net/netns/bpf.h> #include <uapi/linux/bpf.h> -enum netns_bpf_attach_type { - NETNS_BPF_INVALID = -1, - NETNS_BPF_FLOW_DISSECTOR = 0, - NETNS_BPF_SK_LOOKUP, - MAX_NETNS_BPF_ATTACH_TYPE -}; - static inline enum netns_bpf_attach_type to_netns_bpf_attach_type(enum bpf_attach_type attach_type) { diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 0ceb54c6342f..26753139d5b4 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -297,6 +297,34 @@ bool bpf_map_meta_equal(const struct bpf_map *meta0, extern const struct bpf_map_ops bpf_map_offload_ops; +/* bpf_type_flag contains a set of flags that are applicable to the values of + * arg_type, ret_type and reg_type. For example, a pointer value may be null, + * or a memory is read-only. We classify types into two categories: base types + * and extended types. Extended types are base types combined with a type flag. + * + * Currently there are no more than 32 base types in arg_type, ret_type and + * reg_types. + */ +#define BPF_BASE_TYPE_BITS 8 + +enum bpf_type_flag { + /* PTR may be NULL. */ + PTR_MAYBE_NULL = BIT(0 + BPF_BASE_TYPE_BITS), + + /* MEM is read-only. When applied on bpf_arg, it indicates the arg is + * compatible with both mutable and immutable memory. + */ + MEM_RDONLY = BIT(1 + BPF_BASE_TYPE_BITS), + + __BPF_TYPE_LAST_FLAG = MEM_RDONLY, +}; + +/* Max number of base types. */ +#define BPF_BASE_TYPE_LIMIT (1UL << BPF_BASE_TYPE_BITS) + +/* Max number of all types. */ +#define BPF_TYPE_LIMIT (__BPF_TYPE_LAST_FLAG | (__BPF_TYPE_LAST_FLAG - 1)) + /* function argument constraints */ enum bpf_arg_type { ARG_DONTCARE = 0, /* unused argument in helper function */ @@ -308,13 +336,11 @@ enum bpf_arg_type { ARG_PTR_TO_MAP_KEY, /* pointer to stack used as map key */ ARG_PTR_TO_MAP_VALUE, /* pointer to stack used as map value */ ARG_PTR_TO_UNINIT_MAP_VALUE, /* pointer to valid memory used to store a map value */ - ARG_PTR_TO_MAP_VALUE_OR_NULL, /* pointer to stack used as map value or NULL */ /* the following constraints used to prototype bpf_memcmp() and other * functions that access data on eBPF program stack */ ARG_PTR_TO_MEM, /* pointer to valid memory (stack, packet, map value) */ - ARG_PTR_TO_MEM_OR_NULL, /* pointer to valid memory or NULL */ ARG_PTR_TO_UNINIT_MEM, /* pointer to memory does not need to be initialized, * helper function must fill all bytes or clear * them in error case. @@ -324,42 +350,65 @@ enum bpf_arg_type { ARG_CONST_SIZE_OR_ZERO, /* number of bytes accessed from memory or 0 */ ARG_PTR_TO_CTX, /* pointer to context */ - ARG_PTR_TO_CTX_OR_NULL, /* pointer to context or NULL */ ARG_ANYTHING, /* any (initialized) argument is ok */ ARG_PTR_TO_SPIN_LOCK, /* pointer to bpf_spin_lock */ ARG_PTR_TO_SOCK_COMMON, /* pointer to sock_common */ ARG_PTR_TO_INT, /* pointer to int */ ARG_PTR_TO_LONG, /* pointer to long */ ARG_PTR_TO_SOCKET, /* pointer to bpf_sock (fullsock) */ - ARG_PTR_TO_SOCKET_OR_NULL, /* pointer to bpf_sock (fullsock) or NULL */ ARG_PTR_TO_BTF_ID, /* pointer to in-kernel struct */ ARG_PTR_TO_ALLOC_MEM, /* pointer to dynamically allocated memory */ - ARG_PTR_TO_ALLOC_MEM_OR_NULL, /* pointer to dynamically allocated memory or NULL */ ARG_CONST_ALLOC_SIZE_OR_ZERO, /* number of allocated bytes requested */ ARG_PTR_TO_BTF_ID_SOCK_COMMON, /* pointer to in-kernel sock_common or bpf-mirrored bpf_sock */ ARG_PTR_TO_PERCPU_BTF_ID, /* pointer to in-kernel percpu type */ ARG_PTR_TO_FUNC, /* pointer to a bpf program function */ - ARG_PTR_TO_STACK_OR_NULL, /* pointer to stack or NULL */ + ARG_PTR_TO_STACK, /* pointer to stack */ ARG_PTR_TO_CONST_STR, /* pointer to a null terminated read-only string */ ARG_PTR_TO_TIMER, /* pointer to bpf_timer */ __BPF_ARG_TYPE_MAX, + + /* Extended arg_types. */ + ARG_PTR_TO_MAP_VALUE_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_MAP_VALUE, + ARG_PTR_TO_MEM_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_MEM, + ARG_PTR_TO_CTX_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_CTX, + ARG_PTR_TO_SOCKET_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_SOCKET, + ARG_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_ALLOC_MEM, + ARG_PTR_TO_STACK_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_STACK, + + /* This must be the last entry. Its purpose is to ensure the enum is + * wide enough to hold the higher bits reserved for bpf_type_flag. + */ + __BPF_ARG_TYPE_LIMIT = BPF_TYPE_LIMIT, }; +static_assert(__BPF_ARG_TYPE_MAX <= BPF_BASE_TYPE_LIMIT); /* type of values returned from helper functions */ enum bpf_return_type { RET_INTEGER, /* function returns integer */ RET_VOID, /* function doesn't return anything */ RET_PTR_TO_MAP_VALUE, /* returns a pointer to map elem value */ - RET_PTR_TO_MAP_VALUE_OR_NULL, /* returns a pointer to map elem value or NULL */ - RET_PTR_TO_SOCKET_OR_NULL, /* returns a pointer to a socket or NULL */ - RET_PTR_TO_TCP_SOCK_OR_NULL, /* returns a pointer to a tcp_sock or NULL */ - RET_PTR_TO_SOCK_COMMON_OR_NULL, /* returns a pointer to a sock_common or NULL */ - RET_PTR_TO_ALLOC_MEM_OR_NULL, /* returns a pointer to dynamically allocated memory or NULL */ - RET_PTR_TO_BTF_ID_OR_NULL, /* returns a pointer to a btf_id or NULL */ - RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL, /* returns a pointer to a valid memory or a btf_id or NULL */ + RET_PTR_TO_SOCKET, /* returns a pointer to a socket */ + RET_PTR_TO_TCP_SOCK, /* returns a pointer to a tcp_sock */ + RET_PTR_TO_SOCK_COMMON, /* returns a pointer to a sock_common */ + RET_PTR_TO_ALLOC_MEM, /* returns a pointer to dynamically allocated memory */ RET_PTR_TO_MEM_OR_BTF_ID, /* returns a pointer to a valid memory or a btf_id */ RET_PTR_TO_BTF_ID, /* returns a pointer to a btf_id */ + __BPF_RET_TYPE_MAX, + + /* Extended ret_types. */ + RET_PTR_TO_MAP_VALUE_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_MAP_VALUE, + RET_PTR_TO_SOCKET_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_SOCKET, + RET_PTR_TO_TCP_SOCK_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_TCP_SOCK, + RET_PTR_TO_SOCK_COMMON_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_SOCK_COMMON, + RET_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_ALLOC_MEM, + RET_PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_BTF_ID, + + /* This must be the last entry. Its purpose is to ensure the enum is + * wide enough to hold the higher bits reserved for bpf_type_flag. + */ + __BPF_RET_TYPE_LIMIT = BPF_TYPE_LIMIT, }; +static_assert(__BPF_RET_TYPE_MAX <= BPF_BASE_TYPE_LIMIT); /* eBPF function prototype used by verifier to allow BPF_CALLs from eBPF programs * to in-kernel helper functions and for adjusting imm32 field in BPF_CALL @@ -421,18 +470,15 @@ enum bpf_reg_type { PTR_TO_CTX, /* reg points to bpf_context */ CONST_PTR_TO_MAP, /* reg points to struct bpf_map */ PTR_TO_MAP_VALUE, /* reg points to map element value */ - PTR_TO_MAP_VALUE_OR_NULL,/* points to map elem value or NULL */ + PTR_TO_MAP_KEY, /* reg points to a map element key */ PTR_TO_STACK, /* reg == frame_pointer + offset */ PTR_TO_PACKET_META, /* skb->data - meta_len */ PTR_TO_PACKET, /* reg points to skb->data */ PTR_TO_PACKET_END, /* skb->data + headlen */ PTR_TO_FLOW_KEYS, /* reg points to bpf_flow_keys */ PTR_TO_SOCKET, /* reg points to struct bpf_sock */ - PTR_TO_SOCKET_OR_NULL, /* reg points to struct bpf_sock or NULL */ PTR_TO_SOCK_COMMON, /* reg points to sock_common */ - PTR_TO_SOCK_COMMON_OR_NULL, /* reg points to sock_common or NULL */ PTR_TO_TCP_SOCK, /* reg points to struct tcp_sock */ - PTR_TO_TCP_SOCK_OR_NULL, /* reg points to struct tcp_sock or NULL */ PTR_TO_TP_BUFFER, /* reg points to a writable raw tp's buffer */ PTR_TO_XDP_SOCK, /* reg points to struct xdp_sock */ /* PTR_TO_BTF_ID points to a kernel struct that does not need @@ -450,18 +496,25 @@ enum bpf_reg_type { * been checked for null. Used primarily to inform the verifier * an explicit null check is required for this struct. */ - PTR_TO_BTF_ID_OR_NULL, PTR_TO_MEM, /* reg points to valid memory region */ - PTR_TO_MEM_OR_NULL, /* reg points to valid memory region or NULL */ - PTR_TO_RDONLY_BUF, /* reg points to a readonly buffer */ - PTR_TO_RDONLY_BUF_OR_NULL, /* reg points to a readonly buffer or NULL */ - PTR_TO_RDWR_BUF, /* reg points to a read/write buffer */ - PTR_TO_RDWR_BUF_OR_NULL, /* reg points to a read/write buffer or NULL */ + PTR_TO_BUF, /* reg points to a read/write buffer */ PTR_TO_PERCPU_BTF_ID, /* reg points to a percpu kernel variable */ PTR_TO_FUNC, /* reg points to a bpf program function */ - PTR_TO_MAP_KEY, /* reg points to a map element key */ __BPF_REG_TYPE_MAX, + + /* Extended reg_types. */ + PTR_TO_MAP_VALUE_OR_NULL = PTR_MAYBE_NULL | PTR_TO_MAP_VALUE, + PTR_TO_SOCKET_OR_NULL = PTR_MAYBE_NULL | PTR_TO_SOCKET, + PTR_TO_SOCK_COMMON_OR_NULL = PTR_MAYBE_NULL | PTR_TO_SOCK_COMMON, + PTR_TO_TCP_SOCK_OR_NULL = PTR_MAYBE_NULL | PTR_TO_TCP_SOCK, + PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | PTR_TO_BTF_ID, + + /* This must be the last entry. Its purpose is to ensure the enum is + * wide enough to hold the higher bits reserved for bpf_type_flag. + */ + __BPF_REG_TYPE_LIMIT = BPF_TYPE_LIMIT, }; +static_assert(__BPF_REG_TYPE_MAX <= BPF_BASE_TYPE_LIMIT); /* The information passed from prog-specific *_is_valid_access * back to the verifier. @@ -777,6 +830,7 @@ void bpf_ksym_add(struct bpf_ksym *ksym); void bpf_ksym_del(struct bpf_ksym *ksym); int bpf_jit_charge_modmem(u32 pages); void bpf_jit_uncharge_modmem(u32 pages); +bool bpf_prog_has_trampoline(const struct bpf_prog *prog); #else static inline int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr) @@ -805,6 +859,10 @@ static inline bool is_bpf_image_address(unsigned long address) { return false; } +static inline bool bpf_prog_has_trampoline(const struct bpf_prog *prog) +{ + return false; +} #endif struct bpf_func_info_aux { @@ -2163,6 +2221,7 @@ extern const struct bpf_func_proto bpf_sk_getsockopt_proto; extern const struct bpf_func_proto bpf_kallsyms_lookup_name_proto; extern const struct bpf_func_proto bpf_find_vma_proto; extern const struct bpf_func_proto bpf_loop_proto; +extern const struct bpf_func_proto bpf_strncmp_proto; const struct bpf_func_proto *tracing_prog_func_proto( enum bpf_func_id func_id, const struct bpf_prog *prog); diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h index 24496bc28e7b..37b3906af8b1 100644 --- a/include/linux/bpf_local_storage.h +++ b/include/linux/bpf_local_storage.h @@ -8,6 +8,7 @@ #define _BPF_LOCAL_STORAGE_H #include <linux/bpf.h> +#include <linux/filter.h> #include <linux/rculist.h> #include <linux/list.h> #include <linux/hash.h> @@ -16,6 +17,9 @@ #define BPF_LOCAL_STORAGE_CACHE_SIZE 16 +#define bpf_rcu_lock_held() \ + (rcu_read_lock_held() || rcu_read_lock_trace_held() || \ + rcu_read_lock_bh_held()) struct bpf_local_storage_map_bucket { struct hlist_head list; raw_spinlock_t lock; @@ -161,4 +165,6 @@ struct bpf_local_storage_data * bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, void *value, u64 map_flags); +void bpf_local_storage_free_rcu(struct rcu_head *rcu); + #endif /* _BPF_LOCAL_STORAGE_H */ diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 182b16a91084..143401d4c9d9 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -18,6 +18,8 @@ * that converting umax_value to int cannot overflow. */ #define BPF_MAX_VAR_SIZ (1 << 29) +/* size of type_str_buf in bpf_verifier. */ +#define TYPE_STR_BUF_LEN 64 /* Liveness marks, used for registers and spilled-regs (in stack slots). * Read marks propagate upwards until they find a write mark; they record that @@ -388,6 +390,8 @@ static inline bool bpf_verifier_log_full(const struct bpf_verifier_log *log) #define BPF_LOG_LEVEL (BPF_LOG_LEVEL1 | BPF_LOG_LEVEL2) #define BPF_LOG_MASK (BPF_LOG_LEVEL | BPF_LOG_STATS) #define BPF_LOG_KERNEL (BPF_LOG_MASK + 1) /* kernel internal flag */ +#define BPF_LOG_MIN_ALIGNMENT 8U +#define BPF_LOG_ALIGNMENT 40U static inline bool bpf_verifier_log_needed(const struct bpf_verifier_log *log) { @@ -474,6 +478,16 @@ struct bpf_verifier_env { /* longest register parentage chain walked for liveness marking */ u32 longest_mark_read_walk; bpfptr_t fd_array; + + /* bit mask to keep track of whether a register has been accessed + * since the last time the function state was printed + */ + u32 scratched_regs; + /* Same as scratched_regs but for stack slots */ + u64 scratched_stack_slots; + u32 prev_log_len, prev_insn_print_len; + /* buffer used in reg_type_str() to generate reg_type string */ + char type_str_buf[TYPE_STR_BUF_LEN]; }; __printf(2, 0) void bpf_verifier_vlog(struct bpf_verifier_log *log, @@ -536,5 +550,18 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, struct bpf_attach_target_info *tgt_info); void bpf_free_kfunc_btf_tab(struct bpf_kfunc_btf_tab *tab); +#define BPF_BASE_TYPE_MASK GENMASK(BPF_BASE_TYPE_BITS - 1, 0) + +/* extract base type from bpf_{arg, return, reg}_type. */ +static inline u32 base_type(u32 type) +{ + return type & BPF_BASE_TYPE_MASK; +} + +/* extract flags from an extended type. See bpf_type_flag in bpf.h. */ +static inline u32 type_flag(u32 type) +{ + return type & ~BPF_BASE_TYPE_MASK; +} #endif /* _LINUX_BPF_VERIFIER_H */ diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h index db2e147e069f..411684c80cf3 100644 --- a/include/linux/cgroup-defs.h +++ b/include/linux/cgroup-defs.h @@ -19,7 +19,7 @@ #include <linux/percpu-rwsem.h> #include <linux/u64_stats_sync.h> #include <linux/workqueue.h> -#include <linux/bpf-cgroup.h> +#include <linux/bpf-cgroup-defs.h> #include <linux/psi_types.h> #ifdef CONFIG_CGROUPS diff --git a/include/linux/dsa/loop.h b/include/linux/dsa/loop.h index 5a3470bcc8a7..b8fef35591aa 100644 --- a/include/linux/dsa/loop.h +++ b/include/linux/dsa/loop.h @@ -2,6 +2,7 @@ #ifndef DSA_LOOP_H #define DSA_LOOP_H +#include <linux/if_vlan.h> #include <linux/types.h> #include <linux/ethtool.h> #include <net/dsa.h> diff --git a/include/linux/filter.h b/include/linux/filter.h index 68c6b5c208e7..60eec80fa1d4 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1027,7 +1027,7 @@ void xdp_do_flush(void); */ #define xdp_do_flush_map xdp_do_flush -void bpf_warn_invalid_xdp_action(u32 act); +void bpf_warn_invalid_xdp_action(struct net_device *dev, struct bpf_prog *prog, u32 act); #ifdef CONFIG_INET struct sock *bpf_run_sk_reuseport(struct sock_reuseport *reuse, struct sock *sk, diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 0dcfd265beed..4a021149eaf0 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -611,6 +611,7 @@ struct swevent_hlist { #define PERF_ATTACH_SCHED_CB 0x20 #define PERF_ATTACH_CHILD 0x40 +struct bpf_prog; struct perf_cgroup; struct perf_buffer; diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index 83b8070d1cc9..a9a4ccc0cdb5 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -20,6 +20,7 @@ #include <net/inetpeer.h> #include <net/fib_notifier.h> #include <linux/indirect_call_wrapper.h> +#include <uapi/linux/bpf.h> #ifdef CONFIG_IPV6_MULTIPLE_TABLES #define FIB6_TABLE_HASHSZ 256 diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 53ac7707ca70..3afcb128e064 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -21,6 +21,8 @@ #include <net/snmp.h> #include <net/netns/hash.h> +struct ip_tunnel_info; + #define SIN6_LEN_RFC2133 24 #define IPV6_MAXPLEN 65535 diff --git a/include/net/netns/bpf.h b/include/net/netns/bpf.h index 0ca6a1b87185..2c01a278d1eb 100644 --- a/include/net/netns/bpf.h +++ b/include/net/netns/bpf.h @@ -6,11 +6,18 @@ #ifndef __NETNS_BPF_H__ #define __NETNS_BPF_H__ -#include <linux/bpf-netns.h> +#include <linux/list.h> struct bpf_prog; struct bpf_prog_array; +enum netns_bpf_attach_type { + NETNS_BPF_INVALID = -1, + NETNS_BPF_FLOW_DISSECTOR = 0, + NETNS_BPF_SK_LOOKUP, + MAX_NETNS_BPF_ATTACH_TYPE +}; + struct netns_bpf { /* Array of programs to run compiled from progs or links */ struct bpf_prog_array __rcu *run_array[MAX_NETNS_BPF_ATTACH_TYPE]; diff --git a/include/net/route.h b/include/net/route.h index 2e6c0e153e3a..4c858dcf1aa8 100644 --- a/include/net/route.h +++ b/include/net/route.h @@ -43,6 +43,7 @@ #define RT_CONN_FLAGS(sk) (RT_TOS(inet_sk(sk)->tos) | sock_flag(sk, SOCK_LOCALROUTE)) #define RT_CONN_FLAGS_TOS(sk,tos) (RT_TOS(tos) | sock_flag(sk, SOCK_LOCALROUTE)) +struct ip_tunnel_info; struct fib_nh; struct fib_info; struct uncached_list; diff --git a/include/net/sock.h b/include/net/sock.h index 44cc25f0bae7..7b4b4237e6e0 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -56,7 +56,6 @@ #include <linux/wait.h> #include <linux/cgroup-defs.h> #include <linux/rbtree.h> -#include <linux/filter.h> #include <linux/rculist_nulls.h> #include <linux/poll.h> #include <linux/sockptr.h> @@ -249,6 +248,7 @@ struct sock_common { }; struct bpf_local_storage; +struct sk_filter; /** * struct sock - network layer representation of sockets diff --git a/include/net/xdp_priv.h b/include/net/xdp_priv.h index a9d5b7603b89..a2d58b1a12e1 100644 --- a/include/net/xdp_priv.h +++ b/include/net/xdp_priv.h @@ -10,7 +10,6 @@ struct xdp_mem_allocator { union { void *allocator; struct page_pool *page_pool; - struct zero_copy_allocator *zc_alloc; }; struct rhash_head node; struct rcu_head rcu; diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index fff069d2ed1b..3057e1a4a11c 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -6,6 +6,7 @@ #ifndef _LINUX_XDP_SOCK_H #define _LINUX_XDP_SOCK_H +#include <linux/bpf.h> #include <linux/workqueue.h> #include <linux/if_xdp.h> #include <linux/mutex.h> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index c26871263f1f..b0383d371b9a 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -4983,6 +4983,41 @@ union bpf_attr { * Return * The number of loops performed, **-EINVAL** for invalid **flags**, * **-E2BIG** if **nr_loops** exceeds the maximum number of loops. + * + * long bpf_strncmp(const char *s1, u32 s1_sz, const char *s2) + * Description + * Do strncmp() between **s1** and **s2**. **s1** doesn't need + * to be null-terminated and **s1_sz** is the maximum storage + * size of **s1**. **s2** must be a read-only string. + * Return + * An integer less than, equal to, or greater than zero + * if the first **s1_sz** bytes of **s1** is found to be + * less than, to match, or be greater than **s2**. + * + * long bpf_get_func_arg(void *ctx, u32 n, u64 *value) + * Description + * Get **n**-th argument (zero based) of the traced function (for tracing programs) + * returned in **value**. + * + * Return + * 0 on success. + * **-EINVAL** if n >= arguments count of traced function. + * + * long bpf_get_func_ret(void *ctx, u64 *value) + * Description + * Get return value of the traced function (for tracing programs) + * in **value**. + * + * Return + * 0 on success. + * **-EOPNOTSUPP** for tracing programs other than BPF_TRACE_FEXIT or BPF_MODIFY_RETURN. + * + * long bpf_get_func_arg_cnt(void *ctx) + * Description + * Get number of arguments of the traced function (for tracing programs). + * + * Return + * The number of arguments of the traced function. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -5167,6 +5202,10 @@ union bpf_attr { FN(kallsyms_lookup_name), \ FN(find_vma), \ FN(loop), \ + FN(strncmp), \ + FN(get_func_arg), \ + FN(get_func_ret), \ + FN(get_func_arg_cnt), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/bloom_filter.c b/kernel/bpf/bloom_filter.c index 277a05e9c984..b141a1346f72 100644 --- a/kernel/bpf/bloom_filter.c +++ b/kernel/bpf/bloom_filter.c @@ -82,6 +82,11 @@ static int bloom_map_delete_elem(struct bpf_map *map, void *value) return -EOPNOTSUPP; } +static int bloom_map_get_next_key(struct bpf_map *map, void *key, void *next_key) +{ + return -EOPNOTSUPP; +} + static struct bpf_map *bloom_map_alloc(union bpf_attr *attr) { u32 bitset_bytes, bitset_mask, nr_hash_funcs, nr_bits; @@ -192,6 +197,7 @@ const struct bpf_map_ops bloom_filter_map_ops = { .map_meta_equal = bpf_map_meta_equal, .map_alloc = bloom_map_alloc, .map_free = bloom_map_free, + .map_get_next_key = bloom_map_get_next_key, .map_push_elem = bloom_map_push_elem, .map_peek_elem = bloom_map_peek_elem, .map_pop_elem = bloom_map_pop_elem, diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c index 96ceed0e0fb5..e29d9e3d853e 100644 --- a/kernel/bpf/bpf_inode_storage.c +++ b/kernel/bpf/bpf_inode_storage.c @@ -17,6 +17,7 @@ #include <linux/bpf_lsm.h> #include <linux/btf_ids.h> #include <linux/fdtable.h> +#include <linux/rcupdate_trace.h> DEFINE_BPF_STORAGE_CACHE(inode_cache); @@ -44,7 +45,8 @@ static struct bpf_local_storage_data *inode_storage_lookup(struct inode *inode, if (!bsb) return NULL; - inode_storage = rcu_dereference(bsb->storage); + inode_storage = + rcu_dereference_check(bsb->storage, bpf_rcu_lock_held()); if (!inode_storage) return NULL; @@ -172,6 +174,7 @@ BPF_CALL_4(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode, { struct bpf_local_storage_data *sdata; + WARN_ON_ONCE(!bpf_rcu_lock_held()); if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE)) return (unsigned long)NULL; @@ -204,6 +207,7 @@ BPF_CALL_4(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode, BPF_CALL_2(bpf_inode_storage_delete, struct bpf_map *, map, struct inode *, inode) { + WARN_ON_ONCE(!bpf_rcu_lock_held()); if (!inode) return -EINVAL; diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c index b305270b7a4b..71de2a89869c 100644 --- a/kernel/bpf/bpf_local_storage.c +++ b/kernel/bpf/bpf_local_storage.c @@ -11,6 +11,9 @@ #include <net/sock.h> #include <uapi/linux/sock_diag.h> #include <uapi/linux/btf.h> +#include <linux/rcupdate.h> +#include <linux/rcupdate_trace.h> +#include <linux/rcupdate_wait.h> #define BPF_LOCAL_STORAGE_CREATE_FLAG_MASK (BPF_F_NO_PREALLOC | BPF_F_CLONE) @@ -81,6 +84,22 @@ bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner, return NULL; } +void bpf_local_storage_free_rcu(struct rcu_head *rcu) +{ + struct bpf_local_storage *local_storage; + + local_storage = container_of(rcu, struct bpf_local_storage, rcu); + kfree_rcu(local_storage, rcu); +} + +static void bpf_selem_free_rcu(struct rcu_head *rcu) +{ + struct bpf_local_storage_elem *selem; + + selem = container_of(rcu, struct bpf_local_storage_elem, rcu); + kfree_rcu(selem, rcu); +} + /* local_storage->lock must be held and selem->local_storage == local_storage. * The caller must ensure selem->smap is still valid to be * dereferenced for its smap->elem_size and smap->cache_idx. @@ -93,7 +112,7 @@ bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_storage, bool free_local_storage; void *owner; - smap = rcu_dereference(SDATA(selem)->smap); + smap = rcu_dereference_check(SDATA(selem)->smap, bpf_rcu_lock_held()); owner = local_storage->owner; /* All uncharging on the owner must be done first. @@ -118,12 +137,12 @@ bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_storage, * * Although the unlock will be done under * rcu_read_lock(), it is more intutivie to - * read if kfree_rcu(local_storage, rcu) is done + * read if the freeing of the storage is done * after the raw_spin_unlock_bh(&local_storage->lock). * * Hence, a "bool free_local_storage" is returned - * to the caller which then calls the kfree_rcu() - * after unlock. + * to the caller which then calls then frees the storage after + * all the RCU grace periods have expired. */ } hlist_del_init_rcu(&selem->snode); @@ -131,8 +150,7 @@ bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_storage, SDATA(selem)) RCU_INIT_POINTER(local_storage->cache[smap->cache_idx], NULL); - kfree_rcu(selem, rcu); - + call_rcu_tasks_trace(&selem->rcu, bpf_selem_free_rcu); return free_local_storage; } @@ -146,7 +164,8 @@ static void __bpf_selem_unlink_storage(struct bpf_local_storage_elem *selem) /* selem has already been unlinked from sk */ return; - local_storage = rcu_dereference(selem->local_storage); + local_storage = rcu_dereference_check(selem->local_storage, + bpf_rcu_lock_held()); raw_spin_lock_irqsave(&local_storage->lock, flags); if (likely(selem_linked_to_storage(selem))) free_local_storage = bpf_selem_unlink_storage_nolock( @@ -154,7 +173,8 @@ static void __bpf_selem_unlink_storage(struct bpf_local_storage_elem *selem) raw_spin_unlock_irqrestore(&local_storage->lock, flags); if (free_local_storage) - kfree_rcu(local_storage, rcu); + call_rcu_tasks_trace(&local_storage->rcu, + bpf_local_storage_free_rcu); } void bpf_selem_link_storage_nolock(struct bpf_local_storage *local_storage, @@ -174,7 +194,7 @@ void bpf_selem_unlink_map(struct bpf_local_storage_elem *selem) /* selem has already be unlinked from smap */ return; - smap = rcu_dereference(SDATA(selem)->smap); + smap = rcu_dereference_check(SDATA(selem)->smap, bpf_rcu_lock_held()); b = select_bucket(smap, selem); raw_spin_lock_irqsave(&b->lock, flags); if (likely(selem_linked_to_map(selem))) @@ -213,12 +233,14 @@ bpf_local_storage_lookup(struct bpf_local_storage *local_storage, struct bpf_local_storage_elem *selem; /* Fast path (cache hit) */ - sdata = rcu_dereference(local_storage->cache[smap->cache_idx]); + sdata = rcu_dereference_check(local_storage->cache[smap->cache_idx], + bpf_rcu_lock_held()); if (sdata && rcu_access_pointer(sdata->smap) == smap) return sdata; /* Slow path (cache miss) */ - hlist_for_each_entry_rcu(selem, &local_storage->list, snode) + hlist_for_each_entry_rcu(selem, &local_storage->list, snode, + rcu_read_lock_trace_held()) if (rcu_access_pointer(SDATA(selem)->smap) == smap) break; @@ -306,7 +328,8 @@ int bpf_local_storage_alloc(void *owner, * bucket->list, first_selem can be freed immediately * (instead of kfree_rcu) because * bpf_local_storage_map_free() does a - * synchronize_rcu() before walking the bucket->list. + * synchronize_rcu_mult (waiting for both sleepable and + * normal programs) before walking the bucket->list. * Hence, no one is accessing selem from the * bucket->list under rcu_read_lock(). */ @@ -342,7 +365,8 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, !map_value_has_spin_lock(&smap->map))) return ERR_PTR(-EINVAL); - local_storage = rcu_dereference(*owner_storage(smap, owner)); + local_storage = rcu_dereference_check(*owner_storage(smap, owner), + bpf_rcu_lock_held()); if (!local_storage || hlist_empty(&local_storage->list)) { /* Very first elem for the owner */ err = check_flags(NULL, map_flags); diff --git a/kernel/bpf/bpf_task_storage.c b/kernel/bpf/bpf_task_storage.c index bb69aea1a777..5da7bed0f5f6 100644 --- a/kernel/bpf/bpf_task_storage.c +++ b/kernel/bpf/bpf_task_storage.c @@ -17,6 +17,7 @@ #include <uapi/linux/btf.h> #include <linux/btf_ids.h> #include <linux/fdtable.h> +#include <linux/rcupdate_trace.h> DEFINE_BPF_STORAGE_CACHE(task_cache); @@ -59,7 +60,8 @@ task_storage_lookup(struct task_struct *task, struct bpf_map *map, struct bpf_local_storage *task_storage; struct bpf_local_storage_map *smap; - task_storage = rcu_dereference(task->bpf_storage); + task_storage = + rcu_dereference_check(task->bpf_storage, bpf_rcu_lock_held()); if (!task_storage) return NULL; @@ -229,6 +231,7 @@ BPF_CALL_4(bpf_task_storage_get, struct bpf_map *, map, struct task_struct *, { struct bpf_local_storage_data *sdata; + WARN_ON_ONCE(!bpf_rcu_lock_held()); if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE)) return (unsigned long)NULL; @@ -260,6 +263,7 @@ BPF_CALL_2(bpf_task_storage_delete, struct bpf_map *, map, struct task_struct *, { int ret; + WARN_ON_ONCE(!bpf_rcu_lock_held()); if (!task) return -EINVAL; diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 8b00c6e4d6fb..33bb8ae4a804 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -4826,7 +4826,7 @@ struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog) return prog->aux->attach_btf; } -static bool is_string_ptr(struct btf *btf, const struct btf_type *t) +static bool is_int_ptr(struct btf *btf, const struct btf_type *t) { /* t comes in already as a pointer */ t = btf_type_by_id(btf, t->type); @@ -4835,8 +4835,7 @@ static bool is_string_ptr(struct btf *btf, const struct btf_type *t) if (BTF_INFO_KIND(t->info) == BTF_KIND_CONST) t = btf_type_by_id(btf, t->type); - /* char, signed char, unsigned char */ - return btf_type_is_int(t) && t->size == 1; + return btf_type_is_int(t); } bool btf_ctx_access(int off, int size, enum bpf_access_type type, @@ -4941,10 +4940,12 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type, /* check for PTR_TO_RDONLY_BUF_OR_NULL or PTR_TO_RDWR_BUF_OR_NULL */ for (i = 0; i < prog->aux->ctx_arg_info_size; i++) { const struct bpf_ctx_arg_aux *ctx_arg_info = &prog->aux->ctx_arg_info[i]; + u32 type, flag; - if (ctx_arg_info->offset == off && - (ctx_arg_info->reg_type == PTR_TO_RDONLY_BUF_OR_NULL || - ctx_arg_info->reg_type == PTR_TO_RDWR_BUF_OR_NULL)) { + type = base_type(ctx_arg_info->reg_type); + flag = type_flag(ctx_arg_info->reg_type); + if (ctx_arg_info->offset == off && type == PTR_TO_BUF && + (flag & PTR_MAYBE_NULL)) { info->reg_type = ctx_arg_info->reg_type; return true; } @@ -4957,7 +4958,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type, */ return true; - if (is_string_ptr(btf, t)) + if (is_int_ptr(btf, t)) return true; /* this is a pointer to another type */ @@ -5575,12 +5576,53 @@ static u32 *reg2btf_ids[__BPF_REG_TYPE_MAX] = { #endif }; +/* Returns true if struct is composed of scalars, 4 levels of nesting allowed */ +static bool __btf_type_is_scalar_struct(struct bpf_verifier_log *log, + const struct btf *btf, + const struct btf_type *t, int rec) +{ + const struct btf_type *member_type; + const struct btf_member *member; + u32 i; + + if (!btf_type_is_struct(t)) + return false; + + for_each_member(i, t, member) { + const struct btf_array *array; + + member_type = btf_type_skip_modifiers(btf, member->type, NULL); + if (btf_type_is_struct(member_type)) { + if (rec >= 3) { + bpf_log(log, "max struct nesting depth exceeded\n"); + return false; + } + if (!__btf_type_is_scalar_struct(log, btf, member_type, rec + 1)) + return false; + continue; + } + if (btf_type_is_array(member_type)) { + array = btf_type_array(member_type); + if (!array->nelems) + return false; + member_type = btf_type_skip_modifiers(btf, array->type, NULL); + if (!btf_type_is_scalar(member_type)) + return false; + continue; + } + if (!btf_type_is_scalar(member_type)) + return false; + } + return true; +} + static int btf_check_func_arg_match(struct bpf_verifier_env *env, const struct btf *btf, u32 func_id, struct bpf_reg_state *regs, bool ptr_to_mem_ok) { struct bpf_verifier_log *log = &env->log; + bool is_kfunc = btf_is_kernel(btf); const char *func_name, *ref_tname; const struct btf_type *t, *ref_t; const struct btf_param *args; @@ -5633,7 +5675,20 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env, ref_t = btf_type_skip_modifiers(btf, t->type, &ref_id); ref_tname = btf_name_by_offset(btf, ref_t->name_off); - if (btf_is_kernel(btf)) { + if (btf_get_prog_ctx_type(log, btf, t, + env->prog->type, i)) { + /* If function expects ctx type in BTF check that caller + * is passing PTR_TO_CTX. + */ + if (reg->type != PTR_TO_CTX) { + bpf_log(log, + "arg#%d expected pointer to ctx, but got %s\n", + i, btf_type_str(t)); + return -EINVAL; + } + if (check_ctx_reg(env, reg, regno)) + return -EINVAL; + } else if (is_kfunc && (reg->type == PTR_TO_BTF_ID || reg2btf_ids[reg->type])) { const struct btf_type *reg_ref_t; const struct btf *reg_btf; const char *reg_ref_tname; @@ -5649,14 +5704,9 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env, if (reg->type == PTR_TO_BTF_ID) { reg_btf = reg->btf; reg_ref_id = reg->btf_id; - } else if (reg2btf_ids[reg->type]) { + } else { reg_btf = btf_vmlinux; reg_ref_id = *reg2btf_ids[reg->type]; - } else { - bpf_log(log, "kernel function %s args#%d expected pointer to %s %s but R%d is not a pointer to btf_id\n", - func_name, i, - btf_type_str(ref_t), ref_tname, regno); - return -EINVAL; } reg_ref_t = btf_type_skip_modifiers(reg_btf, reg_ref_id, @@ -5672,23 +5722,24 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env, reg_ref_tname); return -EINVAL; } - } else if (btf_get_prog_ctx_type(log, btf, t, - env->prog->type, i)) { - /* If function expects ctx type in BTF check that caller - * is passing PTR_TO_CTX. - */ - if (reg->type != PTR_TO_CTX) { - bpf_log(log, - "arg#%d expected pointer to ctx, but got %s\n", - i, btf_type_str(t)); - return -EINVAL; - } - if (check_ctx_reg(env, reg, regno)) - return -EINVAL; } else if (ptr_to_mem_ok) { const struct btf_type *resolve_ret; u32 type_size; + if (is_kfunc) { + /* Permit pointer to mem, but only when argument + * type is pointer to scalar, or struct composed + * (recursively) of scalars. + */ + if (!btf_type_is_scalar(ref_t) && + !__btf_type_is_scalar_struct(log, btf, ref_t, 0)) { + bpf_log(log, + "arg#%d pointer type %s %s must point to scalar or struct with scalar\n", + i, btf_type_str(ref_t), ref_tname); + return -EINVAL; + } + } + resolve_ret = btf_resolve_size(btf, ref_t, &type_size); if (IS_ERR(resolve_ret)) { bpf_log(log, @@ -5701,6 +5752,8 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env, if (check_mem_reg(env, reg, regno, type_size)) return -EINVAL; } else { + bpf_log(log, "reg type unsupported for arg#%d %sfunction %s#%d\n", i, + is_kfunc ? "kernel " : "", func_name, func_id); return -EINVAL; } } @@ -5750,7 +5803,7 @@ int btf_check_kfunc_arg_match(struct bpf_verifier_env *env, const struct btf *btf, u32 func_id, struct bpf_reg_state *regs) { - return btf_check_func_arg_match(env, btf, func_id, regs, false); + return btf_check_func_arg_match(env, btf, func_id, regs, true); } /* Convert BTF of a function into bpf_reg_state if possible @@ -5858,7 +5911,7 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog, return -EINVAL; } - reg->type = PTR_TO_MEM_OR_NULL; + reg->type = PTR_TO_MEM | PTR_MAYBE_NULL; reg->id = ++env->id_gen; continue; @@ -6352,7 +6405,7 @@ const struct bpf_func_proto bpf_btf_find_by_name_kind_proto = { .func = bpf_btf_find_by_name_kind, .gpl_only = false, .ret_type = RET_INTEGER, - .arg1_type = ARG_PTR_TO_MEM, + .arg1_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg2_type = ARG_CONST_SIZE, .arg3_type = ARG_ANYTHING, .arg4_type = ARG_ANYTHING, @@ -6534,12 +6587,11 @@ static struct bpf_cand_cache *populate_cand_cache(struct bpf_cand_cache *cands, bpf_free_cands_from_cache(*cc); *cc = NULL; } - new_cands = kmalloc(sizeof_cands(cands->cnt), GFP_KERNEL); + new_cands = kmemdup(cands, sizeof_cands(cands->cnt), GFP_KERNEL); if (!new_cands) { bpf_free_cands(cands); return ERR_PTR(-ENOMEM); } - memcpy(new_cands, cands, sizeof_cands(cands->cnt)); /* strdup the name, since it will stay in cache. * the cands->name points to strings in prog's BTF and the prog can be unloaded. */ @@ -6657,7 +6709,7 @@ bpf_core_find_cands(struct bpf_core_ctx *ctx, u32 local_type_id) main_btf = bpf_get_btf_vmlinux(); if (IS_ERR(main_btf)) - return (void *)main_btf; + return ERR_CAST(main_btf); local_type = btf_type_by_id(local_btf, local_type_id); if (!local_type) @@ -6684,14 +6736,14 @@ bpf_core_find_cands(struct bpf_core_ctx *ctx, u32 local_type_id) /* Attempt to find target candidates in vmlinux BTF first */ cands = bpf_core_add_cands(cands, main_btf, 1); if (IS_ERR(cands)) - return cands; + return ERR_CAST(cands); /* cands is a pointer to kmalloced memory here if cands->cnt > 0 */ /* populate cache even when cands->cnt == 0 */ cc = populate_cand_cache(cands, vmlinux_cand_cache, VMLINUX_CAND_CACHE_SIZE); if (IS_ERR(cc)) - return cc; + return ERR_CAST(cc); /* if vmlinux BTF has any candidate, don't go for module BTFs */ if (cc->cnt) @@ -6717,7 +6769,7 @@ check_modules: cands = bpf_core_add_cands(cands, mod_btf, btf_nr_types(main_btf)); if (IS_ERR(cands)) { btf_put(mod_btf); - return cands; + return ERR_CAST(cands); } spin_lock_bh(&btf_idr_lock); btf_put(mod_btf); diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c index 43eb3501721b..514b4681a90a 100644 --- a/kernel/bpf/cgroup.c +++ b/kernel/bpf/cgroup.c @@ -1789,7 +1789,7 @@ static const struct bpf_func_proto bpf_sysctl_set_new_value_proto = { .gpl_only = false, .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, }; diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index 585b2b77ccc4..0421061d95f1 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -195,7 +195,7 @@ static void cpu_map_bpf_prog_run_skb(struct bpf_cpu_map_entry *rcpu, } return; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(NULL, rcpu->prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(skb->dev, rcpu->prog, act); @@ -254,7 +254,7 @@ static int cpu_map_bpf_prog_run_xdp(struct bpf_cpu_map_entry *rcpu, } break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(NULL, rcpu->prog, act); fallthrough; case XDP_DROP: xdp_return_frame(xdpf); diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index f02d04540c0c..6feea293ff10 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -348,7 +348,7 @@ static int dev_map_bpf_prog_run(struct bpf_prog *xdp_prog, frames[nframes++] = xdpf; break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(NULL, xdp_prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(dev, xdp_prog, act); @@ -507,7 +507,7 @@ static u32 dev_map_bpf_prog_run_skb(struct sk_buff *skb, struct bpf_dtab_netdev __skb_push(skb, skb->mac_len); break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(NULL, dst->xdp_prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(dst->dev, dst->xdp_prog, act); diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 19fecfeaa9c2..01cfdf40c838 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -2,6 +2,7 @@ /* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com */ #include <linux/bpf.h> +#include <linux/bpf-cgroup.h> #include <linux/rcupdate.h> #include <linux/random.h> #include <linux/smp.h> @@ -530,7 +531,7 @@ const struct bpf_func_proto bpf_strtol_proto = { .func = bpf_strtol, .gpl_only = false, .ret_type = RET_INTEGER, - .arg1_type = ARG_PTR_TO_MEM, + .arg1_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg2_type = ARG_CONST_SIZE, .arg3_type = ARG_ANYTHING, .arg4_type = ARG_PTR_TO_LONG, @@ -558,13 +559,27 @@ const struct bpf_func_proto bpf_strtoul_proto = { .func = bpf_strtoul, .gpl_only = false, .ret_type = RET_INTEGER, - .arg1_type = ARG_PTR_TO_MEM, + .arg1_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg2_type = ARG_CONST_SIZE, .arg3_type = ARG_ANYTHING, .arg4_type = ARG_PTR_TO_LONG, }; #endif +BPF_CALL_3(bpf_strncmp, const char *, s1, u32, s1_sz, const char *, s2) +{ + return strncmp(s1, s2, s1_sz); +} + +const struct bpf_func_proto bpf_strncmp_proto = { + .func = bpf_strncmp, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_CONST_SIZE, + .arg3_type = ARG_PTR_TO_CONST_STR, +}; + BPF_CALL_4(bpf_get_ns_current_pid_tgid, u64, dev, u64, ino, struct bpf_pidns_info *, nsdata, u32, size) { @@ -630,7 +645,7 @@ const struct bpf_func_proto bpf_event_output_data_proto = { .arg1_type = ARG_PTR_TO_CTX, .arg2_type = ARG_CONST_MAP_PTR, .arg3_type = ARG_ANYTHING, - .arg4_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE_OR_ZERO, }; @@ -667,7 +682,7 @@ BPF_CALL_2(bpf_per_cpu_ptr, const void *, ptr, u32, cpu) const struct bpf_func_proto bpf_per_cpu_ptr_proto = { .func = bpf_per_cpu_ptr, .gpl_only = false, - .ret_type = RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL, + .ret_type = RET_PTR_TO_MEM_OR_BTF_ID | PTR_MAYBE_NULL | MEM_RDONLY, .arg1_type = ARG_PTR_TO_PERCPU_BTF_ID, .arg2_type = ARG_ANYTHING, }; @@ -680,7 +695,7 @@ BPF_CALL_1(bpf_this_cpu_ptr, const void *, percpu_ptr) const struct bpf_func_proto bpf_this_cpu_ptr_proto = { .func = bpf_this_cpu_ptr, .gpl_only = false, - .ret_type = RET_PTR_TO_MEM_OR_BTF_ID, + .ret_type = RET_PTR_TO_MEM_OR_BTF_ID | MEM_RDONLY, .arg1_type = ARG_PTR_TO_PERCPU_BTF_ID, }; @@ -1011,7 +1026,7 @@ const struct bpf_func_proto bpf_snprintf_proto = { .arg1_type = ARG_PTR_TO_MEM_OR_NULL, .arg2_type = ARG_CONST_SIZE_OR_ZERO, .arg3_type = ARG_PTR_TO_CONST_STR, - .arg4_type = ARG_PTR_TO_MEM_OR_NULL, + .arg4_type = ARG_PTR_TO_MEM | PTR_MAYBE_NULL | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE_OR_ZERO, }; @@ -1378,6 +1393,8 @@ bpf_base_func_proto(enum bpf_func_id func_id) return &bpf_for_each_map_elem_proto; case BPF_FUNC_loop: return &bpf_loop_proto; + case BPF_FUNC_strncmp: + return &bpf_strncmp_proto; default: break; } diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c index 035e9e3a7132..23f7f9d08a62 100644 --- a/kernel/bpf/local_storage.c +++ b/kernel/bpf/local_storage.c @@ -163,8 +163,7 @@ static int cgroup_storage_update_elem(struct bpf_map *map, void *key, return 0; } - new = bpf_map_kmalloc_node(map, sizeof(struct bpf_storage_buffer) + - map->value_size, + new = bpf_map_kmalloc_node(map, struct_size(new, data, map->value_size), __GFP_ZERO | GFP_ATOMIC | __GFP_NOWARN, map->numa_node); if (!new) diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index 423549d2c52e..5763cc7ac4f1 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -412,7 +412,7 @@ static int trie_update_elem(struct bpf_map *map, rcu_assign_pointer(im_node->child[1], node); } - /* Finally, assign the intermediate node to the determined spot */ + /* Finally, assign the intermediate node to the determined slot */ rcu_assign_pointer(*slot, im_node); out: diff --git a/kernel/bpf/map_iter.c b/kernel/bpf/map_iter.c index 6a9542af4212..b0fa190b0979 100644 --- a/kernel/bpf/map_iter.c +++ b/kernel/bpf/map_iter.c @@ -174,9 +174,9 @@ static const struct bpf_iter_reg bpf_map_elem_reg_info = { .ctx_arg_info_size = 2, .ctx_arg_info = { { offsetof(struct bpf_iter__bpf_map_elem, key), - PTR_TO_RDONLY_BUF_OR_NULL }, + PTR_TO_BUF | PTR_MAYBE_NULL | MEM_RDONLY }, { offsetof(struct bpf_iter__bpf_map_elem, value), - PTR_TO_RDWR_BUF_OR_NULL }, + PTR_TO_BUF | PTR_MAYBE_NULL }, }, }; diff --git a/kernel/bpf/net_namespace.c b/kernel/bpf/net_namespace.c index 542f275bf252..868cc2c43899 100644 --- a/kernel/bpf/net_namespace.c +++ b/kernel/bpf/net_namespace.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include <linux/bpf.h> +#include <linux/bpf-netns.h> #include <linux/filter.h> #include <net/net_namespace.h> diff --git a/kernel/bpf/reuseport_array.c b/kernel/bpf/reuseport_array.c index 93a55391791a..556a769b5b80 100644 --- a/kernel/bpf/reuseport_array.c +++ b/kernel/bpf/reuseport_array.c @@ -152,16 +152,12 @@ static struct bpf_map *reuseport_array_alloc(union bpf_attr *attr) { int numa_node = bpf_map_attr_numa_node(attr); struct reuseport_array *array; - u64 array_size; if (!bpf_capable()) return ERR_PTR(-EPERM); - array_size = sizeof(*array); - array_size += (u64)attr->max_entries * sizeof(struct sock *); - /* allocate all map elements and zero-initialize them */ - array = bpf_map_area_alloc(array_size, numa_node); + array = bpf_map_area_alloc(struct_size(array, ptrs, attr->max_entries), numa_node); if (!array) return ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c index 9e0c10c6892a..638d7fd7b375 100644 --- a/kernel/bpf/ringbuf.c +++ b/kernel/bpf/ringbuf.c @@ -444,7 +444,7 @@ const struct bpf_func_proto bpf_ringbuf_output_proto = { .func = bpf_ringbuf_output, .ret_type = RET_INTEGER, .arg1_type = ARG_CONST_MAP_PTR, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE_OR_ZERO, .arg4_type = ARG_ANYTHING, }; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index ddd81d543203..fa4505f9b611 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2,6 +2,7 @@ /* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com */ #include <linux/bpf.h> +#include <linux/bpf-cgroup.h> #include <linux/bpf_trace.h> #include <linux/bpf_lirc.h> #include <linux/bpf_verifier.h> @@ -4772,7 +4773,7 @@ static const struct bpf_func_proto bpf_sys_bpf_proto = { .gpl_only = false, .ret_type = RET_INTEGER, .arg1_type = ARG_ANYTHING, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, }; diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c index e98de5e73ba5..4b6974a195c1 100644 --- a/kernel/bpf/trampoline.c +++ b/kernel/bpf/trampoline.c @@ -27,6 +27,14 @@ static struct hlist_head trampoline_table[TRAMPOLINE_TABLE_SIZE]; /* serializes access to trampoline_table */ static DEFINE_MUTEX(trampoline_mutex); +bool bpf_prog_has_trampoline(const struct bpf_prog *prog) +{ + enum bpf_attach_type eatype = prog->expected_attach_type; + + return eatype == BPF_TRACE_FENTRY || eatype == BPF_TRACE_FEXIT || + eatype == BPF_MODIFY_RETURN; +} + void *bpf_jit_alloc_exec_page(void) { void *image; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index cd79bb12d076..b70c66c6db3b 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4,6 +4,7 @@ * Copyright (c) 2018 Covalent IO, Inc. http://covalent.io */ #include <uapi/linux/btf.h> +#include <linux/bpf-cgroup.h> #include <linux/kernel.h> #include <linux/types.h> #include <linux/slab.h> @@ -441,18 +442,6 @@ static bool reg_type_not_null(enum bpf_reg_type type) type == PTR_TO_SOCK_COMMON; } -static bool reg_type_may_be_null(enum bpf_reg_type type) -{ - return type == PTR_TO_MAP_VALUE_OR_NULL || - type == PTR_TO_SOCKET_OR_NULL || - type == PTR_TO_SOCK_COMMON_OR_NULL || - type == PTR_TO_TCP_SOCK_OR_NULL || - type == PTR_TO_BTF_ID_OR_NULL || - type == PTR_TO_MEM_OR_NULL || - type == PTR_TO_RDONLY_BUF_OR_NULL || - type == PTR_TO_RDWR_BUF_OR_NULL; -} - static bool reg_may_point_to_spin_lock(const struct bpf_reg_state *reg) { return reg->type == PTR_TO_MAP_VALUE && @@ -461,12 +450,14 @@ static bool reg_may_point_to_spin_lock(const struct bpf_reg_state *reg) static bool reg_type_may_be_refcounted_or_null(enum bpf_reg_type type) { - return type == PTR_TO_SOCKET || - type == PTR_TO_SOCKET_OR_NULL || - type == PTR_TO_TCP_SOCK || - type == PTR_TO_TCP_SOCK_OR_NULL || - type == PTR_TO_MEM || - type == PTR_TO_MEM_OR_NULL; + return base_type(type) == PTR_TO_SOCKET || + base_type(type) == PTR_TO_TCP_SOCK || + base_type(type) == PTR_TO_MEM; +} + +static bool type_is_rdonly_mem(u32 type) +{ + return type & MEM_RDONLY; } static bool arg_type_may_be_refcounted(enum bpf_arg_type type) @@ -474,14 +465,9 @@ static bool arg_type_may_be_refcounted(enum bpf_arg_type type) return type == ARG_PTR_TO_SOCK_COMMON; } -static bool arg_type_may_be_null(enum bpf_arg_type type) +static bool type_may_be_null(u32 type) { - return type == ARG_PTR_TO_MAP_VALUE_OR_NULL || - type == ARG_PTR_TO_MEM_OR_NULL || - type == ARG_PTR_TO_CTX_OR_NULL || - type == ARG_PTR_TO_SOCKET_OR_NULL || - type == ARG_PTR_TO_ALLOC_MEM_OR_NULL || - type == ARG_PTR_TO_STACK_OR_NULL; + return type & PTR_MAYBE_NULL; } /* Determine whether the function releases some resources allocated by another @@ -541,39 +527,54 @@ static bool is_cmpxchg_insn(const struct bpf_insn *insn) insn->imm == BPF_CMPXCHG; } -/* string representation of 'enum bpf_reg_type' */ -static const char * const reg_type_str[] = { - [NOT_INIT] = "?", - [SCALAR_VALUE] = "inv", - [PTR_TO_CTX] = "ctx", - [CONST_PTR_TO_MAP] = "map_ptr", - [PTR_TO_MAP_VALUE] = "map_value", - [PTR_TO_MAP_VALUE_OR_NULL] = "map_value_or_null", - [PTR_TO_STACK] = "fp", - [PTR_TO_PACKET] = "pkt", - [PTR_TO_PACKET_META] = "pkt_meta", - [PTR_TO_PACKET_END] = "pkt_end", - [PTR_TO_FLOW_KEYS] = "flow_keys", - [PTR_TO_SOCKET] = "sock", - [PTR_TO_SOCKET_OR_NULL] = "sock_or_null", - [PTR_TO_SOCK_COMMON] = "sock_common", - [PTR_TO_SOCK_COMMON_OR_NULL] = "sock_common_or_null", - [PTR_TO_TCP_SOCK] = "tcp_sock", - [PTR_TO_TCP_SOCK_OR_NULL] = "tcp_sock_or_null", - [PTR_TO_TP_BUFFER] = "tp_buffer", - [PTR_TO_XDP_SOCK] = "xdp_sock", - [PTR_TO_BTF_ID] = "ptr_", - [PTR_TO_BTF_ID_OR_NULL] = "ptr_or_null_", - [PTR_TO_PERCPU_BTF_ID] = "percpu_ptr_", - [PTR_TO_MEM] = "mem", - [PTR_TO_MEM_OR_NULL] = "mem_or_null", - [PTR_TO_RDONLY_BUF] = "rdonly_buf", - [PTR_TO_RDONLY_BUF_OR_NULL] = "rdonly_buf_or_null", - [PTR_TO_RDWR_BUF] = "rdwr_buf", - [PTR_TO_RDWR_BUF_OR_NULL] = "rdwr_buf_or_null", - [PTR_TO_FUNC] = "func", - [PTR_TO_MAP_KEY] = "map_key", -}; +/* string representation of 'enum bpf_reg_type' + * + * Note that reg_type_str() can not appear more than once in a single verbose() + * statement. + */ +static const char *reg_type_str(struct bpf_verifier_env *env, + enum bpf_reg_type type) +{ + char postfix[16] = {0}, prefix[16] = {0}; + static const char * const str[] = { + [NOT_INIT] = "?", + [SCALAR_VALUE] = "inv", + [PTR_TO_CTX] = "ctx", + [CONST_PTR_TO_MAP] = "map_ptr", + [PTR_TO_MAP_VALUE] = "map_value", + [PTR_TO_STACK] = "fp", + [PTR_TO_PACKET] = "pkt", + [PTR_TO_PACKET_META] = "pkt_meta", + [PTR_TO_PACKET_END] = "pkt_end", + [PTR_TO_FLOW_KEYS] = "flow_keys", + [PTR_TO_SOCKET] = "sock", + [PTR_TO_SOCK_COMMON] = "sock_common", + [PTR_TO_TCP_SOCK] = "tcp_sock", + [PTR_TO_TP_BUFFER] = "tp_buffer", + [PTR_TO_XDP_SOCK] = "xdp_sock", + [PTR_TO_BTF_ID] = "ptr_", + [PTR_TO_PERCPU_BTF_ID] = "percpu_ptr_", + [PTR_TO_MEM] = "mem", + [PTR_TO_BUF] = "buf", + [PTR_TO_FUNC] = "func", + [PTR_TO_MAP_KEY] = "map_key", + }; + + if (type & PTR_MAYBE_NULL) { + if (base_type(type) == PTR_TO_BTF_ID || + base_type(type) == PTR_TO_PERCPU_BTF_ID) + strncpy(postfix, "or_null_", 16); + else + strncpy(postfix, "_or_null", 16); + } + + if (type & MEM_RDONLY) + strncpy(prefix, "rdonly_", 16); + + snprintf(env->type_str_buf, TYPE_STR_BUF_LEN, "%s%s%s", + prefix, str[base_type(type)], postfix); + return env->type_str_buf; +} static char slot_type_char[] = { [STACK_INVALID] = '?', @@ -608,6 +609,44 @@ static const char *kernel_type_name(const struct btf* btf, u32 id) return btf_name_by_offset(btf, btf_type_by_id(btf, id)->name_off); } +static void mark_reg_scratched(struct bpf_verifier_env *env, u32 regno) +{ + env->scratched_regs |= 1U << regno; +} + +static void mark_stack_slot_scratched(struct bpf_verifier_env *env, u32 spi) +{ + env->scratched_stack_slots |= 1UL << spi; +} + +static bool reg_scratched(const struct bpf_verifier_env *env, u32 regno) +{ + return (env->scratched_regs >> regno) & 1; +} + +static bool stack_slot_scratched(const struct bpf_verifier_env *env, u64 regno) +{ + return (env->scratched_stack_slots >> regno) & 1; +} + +static bool verifier_state_scratched(const struct bpf_verifier_env *env) +{ + return env->scratched_regs || env->scratched_stack_slots; +} + +static void mark_verifier_state_clean(struct bpf_verifier_env *env) +{ + env->scratched_regs = 0U; + env->scratched_stack_slots = 0UL; +} + +/* Used for printing the entire verifier state. */ +static void mark_verifier_state_scratched(struct bpf_verifier_env *env) +{ + env->scratched_regs = ~0U; + env->scratched_stack_slots = ~0UL; +} + /* The reg state of a pointer or a bounded scalar was saved when * it was spilled to the stack. */ @@ -623,7 +662,8 @@ static void scrub_spilled_slot(u8 *stype) } static void print_verifier_state(struct bpf_verifier_env *env, - const struct bpf_func_state *state) + const struct bpf_func_state *state, + bool print_all) { const struct bpf_reg_state *reg; enum bpf_reg_type t; @@ -636,9 +676,11 @@ static void print_verifier_state(struct bpf_verifier_env *env, t = reg->type; if (t == NOT_INIT) continue; + if (!print_all && !reg_scratched(env, i)) + continue; verbose(env, " R%d", i); print_liveness(env, reg->live); - verbose(env, "=%s", reg_type_str[t]); + verbose(env, "=%s", reg_type_str(env, t)); if (t == SCALAR_VALUE && reg->precise) verbose(env, "P"); if ((t == SCALAR_VALUE || t == PTR_TO_STACK) && @@ -646,9 +688,8 @@ static void print_verifier_state(struct bpf_verifier_env *env, /* reg->off should be 0 for SCALAR_VALUE */ verbose(env, "%lld", reg->var_off.value + reg->off); } else { - if (t == PTR_TO_BTF_ID || - t == PTR_TO_BTF_ID_OR_NULL || - t == PTR_TO_PERCPU_BTF_ID) + if (base_type(t) == PTR_TO_BTF_ID || + base_type(t) == PTR_TO_PERCPU_BTF_ID) verbose(env, "%s", kernel_type_name(reg->btf, reg->btf_id)); verbose(env, "(id=%d", reg->id); if (reg_type_may_be_refcounted_or_null(t)) @@ -657,10 +698,9 @@ static void print_verifier_state(struct bpf_verifier_env *env, verbose(env, ",off=%d", reg->off); if (type_is_pkt_pointer(t)) verbose(env, ",r=%d", reg->range); - else if (t == CONST_PTR_TO_MAP || - t == PTR_TO_MAP_KEY || - t == PTR_TO_MAP_VALUE || - t == PTR_TO_MAP_VALUE_OR_NULL) + else if (base_type(t) == CONST_PTR_TO_MAP || + base_type(t) == PTR_TO_MAP_KEY || + base_type(t) == PTR_TO_MAP_VALUE) verbose(env, ",ks=%d,vs=%d", reg->map_ptr->key_size, reg->map_ptr->value_size); @@ -725,12 +765,14 @@ static void print_verifier_state(struct bpf_verifier_env *env, types_buf[BPF_REG_SIZE] = 0; if (!valid) continue; + if (!print_all && !stack_slot_scratched(env, i)) + continue; verbose(env, " fp%d", (-i - 1) * BPF_REG_SIZE); print_liveness(env, state->stack[i].spilled_ptr.live); if (is_spilled_reg(&state->stack[i])) { reg = &state->stack[i].spilled_ptr; t = reg->type; - verbose(env, "=%s", reg_type_str[t]); + verbose(env, "=%s", reg_type_str(env, t)); if (t == SCALAR_VALUE && reg->precise) verbose(env, "P"); if (t == SCALAR_VALUE && tnum_is_const(reg->var_off)) @@ -750,6 +792,26 @@ static void print_verifier_state(struct bpf_verifier_env *env, if (state->in_async_callback_fn) verbose(env, " async_cb"); verbose(env, "\n"); + mark_verifier_state_clean(env); +} + +static inline u32 vlog_alignment(u32 pos) +{ + return round_up(max(pos + BPF_LOG_MIN_ALIGNMENT / 2, BPF_LOG_ALIGNMENT), + BPF_LOG_MIN_ALIGNMENT) - pos - 1; +} + +static void print_insn_state(struct bpf_verifier_env *env, + const struct bpf_func_state *state) +{ + if (env->prev_log_len && env->prev_log_len == env->log.len_used) { + /* remove new line character */ + bpf_vlog_reset(&env->log, env->prev_log_len - 1); + verbose(env, "%*c;", vlog_alignment(env->prev_insn_print_len), ' '); + } else { + verbose(env, "%d:", env->insn_idx); + } + print_verifier_state(env, state, false); } /* copy array src of length n * size bytes to dst. dst is reallocated if it's too @@ -1143,8 +1205,7 @@ static void mark_reg_known_zero(struct bpf_verifier_env *env, static void mark_ptr_not_null_reg(struct bpf_reg_state *reg) { - switch (reg->type) { - case PTR_TO_MAP_VALUE_OR_NULL: { + if (base_type(reg->type) == PTR_TO_MAP_VALUE) { const struct bpf_map *map = reg->map_ptr; if (map->inner_map_meta) { @@ -1163,32 +1224,10 @@ static void mark_ptr_not_null_reg(struct bpf_reg_state *reg) } else { reg->type = PTR_TO_MAP_VALUE; } - break; - } - case PTR_TO_SOCKET_OR_NULL: - reg->type = PTR_TO_SOCKET; - break; - case PTR_TO_SOCK_COMMON_OR_NULL: - reg->type = PTR_TO_SOCK_COMMON; - break; - case PTR_TO_TCP_SOCK_OR_NULL: - reg->type = PTR_TO_TCP_SOCK; - break; - case PTR_TO_BTF_ID_OR_NULL: - reg->type = PTR_TO_BTF_ID; - break; - case PTR_TO_MEM_OR_NULL: - reg->type = PTR_TO_MEM; - break; - case PTR_TO_RDONLY_BUF_OR_NULL: - reg->type = PTR_TO_RDONLY_BUF; - break; - case PTR_TO_RDWR_BUF_OR_NULL: - reg->type = PTR_TO_RDWR_BUF; - break; - default: - WARN_ONCE(1, "unknown nullable register type"); + return; } + + reg->type &= ~PTR_MAYBE_NULL; } static bool reg_is_pkt_pointer(const struct bpf_reg_state *reg) @@ -1546,6 +1585,7 @@ static void init_func_state(struct bpf_verifier_env *env, state->frameno = frameno; state->subprogno = subprogno; init_reg_state(env, state); + mark_verifier_state_scratched(env); } /* Similar to push_stack(), but for async callbacks */ @@ -2049,7 +2089,7 @@ static int mark_reg_read(struct bpf_verifier_env *env, break; if (parent->live & REG_LIVE_DONE) { verbose(env, "verifier BUG type %s var_off %lld off %d\n", - reg_type_str[parent->type], + reg_type_str(env, parent->type), parent->var_off.value, parent->off); return -EFAULT; } @@ -2233,6 +2273,8 @@ static int check_reg_arg(struct bpf_verifier_env *env, u32 regno, return -EINVAL; } + mark_reg_scratched(env, regno); + reg = ®s[regno]; rw64 = is_reg64(env, insn, regno, reg, t); if (t == SRC_OP) { @@ -2337,7 +2379,7 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, if (insn->code == 0) return 0; - if (env->log.level & BPF_LOG_LEVEL) { + if (env->log.level & BPF_LOG_LEVEL2) { verbose(env, "regs=%x stack=%llx before ", *reg_mask, *stack_mask); verbose(env, "%d: ", idx); print_bpf_insn(&cbs, insn, env->allow_ptr_leaks); @@ -2591,7 +2633,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int regno, DECLARE_BITMAP(mask, 64); u32 history = st->jmp_history_cnt; - if (env->log.level & BPF_LOG_LEVEL) + if (env->log.level & BPF_LOG_LEVEL2) verbose(env, "last_idx %d first_idx %d\n", last_idx, first_idx); for (i = last_idx;;) { if (skip_first) { @@ -2678,11 +2720,11 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int regno, new_marks = true; reg->precise = true; } - if (env->log.level & BPF_LOG_LEVEL) { - print_verifier_state(env, func); - verbose(env, "parent %s regs=%x stack=%llx marks\n", + if (env->log.level & BPF_LOG_LEVEL2) { + verbose(env, "parent %s regs=%x stack=%llx marks:", new_marks ? "didn't have" : "already had", reg_mask, stack_mask); + print_verifier_state(env, func, true); } if (!reg_mask && !stack_mask) @@ -2708,9 +2750,8 @@ static int mark_chain_precision_stack(struct bpf_verifier_env *env, int spi) static bool is_spillable_regtype(enum bpf_reg_type type) { - switch (type) { + switch (base_type(type)) { case PTR_TO_MAP_VALUE: - case PTR_TO_MAP_VALUE_OR_NULL: case PTR_TO_STACK: case PTR_TO_CTX: case PTR_TO_PACKET: @@ -2719,21 +2760,13 @@ static bool is_spillable_regtype(enum bpf_reg_type type) case PTR_TO_FLOW_KEYS: case CONST_PTR_TO_MAP: case PTR_TO_SOCKET: - case PTR_TO_SOCKET_OR_NULL: case PTR_TO_SOCK_COMMON: - case PTR_TO_SOCK_COMMON_OR_NULL: case PTR_TO_TCP_SOCK: - case PTR_TO_TCP_SOCK_OR_NULL: case PTR_TO_XDP_SOCK: case PTR_TO_BTF_ID: - case PTR_TO_BTF_ID_OR_NULL: - case PTR_TO_RDONLY_BUF: - case PTR_TO_RDONLY_BUF_OR_NULL: - case PTR_TO_RDWR_BUF: - case PTR_TO_RDWR_BUF_OR_NULL: + case PTR_TO_BUF: case PTR_TO_PERCPU_BTF_ID: case PTR_TO_MEM: - case PTR_TO_MEM_OR_NULL: case PTR_TO_FUNC: case PTR_TO_MAP_KEY: return true; @@ -2838,6 +2871,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, env->insn_aux_data[insn_idx].sanitize_stack_spill = true; } + mark_stack_slot_scratched(env, spi); if (reg && !(off % BPF_REG_SIZE) && register_is_bounded(reg) && !register_is_null(reg) && env->bpf_capable) { if (dst_reg != BPF_REG_FP) { @@ -2959,6 +2993,7 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env, slot = -i - 1; spi = slot / BPF_REG_SIZE; stype = &state->stack[spi].slot_type[slot % BPF_REG_SIZE]; + mark_stack_slot_scratched(env, spi); if (!env->allow_ptr_leaks && *stype != NOT_INIT @@ -3375,11 +3410,8 @@ static int check_mem_region_access(struct bpf_verifier_env *env, u32 regno, /* We may have adjusted the register pointing to memory region, so we * need to try adding each of min_value and max_value to off * to make sure our theoretical access will be safe. - */ - if (env->log.level & BPF_LOG_LEVEL) - print_verifier_state(env, state); - - /* The minimum value is only important with signed + * + * The minimum value is only important with signed * comparisons where we can't assume the floor of a * value is 0. If we are using signed variables for our * index'es we need to make sure that whatever we use @@ -3574,7 +3606,7 @@ static int check_ctx_access(struct bpf_verifier_env *env, int insn_idx, int off, */ *reg_type = info.reg_type; - if (*reg_type == PTR_TO_BTF_ID || *reg_type == PTR_TO_BTF_ID_OR_NULL) { + if (base_type(*reg_type) == PTR_TO_BTF_ID) { *btf = info.btf; *btf_id = info.btf_id; } else { @@ -3642,7 +3674,7 @@ static int check_sock_access(struct bpf_verifier_env *env, int insn_idx, } verbose(env, "R%d invalid %s access off=%d size=%d\n", - regno, reg_type_str[reg->type], off, size); + regno, reg_type_str(env, reg->type), off, size); return -EACCES; } @@ -4369,15 +4401,30 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn mark_reg_unknown(env, regs, value_regno); } } - } else if (reg->type == PTR_TO_MEM) { + } else if (base_type(reg->type) == PTR_TO_MEM) { + bool rdonly_mem = type_is_rdonly_mem(reg->type); + + if (type_may_be_null(reg->type)) { + verbose(env, "R%d invalid mem access '%s'\n", regno, + reg_type_str(env, reg->type)); + return -EACCES; + } + + if (t == BPF_WRITE && rdonly_mem) { + verbose(env, "R%d cannot write into %s\n", + regno, reg_type_str(env, reg->type)); + return -EACCES; + } + if (t == BPF_WRITE && value_regno >= 0 && is_pointer_value(env, value_regno)) { verbose(env, "R%d leaks addr into mem\n", value_regno); return -EACCES; } + err = check_mem_region_access(env, regno, off, size, reg->mem_size, false); - if (!err && t == BPF_READ && value_regno >= 0) + if (!err && value_regno >= 0 && (t == BPF_READ || rdonly_mem)) mark_reg_unknown(env, regs, value_regno); } else if (reg->type == PTR_TO_CTX) { enum bpf_reg_type reg_type = SCALAR_VALUE; @@ -4407,7 +4454,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn } else { mark_reg_known_zero(env, regs, value_regno); - if (reg_type_may_be_null(reg_type)) + if (type_may_be_null(reg_type)) regs[value_regno].id = ++env->id_gen; /* A load of ctx field could have different * actual load size with the one encoded in the @@ -4415,8 +4462,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn * a sub-register. */ regs[value_regno].subreg_def = DEF_NOT_SUBREG; - if (reg_type == PTR_TO_BTF_ID || - reg_type == PTR_TO_BTF_ID_OR_NULL) { + if (base_type(reg_type) == PTR_TO_BTF_ID) { regs[value_regno].btf = btf; regs[value_regno].btf_id = btf_id; } @@ -4469,7 +4515,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn } else if (type_is_sk_pointer(reg->type)) { if (t == BPF_WRITE) { verbose(env, "R%d cannot write into %s\n", - regno, reg_type_str[reg->type]); + regno, reg_type_str(env, reg->type)); return -EACCES; } err = check_sock_access(env, insn_idx, regno, off, size, t); @@ -4485,26 +4531,32 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn } else if (reg->type == CONST_PTR_TO_MAP) { err = check_ptr_to_map_access(env, regs, regno, off, size, t, value_regno); - } else if (reg->type == PTR_TO_RDONLY_BUF) { - if (t == BPF_WRITE) { - verbose(env, "R%d cannot write into %s\n", - regno, reg_type_str[reg->type]); - return -EACCES; + } else if (base_type(reg->type) == PTR_TO_BUF) { + bool rdonly_mem = type_is_rdonly_mem(reg->type); + const char *buf_info; + u32 *max_access; + + if (rdonly_mem) { + if (t == BPF_WRITE) { + verbose(env, "R%d cannot write into %s\n", + regno, reg_type_str(env, reg->type)); + return -EACCES; + } + buf_info = "rdonly"; + max_access = &env->prog->aux->max_rdonly_access; + } else { + buf_info = "rdwr"; + max_access = &env->prog->aux->max_rdwr_access; } + err = check_buffer_access(env, reg, regno, off, size, false, - "rdonly", - &env->prog->aux->max_rdonly_access); - if (!err && value_regno >= 0) - mark_reg_unknown(env, regs, value_regno); - } else if (reg->type == PTR_TO_RDWR_BUF) { - err = check_buffer_access(env, reg, regno, off, size, false, - "rdwr", - &env->prog->aux->max_rdwr_access); - if (!err && t == BPF_READ && value_regno >= 0) + buf_info, max_access); + + if (!err && value_regno >= 0 && (rdonly_mem || t == BPF_READ)) mark_reg_unknown(env, regs, value_regno); } else { verbose(env, "R%d invalid mem access '%s'\n", regno, - reg_type_str[reg->type]); + reg_type_str(env, reg->type)); return -EACCES; } @@ -4578,7 +4630,7 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i is_sk_reg(env, insn->dst_reg)) { verbose(env, "BPF_ATOMIC stores into R%d %s is not allowed\n", insn->dst_reg, - reg_type_str[reg_state(env, insn->dst_reg)->type]); + reg_type_str(env, reg_state(env, insn->dst_reg)->type)); return -EACCES; } @@ -4761,8 +4813,10 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno, struct bpf_call_arg_meta *meta) { struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; + const char *buf_info; + u32 *max_access; - switch (reg->type) { + switch (base_type(reg->type)) { case PTR_TO_PACKET: case PTR_TO_PACKET_META: return check_packet_access(env, regno, reg->off, access_size, @@ -4781,18 +4835,20 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno, return check_mem_region_access(env, regno, reg->off, access_size, reg->mem_size, zero_size_allowed); - case PTR_TO_RDONLY_BUF: - if (meta && meta->raw_mode) - return -EACCES; - return check_buffer_access(env, reg, regno, reg->off, - access_size, zero_size_allowed, - "rdonly", - &env->prog->aux->max_rdonly_access); - case PTR_TO_RDWR_BUF: + case PTR_TO_BUF: + if (type_is_rdonly_mem(reg->type)) { + if (meta && meta->raw_mode) + return -EACCES; + + buf_info = "rdonly"; + max_access = &env->prog->aux->max_rdonly_access; + } else { + buf_info = "rdwr"; + max_access = &env->prog->aux->max_rdwr_access; + } return check_buffer_access(env, reg, regno, reg->off, access_size, zero_size_allowed, - "rdwr", - &env->prog->aux->max_rdwr_access); + buf_info, max_access); case PTR_TO_STACK: return check_stack_range_initialized( env, @@ -4804,9 +4860,9 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno, register_is_null(reg)) return 0; - verbose(env, "R%d type=%s expected=%s\n", regno, - reg_type_str[reg->type], - reg_type_str[PTR_TO_STACK]); + verbose(env, "R%d type=%s ", regno, + reg_type_str(env, reg->type)); + verbose(env, "expected=%s\n", reg_type_str(env, PTR_TO_STACK)); return -EACCES; } } @@ -4817,7 +4873,7 @@ int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg, if (register_is_null(reg)) return 0; - if (reg_type_may_be_null(reg->type)) { + if (type_may_be_null(reg->type)) { /* Assuming that the register contains a value check if the memory * access is safe. Temporarily save and restore the register's state as * the conversion shouldn't be visible to a caller. @@ -4965,9 +5021,8 @@ static int process_timer_func(struct bpf_verifier_env *env, int regno, static bool arg_type_is_mem_ptr(enum bpf_arg_type type) { - return type == ARG_PTR_TO_MEM || - type == ARG_PTR_TO_MEM_OR_NULL || - type == ARG_PTR_TO_UNINIT_MEM; + return base_type(type) == ARG_PTR_TO_MEM || + base_type(type) == ARG_PTR_TO_UNINIT_MEM; } static bool arg_type_is_mem_size(enum bpf_arg_type type) @@ -5072,8 +5127,7 @@ static const struct bpf_reg_types mem_types = { PTR_TO_MAP_KEY, PTR_TO_MAP_VALUE, PTR_TO_MEM, - PTR_TO_RDONLY_BUF, - PTR_TO_RDWR_BUF, + PTR_TO_BUF, }, }; @@ -5104,31 +5158,26 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = { [ARG_PTR_TO_MAP_KEY] = &map_key_value_types, [ARG_PTR_TO_MAP_VALUE] = &map_key_value_types, [ARG_PTR_TO_UNINIT_MAP_VALUE] = &map_key_value_types, - [ARG_PTR_TO_MAP_VALUE_OR_NULL] = &map_key_value_types, [ARG_CONST_SIZE] = &scalar_types, [ARG_CONST_SIZE_OR_ZERO] = &scalar_types, [ARG_CONST_ALLOC_SIZE_OR_ZERO] = &scalar_types, [ARG_CONST_MAP_PTR] = &const_map_ptr_types, [ARG_PTR_TO_CTX] = &context_types, - [ARG_PTR_TO_CTX_OR_NULL] = &context_types, [ARG_PTR_TO_SOCK_COMMON] = &sock_types, #ifdef CONFIG_NET [ARG_PTR_TO_BTF_ID_SOCK_COMMON] = &btf_id_sock_common_types, #endif [ARG_PTR_TO_SOCKET] = &fullsock_types, - [ARG_PTR_TO_SOCKET_OR_NULL] = &fullsock_types, [ARG_PTR_TO_BTF_ID] = &btf_ptr_types, [ARG_PTR_TO_SPIN_LOCK] = &spin_lock_types, [ARG_PTR_TO_MEM] = &mem_types, - [ARG_PTR_TO_MEM_OR_NULL] = &mem_types, [ARG_PTR_TO_UNINIT_MEM] = &mem_types, [ARG_PTR_TO_ALLOC_MEM] = &alloc_mem_types, - [ARG_PTR_TO_ALLOC_MEM_OR_NULL] = &alloc_mem_types, [ARG_PTR_TO_INT] = &int_ptr_types, [ARG_PTR_TO_LONG] = &int_ptr_types, [ARG_PTR_TO_PERCPU_BTF_ID] = &percpu_btf_ptr_types, [ARG_PTR_TO_FUNC] = &func_ptr_types, - [ARG_PTR_TO_STACK_OR_NULL] = &stack_ptr_types, + [ARG_PTR_TO_STACK] = &stack_ptr_types, [ARG_PTR_TO_CONST_STR] = &const_str_ptr_types, [ARG_PTR_TO_TIMER] = &timer_types, }; @@ -5142,12 +5191,27 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno, const struct bpf_reg_types *compatible; int i, j; - compatible = compatible_reg_types[arg_type]; + compatible = compatible_reg_types[base_type(arg_type)]; if (!compatible) { verbose(env, "verifier internal error: unsupported arg type %d\n", arg_type); return -EFAULT; } + /* ARG_PTR_TO_MEM + RDONLY is compatible with PTR_TO_MEM and PTR_TO_MEM + RDONLY, + * but ARG_PTR_TO_MEM is compatible only with PTR_TO_MEM and NOT with PTR_TO_MEM + RDONLY + * + * Same for MAYBE_NULL: + * + * ARG_PTR_TO_MEM + MAYBE_NULL is compatible with PTR_TO_MEM and PTR_TO_MEM + MAYBE_NULL, + * but ARG_PTR_TO_MEM is compatible only with PTR_TO_MEM but NOT with PTR_TO_MEM + MAYBE_NULL + * + * Therefore we fold these flags depending on the arg_type before comparison. + */ + if (arg_type & MEM_RDONLY) + type &= ~MEM_RDONLY; + if (arg_type & PTR_MAYBE_NULL) + type &= ~PTR_MAYBE_NULL; + for (i = 0; i < ARRAY_SIZE(compatible->types); i++) { expected = compatible->types[i]; if (expected == NOT_INIT) @@ -5157,14 +5221,14 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno, goto found; } - verbose(env, "R%d type=%s expected=", regno, reg_type_str[type]); + verbose(env, "R%d type=%s expected=", regno, reg_type_str(env, reg->type)); for (j = 0; j + 1 < i; j++) - verbose(env, "%s, ", reg_type_str[compatible->types[j]]); - verbose(env, "%s\n", reg_type_str[compatible->types[j]]); + verbose(env, "%s, ", reg_type_str(env, compatible->types[j])); + verbose(env, "%s\n", reg_type_str(env, compatible->types[j])); return -EACCES; found: - if (type == PTR_TO_BTF_ID) { + if (reg->type == PTR_TO_BTF_ID) { if (!arg_btf_id) { if (!compatible->btf_id) { verbose(env, "verifier internal error: missing arg compatible BTF ID\n"); @@ -5223,15 +5287,14 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, return -EACCES; } - if (arg_type == ARG_PTR_TO_MAP_VALUE || - arg_type == ARG_PTR_TO_UNINIT_MAP_VALUE || - arg_type == ARG_PTR_TO_MAP_VALUE_OR_NULL) { + if (base_type(arg_type) == ARG_PTR_TO_MAP_VALUE || + base_type(arg_type) == ARG_PTR_TO_UNINIT_MAP_VALUE) { err = resolve_map_arg_type(env, meta, &arg_type); if (err) return err; } - if (register_is_null(reg) && arg_type_may_be_null(arg_type)) + if (register_is_null(reg) && type_may_be_null(arg_type)) /* A NULL register has a SCALAR_VALUE type, so skip * type checking. */ @@ -5300,10 +5363,11 @@ skip_type_check: err = check_helper_mem_access(env, regno, meta->map_ptr->key_size, false, NULL); - } else if (arg_type == ARG_PTR_TO_MAP_VALUE || - (arg_type == ARG_PTR_TO_MAP_VALUE_OR_NULL && - !register_is_null(reg)) || - arg_type == ARG_PTR_TO_UNINIT_MAP_VALUE) { + } else if (base_type(arg_type) == ARG_PTR_TO_MAP_VALUE || + base_type(arg_type) == ARG_PTR_TO_UNINIT_MAP_VALUE) { + if (type_may_be_null(arg_type) && register_is_null(reg)) + return 0; + /* bpf_map_xxx(..., map_ptr, ..., value) call: * check [value, value + map->value_size) validity */ @@ -6025,9 +6089,9 @@ static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn if (env->log.level & BPF_LOG_LEVEL) { verbose(env, "caller:\n"); - print_verifier_state(env, caller); + print_verifier_state(env, caller, true); verbose(env, "callee:\n"); - print_verifier_state(env, callee); + print_verifier_state(env, callee, true); } return 0; } @@ -6242,9 +6306,9 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx) *insn_idx = callee->callsite + 1; if (env->log.level & BPF_LOG_LEVEL) { verbose(env, "returning from callee:\n"); - print_verifier_state(env, callee); + print_verifier_state(env, callee, true); verbose(env, "to caller at %d:\n", *insn_idx); - print_verifier_state(env, caller); + print_verifier_state(env, caller, true); } /* clear everything in the callee */ free_func_state(callee); @@ -6410,13 +6474,11 @@ static int check_bpf_snprintf_call(struct bpf_verifier_env *env, static int check_get_func_ip(struct bpf_verifier_env *env) { - enum bpf_attach_type eatype = env->prog->expected_attach_type; enum bpf_prog_type type = resolve_prog_type(env->prog); int func_id = BPF_FUNC_get_func_ip; if (type == BPF_PROG_TYPE_TRACING) { - if (eatype != BPF_TRACE_FENTRY && eatype != BPF_TRACE_FEXIT && - eatype != BPF_MODIFY_RETURN) { + if (!bpf_prog_has_trampoline(env->prog)) { verbose(env, "func %s#%d supported only for fentry/fexit/fmod_ret programs\n", func_id_name(func_id), func_id); return -ENOTSUPP; @@ -6435,6 +6497,8 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn int *insn_idx_p) { const struct bpf_func_proto *fn = NULL; + enum bpf_return_type ret_type; + enum bpf_type_flag ret_flag; struct bpf_reg_state *regs; struct bpf_call_arg_meta meta; int insn_idx = *insn_idx_p; @@ -6574,13 +6638,14 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn regs[BPF_REG_0].subreg_def = DEF_NOT_SUBREG; /* update return register (already marked as written above) */ - if (fn->ret_type == RET_INTEGER) { + ret_type = fn->ret_type; + ret_flag = type_flag(fn->ret_type); + if (ret_type == RET_INTEGER) { /* sets type to SCALAR_VALUE */ mark_reg_unknown(env, regs, BPF_REG_0); - } else if (fn->ret_type == RET_VOID) { + } else if (ret_type == RET_VOID) { regs[BPF_REG_0].type = NOT_INIT; - } else if (fn->ret_type == RET_PTR_TO_MAP_VALUE_OR_NULL || - fn->ret_type == RET_PTR_TO_MAP_VALUE) { + } else if (base_type(ret_type) == RET_PTR_TO_MAP_VALUE) { /* There is no offset yet applied, variable or fixed */ mark_reg_known_zero(env, regs, BPF_REG_0); /* remember map_ptr, so that check_map_access() @@ -6594,28 +6659,25 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn } regs[BPF_REG_0].map_ptr = meta.map_ptr; regs[BPF_REG_0].map_uid = meta.map_uid; - if (fn->ret_type == RET_PTR_TO_MAP_VALUE) { - regs[BPF_REG_0].type = PTR_TO_MAP_VALUE; - if (map_value_has_spin_lock(meta.map_ptr)) - regs[BPF_REG_0].id = ++env->id_gen; - } else { - regs[BPF_REG_0].type = PTR_TO_MAP_VALUE_OR_NULL; + regs[BPF_REG_0].type = PTR_TO_MAP_VALUE | ret_flag; + if (!type_may_be_null(ret_type) && + map_value_has_spin_lock(meta.map_ptr)) { + regs[BPF_REG_0].id = ++env->id_gen; } - } else if (fn->ret_type == RET_PTR_TO_SOCKET_OR_NULL) { + } else if (base_type(ret_type) == RET_PTR_TO_SOCKET) { mark_reg_known_zero(env, regs, BPF_REG_0); - regs[BPF_REG_0].type = PTR_TO_SOCKET_OR_NULL; - } else if (fn->ret_type == RET_PTR_TO_SOCK_COMMON_OR_NULL) { + regs[BPF_REG_0].type = PTR_TO_SOCKET | ret_flag; + } else if (base_type(ret_type) == RET_PTR_TO_SOCK_COMMON) { mark_reg_known_zero(env, regs, BPF_REG_0); - regs[BPF_REG_0].type = PTR_TO_SOCK_COMMON_OR_NULL; - } else if (fn->ret_type == RET_PTR_TO_TCP_SOCK_OR_NULL) { + regs[BPF_REG_0].type = PTR_TO_SOCK_COMMON | ret_flag; + } else if (base_type(ret_type) == RET_PTR_TO_TCP_SOCK) { mark_reg_known_zero(env, regs, BPF_REG_0); - regs[BPF_REG_0].type = PTR_TO_TCP_SOCK_OR_NULL; - } else if (fn->ret_type == RET_PTR_TO_ALLOC_MEM_OR_NULL) { + regs[BPF_REG_0].type = PTR_TO_TCP_SOCK | ret_flag; + } else if (base_type(ret_type) == RET_PTR_TO_ALLOC_MEM) { mark_reg_known_zero(env, regs, BPF_REG_0); - regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL; + regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag; regs[BPF_REG_0].mem_size = meta.mem_size; - } else if (fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL || - fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID) { + } else if (base_type(ret_type) == RET_PTR_TO_MEM_OR_BTF_ID) { const struct btf_type *t; mark_reg_known_zero(env, regs, BPF_REG_0); @@ -6633,29 +6695,30 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn tname, PTR_ERR(ret)); return -EINVAL; } - regs[BPF_REG_0].type = - fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID ? - PTR_TO_MEM : PTR_TO_MEM_OR_NULL; + regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag; regs[BPF_REG_0].mem_size = tsize; } else { - regs[BPF_REG_0].type = - fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID ? - PTR_TO_BTF_ID : PTR_TO_BTF_ID_OR_NULL; + /* MEM_RDONLY may be carried from ret_flag, but it + * doesn't apply on PTR_TO_BTF_ID. Fold it, otherwise + * it will confuse the check of PTR_TO_BTF_ID in + * check_mem_access(). + */ + ret_flag &= ~MEM_RDONLY; + + regs[BPF_REG_0].type = PTR_TO_BTF_ID | ret_flag; regs[BPF_REG_0].btf = meta.ret_btf; regs[BPF_REG_0].btf_id = meta.ret_btf_id; } - } else if (fn->ret_type == RET_PTR_TO_BTF_ID_OR_NULL || - fn->ret_type == RET_PTR_TO_BTF_ID) { + } else if (base_type(ret_type) == RET_PTR_TO_BTF_ID) { int ret_btf_id; mark_reg_known_zero(env, regs, BPF_REG_0); - regs[BPF_REG_0].type = fn->ret_type == RET_PTR_TO_BTF_ID ? - PTR_TO_BTF_ID : - PTR_TO_BTF_ID_OR_NULL; + regs[BPF_REG_0].type = PTR_TO_BTF_ID | ret_flag; ret_btf_id = *fn->ret_btf_id; if (ret_btf_id == 0) { - verbose(env, "invalid return type %d of func %s#%d\n", - fn->ret_type, func_id_name(func_id), func_id); + verbose(env, "invalid return type %u of func %s#%d\n", + base_type(ret_type), func_id_name(func_id), + func_id); return -EINVAL; } /* current BPF helper definitions are only coming from @@ -6664,12 +6727,12 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn regs[BPF_REG_0].btf = btf_vmlinux; regs[BPF_REG_0].btf_id = ret_btf_id; } else { - verbose(env, "unknown return type %d of func %s#%d\n", - fn->ret_type, func_id_name(func_id), func_id); + verbose(env, "unknown return type %u of func %s#%d\n", + base_type(ret_type), func_id_name(func_id), func_id); return -EINVAL; } - if (reg_type_may_be_null(regs[BPF_REG_0].type)) + if (type_may_be_null(regs[BPF_REG_0].type)) regs[BPF_REG_0].id = ++env->id_gen; if (is_ptr_cast_function(func_id)) { @@ -6878,25 +6941,25 @@ static bool check_reg_sane_offset(struct bpf_verifier_env *env, if (known && (val >= BPF_MAX_VAR_OFF || val <= -BPF_MAX_VAR_OFF)) { verbose(env, "math between %s pointer and %lld is not allowed\n", - reg_type_str[type], val); + reg_type_str(env, type), val); return false; } if (reg->off >= BPF_MAX_VAR_OFF || reg->off <= -BPF_MAX_VAR_OFF) { verbose(env, "%s pointer offset %d is not allowed\n", - reg_type_str[type], reg->off); + reg_type_str(env, type), reg->off); return false; } if (smin == S64_MIN) { verbose(env, "math between %s pointer and register with unbounded min value is not allowed\n", - reg_type_str[type]); + reg_type_str(env, type)); return false; } if (smin >= BPF_MAX_VAR_OFF || smin <= -BPF_MAX_VAR_OFF) { verbose(env, "value %lld makes %s pointer be out of bounds\n", - smin, reg_type_str[type]); + smin, reg_type_str(env, type)); return false; } @@ -7273,11 +7336,13 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env, return -EACCES; } - switch (ptr_reg->type) { - case PTR_TO_MAP_VALUE_OR_NULL: + if (ptr_reg->type & PTR_MAYBE_NULL) { verbose(env, "R%d pointer arithmetic on %s prohibited, null-check it first\n", - dst, reg_type_str[ptr_reg->type]); + dst, reg_type_str(env, ptr_reg->type)); return -EACCES; + } + + switch (base_type(ptr_reg->type)) { case CONST_PTR_TO_MAP: /* smin_val represents the known value */ if (known && smin_val == 0 && opcode == BPF_ADD) @@ -7285,14 +7350,11 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env, fallthrough; case PTR_TO_PACKET_END: case PTR_TO_SOCKET: - case PTR_TO_SOCKET_OR_NULL: case PTR_TO_SOCK_COMMON: - case PTR_TO_SOCK_COMMON_OR_NULL: case PTR_TO_TCP_SOCK: - case PTR_TO_TCP_SOCK_OR_NULL: case PTR_TO_XDP_SOCK: verbose(env, "R%d pointer arithmetic on %s prohibited\n", - dst, reg_type_str[ptr_reg->type]); + dst, reg_type_str(env, ptr_reg->type)); return -EACCES; default: break; @@ -8265,12 +8327,12 @@ static int adjust_reg_min_max_vals(struct bpf_verifier_env *env, /* Got here implies adding two SCALAR_VALUEs */ if (WARN_ON_ONCE(ptr_reg)) { - print_verifier_state(env, state); + print_verifier_state(env, state, true); verbose(env, "verifier internal error: unexpected ptr_reg\n"); return -EINVAL; } if (WARN_ON(!src_reg)) { - print_verifier_state(env, state); + print_verifier_state(env, state, true); verbose(env, "verifier internal error: no src_reg\n"); return -EINVAL; } @@ -9015,7 +9077,7 @@ static void mark_ptr_or_null_reg(struct bpf_func_state *state, struct bpf_reg_state *reg, u32 id, bool is_null) { - if (reg_type_may_be_null(reg->type) && reg->id == id && + if (type_may_be_null(reg->type) && reg->id == id && !WARN_ON_ONCE(!reg->id)) { /* Old offset (both fixed and variable parts) should * have been known-zero, because we don't allow pointer @@ -9393,7 +9455,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, */ if (!is_jmp32 && BPF_SRC(insn->code) == BPF_K && insn->imm == 0 && (opcode == BPF_JEQ || opcode == BPF_JNE) && - reg_type_may_be_null(dst_reg->type)) { + type_may_be_null(dst_reg->type)) { /* Mark all identical registers in each branch as either * safe or unknown depending R == 0 or R != 0 conditional. */ @@ -9409,7 +9471,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, return -EACCES; } if (env->log.level & BPF_LOG_LEVEL) - print_verifier_state(env, this_branch->frame[this_branch->curframe]); + print_insn_state(env, this_branch->frame[this_branch->curframe]); return 0; } @@ -9448,7 +9510,7 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn) mark_reg_known_zero(env, regs, insn->dst_reg); dst_reg->type = aux->btf_var.reg_type; - switch (dst_reg->type) { + switch (base_type(dst_reg->type)) { case PTR_TO_MEM: dst_reg->mem_size = aux->btf_var.mem_size; break; @@ -9647,7 +9709,7 @@ static int check_return_code(struct bpf_verifier_env *env) /* enforce return zero from async callbacks like timer */ if (reg->type != SCALAR_VALUE) { verbose(env, "In async callback the register R0 is not a known value (%s)\n", - reg_type_str[reg->type]); + reg_type_str(env, reg->type)); return -EINVAL; } @@ -9661,7 +9723,7 @@ static int check_return_code(struct bpf_verifier_env *env) if (is_subprog) { if (reg->type != SCALAR_VALUE) { verbose(env, "At subprogram exit the register R0 is not a scalar value (%s)\n", - reg_type_str[reg->type]); + reg_type_str(env, reg->type)); return -EINVAL; } return 0; @@ -9725,7 +9787,7 @@ static int check_return_code(struct bpf_verifier_env *env) if (reg->type != SCALAR_VALUE) { verbose(env, "At program exit the register R0 is not a known value (%s)\n", - reg_type_str[reg->type]); + reg_type_str(env, reg->type)); return -EINVAL; } @@ -10582,7 +10644,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, return true; if (rcur->type == NOT_INIT) return false; - switch (rold->type) { + switch (base_type(rold->type)) { case SCALAR_VALUE: if (env->explore_alu_limits) return false; @@ -10604,6 +10666,22 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, } case PTR_TO_MAP_KEY: case PTR_TO_MAP_VALUE: + /* a PTR_TO_MAP_VALUE could be safe to use as a + * PTR_TO_MAP_VALUE_OR_NULL into the same map. + * However, if the old PTR_TO_MAP_VALUE_OR_NULL then got NULL- + * checked, doing so could have affected others with the same + * id, and we can't check for that because we lost the id when + * we converted to a PTR_TO_MAP_VALUE. + */ + if (type_may_be_null(rold->type)) { + if (!type_may_be_null(rcur->type)) + return false; + if (memcmp(rold, rcur, offsetof(struct bpf_reg_state, id))) + return false; + /* Check our ids match any regs they're supposed to */ + return check_ids(rold->id, rcur->id, idmap); + } + /* If the new min/max/var_off satisfy the old ones and * everything else matches, we are OK. * 'id' is not compared, since it's only used for maps with @@ -10615,20 +10693,6 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 && range_within(rold, rcur) && tnum_in(rold->var_off, rcur->var_off); - case PTR_TO_MAP_VALUE_OR_NULL: - /* a PTR_TO_MAP_VALUE could be safe to use as a - * PTR_TO_MAP_VALUE_OR_NULL into the same map. - * However, if the old PTR_TO_MAP_VALUE_OR_NULL then got NULL- - * checked, doing so could have affected others with the same - * id, and we can't check for that because we lost the id when - * we converted to a PTR_TO_MAP_VALUE. - */ - if (rcur->type != PTR_TO_MAP_VALUE_OR_NULL) - return false; - if (memcmp(rold, rcur, offsetof(struct bpf_reg_state, id))) - return false; - /* Check our ids match any regs they're supposed to */ - return check_ids(rold->id, rcur->id, idmap); case PTR_TO_PACKET_META: case PTR_TO_PACKET: if (rcur->type != rold->type) @@ -10657,11 +10721,8 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, case PTR_TO_PACKET_END: case PTR_TO_FLOW_KEYS: case PTR_TO_SOCKET: - case PTR_TO_SOCKET_OR_NULL: case PTR_TO_SOCK_COMMON: - case PTR_TO_SOCK_COMMON_OR_NULL: case PTR_TO_TCP_SOCK: - case PTR_TO_TCP_SOCK_OR_NULL: case PTR_TO_XDP_SOCK: /* Only valid matches are exact, which memcmp() above * would have accepted @@ -11187,17 +11248,13 @@ next: /* Return true if it's OK to have the same insn return a different type. */ static bool reg_type_mismatch_ok(enum bpf_reg_type type) { - switch (type) { + switch (base_type(type)) { case PTR_TO_CTX: case PTR_TO_SOCKET: - case PTR_TO_SOCKET_OR_NULL: case PTR_TO_SOCK_COMMON: - case PTR_TO_SOCK_COMMON_OR_NULL: case PTR_TO_TCP_SOCK: - case PTR_TO_TCP_SOCK_OR_NULL: case PTR_TO_XDP_SOCK: case PTR_TO_BTF_ID: - case PTR_TO_BTF_ID_OR_NULL: return false; default: return true; @@ -11277,16 +11334,12 @@ static int do_check(struct bpf_verifier_env *env) if (need_resched()) cond_resched(); - if (env->log.level & BPF_LOG_LEVEL2 || - (env->log.level & BPF_LOG_LEVEL && do_print_state)) { - if (env->log.level & BPF_LOG_LEVEL2) - verbose(env, "%d:", env->insn_idx); - else - verbose(env, "\nfrom %d to %d%s:", - env->prev_insn_idx, env->insn_idx, - env->cur_state->speculative ? - " (speculative execution)" : ""); - print_verifier_state(env, state->frame[state->curframe]); + if (env->log.level & BPF_LOG_LEVEL2 && do_print_state) { + verbose(env, "\nfrom %d to %d%s:", + env->prev_insn_idx, env->insn_idx, + env->cur_state->speculative ? + " (speculative execution)" : ""); + print_verifier_state(env, state->frame[state->curframe], true); do_print_state = false; } @@ -11297,9 +11350,15 @@ static int do_check(struct bpf_verifier_env *env) .private_data = env, }; + if (verifier_state_scratched(env)) + print_insn_state(env, state->frame[state->curframe]); + verbose_linfo(env, env->insn_idx, "; "); + env->prev_log_len = env->log.len_used; verbose(env, "%d: ", env->insn_idx); print_bpf_insn(&cbs, insn, env->allow_ptr_leaks); + env->prev_insn_print_len = env->log.len_used - env->prev_log_len; + env->prev_log_len = env->log.len_used; } if (bpf_prog_is_dev_bound(env->prog->aux)) { @@ -11421,7 +11480,7 @@ static int do_check(struct bpf_verifier_env *env) if (is_ctx_reg(env, insn->dst_reg)) { verbose(env, "BPF_ST stores into R%d %s is not allowed\n", insn->dst_reg, - reg_type_str[reg_state(env, insn->dst_reg)->type]); + reg_type_str(env, reg_state(env, insn->dst_reg)->type)); return -EACCES; } @@ -11508,6 +11567,7 @@ static int do_check(struct bpf_verifier_env *env) if (err) return err; process_bpf_exit: + mark_verifier_state_scratched(env); update_branch_counts(env, env->cur_state); err = pop_stack(env, &prev_insn_idx, &env->insn_idx, pop_log); @@ -11673,7 +11733,7 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, err = -EINVAL; goto err_put; } - aux->btf_var.reg_type = PTR_TO_MEM; + aux->btf_var.reg_type = PTR_TO_MEM | MEM_RDONLY; aux->btf_var.mem_size = tsize; } else { aux->btf_var.reg_type = PTR_TO_BTF_ID; @@ -11833,6 +11893,9 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env, } break; case BPF_MAP_TYPE_RINGBUF: + case BPF_MAP_TYPE_INODE_STORAGE: + case BPF_MAP_TYPE_SK_STORAGE: + case BPF_MAP_TYPE_TASK_STORAGE: break; default: verbose(env, @@ -13016,6 +13079,7 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, static int do_misc_fixups(struct bpf_verifier_env *env) { struct bpf_prog *prog = env->prog; + enum bpf_attach_type eatype = prog->expected_attach_type; bool expect_blinding = bpf_jit_blinding_enabled(prog); enum bpf_prog_type prog_type = resolve_prog_type(prog); struct bpf_insn *insn = prog->insnsi; @@ -13386,11 +13450,79 @@ patch_map_ops_generic: continue; } + /* Implement bpf_get_func_arg inline. */ + if (prog_type == BPF_PROG_TYPE_TRACING && + insn->imm == BPF_FUNC_get_func_arg) { + /* Load nr_args from ctx - 8 */ + insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8); + insn_buf[1] = BPF_JMP32_REG(BPF_JGE, BPF_REG_2, BPF_REG_0, 6); + insn_buf[2] = BPF_ALU64_IMM(BPF_LSH, BPF_REG_2, 3); + insn_buf[3] = BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_1); + insn_buf[4] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_2, 0); + insn_buf[5] = BPF_STX_MEM(BPF_DW, BPF_REG_3, BPF_REG_0, 0); + insn_buf[6] = BPF_MOV64_IMM(BPF_REG_0, 0); + insn_buf[7] = BPF_JMP_A(1); + insn_buf[8] = BPF_MOV64_IMM(BPF_REG_0, -EINVAL); + cnt = 9; + + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); + if (!new_prog) + return -ENOMEM; + + delta += cnt - 1; + env->prog = prog = new_prog; + insn = new_prog->insnsi + i + delta; + continue; + } + + /* Implement bpf_get_func_ret inline. */ + if (prog_type == BPF_PROG_TYPE_TRACING && + insn->imm == BPF_FUNC_get_func_ret) { + if (eatype == BPF_TRACE_FEXIT || + eatype == BPF_MODIFY_RETURN) { + /* Load nr_args from ctx - 8 */ + insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8); + insn_buf[1] = BPF_ALU64_IMM(BPF_LSH, BPF_REG_0, 3); + insn_buf[2] = BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1); + insn_buf[3] = BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_0, 0); + insn_buf[4] = BPF_STX_MEM(BPF_DW, BPF_REG_2, BPF_REG_3, 0); + insn_buf[5] = BPF_MOV64_IMM(BPF_REG_0, 0); + cnt = 6; + } else { + insn_buf[0] = BPF_MOV64_IMM(BPF_REG_0, -EOPNOTSUPP); + cnt = 1; + } + + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); + if (!new_prog) + return -ENOMEM; + + delta += cnt - 1; + env->prog = prog = new_prog; + insn = new_prog->insnsi + i + delta; + continue; + } + + /* Implement get_func_arg_cnt inline. */ + if (prog_type == BPF_PROG_TYPE_TRACING && + insn->imm == BPF_FUNC_get_func_arg_cnt) { + /* Load nr_args from ctx - 8 */ + insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8); + + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, 1); + if (!new_prog) + return -ENOMEM; + + env->prog = prog = new_prog; + insn = new_prog->insnsi + i + delta; + continue; + } + /* Implement bpf_get_func_ip inline. */ if (prog_type == BPF_PROG_TYPE_TRACING && insn->imm == BPF_FUNC_get_func_ip) { - /* Load IP address from ctx - 8 */ - insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8); + /* Load IP address from ctx - 16 */ + insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -16); new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, 1); if (!new_prog) @@ -13504,7 +13636,7 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog) mark_reg_known_zero(env, regs, i); else if (regs[i].type == SCALAR_VALUE) mark_reg_unknown(env, regs, i); - else if (regs[i].type == PTR_TO_MEM_OR_NULL) { + else if (base_type(regs[i].type) == PTR_TO_MEM) { const u32 mem_size = regs[i].mem_size; mark_reg_known_zero(env, regs, i); @@ -14099,6 +14231,8 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr) } } + mark_verifier_state_clean(env); + if (IS_ERR(btf_vmlinux)) { /* Either gcc or pahole or kernel are broken. */ verbose(env, "in-kernel BTF is malformed\n"); diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 919194de39c8..cd4c23f7e3df 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -30,6 +30,7 @@ #include "cgroup-internal.h" +#include <linux/bpf-cgroup.h> #include <linux/cred.h> #include <linux/errno.h> #include <linux/init_task.h> diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 083be6af29d7..d7ed1dffa426 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -33,6 +33,7 @@ #include <linux/security.h> #include <linux/ctype.h> #include <linux/kmemleak.h> +#include <linux/filter.h> #include <linux/fs.h> #include <linux/init.h> #include <linux/kernel.h> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 623dd0684429..21aa30644219 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -345,7 +345,7 @@ static const struct bpf_func_proto bpf_probe_write_user_proto = { .gpl_only = true, .ret_type = RET_INTEGER, .arg1_type = ARG_ANYTHING, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, }; @@ -394,7 +394,7 @@ static const struct bpf_func_proto bpf_trace_printk_proto = { .func = bpf_trace_printk, .gpl_only = true, .ret_type = RET_INTEGER, - .arg1_type = ARG_PTR_TO_MEM, + .arg1_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg2_type = ARG_CONST_SIZE, }; @@ -450,9 +450,9 @@ static const struct bpf_func_proto bpf_trace_vprintk_proto = { .func = bpf_trace_vprintk, .gpl_only = true, .ret_type = RET_INTEGER, - .arg1_type = ARG_PTR_TO_MEM, + .arg1_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg2_type = ARG_CONST_SIZE, - .arg3_type = ARG_PTR_TO_MEM_OR_NULL, + .arg3_type = ARG_PTR_TO_MEM | PTR_MAYBE_NULL | MEM_RDONLY, .arg4_type = ARG_CONST_SIZE_OR_ZERO, }; @@ -492,9 +492,9 @@ static const struct bpf_func_proto bpf_seq_printf_proto = { .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_BTF_ID, .arg1_btf_id = &btf_seq_file_ids[0], - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, - .arg4_type = ARG_PTR_TO_MEM_OR_NULL, + .arg4_type = ARG_PTR_TO_MEM | PTR_MAYBE_NULL | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE_OR_ZERO, }; @@ -509,7 +509,7 @@ static const struct bpf_func_proto bpf_seq_write_proto = { .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_BTF_ID, .arg1_btf_id = &btf_seq_file_ids[0], - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE_OR_ZERO, }; @@ -533,7 +533,7 @@ static const struct bpf_func_proto bpf_seq_printf_btf_proto = { .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_BTF_ID, .arg1_btf_id = &btf_seq_file_ids[0], - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE_OR_ZERO, .arg4_type = ARG_ANYTHING, }; @@ -694,7 +694,7 @@ static const struct bpf_func_proto bpf_perf_event_output_proto = { .arg1_type = ARG_PTR_TO_CTX, .arg2_type = ARG_CONST_MAP_PTR, .arg3_type = ARG_ANYTHING, - .arg4_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE_OR_ZERO, }; @@ -1004,7 +1004,7 @@ const struct bpf_func_proto bpf_snprintf_btf_proto = { .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_MEM, .arg2_type = ARG_CONST_SIZE, - .arg3_type = ARG_PTR_TO_MEM, + .arg3_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg4_type = ARG_CONST_SIZE, .arg5_type = ARG_ANYTHING, }; @@ -1012,7 +1012,7 @@ const struct bpf_func_proto bpf_snprintf_btf_proto = { BPF_CALL_1(bpf_get_func_ip_tracing, void *, ctx) { /* This helper call is inlined by verifier. */ - return ((u64 *)ctx)[-1]; + return ((u64 *)ctx)[-2]; } static const struct bpf_func_proto bpf_get_func_ip_proto_tracing = { @@ -1091,6 +1091,53 @@ static const struct bpf_func_proto bpf_get_branch_snapshot_proto = { .arg2_type = ARG_CONST_SIZE_OR_ZERO, }; +BPF_CALL_3(get_func_arg, void *, ctx, u32, n, u64 *, value) +{ + /* This helper call is inlined by verifier. */ + u64 nr_args = ((u64 *)ctx)[-1]; + + if ((u64) n >= nr_args) + return -EINVAL; + *value = ((u64 *)ctx)[n]; + return 0; +} + +static const struct bpf_func_proto bpf_get_func_arg_proto = { + .func = get_func_arg, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_ANYTHING, + .arg3_type = ARG_PTR_TO_LONG, +}; + +BPF_CALL_2(get_func_ret, void *, ctx, u64 *, value) +{ + /* This helper call is inlined by verifier. */ + u64 nr_args = ((u64 *)ctx)[-1]; + + *value = ((u64 *)ctx)[nr_args]; + return 0; +} + +static const struct bpf_func_proto bpf_get_func_ret_proto = { + .func = get_func_ret, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_PTR_TO_LONG, +}; + +BPF_CALL_1(get_func_arg_cnt, void *, ctx) +{ + /* This helper call is inlined by verifier. */ + return ((u64 *)ctx)[-1]; +} + +static const struct bpf_func_proto bpf_get_func_arg_cnt_proto = { + .func = get_func_arg_cnt, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, +}; + static const struct bpf_func_proto * bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { @@ -1287,7 +1334,7 @@ static const struct bpf_func_proto bpf_perf_event_output_proto_tp = { .arg1_type = ARG_PTR_TO_CTX, .arg2_type = ARG_CONST_MAP_PTR, .arg3_type = ARG_ANYTHING, - .arg4_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE_OR_ZERO, }; @@ -1509,7 +1556,7 @@ static const struct bpf_func_proto bpf_perf_event_output_proto_raw_tp = { .arg1_type = ARG_PTR_TO_CTX, .arg2_type = ARG_CONST_MAP_PTR, .arg3_type = ARG_ANYTHING, - .arg4_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE_OR_ZERO, }; @@ -1563,7 +1610,7 @@ static const struct bpf_func_proto bpf_get_stack_proto_raw_tp = { .gpl_only = true, .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE_OR_ZERO, .arg4_type = ARG_ANYTHING, }; @@ -1629,6 +1676,12 @@ tracing_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) NULL; case BPF_FUNC_d_path: return &bpf_d_path_proto; + case BPF_FUNC_get_func_arg: + return bpf_prog_has_trampoline(prog) ? &bpf_get_func_arg_proto : NULL; + case BPF_FUNC_get_func_ret: + return bpf_prog_has_trampoline(prog) ? &bpf_get_func_ret_proto : NULL; + case BPF_FUNC_get_func_arg_cnt: + return bpf_prog_has_trampoline(prog) ? &bpf_get_func_arg_cnt_proto : NULL; default: fn = raw_tp_prog_func_proto(func_id, prog); if (!fn && prog->expected_attach_type == BPF_TRACE_ITER) diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 33272a7b6912..4e1257f50aa3 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -7,6 +7,7 @@ */ #define pr_fmt(fmt) "trace_kprobe: " fmt +#include <linux/bpf-cgroup.h> #include <linux/security.h> #include <linux/module.h> #include <linux/uaccess.h> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c index f5f0039d31e5..4f35514a48f3 100644 --- a/kernel/trace/trace_uprobe.c +++ b/kernel/trace/trace_uprobe.c @@ -7,6 +7,7 @@ */ #define pr_fmt(fmt) "trace_uprobe: " fmt +#include <linux/bpf-cgroup.h> #include <linux/security.h> #include <linux/ctype.h> #include <linux/module.h> diff --git a/net/bluetooth/bnep/sock.c b/net/bluetooth/bnep/sock.c index d515571b2afb..57d509d77cb4 100644 --- a/net/bluetooth/bnep/sock.c +++ b/net/bluetooth/bnep/sock.c @@ -24,6 +24,7 @@ SOFTWARE IS DISCLAIMED. */ +#include <linux/compat.h> #include <linux/export.h> #include <linux/file.h> diff --git a/net/bluetooth/eir.h b/net/bluetooth/eir.h index 724662f8f8b1..05e2e917fc25 100644 --- a/net/bluetooth/eir.h +++ b/net/bluetooth/eir.h @@ -5,6 +5,8 @@ * Copyright (C) 2021 Intel Corporation */ +#include <asm/unaligned.h> + void eir_create(struct hci_dev *hdev, u8 *data); u8 eir_create_adv_data(struct hci_dev *hdev, u8 instance, u8 *ptr); diff --git a/net/bluetooth/hidp/sock.c b/net/bluetooth/hidp/sock.c index 595fb3c9d6c3..369ed92dac99 100644 --- a/net/bluetooth/hidp/sock.c +++ b/net/bluetooth/hidp/sock.c @@ -20,6 +20,7 @@ SOFTWARE IS DISCLAIMED. */ +#include <linux/compat.h> #include <linux/export.h> #include <linux/file.h> diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c index 251017c69ab7..188e4d4813b0 100644 --- a/net/bluetooth/l2cap_sock.c +++ b/net/bluetooth/l2cap_sock.c @@ -29,6 +29,7 @@ #include <linux/module.h> #include <linux/export.h> +#include <linux/filter.h> #include <linux/sched/signal.h> #include <net/bluetooth/bluetooth.h> diff --git a/net/bridge/br_ioctl.c b/net/bridge/br_ioctl.c index 5f3c1cf7f6ad..f213ed108361 100644 --- a/net/bridge/br_ioctl.c +++ b/net/bridge/br_ioctl.c @@ -8,6 +8,7 @@ */ #include <linux/capability.h> +#include <linux/compat.h> #include <linux/kernel.h> #include <linux/if_bridge.h> #include <linux/netdevice.h> diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c index e12fd3cad619..2b8892d502f7 100644 --- a/net/caif/caif_socket.c +++ b/net/caif/caif_socket.c @@ -6,6 +6,7 @@ #define pr_fmt(fmt) KBUILD_MODNAME ":%s(): " fmt, __func__ +#include <linux/filter.h> #include <linux/fs.h> #include <linux/init.h> #include <linux/module.h> diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c index 68d2cbf8331a..d9c37fd10809 100644 --- a/net/core/bpf_sk_storage.c +++ b/net/core/bpf_sk_storage.c @@ -13,6 +13,7 @@ #include <net/sock.h> #include <uapi/linux/sock_diag.h> #include <uapi/linux/btf.h> +#include <linux/rcupdate_trace.h> DEFINE_BPF_STORAGE_CACHE(sk_cache); @@ -22,7 +23,8 @@ bpf_sk_storage_lookup(struct sock *sk, struct bpf_map *map, bool cacheit_lockit) struct bpf_local_storage *sk_storage; struct bpf_local_storage_map *smap; - sk_storage = rcu_dereference(sk->sk_bpf_storage); + sk_storage = + rcu_dereference_check(sk->sk_bpf_storage, bpf_rcu_lock_held()); if (!sk_storage) return NULL; @@ -258,6 +260,7 @@ BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk, { struct bpf_local_storage_data *sdata; + WARN_ON_ONCE(!bpf_rcu_lock_held()); if (!sk || !sk_fullsock(sk) || flags > BPF_SK_STORAGE_GET_F_CREATE) return (unsigned long)NULL; @@ -288,6 +291,7 @@ BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk, BPF_CALL_2(bpf_sk_storage_delete, struct bpf_map *, map, struct sock *, sk) { + WARN_ON_ONCE(!bpf_rcu_lock_held()); if (!sk || !sk_fullsock(sk)) return -EINVAL; @@ -416,6 +420,7 @@ static bool bpf_sk_storage_tracing_allowed(const struct bpf_prog *prog) BPF_CALL_4(bpf_sk_storage_get_tracing, struct bpf_map *, map, struct sock *, sk, void *, value, u64, flags) { + WARN_ON_ONCE(!bpf_rcu_lock_held()); if (in_hardirq() || in_nmi()) return (unsigned long)NULL; @@ -425,6 +430,7 @@ BPF_CALL_4(bpf_sk_storage_get_tracing, struct bpf_map *, map, struct sock *, sk, BPF_CALL_2(bpf_sk_storage_delete_tracing, struct bpf_map *, map, struct sock *, sk) { + WARN_ON_ONCE(!bpf_rcu_lock_held()); if (in_hardirq() || in_nmi()) return -EPERM; @@ -929,7 +935,7 @@ static struct bpf_iter_reg bpf_sk_storage_map_reg_info = { { offsetof(struct bpf_iter__bpf_sk_storage_map, sk), PTR_TO_BTF_ID_OR_NULL }, { offsetof(struct bpf_iter__bpf_sk_storage_map, value), - PTR_TO_RDWR_BUF_OR_NULL }, + PTR_TO_BUF | PTR_MAYBE_NULL }, }, .seq_info = &iter_seq_info, }; diff --git a/net/core/dev.c b/net/core/dev.c index 644b9c8be3a8..6c8b226b5f2f 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4712,7 +4712,7 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb, case XDP_PASS: break; default: - bpf_warn_invalid_xdp_action(act); + bpf_warn_invalid_xdp_action(skb->dev, xdp_prog, act); fallthrough; case XDP_ABORTED: trace_xdp_exception(skb->dev, xdp_prog, act); diff --git a/net/core/devlink.c b/net/core/devlink.c index 6366ce324dce..fcd9f6d85cf1 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -7,6 +7,7 @@ * Copyright (c) 2016 Jiri Pirko <jiri@mellanox.com> */ +#include <linux/etherdevice.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/types.h> diff --git a/net/core/filter.c b/net/core/filter.c index e4cc3aff5bf7..606ab5a98a1a 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -1712,7 +1712,7 @@ static const struct bpf_func_proto bpf_skb_store_bytes_proto = { .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_CTX, .arg2_type = ARG_ANYTHING, - .arg3_type = ARG_PTR_TO_MEM, + .arg3_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg4_type = ARG_CONST_SIZE, .arg5_type = ARG_ANYTHING, }; @@ -2017,9 +2017,9 @@ static const struct bpf_func_proto bpf_csum_diff_proto = { .gpl_only = false, .pkt_access = true, .ret_type = RET_INTEGER, - .arg1_type = ARG_PTR_TO_MEM_OR_NULL, + .arg1_type = ARG_PTR_TO_MEM | PTR_MAYBE_NULL | MEM_RDONLY, .arg2_type = ARG_CONST_SIZE_OR_ZERO, - .arg3_type = ARG_PTR_TO_MEM_OR_NULL, + .arg3_type = ARG_PTR_TO_MEM | PTR_MAYBE_NULL | MEM_RDONLY, .arg4_type = ARG_CONST_SIZE_OR_ZERO, .arg5_type = ARG_ANYTHING, }; @@ -2540,7 +2540,7 @@ static const struct bpf_func_proto bpf_redirect_neigh_proto = { .gpl_only = false, .ret_type = RET_INTEGER, .arg1_type = ARG_ANYTHING, - .arg2_type = ARG_PTR_TO_MEM_OR_NULL, + .arg2_type = ARG_PTR_TO_MEM | PTR_MAYBE_NULL | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE_OR_ZERO, .arg4_type = ARG_ANYTHING, }; @@ -4173,7 +4173,7 @@ static const struct bpf_func_proto bpf_skb_event_output_proto = { .arg1_type = ARG_PTR_TO_CTX, .arg2_type = ARG_CONST_MAP_PTR, .arg3_type = ARG_ANYTHING, - .arg4_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE_OR_ZERO, }; @@ -4187,7 +4187,7 @@ const struct bpf_func_proto bpf_skb_output_proto = { .arg1_btf_id = &bpf_skb_output_btf_ids[0], .arg2_type = ARG_CONST_MAP_PTR, .arg3_type = ARG_ANYTHING, - .arg4_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE_OR_ZERO, }; @@ -4370,7 +4370,7 @@ static const struct bpf_func_proto bpf_skb_set_tunnel_key_proto = { .gpl_only = false, .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, .arg4_type = ARG_ANYTHING, }; @@ -4396,7 +4396,7 @@ static const struct bpf_func_proto bpf_skb_set_tunnel_opt_proto = { .gpl_only = false, .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, }; @@ -4566,7 +4566,7 @@ static const struct bpf_func_proto bpf_xdp_event_output_proto = { .arg1_type = ARG_PTR_TO_CTX, .arg2_type = ARG_CONST_MAP_PTR, .arg3_type = ARG_ANYTHING, - .arg4_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE_OR_ZERO, }; @@ -4580,7 +4580,7 @@ const struct bpf_func_proto bpf_xdp_output_proto = { .arg1_btf_id = &bpf_xdp_output_btf_ids[0], .arg2_type = ARG_CONST_MAP_PTR, .arg3_type = ARG_ANYTHING, - .arg4_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE_OR_ZERO, }; @@ -5066,7 +5066,7 @@ const struct bpf_func_proto bpf_sk_setsockopt_proto = { .arg1_type = ARG_PTR_TO_BTF_ID_SOCK_COMMON, .arg2_type = ARG_ANYTHING, .arg3_type = ARG_ANYTHING, - .arg4_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE, }; @@ -5100,7 +5100,7 @@ static const struct bpf_func_proto bpf_sock_addr_setsockopt_proto = { .arg1_type = ARG_PTR_TO_CTX, .arg2_type = ARG_ANYTHING, .arg3_type = ARG_ANYTHING, - .arg4_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE, }; @@ -5134,7 +5134,7 @@ static const struct bpf_func_proto bpf_sock_ops_setsockopt_proto = { .arg1_type = ARG_PTR_TO_CTX, .arg2_type = ARG_ANYTHING, .arg3_type = ARG_ANYTHING, - .arg4_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE, }; @@ -5309,7 +5309,7 @@ static const struct bpf_func_proto bpf_bind_proto = { .gpl_only = false, .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, }; @@ -5897,7 +5897,7 @@ static const struct bpf_func_proto bpf_lwt_in_push_encap_proto = { .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_CTX, .arg2_type = ARG_ANYTHING, - .arg3_type = ARG_PTR_TO_MEM, + .arg3_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg4_type = ARG_CONST_SIZE }; @@ -5907,7 +5907,7 @@ static const struct bpf_func_proto bpf_lwt_xmit_push_encap_proto = { .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_CTX, .arg2_type = ARG_ANYTHING, - .arg3_type = ARG_PTR_TO_MEM, + .arg3_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg4_type = ARG_CONST_SIZE }; @@ -5950,7 +5950,7 @@ static const struct bpf_func_proto bpf_lwt_seg6_store_bytes_proto = { .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_CTX, .arg2_type = ARG_ANYTHING, - .arg3_type = ARG_PTR_TO_MEM, + .arg3_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg4_type = ARG_CONST_SIZE }; @@ -6038,7 +6038,7 @@ static const struct bpf_func_proto bpf_lwt_seg6_action_proto = { .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_CTX, .arg2_type = ARG_ANYTHING, - .arg3_type = ARG_PTR_TO_MEM, + .arg3_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg4_type = ARG_CONST_SIZE }; @@ -6263,7 +6263,7 @@ static const struct bpf_func_proto bpf_skc_lookup_tcp_proto = { .pkt_access = true, .ret_type = RET_PTR_TO_SOCK_COMMON_OR_NULL, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, .arg4_type = ARG_ANYTHING, .arg5_type = ARG_ANYTHING, @@ -6282,7 +6282,7 @@ static const struct bpf_func_proto bpf_sk_lookup_tcp_proto = { .pkt_access = true, .ret_type = RET_PTR_TO_SOCKET_OR_NULL, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, .arg4_type = ARG_ANYTHING, .arg5_type = ARG_ANYTHING, @@ -6301,7 +6301,7 @@ static const struct bpf_func_proto bpf_sk_lookup_udp_proto = { .pkt_access = true, .ret_type = RET_PTR_TO_SOCKET_OR_NULL, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, .arg4_type = ARG_ANYTHING, .arg5_type = ARG_ANYTHING, @@ -6338,7 +6338,7 @@ static const struct bpf_func_proto bpf_xdp_sk_lookup_udp_proto = { .pkt_access = true, .ret_type = RET_PTR_TO_SOCKET_OR_NULL, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, .arg4_type = ARG_ANYTHING, .arg5_type = ARG_ANYTHING, @@ -6361,7 +6361,7 @@ static const struct bpf_func_proto bpf_xdp_skc_lookup_tcp_proto = { .pkt_access = true, .ret_type = RET_PTR_TO_SOCK_COMMON_OR_NULL, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, .arg4_type = ARG_ANYTHING, .arg5_type = ARG_ANYTHING, @@ -6384,7 +6384,7 @@ static const struct bpf_func_proto bpf_xdp_sk_lookup_tcp_proto = { .pkt_access = true, .ret_type = RET_PTR_TO_SOCKET_OR_NULL, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, .arg4_type = ARG_ANYTHING, .arg5_type = ARG_ANYTHING, @@ -6403,7 +6403,7 @@ static const struct bpf_func_proto bpf_sock_addr_skc_lookup_tcp_proto = { .gpl_only = false, .ret_type = RET_PTR_TO_SOCK_COMMON_OR_NULL, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, .arg4_type = ARG_ANYTHING, .arg5_type = ARG_ANYTHING, @@ -6422,7 +6422,7 @@ static const struct bpf_func_proto bpf_sock_addr_sk_lookup_tcp_proto = { .gpl_only = false, .ret_type = RET_PTR_TO_SOCKET_OR_NULL, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, .arg4_type = ARG_ANYTHING, .arg5_type = ARG_ANYTHING, @@ -6441,7 +6441,7 @@ static const struct bpf_func_proto bpf_sock_addr_sk_lookup_udp_proto = { .gpl_only = false, .ret_type = RET_PTR_TO_SOCKET_OR_NULL, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, .arg4_type = ARG_ANYTHING, .arg5_type = ARG_ANYTHING, @@ -6754,9 +6754,9 @@ static const struct bpf_func_proto bpf_tcp_check_syncookie_proto = { .pkt_access = true, .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_BTF_ID_SOCK_COMMON, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, - .arg4_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE, }; @@ -6823,9 +6823,9 @@ static const struct bpf_func_proto bpf_tcp_gen_syncookie_proto = { .pkt_access = true, .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_BTF_ID_SOCK_COMMON, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, - .arg4_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg5_type = ARG_CONST_SIZE, }; @@ -7054,7 +7054,7 @@ static const struct bpf_func_proto bpf_sock_ops_store_hdr_opt_proto = { .gpl_only = false, .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_CTX, - .arg2_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY, .arg3_type = ARG_CONST_SIZE, .arg4_type = ARG_ANYTHING, }; @@ -8180,13 +8180,13 @@ static bool xdp_is_valid_access(int off, int size, return __is_valid_xdp_access(off, size); } -void bpf_warn_invalid_xdp_action(u32 act) +void bpf_warn_invalid_xdp_action(struct net_device *dev, struct bpf_prog *prog, u32 act) { const u32 act_max = XDP_REDIRECT; - WARN_ONCE(1, "%s XDP return value %u, expect packet loss!\n", - act > act_max ? "Illegal" : "Driver unsupported", - act); + pr_warn_once("%s XDP return value %u on prog %s (id %d) dev %s, expect packet loss!\n", + act > act_max ? "Illegal" : "Driver unsupported", + act, prog->aux->name, prog->aux->id, dev ? dev->name : "N/A"); } EXPORT_SYMBOL_GPL(bpf_warn_invalid_xdp_action); diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index fe9b5063cb73..15833e1d6ea1 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -5,6 +5,7 @@ #include <linux/ip.h> #include <linux/ipv6.h> #include <linux/if_vlan.h> +#include <linux/filter.h> #include <net/dsa.h> #include <net/dst_metadata.h> #include <net/ip.h> diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c index 2f7940bcf715..349480ef68a5 100644 --- a/net/core/lwt_bpf.c +++ b/net/core/lwt_bpf.c @@ -2,6 +2,7 @@ /* Copyright (c) 2016 Thomas Graf <tgraf@tgraf.ch> */ +#include <linux/filter.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/skbuff.h> diff --git a/net/core/sock_diag.c b/net/core/sock_diag.c index c9c45b935f99..f7cf74cdd3db 100644 --- a/net/core/sock_diag.c +++ b/net/core/sock_diag.c @@ -1,5 +1,6 @@ /* License: GPL */ +#include <linux/filter.h> #include <linux/mutex.h> #include <linux/socket.h> #include <linux/skbuff.h> diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 4ca4b11f4e5f..9618ab6d7cc9 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -1564,7 +1564,7 @@ static struct bpf_iter_reg sock_map_iter_reg = { .ctx_arg_info_size = 2, .ctx_arg_info = { { offsetof(struct bpf_iter__sockmap, key), - PTR_TO_RDONLY_BUF_OR_NULL }, + PTR_TO_BUF | PTR_MAYBE_NULL | MEM_RDONLY }, { offsetof(struct bpf_iter__sockmap, sk), PTR_TO_BTF_ID_OR_NULL }, }, diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index 5f88526ad61c..7b4d485aac7a 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -6,6 +6,7 @@ * Added /proc/sys/net/core directory entry (empty =) ). [MS] */ +#include <linux/filter.h> #include <linux/mm.h> #include <linux/sysctl.h> #include <linux/module.h> diff --git a/net/decnet/dn_nsp_in.c b/net/decnet/dn_nsp_in.c index 7ab788f41a3f..c59be5b04479 100644 --- a/net/decnet/dn_nsp_in.c +++ b/net/decnet/dn_nsp_in.c @@ -38,6 +38,7 @@ *******************************************************************************/ #include <linux/errno.h> +#include <linux/filter.h> #include <linux/types.h> #include <linux/socket.h> #include <linux/in.h> diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index edfaae7b5967..b5ae21f172a8 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -8,6 +8,7 @@ #define __DSA_PRIV_H #include <linux/if_bridge.h> +#include <linux/if_vlan.h> #include <linux/phy.h> #include <linux/netdevice.h> #include <linux/netpoll.h> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index 9a113d893521..b2cdba1b4aae 100644 --- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -8,6 +8,7 @@ */ #include <linux/compat.h> +#include <linux/etherdevice.h> #include <linux/module.h> #include <linux/types.h> #include <linux/capability.h> diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index 1319d093cdda..eeafeccebb8d 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -8,6 +8,7 @@ #include <linux/nexthop.h> #include <linux/rtnetlink.h> #include <linux/slab.h> +#include <linux/vmalloc.h> #include <net/arp.h> #include <net/ipv6_stubs.h> #include <net/lwtunnel.h> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 4a486ea1333f..7b18a6f42f18 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -74,6 +74,7 @@ #define pr_fmt(fmt) "UDP: " fmt +#include <linux/bpf-cgroup.h> #include <linux/uaccess.h> #include <asm/ioctls.h> #include <linux/memblock.h> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 0371d2c14145..463c37dea449 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -15,6 +15,7 @@ #define pr_fmt(fmt) "IPv6: " fmt +#include <linux/bpf.h> #include <linux/errno.h> #include <linux/types.h> #include <linux/net.h> diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c index 2dc40b3f373e..a5eea182149d 100644 --- a/net/ipv6/seg6_local.c +++ b/net/ipv6/seg6_local.c @@ -7,6 +7,7 @@ * eBPF support: Mathieu Xhonneux <m.xhonneux@gmail.com> */ +#include <linux/filter.h> #include <linux/types.h> #include <linux/skbuff.h> #include <linux/net.h> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index aa5d86bbfb53..1accc06abc54 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -17,6 +17,7 @@ * YOSHIFUJI Hideaki @USAGI: convert /proc/net/udp6 to seq_file. */ +#include <linux/bpf-cgroup.h> #include <linux/errno.h> #include <linux/types.h> #include <linux/socket.h> diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c index 49ecbe8d176a..a1760add5bf1 100644 --- a/net/iucv/af_iucv.c +++ b/net/iucv/af_iucv.c @@ -13,6 +13,7 @@ #define KMSG_COMPONENT "af_iucv" #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt +#include <linux/filter.h> #include <linux/module.h> #include <linux/netdevice.h> #include <linux/types.h> diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index 11a715d76a4f..71899e5a5a11 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -9,6 +9,7 @@ #include <linux/errno.h> #include <linux/errqueue.h> #include <linux/file.h> +#include <linux/filter.h> #include <linux/in.h> #include <linux/kernel.h> #include <linux/module.h> diff --git a/net/netfilter/nfnetlink_hook.c b/net/netfilter/nfnetlink_hook.c index d5c719c9e36c..71e29adac48b 100644 --- a/net/netfilter/nfnetlink_hook.c +++ b/net/netfilter/nfnetlink_hook.c @@ -6,6 +6,7 @@ */ #include <linux/module.h> +#include <linux/kallsyms.h> #include <linux/kernel.h> #include <linux/types.h> #include <linux/skbuff.h> diff --git a/net/netfilter/nft_reject_netdev.c b/net/netfilter/nft_reject_netdev.c index d89f68754f42..61cd8c4ac385 100644 --- a/net/netfilter/nft_reject_netdev.c +++ b/net/netfilter/nft_reject_netdev.c @@ -4,6 +4,7 @@ * Copyright (c) 2020 Jose M. Guisado <guigom@riseup.net> */ +#include <linux/etherdevice.h> #include <linux/kernel.h> #include <linux/init.h> #include <linux/module.h> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 4be2d97ff93e..7b344035bfe3 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -20,8 +20,10 @@ #include <linux/module.h> +#include <linux/bpf.h> #include <linux/capability.h> #include <linux/kernel.h> +#include <linux/filter.h> #include <linux/init.h> #include <linux/signal.h> #include <linux/sched.h> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 59c265b8ac26..9bbe7282efb6 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -49,6 +49,7 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/ethtool.h> +#include <linux/filter.h> #include <linux/types.h> #include <linux/mm.h> #include <linux/capability.h> diff --git a/net/rose/rose_in.c b/net/rose/rose_in.c index 6af786d66b03..4d67f36dce1b 100644 --- a/net/rose/rose_in.c +++ b/net/rose/rose_in.c @@ -9,6 +9,7 @@ * diagrams as the code is not obvious and probably very easy to break. */ #include <linux/errno.h> +#include <linux/filter.h> #include <linux/types.h> #include <linux/socket.h> #include <linux/in.h> diff --git a/net/sched/sch_frag.c b/net/sched/sch_frag.c index 5ded4c8672a6..a9bd0a235890 100644 --- a/net/sched/sch_frag.c +++ b/net/sched/sch_frag.c @@ -1,4 +1,5 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +#include <linux/if_vlan.h> #include <net/netlink.h> #include <net/sch_generic.h> #include <net/pkt_sched.h> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c index fe5d5399c4e8..a3e2d3b89568 100644 --- a/net/smc/smc_ib.c +++ b/net/smc/smc_ib.c @@ -12,6 +12,8 @@ * Author(s): Ursula Braun <ubraun@linux.vnet.ibm.com> */ +#include <linux/etherdevice.h> +#include <linux/if_vlan.h> #include <linux/random.h> #include <linux/workqueue.h> #include <linux/scatterlist.h> diff --git a/net/smc/smc_ism.c b/net/smc/smc_ism.c index fd28cc498b98..a2084ecdb97e 100644 --- a/net/smc/smc_ism.c +++ b/net/smc/smc_ism.c @@ -6,6 +6,7 @@ * Copyright IBM Corp. 2018 */ +#include <linux/if_vlan.h> #include <linux/spinlock.h> #include <linux/mutex.h> #include <linux/slab.h> diff --git a/net/socket.c b/net/socket.c index 6b2a898055ca..4b8bb20d5e9a 100644 --- a/net/socket.c +++ b/net/socket.c @@ -52,6 +52,7 @@ * Based upon Swansea University Computer Society NET3.039 */ +#include <linux/bpf-cgroup.h> #include <linux/ethtool.h> #include <linux/mm.h> #include <linux/socket.h> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 4d6e33bbd446..c19569819866 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -89,6 +89,7 @@ #include <linux/socket.h> #include <linux/un.h> #include <linux/fcntl.h> +#include <linux/filter.h> #include <linux/termios.h> #include <linux/sockios.h> #include <linux/net.h> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index ed0df839c38c..3235261f138d 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -85,6 +85,7 @@ * TCP_LISTEN - listening */ +#include <linux/compat.h> #include <linux/types.h> #include <linux/bitops.h> #include <linux/cred.h> diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c index 2e48d0e094d9..65b53fb3de13 100644 --- a/net/xdp/xskmap.c +++ b/net/xdp/xskmap.c @@ -4,6 +4,7 @@ */ #include <linux/bpf.h> +#include <linux/filter.h> #include <linux/capability.h> #include <net/xdp_sock.h> #include <linux/slab.h> diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index a2f4001221d1..0407272a990c 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -14,6 +14,7 @@ * */ +#include <linux/compat.h> #include <linux/workqueue.h> #include <net/xfrm.h> #include <linux/pfkeyv2.h> diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 7c36cc1f3d79..e3e26f4da6c2 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -11,6 +11,7 @@ * */ +#include <linux/compat.h> #include <linux/crypto.h> #include <linux/module.h> #include <linux/kernel.h> diff --git a/samples/bpf/hbm.c b/samples/bpf/hbm.c index b0c18efe7928..1fe5bcafb3bc 100644 --- a/samples/bpf/hbm.c +++ b/samples/bpf/hbm.c @@ -120,6 +120,9 @@ static void do_error(char *msg, bool errno_flag) static int prog_load(char *prog) { + struct bpf_program *pos; + const char *sec_name; + obj = bpf_object__open_file(prog, NULL); if (libbpf_get_error(obj)) { printf("ERROR: opening BPF object file failed\n"); @@ -132,7 +135,13 @@ static int prog_load(char *prog) goto err; } - bpf_prog = bpf_object__find_program_by_title(obj, "cgroup_skb/egress"); + bpf_object__for_each_program(pos, obj) { + sec_name = bpf_program__section_name(pos); + if (sec_name && !strcmp(sec_name, "cgroup_skb/egress")) { + bpf_prog = pos; + break; + } + } if (!bpf_prog) { printf("ERROR: finding a prog in obj file failed\n"); goto err; diff --git a/samples/bpf/xdp_fwd_user.c b/samples/bpf/xdp_fwd_user.c index 00061261a8da..4ad896782f77 100644 --- a/samples/bpf/xdp_fwd_user.c +++ b/samples/bpf/xdp_fwd_user.c @@ -79,7 +79,9 @@ int main(int argc, char **argv) .prog_type = BPF_PROG_TYPE_XDP, }; const char *prog_name = "xdp_fwd"; - struct bpf_program *prog; + struct bpf_program *prog = NULL; + struct bpf_program *pos; + const char *sec_name; int prog_fd, map_fd = -1; char filename[PATH_MAX]; struct bpf_object *obj; @@ -134,7 +136,13 @@ int main(int argc, char **argv) return 1; } - prog = bpf_object__find_program_by_title(obj, prog_name); + bpf_object__for_each_program(pos, obj) { + sec_name = bpf_program__section_name(pos); + if (sec_name && !strcmp(sec_name, prog_name)) { + prog = pos; + break; + } + } prog_fd = bpf_program__fd(prog); if (prog_fd < 0) { printf("program not found: %s\n", strerror(prog_fd)); diff --git a/security/device_cgroup.c b/security/device_cgroup.c index 04375df52fc9..842889f3dcb7 100644 --- a/security/device_cgroup.c +++ b/security/device_cgroup.c @@ -5,6 +5,7 @@ * Copyright 2007 IBM Corp */ +#include <linux/bpf-cgroup.h> #include <linux/device_cgroup.h> #include <linux/cgroup.h> #include <linux/ctype.h> diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile index 42eb8eee3d89..83369f55df61 100644 --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -57,7 +57,7 @@ $(LIBBPF_INTERNAL_HDRS): $(LIBBPF_HDRS_DIR)/%.h: $(BPF_DIR)/%.h | $(LIBBPF_HDRS_ $(LIBBPF_BOOTSTRAP): $(wildcard $(BPF_DIR)/*.[ch] $(BPF_DIR)/Makefile) | $(LIBBPF_BOOTSTRAP_OUTPUT) $(Q)$(MAKE) -C $(BPF_DIR) OUTPUT=$(LIBBPF_BOOTSTRAP_OUTPUT) \ DESTDIR=$(LIBBPF_BOOTSTRAP_DESTDIR) prefix= \ - ARCH= CC=$(HOSTCC) LD=$(HOSTLD) $@ install_headers + ARCH= CROSS_COMPILE= CC=$(HOSTCC) LD=$(HOSTLD) $@ install_headers $(LIBBPF_BOOTSTRAP_INTERNAL_HDRS): $(LIBBPF_BOOTSTRAP_HDRS_DIR)/%.h: $(BPF_DIR)/%.h | $(LIBBPF_BOOTSTRAP_HDRS_DIR) $(call QUIET_INSTALL, $@) @@ -152,6 +152,9 @@ CFLAGS += -DHAVE_LIBBFD_SUPPORT SRCS += $(BFD_SRCS) endif +HOST_CFLAGS = $(subst -I$(LIBBPF_INCLUDE),-I$(LIBBPF_BOOTSTRAP_INCLUDE),\ + $(subst $(CLANG_CROSS_FLAGS),,$(CFLAGS))) + BPFTOOL_BOOTSTRAP := $(BOOTSTRAP_OUTPUT)bpftool BOOTSTRAP_OBJS = $(addprefix $(BOOTSTRAP_OUTPUT),main.o common.o json_writer.o gen.o btf.o xlated_dumper.o btf_dumper.o disasm.o) @@ -202,7 +205,7 @@ endif CFLAGS += $(if $(BUILD_BPF_SKELS),,-DBPFTOOL_WITHOUT_SKELETONS) $(BOOTSTRAP_OUTPUT)disasm.o: $(srctree)/kernel/bpf/disasm.c - $(QUIET_CC)$(HOSTCC) $(CFLAGS) -c -MMD $< -o $@ + $(QUIET_CC)$(HOSTCC) $(HOST_CFLAGS) -c -MMD $< -o $@ $(OUTPUT)disasm.o: $(srctree)/kernel/bpf/disasm.c $(QUIET_CC)$(CC) $(CFLAGS) -c -MMD $< -o $@ @@ -213,15 +216,13 @@ ifneq ($(feature-zlib), 1) endif $(BPFTOOL_BOOTSTRAP): $(BOOTSTRAP_OBJS) $(LIBBPF_BOOTSTRAP) - $(QUIET_LINK)$(HOSTCC) $(CFLAGS) $(LDFLAGS) $(BOOTSTRAP_OBJS) $(LIBS_BOOTSTRAP) -o $@ + $(QUIET_LINK)$(HOSTCC) $(HOST_CFLAGS) $(LDFLAGS) $(BOOTSTRAP_OBJS) $(LIBS_BOOTSTRAP) -o $@ $(OUTPUT)bpftool: $(OBJS) $(LIBBPF) $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $(OBJS) $(LIBS) -o $@ $(BOOTSTRAP_OUTPUT)%.o: %.c $(LIBBPF_BOOTSTRAP_INTERNAL_HDRS) | $(BOOTSTRAP_OUTPUT) - $(QUIET_CC)$(HOSTCC) \ - $(subst -I$(LIBBPF_INCLUDE),-I$(LIBBPF_BOOTSTRAP_INCLUDE),$(CFLAGS)) \ - -c -MMD $< -o $@ + $(QUIET_CC)$(HOSTCC) $(HOST_CFLAGS) -c -MMD $< -o $@ $(OUTPUT)%.o: %.c $(QUIET_CC)$(CC) $(CFLAGS) -c -MMD $< -o $@ diff --git a/tools/bpf/bpftool/feature.c b/tools/bpf/bpftool/feature.c index 5397077d0d9e..6719b9282eca 100644 --- a/tools/bpf/bpftool/feature.c +++ b/tools/bpf/bpftool/feature.c @@ -642,12 +642,32 @@ probe_helpers_for_progtype(enum bpf_prog_type prog_type, bool supported_type, printf("\n"); } -static void -probe_large_insn_limit(const char *define_prefix, __u32 ifindex) +/* + * Probe for availability of kernel commit (5.3): + * + * c04c0d2b968a ("bpf: increase complexity limit and maximum program size") + */ +static void probe_large_insn_limit(const char *define_prefix, __u32 ifindex) { + LIBBPF_OPTS(bpf_prog_load_opts, opts, + .prog_ifindex = ifindex, + ); + struct bpf_insn insns[BPF_MAXINSNS + 1]; bool res; + int i, fd; + + for (i = 0; i < BPF_MAXINSNS; i++) + insns[i] = BPF_MOV64_IMM(BPF_REG_0, 1); + insns[BPF_MAXINSNS] = BPF_EXIT_INSN(); + + errno = 0; + fd = bpf_prog_load(BPF_PROG_TYPE_SCHED_CLS, NULL, "GPL", + insns, ARRAY_SIZE(insns), &opts); + res = fd >= 0 || (errno != E2BIG && errno != EINVAL); + + if (fd >= 0) + close(fd); - res = bpf_probe_large_insn_limit(ifindex); print_bool_feature("have_large_insn_limit", "Large program size limit", "LARGE_INSN_LIMIT", diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c index 8b71500e7cb2..020e91a542d5 100644 --- a/tools/bpf/bpftool/main.c +++ b/tools/bpf/bpftool/main.c @@ -408,6 +408,8 @@ int main(int argc, char **argv) bool version_requested = false; int opt, ret; + setlinebuf(stdout); + last_do_help = do_help; pretty_output = false; json_output = false; diff --git a/tools/bpf/resolve_btfids/Makefile b/tools/bpf/resolve_btfids/Makefile index 751643f860b2..9ddeca947635 100644 --- a/tools/bpf/resolve_btfids/Makefile +++ b/tools/bpf/resolve_btfids/Makefile @@ -19,6 +19,7 @@ CC = $(HOSTCC) LD = $(HOSTLD) ARCH = $(HOSTARCH) RM ?= rm +CROSS_COMPILE = OUTPUT ?= $(srctree)/tools/bpf/resolve_btfids/ diff --git a/tools/bpf/runqslower/Makefile b/tools/bpf/runqslower/Makefile index 8791d0e2762b..da6de16a3dfb 100644 --- a/tools/bpf/runqslower/Makefile +++ b/tools/bpf/runqslower/Makefile @@ -12,7 +12,7 @@ BPFOBJ := $(BPFOBJ_OUTPUT)libbpf.a BPF_DESTDIR := $(BPFOBJ_OUTPUT) BPF_INCLUDE := $(BPF_DESTDIR)/include INCLUDES := -I$(OUTPUT) -I$(BPF_INCLUDE) -I$(abspath ../../include/uapi) -CFLAGS := -g -Wall +CFLAGS := -g -Wall $(CLANG_CROSS_FLAGS) # Try to detect best kernel BTF source KERNEL_REL := $(shell uname -r) @@ -88,4 +88,4 @@ $(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(BPFOBJ_OU $(DEFAULT_BPFTOOL): $(BPFOBJ) | $(BPFTOOL_OUTPUT) $(Q)$(MAKE) $(submake_extras) -C ../bpftool OUTPUT=$(BPFTOOL_OUTPUT) \ - CC=$(HOSTCC) LD=$(HOSTLD) + ARCH= CROSS_COMPILE= CC=$(HOSTCC) LD=$(HOSTLD) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index c26871263f1f..b0383d371b9a 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -4983,6 +4983,41 @@ union bpf_attr { * Return * The number of loops performed, **-EINVAL** for invalid **flags**, * **-E2BIG** if **nr_loops** exceeds the maximum number of loops. + * + * long bpf_strncmp(const char *s1, u32 s1_sz, const char *s2) + * Description + * Do strncmp() between **s1** and **s2**. **s1** doesn't need + * to be null-terminated and **s1_sz** is the maximum storage + * size of **s1**. **s2** must be a read-only string. + * Return + * An integer less than, equal to, or greater than zero + * if the first **s1_sz** bytes of **s1** is found to be + * less than, to match, or be greater than **s2**. + * + * long bpf_get_func_arg(void *ctx, u32 n, u64 *value) + * Description + * Get **n**-th argument (zero based) of the traced function (for tracing programs) + * returned in **value**. + * + * Return + * 0 on success. + * **-EINVAL** if n >= arguments count of traced function. + * + * long bpf_get_func_ret(void *ctx, u64 *value) + * Description + * Get return value of the traced function (for tracing programs) + * in **value**. + * + * Return + * 0 on success. + * **-EOPNOTSUPP** for tracing programs other than BPF_TRACE_FEXIT or BPF_MODIFY_RETURN. + * + * long bpf_get_func_arg_cnt(void *ctx) + * Description + * Get number of arguments of the traced function (for tracing programs). + * + * Return + * The number of arguments of the traced function. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -5167,6 +5202,10 @@ union bpf_attr { FN(kallsyms_lookup_name), \ FN(find_vma), \ FN(loop), \ + FN(strncmp), \ + FN(get_func_arg), \ + FN(get_func_ret), \ + FN(get_func_arg_cnt), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile index 5f7086fae31c..f947b61b2107 100644 --- a/tools/lib/bpf/Makefile +++ b/tools/lib/bpf/Makefile @@ -90,6 +90,7 @@ override CFLAGS += -Werror -Wall override CFLAGS += $(INCLUDES) override CFLAGS += -fvisibility=hidden override CFLAGS += -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 +override CFLAGS += $(CLANG_CROSS_FLAGS) # flags specific for shared library SHLIB_FLAGS := -DSHARED -fPIC @@ -162,7 +163,7 @@ $(BPF_HELPER_DEFS): $(srctree)/tools/include/uapi/linux/bpf.h $(OUTPUT)libbpf.so: $(OUTPUT)libbpf.so.$(LIBBPF_VERSION) $(OUTPUT)libbpf.so.$(LIBBPF_VERSION): $(BPF_IN_SHARED) $(VERSION_SCRIPT) - $(QUIET_LINK)$(CC) $(LDFLAGS) \ + $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) \ --shared -Wl,-soname,libbpf.so.$(LIBBPF_MAJOR_VERSION) \ -Wl,--version-script=$(VERSION_SCRIPT) $< -lelf -lz -o $@ @ln -sf $(@F) $(OUTPUT)libbpf.so diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index 6b2407e12060..9b64eed2b003 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -28,7 +28,9 @@ #include <asm/unistd.h> #include <errno.h> #include <linux/bpf.h> +#include <linux/filter.h> #include <limits.h> +#include <sys/resource.h> #include "bpf.h" #include "libbpf.h" #include "libbpf_internal.h" @@ -94,6 +96,77 @@ static inline int sys_bpf_prog_load(union bpf_attr *attr, unsigned int size, int return fd; } +/* Probe whether kernel switched from memlock-based (RLIMIT_MEMLOCK) to + * memcg-based memory accounting for BPF maps and progs. This was done in [0]. + * We use the support for bpf_ktime_get_coarse_ns() helper, which was added in + * the same 5.11 Linux release ([1]), to detect memcg-based accounting for BPF. + * + * [0] https://lore.kernel.org/bpf/20201201215900.3569844-1-guro@fb.com/ + * [1] d05512618056 ("bpf: Add bpf_ktime_get_coarse_ns helper") + */ +int probe_memcg_account(void) +{ + const size_t prog_load_attr_sz = offsetofend(union bpf_attr, attach_btf_obj_fd); + struct bpf_insn insns[] = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_coarse_ns), + BPF_EXIT_INSN(), + }; + size_t insn_cnt = sizeof(insns) / sizeof(insns[0]); + union bpf_attr attr; + int prog_fd; + + /* attempt loading freplace trying to use custom BTF */ + memset(&attr, 0, prog_load_attr_sz); + attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER; + attr.insns = ptr_to_u64(insns); + attr.insn_cnt = insn_cnt; + attr.license = ptr_to_u64("GPL"); + + prog_fd = sys_bpf_fd(BPF_PROG_LOAD, &attr, prog_load_attr_sz); + if (prog_fd >= 0) { + close(prog_fd); + return 1; + } + return 0; +} + +static bool memlock_bumped; +static rlim_t memlock_rlim = RLIM_INFINITY; + +int libbpf_set_memlock_rlim(size_t memlock_bytes) +{ + if (memlock_bumped) + return libbpf_err(-EBUSY); + + memlock_rlim = memlock_bytes; + return 0; +} + +int bump_rlimit_memlock(void) +{ + struct rlimit rlim; + + /* this the default in libbpf 1.0, but for now user has to opt-in explicitly */ + if (!(libbpf_mode & LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK)) + return 0; + + /* if kernel supports memcg-based accounting, skip bumping RLIMIT_MEMLOCK */ + if (memlock_bumped || kernel_supports(NULL, FEAT_MEMCG_ACCOUNT)) + return 0; + + memlock_bumped = true; + + /* zero memlock_rlim_max disables auto-bumping RLIMIT_MEMLOCK */ + if (memlock_rlim == 0) + return 0; + + rlim.rlim_cur = rlim.rlim_max = memlock_rlim; + if (setrlimit(RLIMIT_MEMLOCK, &rlim)) + return -errno; + + return 0; +} + int bpf_map_create(enum bpf_map_type map_type, const char *map_name, __u32 key_size, @@ -105,6 +178,8 @@ int bpf_map_create(enum bpf_map_type map_type, union bpf_attr attr; int fd; + bump_rlimit_memlock(); + memset(&attr, 0, attr_sz); if (!OPTS_VALID(opts, bpf_map_create_opts)) @@ -112,7 +187,7 @@ int bpf_map_create(enum bpf_map_type map_type, attr.map_type = map_type; if (map_name) - strncat(attr.map_name, map_name, sizeof(attr.map_name) - 1); + libbpf_strlcpy(attr.map_name, map_name, sizeof(attr.map_name)); attr.key_size = key_size; attr.value_size = value_size; attr.max_entries = max_entries; @@ -251,6 +326,8 @@ int bpf_prog_load_v0_6_0(enum bpf_prog_type prog_type, union bpf_attr attr; char *log_buf; + bump_rlimit_memlock(); + if (!OPTS_VALID(opts, bpf_prog_load_opts)) return libbpf_err(-EINVAL); @@ -271,7 +348,7 @@ int bpf_prog_load_v0_6_0(enum bpf_prog_type prog_type, attr.kern_version = OPTS_GET(opts, kern_version, 0); if (prog_name) - strncat(attr.prog_name, prog_name, sizeof(attr.prog_name) - 1); + libbpf_strlcpy(attr.prog_name, prog_name, sizeof(attr.prog_name)); attr.license = ptr_to_u64(license); if (insn_cnt > UINT_MAX) @@ -456,6 +533,8 @@ int bpf_verify_program(enum bpf_prog_type type, const struct bpf_insn *insns, union bpf_attr attr; int fd; + bump_rlimit_memlock(); + memset(&attr, 0, sizeof(attr)); attr.prog_type = type; attr.insn_cnt = (__u32)insns_cnt; @@ -1056,6 +1135,8 @@ int bpf_btf_load(const void *btf_data, size_t btf_size, const struct bpf_btf_loa __u32 log_level; int fd; + bump_rlimit_memlock(); + memset(&attr, 0, attr_sz); if (!OPTS_VALID(opts, bpf_btf_load_opts)) diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 94e553a0ff9d..00619f64a040 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -35,6 +35,8 @@ extern "C" { #endif +int libbpf_set_memlock_rlim(size_t memlock_bytes); + struct bpf_map_create_opts { size_t sz; /* size of this struct for forward/backward compatibility */ diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h index db05a5937105..90f56b0f585f 100644 --- a/tools/lib/bpf/bpf_tracing.h +++ b/tools/lib/bpf/bpf_tracing.h @@ -66,277 +66,204 @@ #if defined(__KERNEL__) || defined(__VMLINUX_H__) -#define PT_REGS_PARM1(x) ((x)->di) -#define PT_REGS_PARM2(x) ((x)->si) -#define PT_REGS_PARM3(x) ((x)->dx) -#define PT_REGS_PARM4(x) ((x)->cx) -#define PT_REGS_PARM5(x) ((x)->r8) -#define PT_REGS_RET(x) ((x)->sp) -#define PT_REGS_FP(x) ((x)->bp) -#define PT_REGS_RC(x) ((x)->ax) -#define PT_REGS_SP(x) ((x)->sp) -#define PT_REGS_IP(x) ((x)->ip) - -#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), di) -#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), si) -#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), dx) -#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), cx) -#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), r8) -#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), sp) -#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), bp) -#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), ax) -#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), sp) -#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), ip) +#define __PT_PARM1_REG di +#define __PT_PARM2_REG si +#define __PT_PARM3_REG dx +#define __PT_PARM4_REG cx +#define __PT_PARM5_REG r8 +#define __PT_RET_REG sp +#define __PT_FP_REG bp +#define __PT_RC_REG ax +#define __PT_SP_REG sp +#define __PT_IP_REG ip #else #ifdef __i386__ -/* i386 kernel is built with -mregparm=3 */ -#define PT_REGS_PARM1(x) ((x)->eax) -#define PT_REGS_PARM2(x) ((x)->edx) -#define PT_REGS_PARM3(x) ((x)->ecx) -#define PT_REGS_PARM4(x) 0 -#define PT_REGS_PARM5(x) 0 -#define PT_REGS_RET(x) ((x)->esp) -#define PT_REGS_FP(x) ((x)->ebp) -#define PT_REGS_RC(x) ((x)->eax) -#define PT_REGS_SP(x) ((x)->esp) -#define PT_REGS_IP(x) ((x)->eip) - -#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), eax) -#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), edx) -#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), ecx) -#define PT_REGS_PARM4_CORE(x) 0 -#define PT_REGS_PARM5_CORE(x) 0 -#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), esp) -#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), ebp) -#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), eax) -#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), esp) -#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), eip) - -#else -#define PT_REGS_PARM1(x) ((x)->rdi) -#define PT_REGS_PARM2(x) ((x)->rsi) -#define PT_REGS_PARM3(x) ((x)->rdx) -#define PT_REGS_PARM4(x) ((x)->rcx) -#define PT_REGS_PARM5(x) ((x)->r8) -#define PT_REGS_RET(x) ((x)->rsp) -#define PT_REGS_FP(x) ((x)->rbp) -#define PT_REGS_RC(x) ((x)->rax) -#define PT_REGS_SP(x) ((x)->rsp) -#define PT_REGS_IP(x) ((x)->rip) - -#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), rdi) -#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), rsi) -#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), rdx) -#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), rcx) -#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), r8) -#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), rsp) -#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), rbp) -#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), rax) -#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), rsp) -#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), rip) - -#endif -#endif +#define __PT_PARM1_REG eax +#define __PT_PARM2_REG edx +#define __PT_PARM3_REG ecx +/* i386 kernel is built with -mregparm=3 */ +#define __PT_PARM4_REG __unsupported__ +#define __PT_PARM5_REG __unsupported__ +#define __PT_RET_REG esp +#define __PT_FP_REG ebp +#define __PT_RC_REG eax +#define __PT_SP_REG esp +#define __PT_IP_REG eip + +#else /* __i386__ */ + +#define __PT_PARM1_REG rdi +#define __PT_PARM2_REG rsi +#define __PT_PARM3_REG rdx +#define __PT_PARM4_REG rcx +#define __PT_PARM5_REG r8 +#define __PT_RET_REG rsp +#define __PT_FP_REG rbp +#define __PT_RC_REG rax +#define __PT_SP_REG rsp +#define __PT_IP_REG rip + +#endif /* __i386__ */ + +#endif /* __KERNEL__ || __VMLINUX_H__ */ #elif defined(bpf_target_s390) /* s390 provides user_pt_regs instead of struct pt_regs to userspace */ -struct pt_regs; -#define PT_REGS_S390 const volatile user_pt_regs -#define PT_REGS_PARM1(x) (((PT_REGS_S390 *)(x))->gprs[2]) -#define PT_REGS_PARM2(x) (((PT_REGS_S390 *)(x))->gprs[3]) -#define PT_REGS_PARM3(x) (((PT_REGS_S390 *)(x))->gprs[4]) -#define PT_REGS_PARM4(x) (((PT_REGS_S390 *)(x))->gprs[5]) -#define PT_REGS_PARM5(x) (((PT_REGS_S390 *)(x))->gprs[6]) -#define PT_REGS_RET(x) (((PT_REGS_S390 *)(x))->gprs[14]) -/* Works only with CONFIG_FRAME_POINTER */ -#define PT_REGS_FP(x) (((PT_REGS_S390 *)(x))->gprs[11]) -#define PT_REGS_RC(x) (((PT_REGS_S390 *)(x))->gprs[2]) -#define PT_REGS_SP(x) (((PT_REGS_S390 *)(x))->gprs[15]) -#define PT_REGS_IP(x) (((PT_REGS_S390 *)(x))->psw.addr) - -#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[2]) -#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[3]) -#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[4]) -#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[5]) -#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[6]) -#define PT_REGS_RET_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[14]) -#define PT_REGS_FP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[11]) -#define PT_REGS_RC_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[2]) -#define PT_REGS_SP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[15]) -#define PT_REGS_IP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), psw.addr) +#define __PT_REGS_CAST(x) ((const user_pt_regs *)(x)) +#define __PT_PARM1_REG gprs[2] +#define __PT_PARM2_REG gprs[3] +#define __PT_PARM3_REG gprs[4] +#define __PT_PARM4_REG gprs[5] +#define __PT_PARM5_REG gprs[6] +#define __PT_RET_REG grps[14] +#define __PT_FP_REG gprs[11] /* Works only with CONFIG_FRAME_POINTER */ +#define __PT_RC_REG gprs[2] +#define __PT_SP_REG gprs[15] +#define __PT_IP_REG psw.addr #elif defined(bpf_target_arm) -#define PT_REGS_PARM1(x) ((x)->uregs[0]) -#define PT_REGS_PARM2(x) ((x)->uregs[1]) -#define PT_REGS_PARM3(x) ((x)->uregs[2]) -#define PT_REGS_PARM4(x) ((x)->uregs[3]) -#define PT_REGS_PARM5(x) ((x)->uregs[4]) -#define PT_REGS_RET(x) ((x)->uregs[14]) -#define PT_REGS_FP(x) ((x)->uregs[11]) /* Works only with CONFIG_FRAME_POINTER */ -#define PT_REGS_RC(x) ((x)->uregs[0]) -#define PT_REGS_SP(x) ((x)->uregs[13]) -#define PT_REGS_IP(x) ((x)->uregs[12]) - -#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), uregs[0]) -#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), uregs[1]) -#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), uregs[2]) -#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), uregs[3]) -#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), uregs[4]) -#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), uregs[14]) -#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), uregs[11]) -#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), uregs[0]) -#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), uregs[13]) -#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), uregs[12]) +#define __PT_PARM1_REG uregs[0] +#define __PT_PARM2_REG uregs[1] +#define __PT_PARM3_REG uregs[2] +#define __PT_PARM4_REG uregs[3] +#define __PT_PARM5_REG uregs[4] +#define __PT_RET_REG uregs[14] +#define __PT_FP_REG uregs[11] /* Works only with CONFIG_FRAME_POINTER */ +#define __PT_RC_REG uregs[0] +#define __PT_SP_REG uregs[13] +#define __PT_IP_REG uregs[12] #elif defined(bpf_target_arm64) /* arm64 provides struct user_pt_regs instead of struct pt_regs to userspace */ -struct pt_regs; -#define PT_REGS_ARM64 const volatile struct user_pt_regs -#define PT_REGS_PARM1(x) (((PT_REGS_ARM64 *)(x))->regs[0]) -#define PT_REGS_PARM2(x) (((PT_REGS_ARM64 *)(x))->regs[1]) -#define PT_REGS_PARM3(x) (((PT_REGS_ARM64 *)(x))->regs[2]) -#define PT_REGS_PARM4(x) (((PT_REGS_ARM64 *)(x))->regs[3]) -#define PT_REGS_PARM5(x) (((PT_REGS_ARM64 *)(x))->regs[4]) -#define PT_REGS_RET(x) (((PT_REGS_ARM64 *)(x))->regs[30]) -/* Works only with CONFIG_FRAME_POINTER */ -#define PT_REGS_FP(x) (((PT_REGS_ARM64 *)(x))->regs[29]) -#define PT_REGS_RC(x) (((PT_REGS_ARM64 *)(x))->regs[0]) -#define PT_REGS_SP(x) (((PT_REGS_ARM64 *)(x))->sp) -#define PT_REGS_IP(x) (((PT_REGS_ARM64 *)(x))->pc) - -#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[0]) -#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[1]) -#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[2]) -#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[3]) -#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[4]) -#define PT_REGS_RET_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[30]) -#define PT_REGS_FP_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[29]) -#define PT_REGS_RC_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[0]) -#define PT_REGS_SP_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), sp) -#define PT_REGS_IP_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), pc) +#define __PT_REGS_CAST(x) ((const struct user_pt_regs *)(x)) +#define __PT_PARM1_REG regs[0] +#define __PT_PARM2_REG regs[1] +#define __PT_PARM3_REG regs[2] +#define __PT_PARM4_REG regs[3] +#define __PT_PARM5_REG regs[4] +#define __PT_RET_REG regs[30] +#define __PT_FP_REG regs[29] /* Works only with CONFIG_FRAME_POINTER */ +#define __PT_RC_REG regs[0] +#define __PT_SP_REG sp +#define __PT_IP_REG pc #elif defined(bpf_target_mips) -#define PT_REGS_PARM1(x) ((x)->regs[4]) -#define PT_REGS_PARM2(x) ((x)->regs[5]) -#define PT_REGS_PARM3(x) ((x)->regs[6]) -#define PT_REGS_PARM4(x) ((x)->regs[7]) -#define PT_REGS_PARM5(x) ((x)->regs[8]) -#define PT_REGS_RET(x) ((x)->regs[31]) -#define PT_REGS_FP(x) ((x)->regs[30]) /* Works only with CONFIG_FRAME_POINTER */ -#define PT_REGS_RC(x) ((x)->regs[2]) -#define PT_REGS_SP(x) ((x)->regs[29]) -#define PT_REGS_IP(x) ((x)->cp0_epc) - -#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), regs[4]) -#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), regs[5]) -#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), regs[6]) -#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), regs[7]) -#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), regs[8]) -#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), regs[31]) -#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), regs[30]) -#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), regs[2]) -#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), regs[29]) -#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), cp0_epc) +#define __PT_PARM1_REG regs[4] +#define __PT_PARM2_REG regs[5] +#define __PT_PARM3_REG regs[6] +#define __PT_PARM4_REG regs[7] +#define __PT_PARM5_REG regs[8] +#define __PT_RET_REG regs[31] +#define __PT_FP_REG regs[30] /* Works only with CONFIG_FRAME_POINTER */ +#define __PT_RC_REG regs[2] +#define __PT_SP_REG regs[29] +#define __PT_IP_REG cp0_epc #elif defined(bpf_target_powerpc) -#define PT_REGS_PARM1(x) ((x)->gpr[3]) -#define PT_REGS_PARM2(x) ((x)->gpr[4]) -#define PT_REGS_PARM3(x) ((x)->gpr[5]) -#define PT_REGS_PARM4(x) ((x)->gpr[6]) -#define PT_REGS_PARM5(x) ((x)->gpr[7]) -#define PT_REGS_RC(x) ((x)->gpr[3]) -#define PT_REGS_SP(x) ((x)->sp) -#define PT_REGS_IP(x) ((x)->nip) - -#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), gpr[3]) -#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), gpr[4]) -#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), gpr[5]) -#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), gpr[6]) -#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), gpr[7]) -#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), gpr[3]) -#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), sp) -#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), nip) +#define __PT_PARM1_REG gpr[3] +#define __PT_PARM2_REG gpr[4] +#define __PT_PARM3_REG gpr[5] +#define __PT_PARM4_REG gpr[6] +#define __PT_PARM5_REG gpr[7] +#define __PT_RET_REG regs[31] +#define __PT_FP_REG __unsupported__ +#define __PT_RC_REG gpr[3] +#define __PT_SP_REG sp +#define __PT_IP_REG nip #elif defined(bpf_target_sparc) -#define PT_REGS_PARM1(x) ((x)->u_regs[UREG_I0]) -#define PT_REGS_PARM2(x) ((x)->u_regs[UREG_I1]) -#define PT_REGS_PARM3(x) ((x)->u_regs[UREG_I2]) -#define PT_REGS_PARM4(x) ((x)->u_regs[UREG_I3]) -#define PT_REGS_PARM5(x) ((x)->u_regs[UREG_I4]) -#define PT_REGS_RET(x) ((x)->u_regs[UREG_I7]) -#define PT_REGS_RC(x) ((x)->u_regs[UREG_I0]) -#define PT_REGS_SP(x) ((x)->u_regs[UREG_FP]) - -#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I0]) -#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I1]) -#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I2]) -#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I3]) -#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I4]) -#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I7]) -#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I0]) -#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), u_regs[UREG_FP]) - +#define __PT_PARM1_REG u_regs[UREG_I0] +#define __PT_PARM2_REG u_regs[UREG_I1] +#define __PT_PARM3_REG u_regs[UREG_I2] +#define __PT_PARM4_REG u_regs[UREG_I3] +#define __PT_PARM5_REG u_regs[UREG_I4] +#define __PT_RET_REG u_regs[UREG_I7] +#define __PT_FP_REG __unsupported__ +#define __PT_RC_REG u_regs[UREG_I0] +#define __PT_SP_REG u_regs[UREG_FP] /* Should this also be a bpf_target check for the sparc case? */ #if defined(__arch64__) -#define PT_REGS_IP(x) ((x)->tpc) -#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), tpc) +#define __PT_IP_REG tpc #else -#define PT_REGS_IP(x) ((x)->pc) -#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), pc) +#define __PT_IP_REG pc #endif #elif defined(bpf_target_riscv) +#define __PT_REGS_CAST(x) ((const struct user_regs_struct *)(x)) +#define __PT_PARM1_REG a0 +#define __PT_PARM2_REG a1 +#define __PT_PARM3_REG a2 +#define __PT_PARM4_REG a3 +#define __PT_PARM5_REG a4 +#define __PT_RET_REG ra +#define __PT_FP_REG fp +#define __PT_RC_REG a5 +#define __PT_SP_REG sp +#define __PT_IP_REG epc + +#endif + +#if defined(bpf_target_defined) + struct pt_regs; -#define PT_REGS_RV const volatile struct user_regs_struct -#define PT_REGS_PARM1(x) (((PT_REGS_RV *)(x))->a0) -#define PT_REGS_PARM2(x) (((PT_REGS_RV *)(x))->a1) -#define PT_REGS_PARM3(x) (((PT_REGS_RV *)(x))->a2) -#define PT_REGS_PARM4(x) (((PT_REGS_RV *)(x))->a3) -#define PT_REGS_PARM5(x) (((PT_REGS_RV *)(x))->a4) -#define PT_REGS_RET(x) (((PT_REGS_RV *)(x))->ra) -#define PT_REGS_FP(x) (((PT_REGS_RV *)(x))->s5) -#define PT_REGS_RC(x) (((PT_REGS_RV *)(x))->a5) -#define PT_REGS_SP(x) (((PT_REGS_RV *)(x))->sp) -#define PT_REGS_IP(x) (((PT_REGS_RV *)(x))->epc) - -#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((PT_REGS_RV *)(x), a0) -#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((PT_REGS_RV *)(x), a1) -#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((PT_REGS_RV *)(x), a2) -#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((PT_REGS_RV *)(x), a3) -#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((PT_REGS_RV *)(x), a4) -#define PT_REGS_RET_CORE(x) BPF_CORE_READ((PT_REGS_RV *)(x), ra) -#define PT_REGS_FP_CORE(x) BPF_CORE_READ((PT_REGS_RV *)(x), fp) -#define PT_REGS_RC_CORE(x) BPF_CORE_READ((PT_REGS_RV *)(x), a5) -#define PT_REGS_SP_CORE(x) BPF_CORE_READ((PT_REGS_RV *)(x), sp) -#define PT_REGS_IP_CORE(x) BPF_CORE_READ((PT_REGS_RV *)(x), epc) +/* allow some architecutres to override `struct pt_regs` */ +#ifndef __PT_REGS_CAST +#define __PT_REGS_CAST(x) (x) #endif +#define PT_REGS_PARM1(x) (__PT_REGS_CAST(x)->__PT_PARM1_REG) +#define PT_REGS_PARM2(x) (__PT_REGS_CAST(x)->__PT_PARM2_REG) +#define PT_REGS_PARM3(x) (__PT_REGS_CAST(x)->__PT_PARM3_REG) +#define PT_REGS_PARM4(x) (__PT_REGS_CAST(x)->__PT_PARM4_REG) +#define PT_REGS_PARM5(x) (__PT_REGS_CAST(x)->__PT_PARM5_REG) +#define PT_REGS_RET(x) (__PT_REGS_CAST(x)->__PT_RET_REG) +#define PT_REGS_FP(x) (__PT_REGS_CAST(x)->__PT_FP_REG) +#define PT_REGS_RC(x) (__PT_REGS_CAST(x)->__PT_RC_REG) +#define PT_REGS_SP(x) (__PT_REGS_CAST(x)->__PT_SP_REG) +#define PT_REGS_IP(x) (__PT_REGS_CAST(x)->__PT_IP_REG) + +#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM1_REG) +#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM2_REG) +#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM3_REG) +#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM4_REG) +#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM5_REG) +#define PT_REGS_RET_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_RET_REG) +#define PT_REGS_FP_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_FP_REG) +#define PT_REGS_RC_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_RC_REG) +#define PT_REGS_SP_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_SP_REG) +#define PT_REGS_IP_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_IP_REG) + #if defined(bpf_target_powerpc) + #define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ (ip) = (ctx)->link; }) #define BPF_KRETPROBE_READ_RET_IP BPF_KPROBE_READ_RET_IP + #elif defined(bpf_target_sparc) + #define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ (ip) = PT_REGS_RET(ctx); }) #define BPF_KRETPROBE_READ_RET_IP BPF_KPROBE_READ_RET_IP -#elif defined(bpf_target_defined) + +#else + #define BPF_KPROBE_READ_RET_IP(ip, ctx) \ ({ bpf_probe_read_kernel(&(ip), sizeof(ip), (void *)PT_REGS_RET(ctx)); }) #define BPF_KRETPROBE_READ_RET_IP(ip, ctx) \ - ({ bpf_probe_read_kernel(&(ip), sizeof(ip), \ - (void *)(PT_REGS_FP(ctx) + sizeof(ip))); }) + ({ bpf_probe_read_kernel(&(ip), sizeof(ip), (void *)(PT_REGS_FP(ctx) + sizeof(ip))); }) + #endif -#if !defined(bpf_target_defined) +#else /* defined(bpf_target_defined) */ #define PT_REGS_PARM1(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_PARM2(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) @@ -363,7 +290,7 @@ struct pt_regs; #define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define BPF_KRETPROBE_READ_RET_IP(ip, ctx) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) -#endif /* !defined(bpf_target_defined) */ +#endif /* defined(bpf_target_defined) */ #ifndef ___bpf_concat #define ___bpf_concat(a, b) a ## b @@ -375,25 +302,23 @@ struct pt_regs; #define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c, N, ...) N #endif #ifndef ___bpf_narg -#define ___bpf_narg(...) \ - ___bpf_nth(_, ##__VA_ARGS__, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0) +#define ___bpf_narg(...) ___bpf_nth(_, ##__VA_ARGS__, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0) #endif -#define ___bpf_ctx_cast0() ctx -#define ___bpf_ctx_cast1(x) ___bpf_ctx_cast0(), (void *)ctx[0] -#define ___bpf_ctx_cast2(x, args...) ___bpf_ctx_cast1(args), (void *)ctx[1] -#define ___bpf_ctx_cast3(x, args...) ___bpf_ctx_cast2(args), (void *)ctx[2] -#define ___bpf_ctx_cast4(x, args...) ___bpf_ctx_cast3(args), (void *)ctx[3] -#define ___bpf_ctx_cast5(x, args...) ___bpf_ctx_cast4(args), (void *)ctx[4] -#define ___bpf_ctx_cast6(x, args...) ___bpf_ctx_cast5(args), (void *)ctx[5] -#define ___bpf_ctx_cast7(x, args...) ___bpf_ctx_cast6(args), (void *)ctx[6] -#define ___bpf_ctx_cast8(x, args...) ___bpf_ctx_cast7(args), (void *)ctx[7] -#define ___bpf_ctx_cast9(x, args...) ___bpf_ctx_cast8(args), (void *)ctx[8] +#define ___bpf_ctx_cast0() ctx +#define ___bpf_ctx_cast1(x) ___bpf_ctx_cast0(), (void *)ctx[0] +#define ___bpf_ctx_cast2(x, args...) ___bpf_ctx_cast1(args), (void *)ctx[1] +#define ___bpf_ctx_cast3(x, args...) ___bpf_ctx_cast2(args), (void *)ctx[2] +#define ___bpf_ctx_cast4(x, args...) ___bpf_ctx_cast3(args), (void *)ctx[3] +#define ___bpf_ctx_cast5(x, args...) ___bpf_ctx_cast4(args), (void *)ctx[4] +#define ___bpf_ctx_cast6(x, args...) ___bpf_ctx_cast5(args), (void *)ctx[5] +#define ___bpf_ctx_cast7(x, args...) ___bpf_ctx_cast6(args), (void *)ctx[6] +#define ___bpf_ctx_cast8(x, args...) ___bpf_ctx_cast7(args), (void *)ctx[7] +#define ___bpf_ctx_cast9(x, args...) ___bpf_ctx_cast8(args), (void *)ctx[8] #define ___bpf_ctx_cast10(x, args...) ___bpf_ctx_cast9(args), (void *)ctx[9] #define ___bpf_ctx_cast11(x, args...) ___bpf_ctx_cast10(args), (void *)ctx[10] #define ___bpf_ctx_cast12(x, args...) ___bpf_ctx_cast11(args), (void *)ctx[11] -#define ___bpf_ctx_cast(args...) \ - ___bpf_apply(___bpf_ctx_cast, ___bpf_narg(args))(args) +#define ___bpf_ctx_cast(args...) ___bpf_apply(___bpf_ctx_cast, ___bpf_narg(args))(args) /* * BPF_PROG is a convenience wrapper for generic tp_btf/fentry/fexit and @@ -426,19 +351,13 @@ ____##name(unsigned long long *ctx, ##args) struct pt_regs; -#define ___bpf_kprobe_args0() ctx -#define ___bpf_kprobe_args1(x) \ - ___bpf_kprobe_args0(), (void *)PT_REGS_PARM1(ctx) -#define ___bpf_kprobe_args2(x, args...) \ - ___bpf_kprobe_args1(args), (void *)PT_REGS_PARM2(ctx) -#define ___bpf_kprobe_args3(x, args...) \ - ___bpf_kprobe_args2(args), (void *)PT_REGS_PARM3(ctx) -#define ___bpf_kprobe_args4(x, args...) \ - ___bpf_kprobe_args3(args), (void *)PT_REGS_PARM4(ctx) -#define ___bpf_kprobe_args5(x, args...) \ - ___bpf_kprobe_args4(args), (void *)PT_REGS_PARM5(ctx) -#define ___bpf_kprobe_args(args...) \ - ___bpf_apply(___bpf_kprobe_args, ___bpf_narg(args))(args) +#define ___bpf_kprobe_args0() ctx +#define ___bpf_kprobe_args1(x) ___bpf_kprobe_args0(), (void *)PT_REGS_PARM1(ctx) +#define ___bpf_kprobe_args2(x, args...) ___bpf_kprobe_args1(args), (void *)PT_REGS_PARM2(ctx) +#define ___bpf_kprobe_args3(x, args...) ___bpf_kprobe_args2(args), (void *)PT_REGS_PARM3(ctx) +#define ___bpf_kprobe_args4(x, args...) ___bpf_kprobe_args3(args), (void *)PT_REGS_PARM4(ctx) +#define ___bpf_kprobe_args5(x, args...) ___bpf_kprobe_args4(args), (void *)PT_REGS_PARM5(ctx) +#define ___bpf_kprobe_args(args...) ___bpf_apply(___bpf_kprobe_args, ___bpf_narg(args))(args) /* * BPF_KPROBE serves the same purpose for kprobes as BPF_PROG for @@ -464,11 +383,9 @@ typeof(name(0)) name(struct pt_regs *ctx) \ static __attribute__((always_inline)) typeof(name(0)) \ ____##name(struct pt_regs *ctx, ##args) -#define ___bpf_kretprobe_args0() ctx -#define ___bpf_kretprobe_args1(x) \ - ___bpf_kretprobe_args0(), (void *)PT_REGS_RC(ctx) -#define ___bpf_kretprobe_args(args...) \ - ___bpf_apply(___bpf_kretprobe_args, ___bpf_narg(args))(args) +#define ___bpf_kretprobe_args0() ctx +#define ___bpf_kretprobe_args1(x) ___bpf_kretprobe_args0(), (void *)PT_REGS_RC(ctx) +#define ___bpf_kretprobe_args(args...) ___bpf_apply(___bpf_kretprobe_args, ___bpf_narg(args))(args) /* * BPF_KRETPROBE is similar to BPF_KPROBE, except, it only provides optional diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index 742a2bf71c5e..061839f04525 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -313,12 +313,18 @@ LIBBPF_API struct btf_dump *btf_dump__new_deprecated(const struct btf *btf, * * The rest works just like in case of ___libbpf_override() usage with symbol * versioning. + * + * C++ compilers don't support __builtin_types_compatible_p(), so at least + * don't screw up compilation for them and let C++ users pick btf_dump__new + * vs btf_dump__new_deprecated explicitly. */ +#ifndef __cplusplus #define btf_dump__new(a1, a2, a3, a4) __builtin_choose_expr( \ __builtin_types_compatible_p(typeof(a4), btf_dump_printf_fn_t) || \ __builtin_types_compatible_p(typeof(a4), void(void *, const char *, va_list)), \ btf_dump__new_deprecated((void *)a1, (void *)a2, (void *)a3, (void *)a4), \ btf_dump__new((void *)a1, (void *)a2, (void *)a3, (void *)a4)) +#endif LIBBPF_API void btf_dump__free(struct btf_dump *d); diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index f06a1d343c92..b9a3260c83cb 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -2321,8 +2321,8 @@ int btf_dump__dump_type_data(struct btf_dump *d, __u32 id, if (!opts->indent_str) d->typed_dump->indent_str[0] = '\t'; else - strncat(d->typed_dump->indent_str, opts->indent_str, - sizeof(d->typed_dump->indent_str) - 1); + libbpf_strlcpy(d->typed_dump->indent_str, opts->indent_str, + sizeof(d->typed_dump->indent_str)); d->typed_dump->compact = OPTS_GET(opts, compact, false); d->typed_dump->skip_names = OPTS_GET(opts, skip_names, false); diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index 8ed02e89c9a9..8ecef1088ba2 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -371,8 +371,9 @@ int bpf_gen__finish(struct bpf_gen *gen, int nr_progs, int nr_maps) { int i; - if (nr_progs != gen->nr_progs || nr_maps != gen->nr_maps) { - pr_warn("progs/maps mismatch\n"); + if (nr_progs < gen->nr_progs || nr_maps != gen->nr_maps) { + pr_warn("nr_progs %d/%d nr_maps %d/%d mismatch\n", + nr_progs, gen->nr_progs, nr_maps, gen->nr_maps); gen->error = -EFAULT; return gen->error; } @@ -462,8 +463,7 @@ void bpf_gen__map_create(struct bpf_gen *gen, attr.map_flags = map_attr->map_flags; attr.map_extra = map_attr->map_extra; if (map_name) - memcpy(attr.map_name, map_name, - min((unsigned)strlen(map_name), BPF_OBJ_NAME_LEN - 1)); + libbpf_strlcpy(attr.map_name, map_name, sizeof(attr.map_name)); attr.numa_node = map_attr->numa_node; attr.map_ifindex = map_attr->map_ifindex; attr.max_entries = max_entries; @@ -969,8 +969,7 @@ void bpf_gen__prog_load(struct bpf_gen *gen, core_relos = add_data(gen, gen->core_relos, attr.core_relo_cnt * attr.core_relo_rec_size); - memcpy(attr.prog_name, prog_name, - min((unsigned)strlen(prog_name), BPF_OBJ_NAME_LEN - 1)); + libbpf_strlcpy(attr.prog_name, prog_name, sizeof(attr.prog_name)); prog_load_attr = add_data(gen, &attr, attr_size); /* populate union bpf_attr with a pointer to license */ diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 902f1ad5b7e6..9cb99d1e2385 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -187,42 +187,6 @@ const char *libbpf_version_string(void) #undef __S } -enum kern_feature_id { - /* v4.14: kernel support for program & map names. */ - FEAT_PROG_NAME, - /* v5.2: kernel support for global data sections. */ - FEAT_GLOBAL_DATA, - /* BTF support */ - FEAT_BTF, - /* BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO support */ - FEAT_BTF_FUNC, - /* BTF_KIND_VAR and BTF_KIND_DATASEC support */ - FEAT_BTF_DATASEC, - /* BTF_FUNC_GLOBAL is supported */ - FEAT_BTF_GLOBAL_FUNC, - /* BPF_F_MMAPABLE is supported for arrays */ - FEAT_ARRAY_MMAP, - /* kernel support for expected_attach_type in BPF_PROG_LOAD */ - FEAT_EXP_ATTACH_TYPE, - /* bpf_probe_read_{kernel,user}[_str] helpers */ - FEAT_PROBE_READ_KERN, - /* BPF_PROG_BIND_MAP is supported */ - FEAT_PROG_BIND_MAP, - /* Kernel support for module BTFs */ - FEAT_MODULE_BTF, - /* BTF_KIND_FLOAT support */ - FEAT_BTF_FLOAT, - /* BPF perf link support */ - FEAT_PERF_LINK, - /* BTF_KIND_DECL_TAG support */ - FEAT_BTF_DECL_TAG, - /* BTF_KIND_TYPE_TAG support */ - FEAT_BTF_TYPE_TAG, - __FEAT_CNT, -}; - -static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id); - enum reloc_type { RELO_LD64, RELO_CALL, @@ -831,11 +795,36 @@ bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data, return 0; } -static __u32 get_kernel_version(void) +__u32 get_kernel_version(void) { + /* On Ubuntu LINUX_VERSION_CODE doesn't correspond to info.release, + * but Ubuntu provides /proc/version_signature file, as described at + * https://ubuntu.com/kernel, with an example contents below, which we + * can use to get a proper LINUX_VERSION_CODE. + * + * Ubuntu 5.4.0-12.15-generic 5.4.8 + * + * In the above, 5.4.8 is what kernel is actually expecting, while + * uname() call will return 5.4.0 in info.release. + */ + const char *ubuntu_kver_file = "/proc/version_signature"; __u32 major, minor, patch; struct utsname info; + if (access(ubuntu_kver_file, R_OK) == 0) { + FILE *f; + + f = fopen(ubuntu_kver_file, "r"); + if (f) { + if (fscanf(f, "%*s %*s %d.%d.%d\n", &major, &minor, &patch) == 3) { + fclose(f); + return KERNEL_VERSION(major, minor, patch); + } + fclose(f); + } + /* something went wrong, fall back to uname() approach */ + } + uname(&info); if (sscanf(info.release, "%u.%u.%u", &major, &minor, &patch) != 3) return 0; @@ -1201,12 +1190,10 @@ static struct bpf_object *bpf_object__new(const char *path, strcpy(obj->path, path); if (obj_name) { - strncpy(obj->name, obj_name, sizeof(obj->name) - 1); - obj->name[sizeof(obj->name) - 1] = 0; + libbpf_strlcpy(obj->name, obj_name, sizeof(obj->name)); } else { /* Using basename() GNU version which doesn't modify arg. */ - strncpy(obj->name, basename((void *)path), - sizeof(obj->name) - 1); + libbpf_strlcpy(obj->name, basename((void *)path), sizeof(obj->name)); end = strchr(obj->name, '.'); if (end) *end = 0; @@ -1358,7 +1345,10 @@ static int bpf_object__check_endianness(struct bpf_object *obj) static int bpf_object__init_license(struct bpf_object *obj, void *data, size_t size) { - memcpy(obj->license, data, min(size, sizeof(obj->license) - 1)); + /* libbpf_strlcpy() only copies first N - 1 bytes, so size + 1 won't + * go over allowed ELF data section buffer + */ + libbpf_strlcpy(obj->license, data, min(size + 1, sizeof(obj->license))); pr_debug("license of %s is %s\n", obj->path, obj->license); return 0; } @@ -4354,6 +4344,10 @@ bpf_object__probe_loading(struct bpf_object *obj) if (obj->gen_loader) return 0; + ret = bump_rlimit_memlock(); + if (ret) + pr_warn("Failed to bump RLIMIT_MEMLOCK (err = %d), you might need to do it explicitly!\n", ret); + /* make sure basic loading works */ ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, NULL); if (ret < 0) @@ -4720,14 +4714,17 @@ static struct kern_feature_desc { [FEAT_BTF_TYPE_TAG] = { "BTF_KIND_TYPE_TAG support", probe_kern_btf_type_tag, }, + [FEAT_MEMCG_ACCOUNT] = { + "memcg-based memory accounting", probe_memcg_account, + }, }; -static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id) +bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id) { struct kern_feature_desc *feat = &feature_probes[feat_id]; int ret; - if (obj->gen_loader) + if (obj && obj->gen_loader) /* To generate loader program assume the latest kernel * to avoid doing extra prog_load, map_create syscalls. */ diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index a8b894dae633..85dfef88b3d2 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -227,6 +227,7 @@ struct btf; LIBBPF_API struct btf *bpf_object__btf(const struct bpf_object *obj); LIBBPF_API int bpf_object__btf_fd(const struct bpf_object *obj); +LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__find_program_by_name() instead") LIBBPF_API struct bpf_program * bpf_object__find_program_by_title(const struct bpf_object *obj, const char *title); @@ -339,7 +340,31 @@ LIBBPF_DEPRECATED_SINCE(0, 7, "multi-instance bpf_program support is deprecated" LIBBPF_API int bpf_program__unpin_instance(struct bpf_program *prog, const char *path, int instance); + +/** + * @brief **bpf_program__pin()** pins the BPF program to a file + * in the BPF FS specified by a path. This increments the programs + * reference count, allowing it to stay loaded after the process + * which loaded it has exited. + * + * @param prog BPF program to pin, must already be loaded + * @param path file path in a BPF file system + * @return 0, on success; negative error code, otherwise + */ LIBBPF_API int bpf_program__pin(struct bpf_program *prog, const char *path); + +/** + * @brief **bpf_program__unpin()** unpins the BPF program from a file + * in the BPFFS specified by a path. This decrements the programs + * reference count. + * + * The file pinning the BPF program can also be unlinked by a different + * process in which case this function will return an error. + * + * @param prog BPF program to unpin + * @param path file path to the pin in a BPF file system + * @return 0, on success; negative error code, otherwise + */ LIBBPF_API int bpf_program__unpin(struct bpf_program *prog, const char *path); LIBBPF_API void bpf_program__unload(struct bpf_program *prog); @@ -1027,13 +1052,57 @@ bpf_prog_linfo__lfind(const struct bpf_prog_linfo *prog_linfo, * user, causing subsequent probes to fail. In this case, the caller may want * to adjust that limit with setrlimit(). */ -LIBBPF_API bool bpf_probe_prog_type(enum bpf_prog_type prog_type, - __u32 ifindex); +LIBBPF_DEPRECATED_SINCE(0, 8, "use libbpf_probe_bpf_prog_type() instead") +LIBBPF_API bool bpf_probe_prog_type(enum bpf_prog_type prog_type, __u32 ifindex); +LIBBPF_DEPRECATED_SINCE(0, 8, "use libbpf_probe_bpf_map_type() instead") LIBBPF_API bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex); -LIBBPF_API bool bpf_probe_helper(enum bpf_func_id id, - enum bpf_prog_type prog_type, __u32 ifindex); +LIBBPF_DEPRECATED_SINCE(0, 8, "use libbpf_probe_bpf_helper() instead") +LIBBPF_API bool bpf_probe_helper(enum bpf_func_id id, enum bpf_prog_type prog_type, __u32 ifindex); +LIBBPF_DEPRECATED_SINCE(0, 8, "implement your own or use bpftool for feature detection") LIBBPF_API bool bpf_probe_large_insn_limit(__u32 ifindex); +/** + * @brief **libbpf_probe_bpf_prog_type()** detects if host kernel supports + * BPF programs of a given type. + * @param prog_type BPF program type to detect kernel support for + * @param opts reserved for future extensibility, should be NULL + * @return 1, if given program type is supported; 0, if given program type is + * not supported; negative error code if feature detection failed or can't be + * performed + * + * Make sure the process has required set of CAP_* permissions (or runs as + * root) when performing feature checking. + */ +LIBBPF_API int libbpf_probe_bpf_prog_type(enum bpf_prog_type prog_type, const void *opts); +/** + * @brief **libbpf_probe_bpf_map_type()** detects if host kernel supports + * BPF maps of a given type. + * @param map_type BPF map type to detect kernel support for + * @param opts reserved for future extensibility, should be NULL + * @return 1, if given map type is supported; 0, if given map type is + * not supported; negative error code if feature detection failed or can't be + * performed + * + * Make sure the process has required set of CAP_* permissions (or runs as + * root) when performing feature checking. + */ +LIBBPF_API int libbpf_probe_bpf_map_type(enum bpf_map_type map_type, const void *opts); +/** + * @brief **libbpf_probe_bpf_helper()** detects if host kernel supports the + * use of a given BPF helper from specified BPF program type. + * @param prog_type BPF program type used to check the support of BPF helper + * @param helper_id BPF helper ID (enum bpf_func_id) to check support for + * @param opts reserved for future extensibility, should be NULL + * @return 1, if given combination of program type and helper is supported; 0, + * if the combination is not supported; negative error code if feature + * detection for provided input arguments failed or can't be performed + * + * Make sure the process has required set of CAP_* permissions (or runs as + * root) when performing feature checking. + */ +LIBBPF_API int libbpf_probe_bpf_helper(enum bpf_prog_type prog_type, + enum bpf_func_id helper_id, const void *opts); + /* * Get bpf_prog_info in continuous memory * diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 4d483af7dba6..529783967793 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -427,4 +427,8 @@ LIBBPF_0.7.0 { bpf_program__log_level; bpf_program__set_log_buf; bpf_program__set_log_level; + libbpf_probe_bpf_helper; + libbpf_probe_bpf_map_type; + libbpf_probe_bpf_prog_type; + libbpf_set_memlock_rlim_max; }; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 355c41019aed..1565679eb432 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -169,6 +169,27 @@ static inline void *libbpf_reallocarray(void *ptr, size_t nmemb, size_t size) return realloc(ptr, total); } +/* Copy up to sz - 1 bytes from zero-terminated src string and ensure that dst + * is zero-terminated string no matter what (unless sz == 0, in which case + * it's a no-op). It's conceptually close to FreeBSD's strlcpy(), but differs + * in what is returned. Given this is internal helper, it's trivial to extend + * this, when necessary. Use this instead of strncpy inside libbpf source code. + */ +static inline void libbpf_strlcpy(char *dst, const char *src, size_t sz) +{ + size_t i; + + if (sz == 0) + return; + + sz--; + for (i = 0; i < sz && src[i]; i++) + dst[i] = src[i]; + dst[i] = '\0'; +} + +__u32 get_kernel_version(void); + struct btf; struct btf_type; @@ -272,6 +293,45 @@ static inline bool libbpf_validate_opts(const char *opts, (opts)->sz - __off); \ }) +enum kern_feature_id { + /* v4.14: kernel support for program & map names. */ + FEAT_PROG_NAME, + /* v5.2: kernel support for global data sections. */ + FEAT_GLOBAL_DATA, + /* BTF support */ + FEAT_BTF, + /* BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO support */ + FEAT_BTF_FUNC, + /* BTF_KIND_VAR and BTF_KIND_DATASEC support */ + FEAT_BTF_DATASEC, + /* BTF_FUNC_GLOBAL is supported */ + FEAT_BTF_GLOBAL_FUNC, + /* BPF_F_MMAPABLE is supported for arrays */ + FEAT_ARRAY_MMAP, + /* kernel support for expected_attach_type in BPF_PROG_LOAD */ + FEAT_EXP_ATTACH_TYPE, + /* bpf_probe_read_{kernel,user}[_str] helpers */ + FEAT_PROBE_READ_KERN, + /* BPF_PROG_BIND_MAP is supported */ + FEAT_PROG_BIND_MAP, + /* Kernel support for module BTFs */ + FEAT_MODULE_BTF, + /* BTF_KIND_FLOAT support */ + FEAT_BTF_FLOAT, + /* BPF perf link support */ + FEAT_PERF_LINK, + /* BTF_KIND_DECL_TAG support */ + FEAT_BTF_DECL_TAG, + /* BTF_KIND_TYPE_TAG support */ + FEAT_BTF_TYPE_TAG, + /* memcg-based accounting for BPF maps and progs */ + FEAT_MEMCG_ACCOUNT, + __FEAT_CNT, +}; + +int probe_memcg_account(void); +bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id); +int bump_rlimit_memlock(void); int parse_cpu_mask_str(const char *s, bool **mask, int *mask_sz); int parse_cpu_mask_file(const char *fcpu, bool **mask, int *mask_sz); diff --git a/tools/lib/bpf/libbpf_legacy.h b/tools/lib/bpf/libbpf_legacy.h index bb03c568af7b..79131f761a27 100644 --- a/tools/lib/bpf/libbpf_legacy.h +++ b/tools/lib/bpf/libbpf_legacy.h @@ -45,7 +45,6 @@ enum libbpf_strict_mode { * (positive) error code. */ LIBBPF_STRICT_DIRECT_ERRS = 0x02, - /* * Enforce strict BPF program section (SEC()) names. * E.g., while prefiously SEC("xdp_whatever") or SEC("perf_event_blah") were @@ -63,6 +62,17 @@ enum libbpf_strict_mode { * Clients can maintain it on their own if it is valuable for them. */ LIBBPF_STRICT_NO_OBJECT_LIST = 0x08, + /* + * Automatically bump RLIMIT_MEMLOCK using setrlimit() before the + * first BPF program or map creation operation. This is done only if + * kernel is too old to support memcg-based memory accounting for BPF + * subsystem. By default, RLIMIT_MEMLOCK limit is set to RLIM_INFINITY, + * but it can be overriden with libbpf_set_memlock_rlim_max() API. + * Note that libbpf_set_memlock_rlim_max() needs to be called before + * the very first bpf_prog_load(), bpf_map_create() or bpf_object__load() + * operation. + */ + LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK = 0x10, __LIBBPF_STRICT_LAST, }; diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c index 4bdec69523a7..97b06cede56f 100644 --- a/tools/lib/bpf/libbpf_probes.c +++ b/tools/lib/bpf/libbpf_probes.c @@ -48,28 +48,20 @@ static int get_vendor_id(int ifindex) return strtol(buf, NULL, 0); } -static int get_kernel_version(void) +static int probe_prog_load(enum bpf_prog_type prog_type, + const struct bpf_insn *insns, size_t insns_cnt, + char *log_buf, size_t log_buf_sz, + __u32 ifindex) { - int version, subversion, patchlevel; - struct utsname utsn; - - /* Return 0 on failure, and attempt to probe with empty kversion */ - if (uname(&utsn)) - return 0; - - if (sscanf(utsn.release, "%d.%d.%d", - &version, &subversion, &patchlevel) != 3) - return 0; - - return (version << 16) + (subversion << 8) + patchlevel; -} - -static void -probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns, - size_t insns_cnt, char *buf, size_t buf_len, __u32 ifindex) -{ - LIBBPF_OPTS(bpf_prog_load_opts, opts); - int fd; + LIBBPF_OPTS(bpf_prog_load_opts, opts, + .log_buf = log_buf, + .log_size = log_buf_sz, + .log_level = log_buf ? 1 : 0, + .prog_ifindex = ifindex, + ); + int fd, err, exp_err = 0; + const char *exp_msg = NULL; + char buf[4096]; switch (prog_type) { case BPF_PROG_TYPE_CGROUP_SOCK_ADDR: @@ -84,6 +76,38 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns, case BPF_PROG_TYPE_KPROBE: opts.kern_version = get_kernel_version(); break; + case BPF_PROG_TYPE_LIRC_MODE2: + opts.expected_attach_type = BPF_LIRC_MODE2; + break; + case BPF_PROG_TYPE_TRACING: + case BPF_PROG_TYPE_LSM: + opts.log_buf = buf; + opts.log_size = sizeof(buf); + opts.log_level = 1; + if (prog_type == BPF_PROG_TYPE_TRACING) + opts.expected_attach_type = BPF_TRACE_FENTRY; + else + opts.expected_attach_type = BPF_MODIFY_RETURN; + opts.attach_btf_id = 1; + + exp_err = -EINVAL; + exp_msg = "attach_btf_id 1 is not a function"; + break; + case BPF_PROG_TYPE_EXT: + opts.log_buf = buf; + opts.log_size = sizeof(buf); + opts.log_level = 1; + opts.attach_btf_id = 1; + + exp_err = -EINVAL; + exp_msg = "Cannot replace kernel functions"; + break; + case BPF_PROG_TYPE_SYSCALL: + opts.prog_flags = BPF_F_SLEEPABLE; + break; + case BPF_PROG_TYPE_STRUCT_OPS: + exp_err = -524; /* -ENOTSUPP */ + break; case BPF_PROG_TYPE_UNSPEC: case BPF_PROG_TYPE_SOCKET_FILTER: case BPF_PROG_TYPE_SCHED_CLS: @@ -103,25 +127,42 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns, case BPF_PROG_TYPE_RAW_TRACEPOINT: case BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE: case BPF_PROG_TYPE_LWT_SEG6LOCAL: - case BPF_PROG_TYPE_LIRC_MODE2: case BPF_PROG_TYPE_SK_REUSEPORT: case BPF_PROG_TYPE_FLOW_DISSECTOR: case BPF_PROG_TYPE_CGROUP_SYSCTL: - case BPF_PROG_TYPE_TRACING: - case BPF_PROG_TYPE_STRUCT_OPS: - case BPF_PROG_TYPE_EXT: - case BPF_PROG_TYPE_LSM: - default: break; + default: + return -EOPNOTSUPP; } - opts.prog_ifindex = ifindex; - opts.log_buf = buf; - opts.log_size = buf_len; - - fd = bpf_prog_load(prog_type, NULL, "GPL", insns, insns_cnt, NULL); + fd = bpf_prog_load(prog_type, NULL, "GPL", insns, insns_cnt, &opts); + err = -errno; if (fd >= 0) close(fd); + if (exp_err) { + if (fd >= 0 || err != exp_err) + return 0; + if (exp_msg && !strstr(buf, exp_msg)) + return 0; + return 1; + } + return fd >= 0 ? 1 : 0; +} + +int libbpf_probe_bpf_prog_type(enum bpf_prog_type prog_type, const void *opts) +{ + struct bpf_insn insns[] = { + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN() + }; + const size_t insn_cnt = ARRAY_SIZE(insns); + int ret; + + if (opts) + return libbpf_err(-EINVAL); + + ret = probe_prog_load(prog_type, insns, insn_cnt, NULL, 0, 0); + return libbpf_err(ret); } bool bpf_probe_prog_type(enum bpf_prog_type prog_type, __u32 ifindex) @@ -131,12 +172,16 @@ bool bpf_probe_prog_type(enum bpf_prog_type prog_type, __u32 ifindex) BPF_EXIT_INSN() }; + /* prefer libbpf_probe_bpf_prog_type() unless offload is requested */ + if (ifindex == 0) + return libbpf_probe_bpf_prog_type(prog_type, NULL) == 1; + if (ifindex && prog_type == BPF_PROG_TYPE_SCHED_CLS) /* nfp returns -EINVAL on exit(0) with TC offload */ insns[0].imm = 2; errno = 0; - probe_load(prog_type, insns, ARRAY_SIZE(insns), NULL, 0, ifindex); + probe_prog_load(prog_type, insns, ARRAY_SIZE(insns), NULL, 0, ifindex); return errno != EINVAL && errno != EOPNOTSUPP; } @@ -197,16 +242,18 @@ static int load_local_storage_btf(void) strs, sizeof(strs)); } -bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex) +static int probe_map_create(enum bpf_map_type map_type, __u32 ifindex) { - int key_size, value_size, max_entries, map_flags; + LIBBPF_OPTS(bpf_map_create_opts, opts); + int key_size, value_size, max_entries; __u32 btf_key_type_id = 0, btf_value_type_id = 0; - int fd = -1, btf_fd = -1, fd_inner; + int fd = -1, btf_fd = -1, fd_inner = -1, exp_err = 0, err; + + opts.map_ifindex = ifindex; key_size = sizeof(__u32); value_size = sizeof(__u32); max_entries = 1; - map_flags = 0; switch (map_type) { case BPF_MAP_TYPE_STACK_TRACE: @@ -215,7 +262,7 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex) case BPF_MAP_TYPE_LPM_TRIE: key_size = sizeof(__u64); value_size = sizeof(__u64); - map_flags = BPF_F_NO_PREALLOC; + opts.map_flags = BPF_F_NO_PREALLOC; break; case BPF_MAP_TYPE_CGROUP_STORAGE: case BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE: @@ -234,17 +281,25 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex) btf_value_type_id = 3; value_size = 8; max_entries = 0; - map_flags = BPF_F_NO_PREALLOC; + opts.map_flags = BPF_F_NO_PREALLOC; btf_fd = load_local_storage_btf(); if (btf_fd < 0) - return false; + return btf_fd; break; case BPF_MAP_TYPE_RINGBUF: key_size = 0; value_size = 0; max_entries = 4096; break; - case BPF_MAP_TYPE_UNSPEC: + case BPF_MAP_TYPE_STRUCT_OPS: + /* we'll get -ENOTSUPP for invalid BTF type ID for struct_ops */ + opts.btf_vmlinux_value_type_id = 1; + exp_err = -524; /* -ENOTSUPP */ + break; + case BPF_MAP_TYPE_BLOOM_FILTER: + key_size = 0; + max_entries = 1; + break; case BPF_MAP_TYPE_HASH: case BPF_MAP_TYPE_ARRAY: case BPF_MAP_TYPE_PROG_ARRAY: @@ -263,49 +318,114 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex) case BPF_MAP_TYPE_XSKMAP: case BPF_MAP_TYPE_SOCKHASH: case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY: - case BPF_MAP_TYPE_STRUCT_OPS: - default: break; + case BPF_MAP_TYPE_UNSPEC: + default: + return -EOPNOTSUPP; } if (map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS || map_type == BPF_MAP_TYPE_HASH_OF_MAPS) { - LIBBPF_OPTS(bpf_map_create_opts, opts); - /* TODO: probe for device, once libbpf has a function to create * map-in-map for offload */ if (ifindex) - return false; + goto cleanup; fd_inner = bpf_map_create(BPF_MAP_TYPE_HASH, NULL, sizeof(__u32), sizeof(__u32), 1, NULL); if (fd_inner < 0) - return false; + goto cleanup; opts.inner_map_fd = fd_inner; - fd = bpf_map_create(map_type, NULL, sizeof(__u32), sizeof(__u32), 1, &opts); - close(fd_inner); - } else { - LIBBPF_OPTS(bpf_map_create_opts, opts); - - /* Note: No other restriction on map type probes for offload */ - opts.map_flags = map_flags; - opts.map_ifindex = ifindex; - if (btf_fd >= 0) { - opts.btf_fd = btf_fd; - opts.btf_key_type_id = btf_key_type_id; - opts.btf_value_type_id = btf_value_type_id; - } + } - fd = bpf_map_create(map_type, NULL, key_size, value_size, max_entries, &opts); + if (btf_fd >= 0) { + opts.btf_fd = btf_fd; + opts.btf_key_type_id = btf_key_type_id; + opts.btf_value_type_id = btf_value_type_id; } + + fd = bpf_map_create(map_type, NULL, key_size, value_size, max_entries, &opts); + err = -errno; + +cleanup: if (fd >= 0) close(fd); + if (fd_inner >= 0) + close(fd_inner); if (btf_fd >= 0) close(btf_fd); - return fd >= 0; + if (exp_err) + return fd < 0 && err == exp_err ? 1 : 0; + else + return fd >= 0 ? 1 : 0; +} + +int libbpf_probe_bpf_map_type(enum bpf_map_type map_type, const void *opts) +{ + int ret; + + if (opts) + return libbpf_err(-EINVAL); + + ret = probe_map_create(map_type, 0); + return libbpf_err(ret); +} + +bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex) +{ + return probe_map_create(map_type, ifindex) == 1; +} + +int libbpf_probe_bpf_helper(enum bpf_prog_type prog_type, enum bpf_func_id helper_id, + const void *opts) +{ + struct bpf_insn insns[] = { + BPF_EMIT_CALL((__u32)helper_id), + BPF_EXIT_INSN(), + }; + const size_t insn_cnt = ARRAY_SIZE(insns); + char buf[4096]; + int ret; + + if (opts) + return libbpf_err(-EINVAL); + + /* we can't successfully load all prog types to check for BPF helper + * support, so bail out with -EOPNOTSUPP error + */ + switch (prog_type) { + case BPF_PROG_TYPE_TRACING: + case BPF_PROG_TYPE_EXT: + case BPF_PROG_TYPE_LSM: + case BPF_PROG_TYPE_STRUCT_OPS: + return -EOPNOTSUPP; + default: + break; + } + + buf[0] = '\0'; + ret = probe_prog_load(prog_type, insns, insn_cnt, buf, sizeof(buf), 0); + if (ret < 0) + return libbpf_err(ret); + + /* If BPF verifier doesn't recognize BPF helper ID (enum bpf_func_id) + * at all, it will emit something like "invalid func unknown#181". + * If BPF verifier recognizes BPF helper but it's not supported for + * given BPF program type, it will emit "unknown func bpf_sys_bpf#166". + * In both cases, provided combination of BPF program type and BPF + * helper is not supported by the kernel. + * In all other cases, probe_prog_load() above will either succeed (e.g., + * because BPF helper happens to accept no input arguments or it + * accepts one input argument and initial PTR_TO_CTX is fine for + * that), or we'll get some more specific BPF verifier error about + * some unsatisfied conditions. + */ + if (ret == 0 && (strstr(buf, "invalid func ") || strstr(buf, "unknown func "))) + return 0; + return 1; /* assume supported */ } bool bpf_probe_helper(enum bpf_func_id id, enum bpf_prog_type prog_type, @@ -318,8 +438,7 @@ bool bpf_probe_helper(enum bpf_func_id id, enum bpf_prog_type prog_type, char buf[4096] = {}; bool res; - probe_load(prog_type, insns, ARRAY_SIZE(insns), buf, sizeof(buf), - ifindex); + probe_prog_load(prog_type, insns, ARRAY_SIZE(insns), buf, sizeof(buf), ifindex); res = !grep(buf, "invalid func ") && !grep(buf, "unknown func "); if (ifindex) { @@ -351,8 +470,8 @@ bool bpf_probe_large_insn_limit(__u32 ifindex) insns[BPF_MAXINSNS] = BPF_EXIT_INSN(); errno = 0; - probe_load(BPF_PROG_TYPE_SCHED_CLS, insns, ARRAY_SIZE(insns), NULL, 0, - ifindex); + probe_prog_load(BPF_PROG_TYPE_SCHED_CLS, insns, ARRAY_SIZE(insns), NULL, 0, + ifindex); return errno != E2BIG && errno != EINVAL; } diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c index 32464f0ab4b1..910865e29edc 100644 --- a/tools/lib/bpf/relo_core.c +++ b/tools/lib/bpf/relo_core.c @@ -709,10 +709,14 @@ static int bpf_core_calc_field_relo(const char *prog_name, static int bpf_core_calc_type_relo(const struct bpf_core_relo *relo, const struct bpf_core_spec *spec, - __u32 *val) + __u32 *val, bool *validate) { __s64 sz; + /* by default, always check expected value in bpf_insn */ + if (validate) + *validate = true; + /* type-based relos return zero when target type is not found */ if (!spec) { *val = 0; @@ -722,6 +726,11 @@ static int bpf_core_calc_type_relo(const struct bpf_core_relo *relo, switch (relo->kind) { case BPF_CORE_TYPE_ID_TARGET: *val = spec->root_type_id; + /* type ID, embedded in bpf_insn, might change during linking, + * so enforcing it is pointless + */ + if (validate) + *validate = false; break; case BPF_CORE_TYPE_EXISTS: *val = 1; @@ -861,8 +870,8 @@ static int bpf_core_calc_relo(const char *prog_name, res->fail_memsz_adjust = true; } } else if (core_relo_is_type_based(relo->kind)) { - err = bpf_core_calc_type_relo(relo, local_spec, &res->orig_val); - err = err ?: bpf_core_calc_type_relo(relo, targ_spec, &res->new_val); + err = bpf_core_calc_type_relo(relo, local_spec, &res->orig_val, &res->validate); + err = err ?: bpf_core_calc_type_relo(relo, targ_spec, &res->new_val, NULL); } else if (core_relo_is_enumval_based(relo->kind)) { err = bpf_core_calc_enumval_relo(relo, local_spec, &res->orig_val); err = err ?: bpf_core_calc_enumval_relo(relo, targ_spec, &res->new_val); @@ -1213,7 +1222,9 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, /* TYPE_ID_LOCAL relo is special and doesn't need candidate search */ if (relo->kind == BPF_CORE_TYPE_ID_LOCAL) { - targ_res.validate = true; + /* bpf_insn's imm value could get out of sync during linking */ + memset(&targ_res, 0, sizeof(targ_res)); + targ_res.validate = false; targ_res.poison = false; targ_res.orig_val = local_spec->root_type_id; targ_res.new_val = local_spec->root_type_id; @@ -1227,7 +1238,6 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, return -EOPNOTSUPP; } - for (i = 0, j = 0; i < cands->len; i++) { err = bpf_core_spec_match(local_spec, cands->cands[i].btf, cands->cands[i].id, cand_spec); diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c index e8d94c6dd3bc..edafe56664f3 100644 --- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -548,8 +548,7 @@ static int xsk_get_max_queues(struct xsk_socket *xsk) return -errno; ifr.ifr_data = (void *)&channels; - memcpy(ifr.ifr_name, ctx->ifname, IFNAMSIZ - 1); - ifr.ifr_name[IFNAMSIZ - 1] = '\0'; + libbpf_strlcpy(ifr.ifr_name, ctx->ifname, IFNAMSIZ); err = ioctl(fd, SIOCETHTOOL, &ifr); if (err && errno != EOPNOTSUPP) { ret = -errno; @@ -768,8 +767,7 @@ static int xsk_create_xsk_struct(int ifindex, struct xsk_socket *xsk) } ctx->ifindex = ifindex; - memcpy(ctx->ifname, ifname, IFNAMSIZ -1); - ctx->ifname[IFNAMSIZ - 1] = 0; + libbpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ); xsk->ctx = ctx; xsk->ctx->has_bpf_link = xsk_probe_bpf_link(); @@ -951,8 +949,7 @@ static struct xsk_ctx *xsk_create_ctx(struct xsk_socket *xsk, ctx->refcount = 1; ctx->umem = umem; ctx->queue_id = queue_id; - memcpy(ctx->ifname, ifname, IFNAMSIZ - 1); - ctx->ifname[IFNAMSIZ - 1] = '\0'; + libbpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ); ctx->fill = fill; ctx->comp = comp; diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index 0b52e08e558e..97121fb45842 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -3257,10 +3257,21 @@ static void trace__set_bpf_map_syscalls(struct trace *trace) static struct bpf_program *trace__find_bpf_program_by_title(struct trace *trace, const char *name) { + struct bpf_program *pos, *prog = NULL; + const char *sec_name; + if (trace->bpf_obj == NULL) return NULL; - return bpf_object__find_program_by_title(trace->bpf_obj, name); + bpf_object__for_each_program(pos, trace->bpf_obj) { + sec_name = bpf_program__section_name(pos); + if (sec_name && !strcmp(sec_name, name)) { + prog = pos; + break; + } + } + + return prog; } static struct bpf_program *trace__find_syscall_bpf_prog(struct trace *trace, struct syscall *sc, diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include index 071312f5eb92..b0be5f40a3f1 100644 --- a/tools/scripts/Makefile.include +++ b/tools/scripts/Makefile.include @@ -87,7 +87,18 @@ LLVM_STRIP ?= llvm-strip ifeq ($(CC_NO_CLANG), 1) EXTRA_WARNINGS += -Wstrict-aliasing=3 -endif + +else ifneq ($(CROSS_COMPILE),) +CLANG_CROSS_FLAGS := --target=$(notdir $(CROSS_COMPILE:%-=%)) +GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)gcc)) +ifneq ($(GCC_TOOLCHAIN_DIR),) +CLANG_CROSS_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(notdir $(CROSS_COMPILE)) +CLANG_CROSS_FLAGS += --sysroot=$(shell $(CROSS_COMPILE)gcc -print-sysroot) +CLANG_CROSS_FLAGS += --gcc-toolchain=$(realpath $(GCC_TOOLCHAIN_DIR)/..) +endif # GCC_TOOLCHAIN_DIR +CFLAGS += $(CLANG_CROSS_FLAGS) +AFLAGS += $(CLANG_CROSS_FLAGS) +endif # CROSS_COMPILE # Hack to avoid type-punned warnings on old systems such as RHEL5: # We should be changing CFLAGS and checking gcc version, but this diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index a795bca4c8ec..42ffc24e9e71 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -170,7 +170,7 @@ $(OUTPUT)/%:%.c $(OUTPUT)/urandom_read: urandom_read.c $(call msg,BINARY,,$@) - $(Q)$(CC) $(LDFLAGS) $< $(LDLIBS) -Wl,--build-id=sha1 -o $@ + $(Q)$(CC) $(CFLAGS) $(LDFLAGS) $< $(LDLIBS) -Wl,--build-id=sha1 -o $@ $(OUTPUT)/bpf_testmod.ko: $(VMLINUX_BTF) $(wildcard bpf_testmod/Makefile bpf_testmod/*.[ch]) $(call msg,MOD,,$@) @@ -217,7 +217,7 @@ BPFTOOL ?= $(DEFAULT_BPFTOOL) $(DEFAULT_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \ $(HOST_BPFOBJ) | $(HOST_BUILD_DIR)/bpftool $(Q)$(MAKE) $(submake_extras) -C $(BPFTOOLDIR) \ - CC=$(HOSTCC) LD=$(HOSTLD) \ + ARCH= CROSS_COMPILE= CC=$(HOSTCC) LD=$(HOSTLD) \ EXTRA_CFLAGS='-g -O0' \ OUTPUT=$(HOST_BUILD_DIR)/bpftool/ \ LIBBPF_OUTPUT=$(HOST_BUILD_DIR)/libbpf/ \ @@ -248,7 +248,7 @@ $(HOST_BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile) \ $(APIDIR)/linux/bpf.h \ | $(HOST_BUILD_DIR)/libbpf $(Q)$(MAKE) $(submake_extras) -C $(BPFDIR) \ - EXTRA_CFLAGS='-g -O0' \ + EXTRA_CFLAGS='-g -O0' ARCH= CROSS_COMPILE= \ OUTPUT=$(HOST_BUILD_DIR)/libbpf/ CC=$(HOSTCC) LD=$(HOSTLD) \ DESTDIR=$(HOST_SCRATCH_DIR)/ prefix= all install_headers endif @@ -537,6 +537,7 @@ $(OUTPUT)/bench_ringbufs.o: $(OUTPUT)/ringbuf_bench.skel.h \ $(OUTPUT)/perfbuf_bench.skel.h $(OUTPUT)/bench_bloom_filter_map.o: $(OUTPUT)/bloom_filter_bench.skel.h $(OUTPUT)/bench_bpf_loop.o: $(OUTPUT)/bpf_loop_bench.skel.h +$(OUTPUT)/bench_strncmp.o: $(OUTPUT)/strncmp_bench.skel.h $(OUTPUT)/bench.o: bench.h testing_helpers.h $(BPFOBJ) $(OUTPUT)/bench: LDLIBS += -lm $(OUTPUT)/bench: $(OUTPUT)/bench.o \ @@ -547,9 +548,10 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o \ $(OUTPUT)/bench_trigger.o \ $(OUTPUT)/bench_ringbufs.o \ $(OUTPUT)/bench_bloom_filter_map.o \ - $(OUTPUT)/bench_bpf_loop.o + $(OUTPUT)/bench_bpf_loop.o \ + $(OUTPUT)/bench_strncmp.o $(call msg,BINARY,,$@) - $(Q)$(CC) $(LDFLAGS) $(filter %.a %.o,$^) $(LDLIBS) -o $@ + $(Q)$(CC) $(CFLAGS) $(LDFLAGS) $(filter %.a %.o,$^) $(LDLIBS) -o $@ EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR) $(HOST_SCRATCH_DIR) \ prog_tests/tests.h map_tests/tests.h verifier/tests.h \ diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c index 3d6082b97a56..f973320e6dbf 100644 --- a/tools/testing/selftests/bpf/bench.c +++ b/tools/testing/selftests/bpf/bench.c @@ -29,26 +29,10 @@ static int libbpf_print_fn(enum libbpf_print_level level, return vfprintf(stderr, format, args); } -static int bump_memlock_rlimit(void) +void setup_libbpf(void) { - struct rlimit rlim_new = { - .rlim_cur = RLIM_INFINITY, - .rlim_max = RLIM_INFINITY, - }; - - return setrlimit(RLIMIT_MEMLOCK, &rlim_new); -} - -void setup_libbpf() -{ - int err; - libbpf_set_strict_mode(LIBBPF_STRICT_ALL); libbpf_set_print(libbpf_print_fn); - - err = bump_memlock_rlimit(); - if (err) - fprintf(stderr, "failed to increase RLIMIT_MEMLOCK: %d", err); } void false_hits_report_progress(int iter, struct bench_res *res, long delta_ns) @@ -205,11 +189,13 @@ static const struct argp_option opts[] = { extern struct argp bench_ringbufs_argp; extern struct argp bench_bloom_map_argp; extern struct argp bench_bpf_loop_argp; +extern struct argp bench_strncmp_argp; static const struct argp_child bench_parsers[] = { { &bench_ringbufs_argp, 0, "Ring buffers benchmark", 0 }, { &bench_bloom_map_argp, 0, "Bloom filter map benchmark", 0 }, { &bench_bpf_loop_argp, 0, "bpf_loop helper benchmark", 0 }, + { &bench_strncmp_argp, 0, "bpf_strncmp helper benchmark", 0 }, {}, }; @@ -409,6 +395,8 @@ extern const struct bench bench_bloom_false_positive; extern const struct bench bench_hashmap_without_bloom; extern const struct bench bench_hashmap_with_bloom; extern const struct bench bench_bpf_loop; +extern const struct bench bench_strncmp_no_helper; +extern const struct bench bench_strncmp_helper; static const struct bench *benchs[] = { &bench_count_global, @@ -441,6 +429,8 @@ static const struct bench *benchs[] = { &bench_hashmap_without_bloom, &bench_hashmap_with_bloom, &bench_bpf_loop, + &bench_strncmp_no_helper, + &bench_strncmp_helper, }; static void setup_benchmark() diff --git a/tools/testing/selftests/bpf/bench.h b/tools/testing/selftests/bpf/bench.h index 50785503756b..fb3e213df3dc 100644 --- a/tools/testing/selftests/bpf/bench.h +++ b/tools/testing/selftests/bpf/bench.h @@ -38,8 +38,8 @@ struct bench_res { struct bench { const char *name; - void (*validate)(); - void (*setup)(); + void (*validate)(void); + void (*setup)(void); void *(*producer_thread)(void *ctx); void *(*consumer_thread)(void *ctx); void (*measure)(struct bench_res* res); @@ -54,7 +54,7 @@ struct counter { extern struct env env; extern const struct bench *bench; -void setup_libbpf(); +void setup_libbpf(void); void hits_drops_report_progress(int iter, struct bench_res *res, long delta_ns); void hits_drops_report_final(struct bench_res res[], int res_cnt); void false_hits_report_progress(int iter, struct bench_res *res, long delta_ns); @@ -62,7 +62,8 @@ void false_hits_report_final(struct bench_res res[], int res_cnt); void ops_report_progress(int iter, struct bench_res *res, long delta_ns); void ops_report_final(struct bench_res res[], int res_cnt); -static inline __u64 get_time_ns() { +static inline __u64 get_time_ns(void) +{ struct timespec t; clock_gettime(CLOCK_MONOTONIC, &t); diff --git a/tools/testing/selftests/bpf/benchs/bench_count.c b/tools/testing/selftests/bpf/benchs/bench_count.c index befba7a82643..078972ce208e 100644 --- a/tools/testing/selftests/bpf/benchs/bench_count.c +++ b/tools/testing/selftests/bpf/benchs/bench_count.c @@ -36,7 +36,7 @@ static struct count_local_ctx { struct counter *hits; } count_local_ctx; -static void count_local_setup() +static void count_local_setup(void) { struct count_local_ctx *ctx = &count_local_ctx; diff --git a/tools/testing/selftests/bpf/benchs/bench_rename.c b/tools/testing/selftests/bpf/benchs/bench_rename.c index c7ec114eca56..3c203b6d6a6e 100644 --- a/tools/testing/selftests/bpf/benchs/bench_rename.c +++ b/tools/testing/selftests/bpf/benchs/bench_rename.c @@ -11,7 +11,7 @@ static struct ctx { int fd; } ctx; -static void validate() +static void validate(void) { if (env.producer_cnt != 1) { fprintf(stderr, "benchmark doesn't support multi-producer!\n"); @@ -43,7 +43,7 @@ static void measure(struct bench_res *res) res->hits = atomic_swap(&ctx.hits.value, 0); } -static void setup_ctx() +static void setup_ctx(void) { setup_libbpf(); @@ -71,36 +71,36 @@ static void attach_bpf(struct bpf_program *prog) } } -static void setup_base() +static void setup_base(void) { setup_ctx(); } -static void setup_kprobe() +static void setup_kprobe(void) { setup_ctx(); attach_bpf(ctx.skel->progs.prog1); } -static void setup_kretprobe() +static void setup_kretprobe(void) { setup_ctx(); attach_bpf(ctx.skel->progs.prog2); } -static void setup_rawtp() +static void setup_rawtp(void) { setup_ctx(); attach_bpf(ctx.skel->progs.prog3); } -static void setup_fentry() +static void setup_fentry(void) { setup_ctx(); attach_bpf(ctx.skel->progs.prog4); } -static void setup_fexit() +static void setup_fexit(void) { setup_ctx(); attach_bpf(ctx.skel->progs.prog5); diff --git a/tools/testing/selftests/bpf/benchs/bench_ringbufs.c b/tools/testing/selftests/bpf/benchs/bench_ringbufs.c index 52d4a2f91dbd..da8593b3494a 100644 --- a/tools/testing/selftests/bpf/benchs/bench_ringbufs.c +++ b/tools/testing/selftests/bpf/benchs/bench_ringbufs.c @@ -88,12 +88,12 @@ const struct argp bench_ringbufs_argp = { static struct counter buf_hits; -static inline void bufs_trigger_batch() +static inline void bufs_trigger_batch(void) { (void)syscall(__NR_getpgid); } -static void bufs_validate() +static void bufs_validate(void) { if (env.consumer_cnt != 1) { fprintf(stderr, "rb-libbpf benchmark doesn't support multi-consumer!\n"); @@ -132,7 +132,7 @@ static void ringbuf_libbpf_measure(struct bench_res *res) res->drops = atomic_swap(&ctx->skel->bss->dropped, 0); } -static struct ringbuf_bench *ringbuf_setup_skeleton() +static struct ringbuf_bench *ringbuf_setup_skeleton(void) { struct ringbuf_bench *skel; @@ -167,7 +167,7 @@ static int buf_process_sample(void *ctx, void *data, size_t len) return 0; } -static void ringbuf_libbpf_setup() +static void ringbuf_libbpf_setup(void) { struct ringbuf_libbpf_ctx *ctx = &ringbuf_libbpf_ctx; struct bpf_link *link; @@ -223,7 +223,7 @@ static void ringbuf_custom_measure(struct bench_res *res) res->drops = atomic_swap(&ctx->skel->bss->dropped, 0); } -static void ringbuf_custom_setup() +static void ringbuf_custom_setup(void) { struct ringbuf_custom_ctx *ctx = &ringbuf_custom_ctx; const size_t page_size = getpagesize(); @@ -352,7 +352,7 @@ static void perfbuf_measure(struct bench_res *res) res->drops = atomic_swap(&ctx->skel->bss->dropped, 0); } -static struct perfbuf_bench *perfbuf_setup_skeleton() +static struct perfbuf_bench *perfbuf_setup_skeleton(void) { struct perfbuf_bench *skel; @@ -390,7 +390,7 @@ perfbuf_process_sample_raw(void *input_ctx, int cpu, return LIBBPF_PERF_EVENT_CONT; } -static void perfbuf_libbpf_setup() +static void perfbuf_libbpf_setup(void) { struct perfbuf_libbpf_ctx *ctx = &perfbuf_libbpf_ctx; struct perf_event_attr attr; diff --git a/tools/testing/selftests/bpf/benchs/bench_strncmp.c b/tools/testing/selftests/bpf/benchs/bench_strncmp.c new file mode 100644 index 000000000000..494b591c0289 --- /dev/null +++ b/tools/testing/selftests/bpf/benchs/bench_strncmp.c @@ -0,0 +1,161 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2021. Huawei Technologies Co., Ltd */ +#include <argp.h> +#include "bench.h" +#include "strncmp_bench.skel.h" + +static struct strncmp_ctx { + struct strncmp_bench *skel; +} ctx; + +static struct strncmp_args { + u32 cmp_str_len; +} args = { + .cmp_str_len = 32, +}; + +enum { + ARG_CMP_STR_LEN = 5000, +}; + +static const struct argp_option opts[] = { + { "cmp-str-len", ARG_CMP_STR_LEN, "CMP_STR_LEN", 0, + "Set the length of compared string" }, + {}, +}; + +static error_t strncmp_parse_arg(int key, char *arg, struct argp_state *state) +{ + switch (key) { + case ARG_CMP_STR_LEN: + args.cmp_str_len = strtoul(arg, NULL, 10); + if (!args.cmp_str_len || + args.cmp_str_len >= sizeof(ctx.skel->bss->str)) { + fprintf(stderr, "Invalid cmp str len (limit %zu)\n", + sizeof(ctx.skel->bss->str)); + argp_usage(state); + } + break; + default: + return ARGP_ERR_UNKNOWN; + } + + return 0; +} + +const struct argp bench_strncmp_argp = { + .options = opts, + .parser = strncmp_parse_arg, +}; + +static void strncmp_validate(void) +{ + if (env.consumer_cnt != 1) { + fprintf(stderr, "strncmp benchmark doesn't support multi-consumer!\n"); + exit(1); + } +} + +static void strncmp_setup(void) +{ + int err; + char *target; + size_t i, sz; + + sz = sizeof(ctx.skel->rodata->target); + if (!sz || sz < sizeof(ctx.skel->bss->str)) { + fprintf(stderr, "invalid string size (target %zu, src %zu)\n", + sz, sizeof(ctx.skel->bss->str)); + exit(1); + } + + setup_libbpf(); + + ctx.skel = strncmp_bench__open(); + if (!ctx.skel) { + fprintf(stderr, "failed to open skeleton\n"); + exit(1); + } + + srandom(time(NULL)); + target = ctx.skel->rodata->target; + for (i = 0; i < sz - 1; i++) + target[i] = '1' + random() % 9; + target[sz - 1] = '\0'; + + ctx.skel->rodata->cmp_str_len = args.cmp_str_len; + + memcpy(ctx.skel->bss->str, target, args.cmp_str_len); + ctx.skel->bss->str[args.cmp_str_len] = '\0'; + /* Make bss->str < rodata->target */ + ctx.skel->bss->str[args.cmp_str_len - 1] -= 1; + + err = strncmp_bench__load(ctx.skel); + if (err) { + fprintf(stderr, "failed to load skeleton\n"); + strncmp_bench__destroy(ctx.skel); + exit(1); + } +} + +static void strncmp_attach_prog(struct bpf_program *prog) +{ + struct bpf_link *link; + + link = bpf_program__attach(prog); + if (!link) { + fprintf(stderr, "failed to attach program!\n"); + exit(1); + } +} + +static void strncmp_no_helper_setup(void) +{ + strncmp_setup(); + strncmp_attach_prog(ctx.skel->progs.strncmp_no_helper); +} + +static void strncmp_helper_setup(void) +{ + strncmp_setup(); + strncmp_attach_prog(ctx.skel->progs.strncmp_helper); +} + +static void *strncmp_producer(void *ctx) +{ + while (true) + (void)syscall(__NR_getpgid); + return NULL; +} + +static void *strncmp_consumer(void *ctx) +{ + return NULL; +} + +static void strncmp_measure(struct bench_res *res) +{ + res->hits = atomic_swap(&ctx.skel->bss->hits, 0); +} + +const struct bench bench_strncmp_no_helper = { + .name = "strncmp-no-helper", + .validate = strncmp_validate, + .setup = strncmp_no_helper_setup, + .producer_thread = strncmp_producer, + .consumer_thread = strncmp_consumer, + .measure = strncmp_measure, + .report_progress = hits_drops_report_progress, + .report_final = hits_drops_report_final, +}; + +const struct bench bench_strncmp_helper = { + .name = "strncmp-helper", + .validate = strncmp_validate, + .setup = strncmp_helper_setup, + .producer_thread = strncmp_producer, + .consumer_thread = strncmp_consumer, + .measure = strncmp_measure, + .report_progress = hits_drops_report_progress, + .report_final = hits_drops_report_final, +}; diff --git a/tools/testing/selftests/bpf/benchs/bench_trigger.c b/tools/testing/selftests/bpf/benchs/bench_trigger.c index 049a5ad56f65..7f957c55a3ca 100644 --- a/tools/testing/selftests/bpf/benchs/bench_trigger.c +++ b/tools/testing/selftests/bpf/benchs/bench_trigger.c @@ -11,7 +11,7 @@ static struct trigger_ctx { static struct counter base_hits; -static void trigger_validate() +static void trigger_validate(void) { if (env.consumer_cnt != 1) { fprintf(stderr, "benchmark doesn't support multi-consumer!\n"); @@ -45,7 +45,7 @@ static void trigger_measure(struct bench_res *res) res->hits = atomic_swap(&ctx.skel->bss->hits, 0); } -static void setup_ctx() +static void setup_ctx(void) { setup_libbpf(); @@ -67,37 +67,37 @@ static void attach_bpf(struct bpf_program *prog) } } -static void trigger_tp_setup() +static void trigger_tp_setup(void) { setup_ctx(); attach_bpf(ctx.skel->progs.bench_trigger_tp); } -static void trigger_rawtp_setup() +static void trigger_rawtp_setup(void) { setup_ctx(); attach_bpf(ctx.skel->progs.bench_trigger_raw_tp); } -static void trigger_kprobe_setup() +static void trigger_kprobe_setup(void) { setup_ctx(); attach_bpf(ctx.skel->progs.bench_trigger_kprobe); } -static void trigger_fentry_setup() +static void trigger_fentry_setup(void) { setup_ctx(); attach_bpf(ctx.skel->progs.bench_trigger_fentry); } -static void trigger_fentry_sleep_setup() +static void trigger_fentry_sleep_setup(void) { setup_ctx(); attach_bpf(ctx.skel->progs.bench_trigger_fentry_sleep); } -static void trigger_fmodret_setup() +static void trigger_fmodret_setup(void) { setup_ctx(); attach_bpf(ctx.skel->progs.bench_trigger_fmodret); @@ -183,22 +183,22 @@ static void usetup(bool use_retprobe, bool use_nop) ctx.skel->links.bench_trigger_uprobe = link; } -static void uprobe_setup_with_nop() +static void uprobe_setup_with_nop(void) { usetup(false, true); } -static void uretprobe_setup_with_nop() +static void uretprobe_setup_with_nop(void) { usetup(true, true); } -static void uprobe_setup_without_nop() +static void uprobe_setup_without_nop(void) { usetup(false, false); } -static void uretprobe_setup_without_nop() +static void uretprobe_setup_without_nop(void) { usetup(true, false); } diff --git a/tools/testing/selftests/bpf/benchs/run_bench_strncmp.sh b/tools/testing/selftests/bpf/benchs/run_bench_strncmp.sh new file mode 100755 index 000000000000..142697284b45 --- /dev/null +++ b/tools/testing/selftests/bpf/benchs/run_bench_strncmp.sh @@ -0,0 +1,12 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +source ./benchs/run_common.sh + +set -eufo pipefail + +for s in 1 8 64 512 2048 4095; do + for b in no-helper helper; do + summarize ${b}-${s} "$($RUN_BENCH --cmp-str-len=$s strncmp-${b})" + done +done diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config index 5192305159ec..f6287132fa89 100644 --- a/tools/testing/selftests/bpf/config +++ b/tools/testing/selftests/bpf/config @@ -38,7 +38,9 @@ CONFIG_IPV6_SIT=m CONFIG_BPF_JIT=y CONFIG_BPF_LSM=y CONFIG_SECURITY=y +CONFIG_RC_CORE=y CONFIG_LIRC=y +CONFIG_BPF_LIRC_MODE2=y CONFIG_IMA=y CONFIG_SECURITYFS=y CONFIG_IMA_WRITE_POLICY=y diff --git a/tools/testing/selftests/bpf/prog_tests/align.c b/tools/testing/selftests/bpf/prog_tests/align.c index 837f67c6bfda..0ee29e11eaee 100644 --- a/tools/testing/selftests/bpf/prog_tests/align.c +++ b/tools/testing/selftests/bpf/prog_tests/align.c @@ -39,13 +39,13 @@ static struct bpf_align_test tests[] = { }, .prog_type = BPF_PROG_TYPE_SCHED_CLS, .matches = { - {1, "R1=ctx(id=0,off=0,imm=0)"}, - {1, "R10=fp0"}, - {1, "R3_w=inv2"}, - {2, "R3_w=inv4"}, - {3, "R3_w=inv8"}, - {4, "R3_w=inv16"}, - {5, "R3_w=inv32"}, + {0, "R1=ctx(id=0,off=0,imm=0)"}, + {0, "R10=fp0"}, + {0, "R3_w=inv2"}, + {1, "R3_w=inv4"}, + {2, "R3_w=inv8"}, + {3, "R3_w=inv16"}, + {4, "R3_w=inv32"}, }, }, { @@ -67,19 +67,19 @@ static struct bpf_align_test tests[] = { }, .prog_type = BPF_PROG_TYPE_SCHED_CLS, .matches = { - {1, "R1=ctx(id=0,off=0,imm=0)"}, - {1, "R10=fp0"}, - {1, "R3_w=inv1"}, - {2, "R3_w=inv2"}, - {3, "R3_w=inv4"}, - {4, "R3_w=inv8"}, - {5, "R3_w=inv16"}, - {6, "R3_w=inv1"}, - {7, "R4_w=inv32"}, - {8, "R4_w=inv16"}, - {9, "R4_w=inv8"}, - {10, "R4_w=inv4"}, - {11, "R4_w=inv2"}, + {0, "R1=ctx(id=0,off=0,imm=0)"}, + {0, "R10=fp0"}, + {0, "R3_w=inv1"}, + {1, "R3_w=inv2"}, + {2, "R3_w=inv4"}, + {3, "R3_w=inv8"}, + {4, "R3_w=inv16"}, + {5, "R3_w=inv1"}, + {6, "R4_w=inv32"}, + {7, "R4_w=inv16"}, + {8, "R4_w=inv8"}, + {9, "R4_w=inv4"}, + {10, "R4_w=inv2"}, }, }, { @@ -96,14 +96,14 @@ static struct bpf_align_test tests[] = { }, .prog_type = BPF_PROG_TYPE_SCHED_CLS, .matches = { - {1, "R1=ctx(id=0,off=0,imm=0)"}, - {1, "R10=fp0"}, - {1, "R3_w=inv4"}, - {2, "R3_w=inv8"}, - {3, "R3_w=inv10"}, - {4, "R4_w=inv8"}, - {5, "R4_w=inv12"}, - {6, "R4_w=inv14"}, + {0, "R1=ctx(id=0,off=0,imm=0)"}, + {0, "R10=fp0"}, + {0, "R3_w=inv4"}, + {1, "R3_w=inv8"}, + {2, "R3_w=inv10"}, + {3, "R4_w=inv8"}, + {4, "R4_w=inv12"}, + {5, "R4_w=inv14"}, }, }, { @@ -118,12 +118,12 @@ static struct bpf_align_test tests[] = { }, .prog_type = BPF_PROG_TYPE_SCHED_CLS, .matches = { - {1, "R1=ctx(id=0,off=0,imm=0)"}, - {1, "R10=fp0"}, + {0, "R1=ctx(id=0,off=0,imm=0)"}, + {0, "R10=fp0"}, + {0, "R3_w=inv7"}, {1, "R3_w=inv7"}, - {2, "R3_w=inv7"}, - {3, "R3_w=inv14"}, - {4, "R3_w=inv56"}, + {2, "R3_w=inv14"}, + {3, "R3_w=inv56"}, }, }, @@ -161,19 +161,19 @@ static struct bpf_align_test tests[] = { }, .prog_type = BPF_PROG_TYPE_SCHED_CLS, .matches = { - {7, "R0_w=pkt(id=0,off=8,r=8,imm=0)"}, - {7, "R3_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"}, - {8, "R3_w=inv(id=0,umax_value=510,var_off=(0x0; 0x1fe))"}, - {9, "R3_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, - {10, "R3_w=inv(id=0,umax_value=2040,var_off=(0x0; 0x7f8))"}, - {11, "R3_w=inv(id=0,umax_value=4080,var_off=(0x0; 0xff0))"}, - {18, "R3=pkt_end(id=0,off=0,imm=0)"}, - {18, "R4_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"}, - {19, "R4_w=inv(id=0,umax_value=8160,var_off=(0x0; 0x1fe0))"}, - {20, "R4_w=inv(id=0,umax_value=4080,var_off=(0x0; 0xff0))"}, - {21, "R4_w=inv(id=0,umax_value=2040,var_off=(0x0; 0x7f8))"}, - {22, "R4_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, - {23, "R4_w=inv(id=0,umax_value=510,var_off=(0x0; 0x1fe))"}, + {6, "R0_w=pkt(id=0,off=8,r=8,imm=0)"}, + {6, "R3_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"}, + {7, "R3_w=inv(id=0,umax_value=510,var_off=(0x0; 0x1fe))"}, + {8, "R3_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, + {9, "R3_w=inv(id=0,umax_value=2040,var_off=(0x0; 0x7f8))"}, + {10, "R3_w=inv(id=0,umax_value=4080,var_off=(0x0; 0xff0))"}, + {12, "R3_w=pkt_end(id=0,off=0,imm=0)"}, + {17, "R4_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"}, + {18, "R4_w=inv(id=0,umax_value=8160,var_off=(0x0; 0x1fe0))"}, + {19, "R4_w=inv(id=0,umax_value=4080,var_off=(0x0; 0xff0))"}, + {20, "R4_w=inv(id=0,umax_value=2040,var_off=(0x0; 0x7f8))"}, + {21, "R4_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, + {22, "R4_w=inv(id=0,umax_value=510,var_off=(0x0; 0x1fe))"}, }, }, { @@ -194,16 +194,16 @@ static struct bpf_align_test tests[] = { }, .prog_type = BPF_PROG_TYPE_SCHED_CLS, .matches = { - {7, "R3_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"}, - {8, "R4_w=inv(id=1,umax_value=255,var_off=(0x0; 0xff))"}, - {9, "R4_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"}, - {10, "R4_w=inv(id=1,umax_value=255,var_off=(0x0; 0xff))"}, - {11, "R4_w=inv(id=0,umax_value=510,var_off=(0x0; 0x1fe))"}, - {12, "R4_w=inv(id=1,umax_value=255,var_off=(0x0; 0xff))"}, - {13, "R4_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, - {14, "R4_w=inv(id=1,umax_value=255,var_off=(0x0; 0xff))"}, - {15, "R4_w=inv(id=0,umax_value=2040,var_off=(0x0; 0x7f8))"}, - {16, "R4_w=inv(id=0,umax_value=4080,var_off=(0x0; 0xff0))"}, + {6, "R3_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"}, + {7, "R4_w=inv(id=1,umax_value=255,var_off=(0x0; 0xff))"}, + {8, "R4_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"}, + {9, "R4_w=inv(id=1,umax_value=255,var_off=(0x0; 0xff))"}, + {10, "R4_w=inv(id=0,umax_value=510,var_off=(0x0; 0x1fe))"}, + {11, "R4_w=inv(id=1,umax_value=255,var_off=(0x0; 0xff))"}, + {12, "R4_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, + {13, "R4_w=inv(id=1,umax_value=255,var_off=(0x0; 0xff))"}, + {14, "R4_w=inv(id=0,umax_value=2040,var_off=(0x0; 0x7f8))"}, + {15, "R4_w=inv(id=0,umax_value=4080,var_off=(0x0; 0xff0))"}, }, }, { @@ -234,14 +234,14 @@ static struct bpf_align_test tests[] = { }, .prog_type = BPF_PROG_TYPE_SCHED_CLS, .matches = { - {4, "R5_w=pkt(id=0,off=0,r=0,imm=0)"}, - {5, "R5_w=pkt(id=0,off=14,r=0,imm=0)"}, - {6, "R4_w=pkt(id=0,off=14,r=0,imm=0)"}, - {10, "R2=pkt(id=0,off=0,r=18,imm=0)"}, + {2, "R5_w=pkt(id=0,off=0,r=0,imm=0)"}, + {4, "R5_w=pkt(id=0,off=14,r=0,imm=0)"}, + {5, "R4_w=pkt(id=0,off=14,r=0,imm=0)"}, + {9, "R2=pkt(id=0,off=0,r=18,imm=0)"}, {10, "R5=pkt(id=0,off=14,r=18,imm=0)"}, {10, "R4_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"}, + {13, "R4_w=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff))"}, {14, "R4_w=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff))"}, - {15, "R4_w=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff))"}, }, }, { @@ -296,8 +296,8 @@ static struct bpf_align_test tests[] = { /* Calculated offset in R6 has unknown value, but known * alignment of 4. */ - {8, "R2_w=pkt(id=0,off=0,r=8,imm=0)"}, - {8, "R6_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, + {6, "R2_w=pkt(id=0,off=0,r=8,imm=0)"}, + {7, "R6_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, /* Offset is added to packet pointer R5, resulting in * known fixed offset, and variable offset from R6. */ @@ -313,11 +313,11 @@ static struct bpf_align_test tests[] = { /* Variable offset is added to R5 packet pointer, * resulting in auxiliary alignment of 4. */ - {18, "R5_w=pkt(id=2,off=0,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, + {17, "R5_w=pkt(id=2,off=0,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, /* Constant offset is added to R5, resulting in * reg->off of 14. */ - {19, "R5_w=pkt(id=2,off=14,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, + {18, "R5_w=pkt(id=2,off=14,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, /* At the time the word size load is performed from R5, * its total fixed offset is NET_IP_ALIGN + reg->off * (14) which is 16. Then the variable offset is 4-byte @@ -329,18 +329,18 @@ static struct bpf_align_test tests[] = { /* Constant offset is added to R5 packet pointer, * resulting in reg->off value of 14. */ - {26, "R5_w=pkt(id=0,off=14,r=8"}, + {25, "R5_w=pkt(id=0,off=14,r=8"}, /* Variable offset is added to R5, resulting in a * variable offset of (4n). */ - {27, "R5_w=pkt(id=3,off=14,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, + {26, "R5_w=pkt(id=3,off=14,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, /* Constant is added to R5 again, setting reg->off to 18. */ - {28, "R5_w=pkt(id=3,off=18,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, + {27, "R5_w=pkt(id=3,off=18,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, /* And once more we add a variable; resulting var_off * is still (4n), fixed offset is not changed. * Also, we create a new reg->id. */ - {29, "R5_w=pkt(id=4,off=18,r=0,umax_value=2040,var_off=(0x0; 0x7fc)"}, + {28, "R5_w=pkt(id=4,off=18,r=0,umax_value=2040,var_off=(0x0; 0x7fc)"}, /* At the time the word size load is performed from R5, * its total fixed offset is NET_IP_ALIGN + reg->off (18) * which is 20. Then the variable offset is (4n), so @@ -386,13 +386,13 @@ static struct bpf_align_test tests[] = { /* Calculated offset in R6 has unknown value, but known * alignment of 4. */ - {8, "R2_w=pkt(id=0,off=0,r=8,imm=0)"}, - {8, "R6_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, + {6, "R2_w=pkt(id=0,off=0,r=8,imm=0)"}, + {7, "R6_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, /* Adding 14 makes R6 be (4n+2) */ - {9, "R6_w=inv(id=0,umin_value=14,umax_value=1034,var_off=(0x2; 0x7fc))"}, + {8, "R6_w=inv(id=0,umin_value=14,umax_value=1034,var_off=(0x2; 0x7fc))"}, /* Packet pointer has (4n+2) offset */ {11, "R5_w=pkt(id=1,off=0,r=0,umin_value=14,umax_value=1034,var_off=(0x2; 0x7fc)"}, - {13, "R4=pkt(id=1,off=4,r=0,umin_value=14,umax_value=1034,var_off=(0x2; 0x7fc)"}, + {12, "R4=pkt(id=1,off=4,r=0,umin_value=14,umax_value=1034,var_off=(0x2; 0x7fc)"}, /* At the time the word size load is performed from R5, * its total fixed offset is NET_IP_ALIGN + reg->off (0) * which is 2. Then the variable offset is (4n+2), so @@ -403,12 +403,12 @@ static struct bpf_align_test tests[] = { /* Newly read value in R6 was shifted left by 2, so has * known alignment of 4. */ - {18, "R6_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, + {17, "R6_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, /* Added (4n) to packet pointer's (4n+2) var_off, giving * another (4n+2). */ {19, "R5_w=pkt(id=2,off=0,r=0,umin_value=14,umax_value=2054,var_off=(0x2; 0xffc)"}, - {21, "R4=pkt(id=2,off=4,r=0,umin_value=14,umax_value=2054,var_off=(0x2; 0xffc)"}, + {20, "R4=pkt(id=2,off=4,r=0,umin_value=14,umax_value=2054,var_off=(0x2; 0xffc)"}, /* At the time the word size load is performed from R5, * its total fixed offset is NET_IP_ALIGN + reg->off (0) * which is 2. Then the variable offset is (4n+2), so @@ -448,18 +448,18 @@ static struct bpf_align_test tests[] = { .prog_type = BPF_PROG_TYPE_SCHED_CLS, .result = REJECT, .matches = { - {4, "R5_w=pkt_end(id=0,off=0,imm=0)"}, + {3, "R5_w=pkt_end(id=0,off=0,imm=0)"}, /* (ptr - ptr) << 2 == unknown, (4n) */ - {6, "R5_w=inv(id=0,smax_value=9223372036854775804,umax_value=18446744073709551612,var_off=(0x0; 0xfffffffffffffffc)"}, + {5, "R5_w=inv(id=0,smax_value=9223372036854775804,umax_value=18446744073709551612,var_off=(0x0; 0xfffffffffffffffc)"}, /* (4n) + 14 == (4n+2). We blow our bounds, because * the add could overflow. */ - {7, "R5_w=inv(id=0,smin_value=-9223372036854775806,smax_value=9223372036854775806,umin_value=2,umax_value=18446744073709551614,var_off=(0x2; 0xfffffffffffffffc)"}, + {6, "R5_w=inv(id=0,smin_value=-9223372036854775806,smax_value=9223372036854775806,umin_value=2,umax_value=18446744073709551614,var_off=(0x2; 0xfffffffffffffffc)"}, /* Checked s>=0 */ {9, "R5=inv(id=0,umin_value=2,umax_value=9223372036854775806,var_off=(0x2; 0x7ffffffffffffffc)"}, /* packet pointer + nonnegative (4n+2) */ {11, "R6_w=pkt(id=1,off=0,r=0,umin_value=2,umax_value=9223372036854775806,var_off=(0x2; 0x7ffffffffffffffc)"}, - {13, "R4_w=pkt(id=1,off=4,r=0,umin_value=2,umax_value=9223372036854775806,var_off=(0x2; 0x7ffffffffffffffc)"}, + {12, "R4_w=pkt(id=1,off=4,r=0,umin_value=2,umax_value=9223372036854775806,var_off=(0x2; 0x7ffffffffffffffc)"}, /* NET_IP_ALIGN + (4n+2) == (4n), alignment is fine. * We checked the bounds, but it might have been able * to overflow if the packet pointer started in the @@ -502,14 +502,14 @@ static struct bpf_align_test tests[] = { /* Calculated offset in R6 has unknown value, but known * alignment of 4. */ - {7, "R2_w=pkt(id=0,off=0,r=8,imm=0)"}, - {9, "R6_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, + {6, "R2_w=pkt(id=0,off=0,r=8,imm=0)"}, + {8, "R6_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, /* Adding 14 makes R6 be (4n+2) */ - {10, "R6_w=inv(id=0,umin_value=14,umax_value=1034,var_off=(0x2; 0x7fc))"}, + {9, "R6_w=inv(id=0,umin_value=14,umax_value=1034,var_off=(0x2; 0x7fc))"}, /* New unknown value in R7 is (4n) */ - {11, "R7_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, + {10, "R7_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"}, /* Subtracting it from R6 blows our unsigned bounds */ - {12, "R6=inv(id=0,smin_value=-1006,smax_value=1034,umin_value=2,umax_value=18446744073709551614,var_off=(0x2; 0xfffffffffffffffc)"}, + {11, "R6=inv(id=0,smin_value=-1006,smax_value=1034,umin_value=2,umax_value=18446744073709551614,var_off=(0x2; 0xfffffffffffffffc)"}, /* Checked s>= 0 */ {14, "R6=inv(id=0,umin_value=2,umax_value=1034,var_off=(0x2; 0x7fc))"}, /* At the time the word size load is performed from R5, @@ -556,14 +556,14 @@ static struct bpf_align_test tests[] = { /* Calculated offset in R6 has unknown value, but known * alignment of 4. */ - {7, "R2_w=pkt(id=0,off=0,r=8,imm=0)"}, - {10, "R6_w=inv(id=0,umax_value=60,var_off=(0x0; 0x3c))"}, + {6, "R2_w=pkt(id=0,off=0,r=8,imm=0)"}, + {9, "R6_w=inv(id=0,umax_value=60,var_off=(0x0; 0x3c))"}, /* Adding 14 makes R6 be (4n+2) */ - {11, "R6_w=inv(id=0,umin_value=14,umax_value=74,var_off=(0x2; 0x7c))"}, + {10, "R6_w=inv(id=0,umin_value=14,umax_value=74,var_off=(0x2; 0x7c))"}, /* Subtracting from packet pointer overflows ubounds */ {13, "R5_w=pkt(id=2,off=0,r=8,umin_value=18446744073709551542,umax_value=18446744073709551602,var_off=(0xffffffffffffff82; 0x7c)"}, /* New unknown value in R7 is (4n), >= 76 */ - {15, "R7_w=inv(id=0,umin_value=76,umax_value=1096,var_off=(0x0; 0x7fc))"}, + {14, "R7_w=inv(id=0,umin_value=76,umax_value=1096,var_off=(0x0; 0x7fc))"}, /* Adding it to packet pointer gives nice bounds again */ {16, "R5_w=pkt(id=3,off=0,r=0,umin_value=2,umax_value=1082,var_off=(0x2; 0xfffffffc)"}, /* At the time the word size load is performed from R5, @@ -625,12 +625,15 @@ static int do_test_single(struct bpf_align_test *test) line_ptr = strtok(bpf_vlog_copy, "\n"); for (i = 0; i < MAX_MATCHES; i++) { struct bpf_reg_match m = test->matches[i]; + int tmp; if (!m.match) break; while (line_ptr) { cur_line = -1; sscanf(line_ptr, "%u: ", &cur_line); + if (cur_line == -1) + sscanf(line_ptr, "from %u to %u: ", &tmp, &cur_line); if (cur_line == m.line) break; line_ptr = strtok(NULL, "\n"); @@ -642,7 +645,19 @@ static int do_test_single(struct bpf_align_test *test) printf("%s", bpf_vlog); break; } + /* Check the next line as well in case the previous line + * did not have a corresponding bpf insn. Example: + * func#0 @0 + * 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 + * 0: (b7) r3 = 2 ; R3_w=inv2 + */ if (!strstr(line_ptr, m.match)) { + cur_line = -1; + line_ptr = strtok(NULL, "\n"); + sscanf(line_ptr, "%u: ", &cur_line); + } + if (cur_line != m.line || !line_ptr || + !strstr(line_ptr, m.match)) { printf("Failed to find match %u: %s\n", m.line, m.match); ret = 1; diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c b/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c index 0a6c5f00abd4..dbe56fa8582d 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c @@ -65,8 +65,8 @@ void serial_test_bpf_obj_id(void) if (CHECK_FAIL(err)) goto done; - prog = bpf_object__find_program_by_title(objs[i], - "raw_tp/sys_enter"); + prog = bpf_object__find_program_by_name(objs[i], + "test_obj_id"); if (CHECK_FAIL(!prog)) goto done; links[i] = bpf_program__attach(prog); diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c index 8daca0ac909f..8f7a1cef7d87 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c @@ -217,7 +217,7 @@ static bool found; static int libbpf_debug_print(enum libbpf_print_level level, const char *format, va_list args) { - const char *log_buf; + const char *prog_name, *log_buf; if (level != LIBBPF_WARN || !strstr(format, "-- BEGIN PROG LOAD LOG --")) { @@ -225,15 +225,14 @@ static int libbpf_debug_print(enum libbpf_print_level level, return 0; } - /* skip prog_name */ - va_arg(args, char *); + prog_name = va_arg(args, char *); log_buf = va_arg(args, char *); if (!log_buf) goto out; if (err_str && strstr(log_buf, err_str) != NULL) found = true; out: - printf(format, log_buf); + printf(format, prog_name, log_buf); return 0; } diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c index 01b776a7beeb..8ba53acf9eb4 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf.c +++ b/tools/testing/selftests/bpf/prog_tests/btf.c @@ -22,7 +22,6 @@ #include <bpf/libbpf.h> #include <bpf/btf.h> -#include "bpf_rlimit.h" #include "bpf_util.h" #include "../test_btf.h" #include "test_progs.h" diff --git a/tools/testing/selftests/bpf/prog_tests/connect_force_port.c b/tools/testing/selftests/bpf/prog_tests/connect_force_port.c index ca574e1e30e6..9c4325f4aef2 100644 --- a/tools/testing/selftests/bpf/prog_tests/connect_force_port.c +++ b/tools/testing/selftests/bpf/prog_tests/connect_force_port.c @@ -67,9 +67,9 @@ static int run_test(int cgroup_fd, int server_fd, int family, int type) goto close_bpf_object; } - prog = bpf_object__find_program_by_title(obj, v4 ? - "cgroup/connect4" : - "cgroup/connect6"); + prog = bpf_object__find_program_by_name(obj, v4 ? + "connect4" : + "connect6"); if (CHECK(!prog, "find_prog", "connect prog not found\n")) { err = -EIO; goto close_bpf_object; @@ -83,9 +83,9 @@ static int run_test(int cgroup_fd, int server_fd, int family, int type) goto close_bpf_object; } - prog = bpf_object__find_program_by_title(obj, v4 ? - "cgroup/getpeername4" : - "cgroup/getpeername6"); + prog = bpf_object__find_program_by_name(obj, v4 ? + "getpeername4" : + "getpeername6"); if (CHECK(!prog, "find_prog", "getpeername prog not found\n")) { err = -EIO; goto close_bpf_object; @@ -99,9 +99,9 @@ static int run_test(int cgroup_fd, int server_fd, int family, int type) goto close_bpf_object; } - prog = bpf_object__find_program_by_title(obj, v4 ? - "cgroup/getsockname4" : - "cgroup/getsockname6"); + prog = bpf_object__find_program_by_name(obj, v4 ? + "getsockname4" : + "getsockname6"); if (CHECK(!prog, "find_prog", "getsockname prog not found\n")) { err = -EIO; goto close_bpf_object; diff --git a/tools/testing/selftests/bpf/prog_tests/core_reloc.c b/tools/testing/selftests/bpf/prog_tests/core_reloc.c index 44a9868c70ea..b8bdd1c3efca 100644 --- a/tools/testing/selftests/bpf/prog_tests/core_reloc.c +++ b/tools/testing/selftests/bpf/prog_tests/core_reloc.c @@ -10,7 +10,7 @@ static int duration = 0; #define STRUCT_TO_CHAR_PTR(struct_name) (const char *)&(struct struct_name) -#define MODULES_CASE(name, sec_name, tp_name) { \ +#define MODULES_CASE(name, pg_name, tp_name) { \ .case_name = name, \ .bpf_obj_file = "test_core_reloc_module.o", \ .btf_src_file = NULL, /* find in kernel module BTFs */ \ @@ -28,7 +28,7 @@ static int duration = 0; .comm_len = sizeof("test_progs"), \ }, \ .output_len = sizeof(struct core_reloc_module_output), \ - .prog_sec_name = sec_name, \ + .prog_name = pg_name, \ .raw_tp_name = tp_name, \ .trigger = __trigger_module_test_read, \ .needs_testmod = true, \ @@ -43,7 +43,9 @@ static int duration = 0; #define FLAVORS_CASE_COMMON(name) \ .case_name = #name, \ .bpf_obj_file = "test_core_reloc_flavors.o", \ - .btf_src_file = "btf__core_reloc_" #name ".o" \ + .btf_src_file = "btf__core_reloc_" #name ".o", \ + .raw_tp_name = "sys_enter", \ + .prog_name = "test_core_flavors" \ #define FLAVORS_CASE(name) { \ FLAVORS_CASE_COMMON(name), \ @@ -66,7 +68,9 @@ static int duration = 0; #define NESTING_CASE_COMMON(name) \ .case_name = #name, \ .bpf_obj_file = "test_core_reloc_nesting.o", \ - .btf_src_file = "btf__core_reloc_" #name ".o" + .btf_src_file = "btf__core_reloc_" #name ".o", \ + .raw_tp_name = "sys_enter", \ + .prog_name = "test_core_nesting" \ #define NESTING_CASE(name) { \ NESTING_CASE_COMMON(name), \ @@ -91,7 +95,9 @@ static int duration = 0; #define ARRAYS_CASE_COMMON(name) \ .case_name = #name, \ .bpf_obj_file = "test_core_reloc_arrays.o", \ - .btf_src_file = "btf__core_reloc_" #name ".o" + .btf_src_file = "btf__core_reloc_" #name ".o", \ + .raw_tp_name = "sys_enter", \ + .prog_name = "test_core_arrays" \ #define ARRAYS_CASE(name) { \ ARRAYS_CASE_COMMON(name), \ @@ -123,7 +129,9 @@ static int duration = 0; #define PRIMITIVES_CASE_COMMON(name) \ .case_name = #name, \ .bpf_obj_file = "test_core_reloc_primitives.o", \ - .btf_src_file = "btf__core_reloc_" #name ".o" + .btf_src_file = "btf__core_reloc_" #name ".o", \ + .raw_tp_name = "sys_enter", \ + .prog_name = "test_core_primitives" \ #define PRIMITIVES_CASE(name) { \ PRIMITIVES_CASE_COMMON(name), \ @@ -158,6 +166,8 @@ static int duration = 0; .e = 5, .f = 6, .g = 7, .h = 8, \ }, \ .output_len = sizeof(struct core_reloc_mods_output), \ + .raw_tp_name = "sys_enter", \ + .prog_name = "test_core_mods", \ } #define PTR_AS_ARR_CASE(name) { \ @@ -174,6 +184,8 @@ static int duration = 0; .a = 3, \ }, \ .output_len = sizeof(struct core_reloc_ptr_as_arr), \ + .raw_tp_name = "sys_enter", \ + .prog_name = "test_core_ptr_as_arr", \ } #define INTS_DATA(struct_name) STRUCT_TO_CHAR_PTR(struct_name) { \ @@ -190,7 +202,9 @@ static int duration = 0; #define INTS_CASE_COMMON(name) \ .case_name = #name, \ .bpf_obj_file = "test_core_reloc_ints.o", \ - .btf_src_file = "btf__core_reloc_" #name ".o" + .btf_src_file = "btf__core_reloc_" #name ".o", \ + .raw_tp_name = "sys_enter", \ + .prog_name = "test_core_ints" #define INTS_CASE(name) { \ INTS_CASE_COMMON(name), \ @@ -208,7 +222,9 @@ static int duration = 0; #define FIELD_EXISTS_CASE_COMMON(name) \ .case_name = #name, \ .bpf_obj_file = "test_core_reloc_existence.o", \ - .btf_src_file = "btf__core_reloc_" #name ".o" \ + .btf_src_file = "btf__core_reloc_" #name ".o", \ + .raw_tp_name = "sys_enter", \ + .prog_name = "test_core_existence" #define BITFIELDS_CASE_COMMON(objfile, test_name_prefix, name) \ .case_name = test_name_prefix#name, \ @@ -223,6 +239,8 @@ static int duration = 0; .output = STRUCT_TO_CHAR_PTR(core_reloc_bitfields_output) \ __VA_ARGS__, \ .output_len = sizeof(struct core_reloc_bitfields_output), \ + .raw_tp_name = "sys_enter", \ + .prog_name = "test_core_bitfields", \ }, { \ BITFIELDS_CASE_COMMON("test_core_reloc_bitfields_direct.o", \ "direct:", name), \ @@ -231,7 +249,7 @@ static int duration = 0; .output = STRUCT_TO_CHAR_PTR(core_reloc_bitfields_output) \ __VA_ARGS__, \ .output_len = sizeof(struct core_reloc_bitfields_output), \ - .prog_sec_name = "tp_btf/sys_enter", \ + .prog_name = "test_core_bitfields_direct", \ } @@ -239,17 +257,21 @@ static int duration = 0; BITFIELDS_CASE_COMMON("test_core_reloc_bitfields_probed.o", \ "probed:", name), \ .fails = true, \ + .raw_tp_name = "sys_enter", \ + .prog_name = "test_core_bitfields", \ }, { \ BITFIELDS_CASE_COMMON("test_core_reloc_bitfields_direct.o", \ "direct:", name), \ - .prog_sec_name = "tp_btf/sys_enter", \ .fails = true, \ + .prog_name = "test_core_bitfields_direct", \ } #define SIZE_CASE_COMMON(name) \ .case_name = #name, \ .bpf_obj_file = "test_core_reloc_size.o", \ - .btf_src_file = "btf__core_reloc_" #name ".o" + .btf_src_file = "btf__core_reloc_" #name ".o", \ + .raw_tp_name = "sys_enter", \ + .prog_name = "test_core_size" #define SIZE_OUTPUT_DATA(type) \ STRUCT_TO_CHAR_PTR(core_reloc_size_output) { \ @@ -277,8 +299,10 @@ static int duration = 0; #define TYPE_BASED_CASE_COMMON(name) \ .case_name = #name, \ - .bpf_obj_file = "test_core_reloc_type_based.o", \ - .btf_src_file = "btf__core_reloc_" #name ".o" \ + .bpf_obj_file = "test_core_reloc_type_based.o", \ + .btf_src_file = "btf__core_reloc_" #name ".o", \ + .raw_tp_name = "sys_enter", \ + .prog_name = "test_core_type_based" #define TYPE_BASED_CASE(name, ...) { \ TYPE_BASED_CASE_COMMON(name), \ @@ -295,7 +319,9 @@ static int duration = 0; #define TYPE_ID_CASE_COMMON(name) \ .case_name = #name, \ .bpf_obj_file = "test_core_reloc_type_id.o", \ - .btf_src_file = "btf__core_reloc_" #name ".o" \ + .btf_src_file = "btf__core_reloc_" #name ".o", \ + .raw_tp_name = "sys_enter", \ + .prog_name = "test_core_type_id" #define TYPE_ID_CASE(name, setup_fn) { \ TYPE_ID_CASE_COMMON(name), \ @@ -312,7 +338,9 @@ static int duration = 0; #define ENUMVAL_CASE_COMMON(name) \ .case_name = #name, \ .bpf_obj_file = "test_core_reloc_enumval.o", \ - .btf_src_file = "btf__core_reloc_" #name ".o" \ + .btf_src_file = "btf__core_reloc_" #name ".o", \ + .raw_tp_name = "sys_enter", \ + .prog_name = "test_core_enumval" #define ENUMVAL_CASE(name, ...) { \ ENUMVAL_CASE_COMMON(name), \ @@ -342,7 +370,7 @@ struct core_reloc_test_case { bool fails; bool needs_testmod; bool relaxed_core_relocs; - const char *prog_sec_name; + const char *prog_name; const char *raw_tp_name; setup_test_fn setup; trigger_test_fn trigger; @@ -497,11 +525,13 @@ static struct core_reloc_test_case test_cases[] = { .comm_len = sizeof("test_progs"), }, .output_len = sizeof(struct core_reloc_kernel_output), + .raw_tp_name = "sys_enter", + .prog_name = "test_core_kernel", }, /* validate we can find kernel module BTF types for relocs/attach */ - MODULES_CASE("module_probed", "raw_tp/bpf_testmod_test_read", "bpf_testmod_test_read"), - MODULES_CASE("module_direct", "tp_btf/bpf_testmod_test_read", NULL), + MODULES_CASE("module_probed", "test_core_module_probed", "bpf_testmod_test_read"), + MODULES_CASE("module_direct", "test_core_module_direct", NULL), /* validate BPF program can use multiple flavors to match against * single target BTF type @@ -580,6 +610,8 @@ static struct core_reloc_test_case test_cases[] = { .c = 0, /* BUG in clang, should be 3 */ }, .output_len = sizeof(struct core_reloc_misc_output), + .raw_tp_name = "sys_enter", + .prog_name = "test_core_misc", }, /* validate field existence checks */ @@ -848,14 +880,9 @@ void test_core_reloc(void) if (!ASSERT_OK_PTR(obj, "obj_open")) goto cleanup; - probe_name = "raw_tracepoint/sys_enter"; - tp_name = "sys_enter"; - if (test_case->prog_sec_name) { - probe_name = test_case->prog_sec_name; - tp_name = test_case->raw_tp_name; /* NULL for tp_btf */ - } - - prog = bpf_object__find_program_by_title(obj, probe_name); + probe_name = test_case->prog_name; + tp_name = test_case->raw_tp_name; /* NULL for tp_btf */ + prog = bpf_object__find_program_by_name(obj, probe_name); if (CHECK(!prog, "find_probe", "prog '%s' not found\n", probe_name)) goto cleanup; diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c index fdd603ebda28..c52f99f6a909 100644 --- a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c +++ b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c @@ -101,6 +101,8 @@ static void test_fexit_bpf2bpf_common(const char *obj_file, for (i = 0; i < prog_cnt; i++) { struct bpf_link_info link_info; + struct bpf_program *pos; + const char *pos_sec_name; char *tgt_name; __s32 btf_id; @@ -109,7 +111,14 @@ static void test_fexit_bpf2bpf_common(const char *obj_file, goto close_prog; btf_id = btf__find_by_name_kind(btf, tgt_name + 1, BTF_KIND_FUNC); - prog[i] = bpf_object__find_program_by_title(obj, prog_name[i]); + prog[i] = NULL; + bpf_object__for_each_program(pos, obj) { + pos_sec_name = bpf_program__section_name(pos); + if (pos_sec_name && !strcmp(pos_sec_name, prog_name[i])) { + prog[i] = pos; + break; + } + } if (!ASSERT_OK_PTR(prog[i], prog_name[i])) goto close_prog; @@ -211,8 +220,8 @@ static void test_func_replace_verify(void) static int test_second_attach(struct bpf_object *obj) { - const char *prog_name = "freplace/get_constant"; - const char *tgt_name = prog_name + 9; /* cut off freplace/ */ + const char *prog_name = "security_new_get_constant"; + const char *tgt_name = "get_constant"; const char *tgt_obj_file = "./test_pkt_access.o"; struct bpf_program *prog = NULL; struct bpf_object *tgt_obj; @@ -220,7 +229,7 @@ static int test_second_attach(struct bpf_object *obj) struct bpf_link *link; int err = 0, tgt_fd; - prog = bpf_object__find_program_by_title(obj, prog_name); + prog = bpf_object__find_program_by_name(obj, prog_name); if (CHECK(!prog, "find_prog", "prog %s not found\n", prog_name)) return -ENOENT; diff --git a/tools/testing/selftests/bpf/prog_tests/get_func_args_test.c b/tools/testing/selftests/bpf/prog_tests/get_func_args_test.c new file mode 100644 index 000000000000..85c427119fe9 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/get_func_args_test.c @@ -0,0 +1,44 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <test_progs.h> +#include "get_func_args_test.skel.h" + +void test_get_func_args_test(void) +{ + struct get_func_args_test *skel = NULL; + __u32 duration = 0, retval; + int err, prog_fd; + + skel = get_func_args_test__open_and_load(); + if (!ASSERT_OK_PTR(skel, "get_func_args_test__open_and_load")) + return; + + err = get_func_args_test__attach(skel); + if (!ASSERT_OK(err, "get_func_args_test__attach")) + goto cleanup; + + /* This runs bpf_fentry_test* functions and triggers + * fentry/fexit programs. + */ + prog_fd = bpf_program__fd(skel->progs.test1); + err = bpf_prog_test_run(prog_fd, 1, NULL, 0, + NULL, NULL, &retval, &duration); + ASSERT_OK(err, "test_run"); + ASSERT_EQ(retval, 0, "test_run"); + + /* This runs bpf_modify_return_test function and triggers + * fmod_ret_test and fexit_test programs. + */ + prog_fd = bpf_program__fd(skel->progs.fmod_ret_test); + err = bpf_prog_test_run(prog_fd, 1, NULL, 0, + NULL, NULL, &retval, &duration); + ASSERT_OK(err, "test_run"); + ASSERT_EQ(retval, 1234, "test_run"); + + ASSERT_EQ(skel->bss->test1_result, 1, "test1_result"); + ASSERT_EQ(skel->bss->test2_result, 1, "test2_result"); + ASSERT_EQ(skel->bss->test3_result, 1, "test3_result"); + ASSERT_EQ(skel->bss->test4_result, 1, "test4_result"); + +cleanup: + get_func_args_test__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c b/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c index 977ab433a946..e834a01de16a 100644 --- a/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c +++ b/tools/testing/selftests/bpf/prog_tests/get_stack_raw_tp.c @@ -89,7 +89,7 @@ void test_get_stack_raw_tp(void) { const char *file = "./test_get_stack_rawtp.o"; const char *file_err = "./test_get_stack_rawtp_err.o"; - const char *prog_name = "raw_tracepoint/sys_enter"; + const char *prog_name = "bpf_prog1"; int i, err, prog_fd, exp_cnt = MAX_CNT_RAWTP; struct perf_buffer *pb = NULL; struct bpf_link *link = NULL; @@ -107,7 +107,7 @@ void test_get_stack_raw_tp(void) if (CHECK(err, "prog_load raw tp", "err %d errno %d\n", err, errno)) return; - prog = bpf_object__find_program_by_title(obj, prog_name); + prog = bpf_object__find_program_by_name(obj, prog_name); if (CHECK(!prog, "find_probe", "prog '%s' not found\n", prog_name)) goto close_prog; diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c index 79f6bd1e50d6..f6933b06daf8 100644 --- a/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c +++ b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c @@ -8,6 +8,7 @@ #include "test_ksyms_btf_null_check.skel.h" #include "test_ksyms_weak.skel.h" #include "test_ksyms_weak.lskel.h" +#include "test_ksyms_btf_write_check.skel.h" static int duration; @@ -137,6 +138,16 @@ cleanup: test_ksyms_weak_lskel__destroy(skel); } +static void test_write_check(void) +{ + struct test_ksyms_btf_write_check *skel; + + skel = test_ksyms_btf_write_check__open_and_load(); + ASSERT_ERR_PTR(skel, "unexpected load of a prog writing to ksym memory\n"); + + test_ksyms_btf_write_check__destroy(skel); +} + void test_ksyms_btf(void) { int percpu_datasec; @@ -167,4 +178,7 @@ void test_ksyms_btf(void) if (test__start_subtest("weak_ksyms_lskel")) test_weak_syms_lskel(); + + if (test__start_subtest("write_check")) + test_write_check(); } diff --git a/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c b/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c new file mode 100644 index 000000000000..9f766ddd946a --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c @@ -0,0 +1,124 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2021 Facebook */ + +#include <test_progs.h> +#include <bpf/btf.h> + +void test_libbpf_probe_prog_types(void) +{ + struct btf *btf; + const struct btf_type *t; + const struct btf_enum *e; + int i, n, id; + + btf = btf__parse("/sys/kernel/btf/vmlinux", NULL); + if (!ASSERT_OK_PTR(btf, "btf_parse")) + return; + + /* find enum bpf_prog_type and enumerate each value */ + id = btf__find_by_name_kind(btf, "bpf_prog_type", BTF_KIND_ENUM); + if (!ASSERT_GT(id, 0, "bpf_prog_type_id")) + goto cleanup; + t = btf__type_by_id(btf, id); + if (!ASSERT_OK_PTR(t, "bpf_prog_type_enum")) + goto cleanup; + + for (e = btf_enum(t), i = 0, n = btf_vlen(t); i < n; e++, i++) { + const char *prog_type_name = btf__str_by_offset(btf, e->name_off); + enum bpf_prog_type prog_type = (enum bpf_prog_type)e->val; + int res; + + if (prog_type == BPF_PROG_TYPE_UNSPEC) + continue; + + if (!test__start_subtest(prog_type_name)) + continue; + + res = libbpf_probe_bpf_prog_type(prog_type, NULL); + ASSERT_EQ(res, 1, prog_type_name); + } + +cleanup: + btf__free(btf); +} + +void test_libbpf_probe_map_types(void) +{ + struct btf *btf; + const struct btf_type *t; + const struct btf_enum *e; + int i, n, id; + + btf = btf__parse("/sys/kernel/btf/vmlinux", NULL); + if (!ASSERT_OK_PTR(btf, "btf_parse")) + return; + + /* find enum bpf_map_type and enumerate each value */ + id = btf__find_by_name_kind(btf, "bpf_map_type", BTF_KIND_ENUM); + if (!ASSERT_GT(id, 0, "bpf_map_type_id")) + goto cleanup; + t = btf__type_by_id(btf, id); + if (!ASSERT_OK_PTR(t, "bpf_map_type_enum")) + goto cleanup; + + for (e = btf_enum(t), i = 0, n = btf_vlen(t); i < n; e++, i++) { + const char *map_type_name = btf__str_by_offset(btf, e->name_off); + enum bpf_map_type map_type = (enum bpf_map_type)e->val; + int res; + + if (map_type == BPF_MAP_TYPE_UNSPEC) + continue; + + if (!test__start_subtest(map_type_name)) + continue; + + res = libbpf_probe_bpf_map_type(map_type, NULL); + ASSERT_EQ(res, 1, map_type_name); + } + +cleanup: + btf__free(btf); +} + +void test_libbpf_probe_helpers(void) +{ +#define CASE(prog, helper, supp) { \ + .prog_type_name = "BPF_PROG_TYPE_" # prog, \ + .helper_name = "bpf_" # helper, \ + .prog_type = BPF_PROG_TYPE_ ## prog, \ + .helper_id = BPF_FUNC_ ## helper, \ + .supported = supp, \ +} + const struct case_def { + const char *prog_type_name; + const char *helper_name; + enum bpf_prog_type prog_type; + enum bpf_func_id helper_id; + bool supported; + } cases[] = { + CASE(KPROBE, unspec, false), + CASE(KPROBE, map_lookup_elem, true), + CASE(KPROBE, loop, true), + + CASE(KPROBE, ktime_get_coarse_ns, false), + CASE(SOCKET_FILTER, ktime_get_coarse_ns, true), + + CASE(KPROBE, sys_bpf, false), + CASE(SYSCALL, sys_bpf, true), + }; + size_t case_cnt = ARRAY_SIZE(cases), i; + char buf[128]; + + for (i = 0; i < case_cnt; i++) { + const struct case_def *d = &cases[i]; + int res; + + snprintf(buf, sizeof(buf), "%s+%s", d->prog_type_name, d->helper_name); + + if (!test__start_subtest(buf)) + continue; + + res = libbpf_probe_bpf_helper(d->prog_type, d->helper_id, NULL); + ASSERT_EQ(res, d->supported, buf); + } +} diff --git a/tools/testing/selftests/bpf/prog_tests/select_reuseport.c b/tools/testing/selftests/bpf/prog_tests/select_reuseport.c index 980ac0f2c0bb..1cbd8cd64044 100644 --- a/tools/testing/selftests/bpf/prog_tests/select_reuseport.c +++ b/tools/testing/selftests/bpf/prog_tests/select_reuseport.c @@ -18,7 +18,6 @@ #include <netinet/in.h> #include <bpf/bpf.h> #include <bpf/libbpf.h> -#include "bpf_rlimit.h" #include "bpf_util.h" #include "test_progs.h" diff --git a/tools/testing/selftests/bpf/prog_tests/sk_lookup.c b/tools/testing/selftests/bpf/prog_tests/sk_lookup.c index 57846cc7ce36..597d0467a926 100644 --- a/tools/testing/selftests/bpf/prog_tests/sk_lookup.c +++ b/tools/testing/selftests/bpf/prog_tests/sk_lookup.c @@ -30,7 +30,6 @@ #include <bpf/bpf.h> #include "test_progs.h" -#include "bpf_rlimit.h" #include "bpf_util.h" #include "cgroup_helpers.h" #include "network_helpers.h" diff --git a/tools/testing/selftests/bpf/prog_tests/sock_fields.c b/tools/testing/selftests/bpf/prog_tests/sock_fields.c index fae40db4d81f..9fc040eaa482 100644 --- a/tools/testing/selftests/bpf/prog_tests/sock_fields.c +++ b/tools/testing/selftests/bpf/prog_tests/sock_fields.c @@ -15,7 +15,6 @@ #include "network_helpers.h" #include "cgroup_helpers.h" #include "test_progs.h" -#include "bpf_rlimit.h" #include "test_sock_fields.skel.h" enum bpf_linum_array_idx { diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt_inherit.c b/tools/testing/selftests/bpf/prog_tests/sockopt_inherit.c index 6a953f4adfdc..8ed78a9383ba 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockopt_inherit.c +++ b/tools/testing/selftests/bpf/prog_tests/sockopt_inherit.c @@ -136,7 +136,8 @@ static int start_server(void) return fd; } -static int prog_attach(struct bpf_object *obj, int cgroup_fd, const char *title) +static int prog_attach(struct bpf_object *obj, int cgroup_fd, const char *title, + const char *prog_name) { enum bpf_attach_type attach_type; enum bpf_prog_type prog_type; @@ -145,20 +146,20 @@ static int prog_attach(struct bpf_object *obj, int cgroup_fd, const char *title) err = libbpf_prog_type_by_name(title, &prog_type, &attach_type); if (err) { - log_err("Failed to deduct types for %s BPF program", title); + log_err("Failed to deduct types for %s BPF program", prog_name); return -1; } - prog = bpf_object__find_program_by_title(obj, title); + prog = bpf_object__find_program_by_name(obj, prog_name); if (!prog) { - log_err("Failed to find %s BPF program", title); + log_err("Failed to find %s BPF program", prog_name); return -1; } err = bpf_prog_attach(bpf_program__fd(prog), cgroup_fd, attach_type, 0); if (err) { - log_err("Failed to attach %s BPF program", title); + log_err("Failed to attach %s BPF program", prog_name); return -1; } @@ -181,11 +182,11 @@ static void run_test(int cgroup_fd) if (!ASSERT_OK(err, "obj_load")) goto close_bpf_object; - err = prog_attach(obj, cgroup_fd, "cgroup/getsockopt"); + err = prog_attach(obj, cgroup_fd, "cgroup/getsockopt", "_getsockopt"); if (CHECK_FAIL(err)) goto close_bpf_object; - err = prog_attach(obj, cgroup_fd, "cgroup/setsockopt"); + err = prog_attach(obj, cgroup_fd, "cgroup/setsockopt", "_setsockopt"); if (CHECK_FAIL(err)) goto close_bpf_object; diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c index 337493d74ec5..313f0a66232e 100644 --- a/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c +++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_map.c @@ -4,7 +4,7 @@ void test_stacktrace_map(void) { int control_map_fd, stackid_hmap_fd, stackmap_fd, stack_amap_fd; - const char *prog_name = "tracepoint/sched/sched_switch"; + const char *prog_name = "oncpu"; int err, prog_fd, stack_trace_len; const char *file = "./test_stacktrace_map.o"; __u32 key, val, duration = 0; @@ -16,7 +16,7 @@ void test_stacktrace_map(void) if (CHECK(err, "prog_load", "err %d errno %d\n", err, errno)) return; - prog = bpf_object__find_program_by_title(obj, prog_name); + prog = bpf_object__find_program_by_name(obj, prog_name); if (CHECK(!prog, "find_prog", "prog '%s' not found\n", prog_name)) goto close_prog; diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c index 063a14a2060d..1cb8dd36bd8f 100644 --- a/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c +++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_map_raw_tp.c @@ -3,7 +3,7 @@ void test_stacktrace_map_raw_tp(void) { - const char *prog_name = "tracepoint/sched/sched_switch"; + const char *prog_name = "oncpu"; int control_map_fd, stackid_hmap_fd, stackmap_fd; const char *file = "./test_stacktrace_map.o"; __u32 key, val, duration = 0; @@ -16,7 +16,7 @@ void test_stacktrace_map_raw_tp(void) if (CHECK(err, "prog_load raw tp", "err %d errno %d\n", err, errno)) return; - prog = bpf_object__find_program_by_title(obj, prog_name); + prog = bpf_object__find_program_by_name(obj, prog_name); if (CHECK(!prog, "find_prog", "prog '%s' not found\n", prog_name)) goto close_prog; diff --git a/tools/testing/selftests/bpf/prog_tests/test_local_storage.c b/tools/testing/selftests/bpf/prog_tests/test_local_storage.c index d2c16eaae367..26ac26a88026 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_local_storage.c +++ b/tools/testing/selftests/bpf/prog_tests/test_local_storage.c @@ -28,10 +28,6 @@ static unsigned int duration; struct storage { void *inode; unsigned int value; - /* Lock ensures that spin locked versions of local stoage operations - * also work, most operations in this tests are still single threaded - */ - struct bpf_spin_lock lock; }; /* Fork and exec the provided rm binary and return the exit code of the @@ -66,27 +62,24 @@ static int run_self_unlink(int *monitored_pid, const char *rm_path) static bool check_syscall_operations(int map_fd, int obj_fd) { - struct storage val = { .value = TEST_STORAGE_VALUE, .lock = { 0 } }, - lookup_val = { .value = 0, .lock = { 0 } }; + struct storage val = { .value = TEST_STORAGE_VALUE }, + lookup_val = { .value = 0 }; int err; /* Looking up an existing element should fail initially */ - err = bpf_map_lookup_elem_flags(map_fd, &obj_fd, &lookup_val, - BPF_F_LOCK); + err = bpf_map_lookup_elem_flags(map_fd, &obj_fd, &lookup_val, 0); if (CHECK(!err || errno != ENOENT, "bpf_map_lookup_elem", "err:%d errno:%d\n", err, errno)) return false; /* Create a new element */ - err = bpf_map_update_elem(map_fd, &obj_fd, &val, - BPF_NOEXIST | BPF_F_LOCK); + err = bpf_map_update_elem(map_fd, &obj_fd, &val, BPF_NOEXIST); if (CHECK(err < 0, "bpf_map_update_elem", "err:%d errno:%d\n", err, errno)) return false; /* Lookup the newly created element */ - err = bpf_map_lookup_elem_flags(map_fd, &obj_fd, &lookup_val, - BPF_F_LOCK); + err = bpf_map_lookup_elem_flags(map_fd, &obj_fd, &lookup_val, 0); if (CHECK(err < 0, "bpf_map_lookup_elem", "err:%d errno:%d", err, errno)) return false; @@ -102,8 +95,7 @@ static bool check_syscall_operations(int map_fd, int obj_fd) return false; /* The lookup should fail, now that the element has been deleted */ - err = bpf_map_lookup_elem_flags(map_fd, &obj_fd, &lookup_val, - BPF_F_LOCK); + err = bpf_map_lookup_elem_flags(map_fd, &obj_fd, &lookup_val, 0); if (CHECK(!err || errno != ENOENT, "bpf_map_lookup_elem", "err:%d errno:%d\n", err, errno)) return false; diff --git a/tools/testing/selftests/bpf/prog_tests/test_overhead.c b/tools/testing/selftests/bpf/prog_tests/test_overhead.c index 123c68c1917d..05acb376f74d 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_overhead.c +++ b/tools/testing/selftests/bpf/prog_tests/test_overhead.c @@ -56,11 +56,11 @@ static void setaffinity(void) void test_test_overhead(void) { - const char *kprobe_name = "kprobe/__set_task_comm"; - const char *kretprobe_name = "kretprobe/__set_task_comm"; - const char *raw_tp_name = "raw_tp/task_rename"; - const char *fentry_name = "fentry/__set_task_comm"; - const char *fexit_name = "fexit/__set_task_comm"; + const char *kprobe_name = "prog1"; + const char *kretprobe_name = "prog2"; + const char *raw_tp_name = "prog3"; + const char *fentry_name = "prog4"; + const char *fexit_name = "prog5"; const char *kprobe_func = "__set_task_comm"; struct bpf_program *kprobe_prog, *kretprobe_prog, *raw_tp_prog; struct bpf_program *fentry_prog, *fexit_prog; @@ -76,23 +76,23 @@ void test_test_overhead(void) if (!ASSERT_OK_PTR(obj, "obj_open_file")) return; - kprobe_prog = bpf_object__find_program_by_title(obj, kprobe_name); + kprobe_prog = bpf_object__find_program_by_name(obj, kprobe_name); if (CHECK(!kprobe_prog, "find_probe", "prog '%s' not found\n", kprobe_name)) goto cleanup; - kretprobe_prog = bpf_object__find_program_by_title(obj, kretprobe_name); + kretprobe_prog = bpf_object__find_program_by_name(obj, kretprobe_name); if (CHECK(!kretprobe_prog, "find_probe", "prog '%s' not found\n", kretprobe_name)) goto cleanup; - raw_tp_prog = bpf_object__find_program_by_title(obj, raw_tp_name); + raw_tp_prog = bpf_object__find_program_by_name(obj, raw_tp_name); if (CHECK(!raw_tp_prog, "find_probe", "prog '%s' not found\n", raw_tp_name)) goto cleanup; - fentry_prog = bpf_object__find_program_by_title(obj, fentry_name); + fentry_prog = bpf_object__find_program_by_name(obj, fentry_name); if (CHECK(!fentry_prog, "find_probe", "prog '%s' not found\n", fentry_name)) goto cleanup; - fexit_prog = bpf_object__find_program_by_title(obj, fexit_name); + fexit_prog = bpf_object__find_program_by_name(obj, fexit_name); if (CHECK(!fexit_prog, "find_probe", "prog '%s' not found\n", fexit_name)) goto cleanup; diff --git a/tools/testing/selftests/bpf/prog_tests/test_strncmp.c b/tools/testing/selftests/bpf/prog_tests/test_strncmp.c new file mode 100644 index 000000000000..b57a3009465f --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/test_strncmp.c @@ -0,0 +1,167 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2021. Huawei Technologies Co., Ltd */ +#include <test_progs.h> +#include "strncmp_test.skel.h" + +static int trigger_strncmp(const struct strncmp_test *skel) +{ + int cmp; + + usleep(1); + + cmp = skel->bss->cmp_ret; + if (cmp > 0) + return 1; + if (cmp < 0) + return -1; + return 0; +} + +/* + * Compare str and target after making str[i] != target[i]. + * When exp is -1, make str[i] < target[i] and delta = -1. + */ +static void strncmp_full_str_cmp(struct strncmp_test *skel, const char *name, + int exp) +{ + size_t nr = sizeof(skel->bss->str); + char *str = skel->bss->str; + int delta = exp; + int got; + size_t i; + + memcpy(str, skel->rodata->target, nr); + for (i = 0; i < nr - 1; i++) { + str[i] += delta; + + got = trigger_strncmp(skel); + ASSERT_EQ(got, exp, name); + + str[i] -= delta; + } +} + +static void test_strncmp_ret(void) +{ + struct strncmp_test *skel; + struct bpf_program *prog; + int err, got; + + skel = strncmp_test__open(); + if (!ASSERT_OK_PTR(skel, "strncmp_test open")) + return; + + bpf_object__for_each_program(prog, skel->obj) + bpf_program__set_autoload(prog, false); + + bpf_program__set_autoload(skel->progs.do_strncmp, true); + + err = strncmp_test__load(skel); + if (!ASSERT_EQ(err, 0, "strncmp_test load")) + goto out; + + err = strncmp_test__attach(skel); + if (!ASSERT_EQ(err, 0, "strncmp_test attach")) + goto out; + + skel->bss->target_pid = getpid(); + + /* Empty str */ + skel->bss->str[0] = '\0'; + got = trigger_strncmp(skel); + ASSERT_EQ(got, -1, "strncmp: empty str"); + + /* Same string */ + memcpy(skel->bss->str, skel->rodata->target, sizeof(skel->bss->str)); + got = trigger_strncmp(skel); + ASSERT_EQ(got, 0, "strncmp: same str"); + + /* Not-null-termainted string */ + memcpy(skel->bss->str, skel->rodata->target, sizeof(skel->bss->str)); + skel->bss->str[sizeof(skel->bss->str) - 1] = 'A'; + got = trigger_strncmp(skel); + ASSERT_EQ(got, 1, "strncmp: not-null-term str"); + + strncmp_full_str_cmp(skel, "strncmp: less than", -1); + strncmp_full_str_cmp(skel, "strncmp: greater than", 1); +out: + strncmp_test__destroy(skel); +} + +static void test_strncmp_bad_not_const_str_size(void) +{ + struct strncmp_test *skel; + struct bpf_program *prog; + int err; + + skel = strncmp_test__open(); + if (!ASSERT_OK_PTR(skel, "strncmp_test open")) + return; + + bpf_object__for_each_program(prog, skel->obj) + bpf_program__set_autoload(prog, false); + + bpf_program__set_autoload(skel->progs.strncmp_bad_not_const_str_size, + true); + + err = strncmp_test__load(skel); + ASSERT_ERR(err, "strncmp_test load bad_not_const_str_size"); + + strncmp_test__destroy(skel); +} + +static void test_strncmp_bad_writable_target(void) +{ + struct strncmp_test *skel; + struct bpf_program *prog; + int err; + + skel = strncmp_test__open(); + if (!ASSERT_OK_PTR(skel, "strncmp_test open")) + return; + + bpf_object__for_each_program(prog, skel->obj) + bpf_program__set_autoload(prog, false); + + bpf_program__set_autoload(skel->progs.strncmp_bad_writable_target, + true); + + err = strncmp_test__load(skel); + ASSERT_ERR(err, "strncmp_test load bad_writable_target"); + + strncmp_test__destroy(skel); +} + +static void test_strncmp_bad_not_null_term_target(void) +{ + struct strncmp_test *skel; + struct bpf_program *prog; + int err; + + skel = strncmp_test__open(); + if (!ASSERT_OK_PTR(skel, "strncmp_test open")) + return; + + bpf_object__for_each_program(prog, skel->obj) + bpf_program__set_autoload(prog, false); + + bpf_program__set_autoload(skel->progs.strncmp_bad_not_null_term_target, + true); + + err = strncmp_test__load(skel); + ASSERT_ERR(err, "strncmp_test load bad_not_null_term_target"); + + strncmp_test__destroy(skel); +} + +void test_test_strncmp(void) +{ + if (test__start_subtest("strncmp_ret")) + test_strncmp_ret(); + if (test__start_subtest("strncmp_bad_not_const_str_size")) + test_strncmp_bad_not_const_str_size(); + if (test__start_subtest("strncmp_bad_writable_target")) + test_strncmp_bad_writable_target(); + if (test__start_subtest("strncmp_bad_not_null_term_target")) + test_strncmp_bad_not_null_term_target(); +} diff --git a/tools/testing/selftests/bpf/prog_tests/trampoline_count.c b/tools/testing/selftests/bpf/prog_tests/trampoline_count.c index fc146671b20a..9c795ee52b7b 100644 --- a/tools/testing/selftests/bpf/prog_tests/trampoline_count.c +++ b/tools/testing/selftests/bpf/prog_tests/trampoline_count.c @@ -35,7 +35,7 @@ static struct bpf_link *load(struct bpf_object *obj, const char *name) struct bpf_program *prog; int duration = 0; - prog = bpf_object__find_program_by_title(obj, name); + prog = bpf_object__find_program_by_name(obj, name); if (CHECK(!prog, "find_probe", "prog '%s' not found\n", name)) return ERR_PTR(-EINVAL); return bpf_program__attach_trace(prog); @@ -44,8 +44,8 @@ static struct bpf_link *load(struct bpf_object *obj, const char *name) /* TODO: use different target function to run in concurrent mode */ void serial_test_trampoline_count(void) { - const char *fentry_name = "fentry/__set_task_comm"; - const char *fexit_name = "fexit/__set_task_comm"; + const char *fentry_name = "prog1"; + const char *fexit_name = "prog2"; const char *object = "test_trampoline_count.o"; struct inst inst[MAX_TRAMP_PROGS] = {}; int err, i = 0, duration = 0; diff --git a/tools/testing/selftests/bpf/progs/get_func_args_test.c b/tools/testing/selftests/bpf/progs/get_func_args_test.c new file mode 100644 index 000000000000..e0f34a55e697 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/get_func_args_test.c @@ -0,0 +1,123 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> +#include <errno.h> + +char _license[] SEC("license") = "GPL"; + +__u64 test1_result = 0; +SEC("fentry/bpf_fentry_test1") +int BPF_PROG(test1) +{ + __u64 cnt = bpf_get_func_arg_cnt(ctx); + __u64 a = 0, z = 0, ret = 0; + __s64 err; + + test1_result = cnt == 1; + + /* valid arguments */ + err = bpf_get_func_arg(ctx, 0, &a); + + /* We need to cast access to traced function argument values with + * proper type cast, because trampoline uses type specific instruction + * to save it, like for 'int a' with 32-bit mov like: + * + * mov %edi,-0x8(%rbp) + * + * so the upper 4 bytes are not zeroed. + */ + test1_result &= err == 0 && ((int) a == 1); + + /* not valid argument */ + err = bpf_get_func_arg(ctx, 1, &z); + test1_result &= err == -EINVAL; + + /* return value fails in fentry */ + err = bpf_get_func_ret(ctx, &ret); + test1_result &= err == -EOPNOTSUPP; + return 0; +} + +__u64 test2_result = 0; +SEC("fexit/bpf_fentry_test2") +int BPF_PROG(test2) +{ + __u64 cnt = bpf_get_func_arg_cnt(ctx); + __u64 a = 0, b = 0, z = 0, ret = 0; + __s64 err; + + test2_result = cnt == 2; + + /* valid arguments */ + err = bpf_get_func_arg(ctx, 0, &a); + test2_result &= err == 0 && (int) a == 2; + + err = bpf_get_func_arg(ctx, 1, &b); + test2_result &= err == 0 && b == 3; + + /* not valid argument */ + err = bpf_get_func_arg(ctx, 2, &z); + test2_result &= err == -EINVAL; + + /* return value */ + err = bpf_get_func_ret(ctx, &ret); + test2_result &= err == 0 && ret == 5; + return 0; +} + +__u64 test3_result = 0; +SEC("fmod_ret/bpf_modify_return_test") +int BPF_PROG(fmod_ret_test, int _a, int *_b, int _ret) +{ + __u64 cnt = bpf_get_func_arg_cnt(ctx); + __u64 a = 0, b = 0, z = 0, ret = 0; + __s64 err; + + test3_result = cnt == 2; + + /* valid arguments */ + err = bpf_get_func_arg(ctx, 0, &a); + test3_result &= err == 0 && ((int) a == 1); + + err = bpf_get_func_arg(ctx, 1, &b); + test3_result &= err == 0 && ((int *) b == _b); + + /* not valid argument */ + err = bpf_get_func_arg(ctx, 2, &z); + test3_result &= err == -EINVAL; + + /* return value */ + err = bpf_get_func_ret(ctx, &ret); + test3_result &= err == 0 && ret == 0; + + /* change return value, it's checked in fexit_test program */ + return 1234; +} + +__u64 test4_result = 0; +SEC("fexit/bpf_modify_return_test") +int BPF_PROG(fexit_test, int _a, int *_b, int _ret) +{ + __u64 cnt = bpf_get_func_arg_cnt(ctx); + __u64 a = 0, b = 0, z = 0, ret = 0; + __s64 err; + + test4_result = cnt == 2; + + /* valid arguments */ + err = bpf_get_func_arg(ctx, 0, &a); + test4_result &= err == 0 && ((int) a == 1); + + err = bpf_get_func_arg(ctx, 1, &b); + test4_result &= err == 0 && ((int *) b == _b); + + /* not valid argument */ + err = bpf_get_func_arg(ctx, 2, &z); + test4_result &= err == -EINVAL; + + /* return value */ + err = bpf_get_func_ret(ctx, &ret); + test4_result &= err == 0 && ret == 1234; + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/local_storage.c b/tools/testing/selftests/bpf/progs/local_storage.c index 95868bc7ada9..9b1f9b75d5c2 100644 --- a/tools/testing/selftests/bpf/progs/local_storage.c +++ b/tools/testing/selftests/bpf/progs/local_storage.c @@ -20,7 +20,6 @@ int sk_storage_result = -1; struct local_storage { struct inode *exec_inode; __u32 value; - struct bpf_spin_lock lock; }; struct { @@ -58,9 +57,7 @@ int BPF_PROG(unlink_hook, struct inode *dir, struct dentry *victim) bpf_get_current_task_btf(), 0, 0); if (storage) { /* Don't let an executable delete itself */ - bpf_spin_lock(&storage->lock); is_self_unlink = storage->exec_inode == victim->d_inode; - bpf_spin_unlock(&storage->lock); if (is_self_unlink) return -EPERM; } @@ -68,7 +65,7 @@ int BPF_PROG(unlink_hook, struct inode *dir, struct dentry *victim) return 0; } -SEC("lsm/inode_rename") +SEC("lsm.s/inode_rename") int BPF_PROG(inode_rename, struct inode *old_dir, struct dentry *old_dentry, struct inode *new_dir, struct dentry *new_dentry, unsigned int flags) @@ -89,10 +86,8 @@ int BPF_PROG(inode_rename, struct inode *old_dir, struct dentry *old_dentry, if (!storage) return 0; - bpf_spin_lock(&storage->lock); if (storage->value != DUMMY_STORAGE_VALUE) inode_storage_result = -1; - bpf_spin_unlock(&storage->lock); err = bpf_inode_storage_delete(&inode_storage_map, old_dentry->d_inode); if (!err) @@ -101,7 +96,7 @@ int BPF_PROG(inode_rename, struct inode *old_dir, struct dentry *old_dentry, return 0; } -SEC("lsm/socket_bind") +SEC("lsm.s/socket_bind") int BPF_PROG(socket_bind, struct socket *sock, struct sockaddr *address, int addrlen) { @@ -117,10 +112,8 @@ int BPF_PROG(socket_bind, struct socket *sock, struct sockaddr *address, if (!storage) return 0; - bpf_spin_lock(&storage->lock); if (storage->value != DUMMY_STORAGE_VALUE) sk_storage_result = -1; - bpf_spin_unlock(&storage->lock); err = bpf_sk_storage_delete(&sk_storage_map, sock->sk); if (!err) @@ -129,7 +122,7 @@ int BPF_PROG(socket_bind, struct socket *sock, struct sockaddr *address, return 0; } -SEC("lsm/socket_post_create") +SEC("lsm.s/socket_post_create") int BPF_PROG(socket_post_create, struct socket *sock, int family, int type, int protocol, int kern) { @@ -144,9 +137,7 @@ int BPF_PROG(socket_post_create, struct socket *sock, int family, int type, if (!storage) return 0; - bpf_spin_lock(&storage->lock); storage->value = DUMMY_STORAGE_VALUE; - bpf_spin_unlock(&storage->lock); return 0; } @@ -154,7 +145,7 @@ int BPF_PROG(socket_post_create, struct socket *sock, int family, int type, /* This uses the local storage to remember the inode of the binary that a * process was originally executing. */ -SEC("lsm/bprm_committed_creds") +SEC("lsm.s/bprm_committed_creds") void BPF_PROG(exec, struct linux_binprm *bprm) { __u32 pid = bpf_get_current_pid_tgid() >> 32; @@ -166,18 +157,13 @@ void BPF_PROG(exec, struct linux_binprm *bprm) storage = bpf_task_storage_get(&task_storage_map, bpf_get_current_task_btf(), 0, BPF_LOCAL_STORAGE_GET_F_CREATE); - if (storage) { - bpf_spin_lock(&storage->lock); + if (storage) storage->exec_inode = bprm->file->f_inode; - bpf_spin_unlock(&storage->lock); - } storage = bpf_inode_storage_get(&inode_storage_map, bprm->file->f_inode, 0, BPF_LOCAL_STORAGE_GET_F_CREATE); if (!storage) return; - bpf_spin_lock(&storage->lock); storage->value = DUMMY_STORAGE_VALUE; - bpf_spin_unlock(&storage->lock); } diff --git a/tools/testing/selftests/bpf/progs/strncmp_bench.c b/tools/testing/selftests/bpf/progs/strncmp_bench.c new file mode 100644 index 000000000000..18373a7df76e --- /dev/null +++ b/tools/testing/selftests/bpf/progs/strncmp_bench.c @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2021. Huawei Technologies Co., Ltd */ +#include <linux/types.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> + +#define STRNCMP_STR_SZ 4096 + +/* Will be updated by benchmark before program loading */ +const volatile unsigned int cmp_str_len = 1; +const char target[STRNCMP_STR_SZ]; + +long hits = 0; +char str[STRNCMP_STR_SZ]; + +char _license[] SEC("license") = "GPL"; + +static __always_inline int local_strncmp(const char *s1, unsigned int sz, + const char *s2) +{ + int ret = 0; + unsigned int i; + + for (i = 0; i < sz; i++) { + /* E.g. 0xff > 0x31 */ + ret = (unsigned char)s1[i] - (unsigned char)s2[i]; + if (ret || !s1[i]) + break; + } + + return ret; +} + +SEC("tp/syscalls/sys_enter_getpgid") +int strncmp_no_helper(void *ctx) +{ + if (local_strncmp(str, cmp_str_len + 1, target) < 0) + __sync_add_and_fetch(&hits, 1); + return 0; +} + +SEC("tp/syscalls/sys_enter_getpgid") +int strncmp_helper(void *ctx) +{ + if (bpf_strncmp(str, cmp_str_len + 1, target) < 0) + __sync_add_and_fetch(&hits, 1); + return 0; +} + diff --git a/tools/testing/selftests/bpf/progs/strncmp_test.c b/tools/testing/selftests/bpf/progs/strncmp_test.c new file mode 100644 index 000000000000..900d930d48a8 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/strncmp_test.c @@ -0,0 +1,54 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2021. Huawei Technologies Co., Ltd */ +#include <stdbool.h> +#include <linux/types.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> + +#define STRNCMP_STR_SZ 8 + +const char target[STRNCMP_STR_SZ] = "EEEEEEE"; +char str[STRNCMP_STR_SZ]; +int cmp_ret = 0; +int target_pid = 0; + +const char no_str_target[STRNCMP_STR_SZ] = "12345678"; +char writable_target[STRNCMP_STR_SZ]; +unsigned int no_const_str_size = STRNCMP_STR_SZ; + +char _license[] SEC("license") = "GPL"; + +SEC("tp/syscalls/sys_enter_nanosleep") +int do_strncmp(void *ctx) +{ + if ((bpf_get_current_pid_tgid() >> 32) != target_pid) + return 0; + + cmp_ret = bpf_strncmp(str, STRNCMP_STR_SZ, target); + return 0; +} + +SEC("tp/syscalls/sys_enter_nanosleep") +int strncmp_bad_not_const_str_size(void *ctx) +{ + /* The value of string size is not const, so will fail */ + cmp_ret = bpf_strncmp(str, no_const_str_size, target); + return 0; +} + +SEC("tp/syscalls/sys_enter_nanosleep") +int strncmp_bad_writable_target(void *ctx) +{ + /* Compared target is not read-only, so will fail */ + cmp_ret = bpf_strncmp(str, STRNCMP_STR_SZ, writable_target); + return 0; +} + +SEC("tp/syscalls/sys_enter_nanosleep") +int strncmp_bad_not_null_term_target(void *ctx) +{ + /* Compared target is not null-terminated, so will fail */ + cmp_ret = bpf_strncmp(str, STRNCMP_STR_SZ, no_str_target); + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_btf_write_check.c b/tools/testing/selftests/bpf/progs/test_ksyms_btf_write_check.c new file mode 100644 index 000000000000..2180c41cd890 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_ksyms_btf_write_check.c @@ -0,0 +1,29 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Google */ + +#include "vmlinux.h" + +#include <bpf/bpf_helpers.h> + +extern const int bpf_prog_active __ksym; /* int type global var. */ + +SEC("raw_tp/sys_enter") +int handler(const void *ctx) +{ + int *active; + __u32 cpu; + + cpu = bpf_get_smp_processor_id(); + active = (int *)bpf_per_cpu_ptr(&bpf_prog_active, cpu); + if (active) { + /* Kernel memory obtained from bpf_{per,this}_cpu_ptr + * is read-only, should _not_ pass verification. + */ + /* WRITE_ONCE */ + *(volatile int *)active = -1; + } + + return 0; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_cpp.cpp b/tools/testing/selftests/bpf/test_cpp.cpp index a8d2e9a87fbf..e00201de2890 100644 --- a/tools/testing/selftests/bpf/test_cpp.cpp +++ b/tools/testing/selftests/bpf/test_cpp.cpp @@ -7,9 +7,15 @@ /* do nothing, just make sure we can link successfully */ +static void dump_printf(void *ctx, const char *fmt, va_list args) +{ +} + int main(int argc, char *argv[]) { + struct btf_dump_opts opts = { }; struct test_core_extern *skel; + struct btf *btf; /* libbpf.h */ libbpf_set_print(NULL); @@ -18,7 +24,8 @@ int main(int argc, char *argv[]) bpf_prog_get_fd_by_id(0); /* btf.h */ - btf__new(NULL, 0); + btf = btf__new(NULL, 0); + btf_dump__new(btf, dump_printf, nullptr, &opts); /* BPF skeleton */ skel = test_core_extern__open_and_load(); diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c index f4cd658bbe00..50f7e74ca0b9 100644 --- a/tools/testing/selftests/bpf/test_maps.c +++ b/tools/testing/selftests/bpf/test_maps.c @@ -23,7 +23,6 @@ #include <bpf/libbpf.h> #include "bpf_util.h" -#include "bpf_rlimit.h" #include "test_maps.h" #include "testing_helpers.h" diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c index 296928948bb9..2ecb73a65206 100644 --- a/tools/testing/selftests/bpf/test_progs.c +++ b/tools/testing/selftests/bpf/test_progs.c @@ -4,7 +4,6 @@ #define _GNU_SOURCE #include "test_progs.h" #include "cgroup_helpers.h" -#include "bpf_rlimit.h" #include <argp.h> #include <pthread.h> #include <sched.h> @@ -1342,7 +1341,6 @@ int main(int argc, char **argv) /* Use libbpf 1.0 API mode */ libbpf_set_strict_mode(LIBBPF_STRICT_ALL); - libbpf_set_print(libbpf_print_fn); srand(time(NULL)); diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 4f1fe8af10a4..76cd903117af 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -41,7 +41,6 @@ # define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS 1 # endif #endif -#include "bpf_rlimit.h" #include "bpf_rand.h" #include "bpf_util.h" #include "test_btf.h" @@ -701,22 +700,18 @@ static int create_sk_storage_map(void) static int create_map_timer(void) { - struct bpf_create_map_attr attr = { - .name = "test_map", - .map_type = BPF_MAP_TYPE_ARRAY, - .key_size = 4, - .value_size = 16, - .max_entries = 1, + LIBBPF_OPTS(bpf_map_create_opts, opts, .btf_key_type_id = 1, .btf_value_type_id = 5, - }; + ); int fd, btf_fd; btf_fd = load_btf(); if (btf_fd < 0) return -1; - attr.btf_fd = btf_fd; - fd = bpf_create_map_xattr(&attr); + + opts.btf_fd = btf_fd; + fd = bpf_map_create(BPF_MAP_TYPE_ARRAY, "test_map", 4, 16, 1, &opts); if (fd < 0) printf("Failed to create map with timer\n"); return fd; @@ -1399,6 +1394,9 @@ int main(int argc, char **argv) return EXIT_FAILURE; } + /* Use libbpf 1.0 API mode */ + libbpf_set_strict_mode(LIBBPF_STRICT_ALL); + bpf_semi_rand_init(); return do_test(unpriv, from, to); } diff --git a/tools/testing/selftests/bpf/verifier/btf_ctx_access.c b/tools/testing/selftests/bpf/verifier/btf_ctx_access.c new file mode 100644 index 000000000000..6340db6b46dc --- /dev/null +++ b/tools/testing/selftests/bpf/verifier/btf_ctx_access.c @@ -0,0 +1,12 @@ +{ + "btf_ctx_access accept", + .insns = { + BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 8), /* load 2nd argument value (int pointer) */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + .prog_type = BPF_PROG_TYPE_TRACING, + .expected_attach_type = BPF_TRACE_FENTRY, + .kfunc = "bpf_modify_return_test", +}, diff --git a/tools/testing/selftests/bpf/vmtest.sh b/tools/testing/selftests/bpf/vmtest.sh index 5e43c79ddc6e..b3afd43549fa 100755 --- a/tools/testing/selftests/bpf/vmtest.sh +++ b/tools/testing/selftests/bpf/vmtest.sh @@ -32,7 +32,7 @@ ROOTFS_IMAGE="root.img" OUTPUT_DIR="$HOME/.bpf_selftests" KCONFIG_URL="https://raw.githubusercontent.com/libbpf/libbpf/master/travis-ci/vmtest/configs/config-latest.${ARCH}" KCONFIG_API_URL="https://api.github.com/repos/libbpf/libbpf/contents/travis-ci/vmtest/configs/config-latest.${ARCH}" -INDEX_URL="https://raw.githubusercontent.com/libbpf/libbpf/master/travis-ci/vmtest/configs/INDEX" +INDEX_URL="https://raw.githubusercontent.com/libbpf/ci/master/INDEX" NUM_COMPILE_JOBS="$(nproc)" LOG_FILE_BASE="$(date +"bpf_selftests.%Y-%m-%d_%H-%M-%S")" LOG_FILE="${LOG_FILE_BASE}.log" |