delta/linux-stable.git - git.kernel.org: pub/scm/linux/kernel/git/stable/linux-stable.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	KEYS: call_sbin_request_key() must write lock keyrings before modifying them	David Howells	2010-05-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	call_sbin_request_key() creates a keyring and then attempts to insert a link to the authorisation key into that keyring, but does so without holding a write lock on the keyring semaphore. It will normally get away with this because it hasn't told anyone that the keyring exists yet. The new keyring, however, has had its serial number published, which means it can be accessed directly by that handle. This was found by a previous patch that adds RCU lockdep checks to the code that reads the keyring payload pointer, which includes a check that the keyring semaphore is actually locked. Without this patch, the following command: keyctl request2 user b a @s will provoke the following lockdep warning is displayed in dmesg: =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- security/keys/keyring.c:727 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by keyctl/2076: #0: (key_types_sem){.+.+.+}, at: [<ffffffff811a5b29>] key_type_lookup+0x1c/0x71 #1: (keyring_serialise_link_sem){+.+.+.}, at: [<ffffffff811a6d1e>] __key_link+0x4d/0x3c5 stack backtrace: Pid: 2076, comm: keyctl Not tainted 2.6.34-rc6-cachefs #54 Call Trace: [<ffffffff81051fdc>] lockdep_rcu_dereference+0xaa/0xb2 [<ffffffff811a6d1e>] ? __key_link+0x4d/0x3c5 [<ffffffff811a6e6f>] __key_link+0x19e/0x3c5 [<ffffffff811a5952>] ? __key_instantiate_and_link+0xb1/0xdc [<ffffffff811a59bf>] ? key_instantiate_and_link+0x42/0x5f [<ffffffff811aa0dc>] call_sbin_request_key+0xe7/0x33b [<ffffffff8139376a>] ? mutex_unlock+0x9/0xb [<ffffffff811a5952>] ? __key_instantiate_and_link+0xb1/0xdc [<ffffffff811a59bf>] ? key_instantiate_and_link+0x42/0x5f [<ffffffff811aa6fa>] ? request_key_auth_new+0x1c2/0x23c [<ffffffff810aaf15>] ? cache_alloc_debugcheck_after+0x108/0x173 [<ffffffff811a9d00>] ? request_key_and_link+0x146/0x300 [<ffffffff810ac568>] ? kmem_cache_alloc+0xe1/0x118 [<ffffffff811a9e45>] request_key_and_link+0x28b/0x300 [<ffffffff811a89ac>] sys_request_key+0xf7/0x14a [<ffffffff81052c0b>] ? trace_hardirqs_on_caller+0x10c/0x130 [<ffffffff81394fb9>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff81001eeb>] system_call_fastpath+0x16/0x1b Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: Use RCU dereference wrappers in keyring key type code	David Howells	2010-05-05	1	-10/+13
\| \| \| \| \| \| \| \| \| \|	The keyring key type code should use RCU dereference wrappers, even when it holds the keyring's key semaphore. Reported-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: find_keyring_by_name() can gain access to a freed keyring	Toshiyuki Okajima	2010-05-05	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	find_keyring_by_name() can gain access to a keyring that has had its reference count reduced to zero, and is thus ready to be freed. This then allows the dead keyring to be brought back into use whilst it is being destroyed. The following timeline illustrates the process: \|(cleaner) (user) \| \| free_user(user) sys_keyctl() \| \| \| \| key_put(user->session_keyring) keyctl_get_keyring_ID() \| \|\| //=> keyring->usage = 0 \| \| \|schedule_work(&key_cleanup_task) lookup_user_key() \| \|\| \| \| kmem_cache_free(,user) \| \| . \|[KEY_SPEC_USER_KEYRING] \| . install_user_keyrings() \| . \|\| \| key_cleanup() [<= worker_thread()] \|\| \| \| \|\| \| [spin_lock(&key_serial_lock)] \|[mutex_lock(&key_user_keyr..mutex)] \| \| \|\| \| atomic_read() == 0 \|\| \| \|{ rb_ease(&key->serial_node,) } \|\| \| \| \|\| \| [spin_unlock(&key_serial_lock)] \|find_keyring_by_name() \| \| \|\|\| \| keyring_destroy(keyring) \|\|[read_lock(&keyring_name_lock)] \| \|\| \|\|\| \| \|[write_lock(&keyring_name_lock)] \|\|atomic_inc(&keyring->usage) \| \|. \|\|\| * GET freeing keyring * \| \|. \|\|[read_unlock(&keyring_name_lock)] \| \|\| \|\| \| \|list_del() \|[mutex_unlock(&key_user_k..mutex)] \| \|\| \| \| \|[write_unlock(&keyring_name_lock)] INVALID keyring is returned \| \| . \| kmem_cache_free(,keyring) . \| . \| atomic_dec(&keyring->usage) v * DESTROYED * TIME If CONFIG_SLUB_DEBUG=y then we may see the following message generated: ============================================================================= BUG key_jar: Poison overwritten ----------------------------------------------------------------------------- INFO: 0xffff880197a7e200-0xffff880197a7e200. First byte 0x6a instead of 0x6b INFO: Allocated in key_alloc+0x10b/0x35f age=25 cpu=1 pid=5086 INFO: Freed in key_cleanup+0xd0/0xd5 age=12 cpu=1 pid=10 INFO: Slab 0xffffea000592cb90 objects=16 used=2 fp=0xffff880197a7e200 flags=0x200000000000c3 INFO: Object 0xffff880197a7e200 @offset=512 fp=0xffff880197a7e300 Bytes b4 0xffff880197a7e1f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Object 0xffff880197a7e200: 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk Alternatively, we may see a system panic happen, such as: BUG: unable to handle kernel NULL pointer dereference at 0000000000000001 IP: [<ffffffff810e61a3>] kmem_cache_alloc+0x5b/0xe9 PGD 6b2b4067 PUD 6a80d067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/kernel/kexec_crash_loaded CPU 1 ... Pid: 31245, comm: su Not tainted 2.6.34-rc5-nofixed-nodebug #2 D2089/PRIMERGY RIP: 0010:[<ffffffff810e61a3>] [<ffffffff810e61a3>] kmem_cache_alloc+0x5b/0xe9 RSP: 0018:ffff88006af3bd98 EFLAGS: 00010002 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88007d19900b RDX: 0000000100000000 RSI: 00000000000080d0 RDI: ffffffff81828430 RBP: ffffffff81828430 R08: ffff88000a293750 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000100000 R12: 00000000000080d0 R13: 00000000000080d0 R14: 0000000000000296 R15: ffffffff810f20ce FS: 00007f97116bc700(0000) GS:ffff88000a280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000001 CR3: 000000006a91c000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process su (pid: 31245, threadinfo ffff88006af3a000, task ffff8800374414c0) Stack: 0000000512e0958e 0000000000008000 ffff880037f8d180 0000000000000001 0000000000000000 0000000000008001 ffff88007d199000 ffffffff810f20ce 0000000000008000 ffff88006af3be48 0000000000000024 ffffffff810face3 Call Trace: [<ffffffff810f20ce>] ? get_empty_filp+0x70/0x12f [<ffffffff810face3>] ? do_filp_open+0x145/0x590 [<ffffffff810ce208>] ? tlb_finish_mmu+0x2a/0x33 [<ffffffff810ce43c>] ? unmap_region+0xd3/0xe2 [<ffffffff810e4393>] ? virt_to_head_page+0x9/0x2d [<ffffffff81103916>] ? alloc_fd+0x69/0x10e [<ffffffff810ef4ed>] ? do_sys_open+0x56/0xfc [<ffffffff81008a02>] ? system_call_fastpath+0x16/0x1b Code: 0f 1f 44 00 00 49 89 c6 fa 66 0f 1f 44 00 00 65 4c 8b 04 25 60 e8 00 00 48 8b 45 00 49 01 c0 49 8b 18 48 85 db 74 0d 48 63 45 18 <48> 8b 04 03 49 89 00 eb 14 4c 89 f9 83 ca ff 44 89 e6 48 89 ef RIP [<ffffffff810e61a3>] kmem_cache_alloc+0x5b/0xe9 This problem is that find_keyring_by_name does not confirm that the keyring is valid before accepting it. Skipping keyrings that have been reduced to a zero count seems the way to go. To this end, use atomic_inc_not_zero() to increment the usage count and skip the candidate keyring if that returns false. The following script _may_ cause the bug to happen, but there's no guarantee as the window of opportunity is small: #!/bin/sh LOOP=100000 USER=dummy_user /bin/su -c "exit;" $USER \|\| { /usr/sbin/adduser -m $USER; add=1; } for ((i=0; i<LOOP; i++)) do /bin/su -c "echo '$i' > /dev/null" $USER done (( add == 1 )) && /usr/sbin/userdel -r $USER exit Note that the nominated user must not be in use. An alternative way of testing this may be: for ((i=0; i<100000; i++)) do keyctl session foo /bin/true \|\| break done >&/dev/null as that uses a keyring named "foo" rather than relying on the user and user-session named keyrings. Reported-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: Fix RCU handling in key_gc_keyring()	David Howells	2010-05-05	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	key_gc_keyring() needs to either hold the RCU read lock or hold the keyring semaphore if it's going to scan the keyring's list. Given that it only needs to read the key list, and it's doing so under a spinlock, the RCU read lock is the thing to use. Furthermore, the RCU check added in e7b0a61b7929632d36cf052d9e2820ef0a9c1bfe is incorrect as holding the spinlock on key_serial_lock is not grounds for assuming a keyring's pointer list can be read safely. Instead, a simple rcu_dereference() inside of the previously mentioned RCU read lock is what we want. Reported-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: Fix an RCU warning in the reading of user keys	David Howells	2010-05-05	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix an RCU warning in the reading of user keys: =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- security/keys/user_defined.c:202 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 1 lock held by keyctl/3637: #0: (&key->sem){+++++.}, at: [<ffffffff811a80ae>] keyctl_read_key+0x9c/0xcf stack backtrace: Pid: 3637, comm: keyctl Not tainted 2.6.34-rc5-cachefs #18 Call Trace: [<ffffffff81051f6c>] lockdep_rcu_dereference+0xaa/0xb2 [<ffffffff811aa55f>] user_read+0x47/0x91 [<ffffffff811a80be>] keyctl_read_key+0xac/0xcf [<ffffffff811a8a06>] sys_keyctl+0x75/0xb7 [<ffffffff81001eeb>] system_call_fastpath+0x16/0x1b Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
*	Merge branch 'for-linus' of ↵	Linus Torvalds	2010-04-27	1	-1/+1
\|\ \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: keys: don't need to use RCU in keyring_read() as semaphore is held
\| *	keys: don't need to use RCU in keyring_read() as semaphore is held	David Howells	2010-04-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	keyring_read() doesn't need to use rcu_dereference() to access the keyring payload as the caller holds the key semaphore to prevent modifications from happening whilst the data is read out. This should solve the following warning: =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- security/keys/keyring.c:204 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 1 lock held by keyctl/2144: #0: (&key->sem){+++++.}, at: [<ffffffff81177f7c>] keyctl_read_key+0x9c/0xcf stack backtrace: Pid: 2144, comm: keyctl Not tainted 2.6.34-rc2-cachefs #113 Call Trace: [<ffffffff8105121f>] lockdep_rcu_dereference+0xaa/0xb2 [<ffffffff811762d5>] keyring_read+0x4d/0xe7 [<ffffffff81177f8c>] keyctl_read_key+0xac/0xcf [<ffffffff811788d4>] sys_keyctl+0x75/0xb9 [<ffffffff81001eeb>] system_call_fastpath+0x16/0x1b Signed-off-by: David Howells <dhowells@redhat.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Morris <jmorris@namei.org>
* \|	keys: the request_key() syscall should link an existing key to the dest keyring	David Howells	2010-04-27	1	-1/+8
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The request_key() system call and request_key_and_link() should make a link from an existing key to the destination keyring (if supplied), not just from a new key to the destination keyring. This can be tested by: ring=`keyctl newring fred @s` keyctl request2 user debug:a a keyctl request user debug:a $ring keyctl list $ring If it says: keyring is empty then it didn't work. If it shows something like: 1 key in keyring: 1070462727: --alswrv 0 0 user: debug:a then it did. request_key() system call is meant to recursively search all your keyrings for the key you desire, and, optionally, if it doesn't exist, call out to userspace to create one for you. If request_key() finds or creates a key, it should, optionally, create a link to that key from the destination keyring specified. Therefore, if, after a successful call to request_key() with a desination keyring specified, you see the destination keyring empty, the code didn't work correctly. If you see the found key in the keyring, then it did - which is what the patch is required for. Signed-off-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	keys: fix an RCU warning	David Howells	2010-04-24	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix the following RCU warning: =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- security/keys/request_key.c:116 invoked rcu_dereference_check() without protection! This was caused by doing: [root@andromeda ~]# keyctl newring fred @s 539196288 [root@andromeda ~]# keyctl request2 user a a 539196288 request_key: Required key not available Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵	Tejun Heo	2010-03-30	2	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
*	security: Apply lockdep-based checking to rcu_dereference() uses	Paul E. McKenney	2010-02-25	2	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Apply lockdep-ified RCU primitives to key_gc_keyring() and keyring_destroy(). Cc: David Howells <dhowells@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1266887105-1528-12-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*	Keys: KEYCTL_SESSION_TO_PARENT needs TIF_NOTIFY_RESUME architecture support	Geert Uytterhoeven	2009-12-17	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As of commit ee18d64c1f632043a02e6f5ba5e045bb26a5465f ("KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]"), CONFIG_KEYS=y fails to build on architectures that haven't implemented TIF_NOTIFY_RESUME yet: security/keys/keyctl.c: In function 'keyctl_session_to_parent': security/keys/keyctl.c:1312: error: 'TIF_NOTIFY_RESUME' undeclared (first use in this function) security/keys/keyctl.c:1312: error: (Each undeclared identifier is reported only once security/keys/keyctl.c:1312: error: for each function it appears in.) Make KEYCTL_SESSION_TO_PARENT depend on TIF_NOTIFY_RESUME until m68k, and xtensa have implemented it. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Mike Frysinger <vapier@gentoo.org>
*	keys: PTR_ERR return of wrong pointer in keyctl_get_security()	Roel Kluin	2009-12-17	1	-1/+1
\| \| \| \| \| \| \| \| \|	Return the PTR_ERR of the correct pointer. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
*	sysctl: Drop & in front of every proc_handler.	Eric W. Biederman	2009-11-18	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \|	For consistency drop & in front of every proc_handler. Explicity taking the address is unnecessary and it prevents optimizations like stubbing the proc_handlers to NULL. Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
*	sysctl security/keys: Remove dead binary sysctl support	Eric W. Biederman	2009-11-12	1	-6/+1
\| \| \| \| \| \| \| \|	Now that sys_sysctl is a generic wrapper around /proc/sys .ctl_name and .strategy members of sysctl tables are dead code. Remove them. Cc: David Howells <dhowells@redhat.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
*	KEYS: get_instantiation_keyring() should inc the keyring refcount in all cases	David Howells	2009-10-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The destination keyring specified to request_key() and co. is made available to the process that instantiates the key (the slave process started by /sbin/request-key typically). This is passed in the request_key_auth struct as the dest_keyring member. keyctl_instantiate_key and keyctl_negate_key() call get_instantiation_keyring() to get the keyring to attach the newly constructed key to at the end of instantiation. This may be given a specific keyring into which a link will be made later, or it may be asked to find the keyring passed to request_key(). In the former case, it returns a keyring with the refcount incremented by lookup_user_key(); in the latter case, it returns the keyring from the request_key_auth struct - and does _not_ increment the refcount. The latter case will eventually result in an oops when the keyring prematurely runs out of references and gets destroyed. The effect may take some time to show up as the key is destroyed lazily. To fix this, the keyring returned by get_instantiation_keyring() must always have its refcount incremented, no matter where it comes from. This can be tested by setting /etc/request-key.conf to: #OP TYPE DESCRIPTION CALLOUT INFO PROGRAM ARG1 ARG2 ARG3 ... #====== ======= =============== =============== =============================== create * test:* * \|/bin/false %u %g %d %{user:_display} negate * * * /bin/keyctl negate %k 10 @u and then doing: keyctl add user _display aaaaaaaa @u while keyctl request2 user test:x test:x @u && keyctl list @u; do keyctl request2 user test:x test:x @u; sleep 31; keyctl list @u; done which will oops eventually. Changing the negate line to have @u rather than %S at the end is important as that forces the latter case by passing a special keyring ID rather than an actual keyring ID. Reported-by: Alexander Zangerl <az@bond.edu.au> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Alexander Zangerl <az@bond.edu.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	KEYS: Have the garbage collector set its timer for live expired keys	David Howells	2009-09-23	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The key garbage collector sets a timer to start a new collection cycle at the point the earliest key to expire should be considered garbage. However, it currently only does this if the key it is considering hasn't yet expired. If the key being considering has expired, but hasn't yet reached the collection time then it is ignored, and won't be collected until some other key provokes a round of collection. Make the garbage collector set the timer for the earliest key that hasn't yet passed its collection time, rather than the earliest key that hasn't yet expired. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: Fix garbage collector	David Howells	2009-09-15	4	-35/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix a number of problems with the new key garbage collector: (1) A rogue semicolon in keyring_gc() was causing the initial count of dead keys to be miscalculated. (2) A missing return in keyring_gc() meant that under certain circumstances, the keyring semaphore would be unlocked twice. (3) The key serial tree iterator (key_garbage_collector()) part of the garbage collector has been modified to: (a) Complete each scan of the keyrings before setting the new timer. (b) Only set the new timer for keys that have yet to expire. This means that the new timer is now calculated correctly, and the gc doesn't get into a loop continually scanning for keys that have expired, and preventing other things from happening, like RCU cleaning up the old keyring contents. (c) Perform an extra scan if any keys were garbage collected in this one as a key might become garbage during a scan, and (b) could mean we don't set the timer again. (4) Made key_schedule_gc() take the time at which to do a collection run, rather than the time at which the key expires. This means the collection of dead keys (key type unregistered) can happen immediately. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: Unlock tasklist when exiting early from keyctl_session_to_parent	Marc Dionne	2009-09-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	When we exit early from keyctl_session_to_parent because of permissions or because the session keyring is the same as the parent, we need to unlock the tasklist. The missing unlock causes the system to hang completely when using keyctl(KEYCTL_SESSION_TO_PARENT) with a keyring shared with the parent. Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]	David Howells	2009-09-02	5	-0/+156
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a keyctl to install a process's session keyring onto its parent. This replaces the parent's session keyring. Because the COW credential code does not permit one process to change another process's credentials directly, the change is deferred until userspace next starts executing again. Normally this will be after a wait() syscall. To support this, three new security hooks have been provided: cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in the blank security creds and key_session_to_parent() - which asks the LSM if the process may replace its parent's session keyring. The replacement may only happen if the process has the same ownership details as its parent, and the process has LINK permission on the session keyring, and the session keyring is owned by the process, and the LSM permits it. Note that this requires alteration to each architecture's notify_resume path. This has been done for all arches barring blackfin, m68k and xtensa, all of which need assembly alteration to support TIF_NOTIFY_RESUME. This allows the replacement to be performed at the point the parent process resumes userspace execution. This allows the userspace AFS pioctl emulation to fully emulate newpag() and the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to alter the parent process's PAG membership. However, since kAFS doesn't use PAGs per se, but rather dumps the keys into the session keyring, the session keyring of the parent must be replaced if, for example, VIOCSETTOK is passed the newpag flag. This can be tested with the following program: #include <stdio.h> #include <stdlib.h> #include <keyutils.h> #define KEYCTL_SESSION_TO_PARENT 18 #define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0) int main(int argc, char **argv) { key_serial_t keyring, key; long ret; keyring = keyctl_join_session_keyring(argv[1]); OSERROR(keyring, "keyctl_join_session_keyring"); key = add_key("user", "a", "b", 1, keyring); OSERROR(key, "add_key"); ret = keyctl(KEYCTL_SESSION_TO_PARENT); OSERROR(ret, "KEYCTL_SESSION_TO_PARENT"); return 0; } Compiled and linked with -lkeyutils, you should see something like: [dhowells@andromeda ~]$ keyctl show Session Keyring -3 --alswrv 4043 4043 keyring: _ses 355907932 --alswrv 4043 -1 \_ keyring: _uid.4043 [dhowells@andromeda ~]$ /tmp/newpag [dhowells@andromeda ~]$ keyctl show Session Keyring -3 --alswrv 4043 4043 keyring: _ses 1055658746 --alswrv 4043 4043 \_ user: a [dhowells@andromeda ~]$ /tmp/newpag hello [dhowells@andromeda ~]$ keyctl show Session Keyring -3 --alswrv 4043 4043 keyring: hello 340417692 --alswrv 4043 4043 \_ user: a Where the test program creates a new session keyring, sticks a user key named 'a' into it and then installs it on its parent. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: Do some whitespace cleanups [try #6]	David Howells	2009-09-02	1	-9/+3
\| \| \| \| \| \| \| \|	Do some whitespace cleanups in the key management code. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: Make /proc/keys use keyid not numread as file position [try #6]	Serge E. Hallyn	2009-09-02	1	-22/+55
\| \| \| \| \| \| \| \| \| \| \|	Make the file position maintained by /proc/keys represent the ID of the key just read rather than the number of keys read. This should make it faster to perform a lookup as we don't have to scan the key ID tree from the beginning to find the current position. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: Add garbage collection for dead, revoked and expired keys. [try #6]	David Howells	2009-09-02	7	-4/+322
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add garbage collection for dead, revoked and expired keys. This involved erasing all links to such keys from keyrings that point to them. At that point, the key will be deleted in the normal manner. Keyrings from which garbage collection occurs are shrunk and their quota consumption reduced as appropriate. Dead keys (for which the key type has been removed) will be garbage collected immediately. Revoked and expired keys will hang around for a number of seconds, as set in /proc/sys/kernel/keys/gc_delay before being automatically removed. The default is 5 minutes. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: Flag dead keys to induce EKEYREVOKED [try #6]	David Howells	2009-09-02	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \|	Set the KEY_FLAG_DEAD flag on keys for which the type has been removed. This causes the key_permission() function to return EKEYREVOKED in response to various commands. It does not, however, prevent unlinking or clearing of keyrings from detaching the key. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: Allow keyctl_revoke() on keys that have SETATTR but not WRITE perm ↵	David Howells	2009-09-02	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \|	[try #6] Allow keyctl_revoke() to operate on keys that have SETATTR but not WRITE permission, rather than only on keys that have WRITE permission. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: Deal with dead-type keys appropriately [try #6]	David Howells	2009-09-02	4	-31/+48
\| \| \| \| \| \| \| \| \| \|	Allow keys for which the key type has been removed to be unlinked. Currently dead-type keys can only be disposed of by completely clearing the keyrings that point to them. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
*	kernel: rename is_single_threaded(task) to current_is_single_threaded(void)	Oleg Nesterov	2009-07-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- is_single_threaded(task) is not safe unless task == current, we can't use task->signal or task->mm. - it doesn't make sense unless task == current, the task can fork right after the check. Rename it to current_is_single_threaded() and kill the argument. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
*	keys: annotate seqfile ops with __releases and __acquires	James Morris	2009-06-25	1	-0/+4
\| \| \| \| \| \| \| \|	Annotate seqfile ops with __releases and __acquires to stop sparse complaining about unbalanced locking. Signed-off-by: James Morris <jmorris@namei.org> Reviewed-by: Serge Hallyn <serue@us.ibm.com>
*	keys: Handle there being no fallback destination keyring for request_key()	David Howells	2009-04-09	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When request_key() is called, without there being any standard process keyrings on which to fall back if a destination keyring is not specified, an oops is liable to occur when construct_alloc_key() calls down_write() on dest_keyring's semaphore. Due to function inlining this may be seen as an oops in down_write() as called from request_key_and_link(). This situation crops up during boot, where request_key() is called from within the kernel (such as in CIFS mounts) where nobody is actually logged in, and so PAM has not had a chance to create a session keyring and user keyrings to act as the fallback. To fix this, make construct_alloc_key() not attempt to cache a key if there is no fallback key if no destination keyring is given specifically. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	keys: make procfiles per-user-namespace	Serge E. Hallyn	2009-02-27	1	-6/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Restrict the /proc/keys and /proc/key-users output to keys belonging to the same user namespace as the reading task. We may want to make this more complicated - so that any keys in a user-namespace which is belongs to the reading task are also shown. But let's see if anyone wants that first. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
*	keys: skip keys from another user namespace	Serge E. Hallyn	2009-02-27	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	When listing keys, do not return keys belonging to the same uid in another user namespace. Otherwise uid 500 in another user namespace will return keyrings called uid.500 for another user namespace. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
*	keys: consider user namespace in key_permission	Serge E. Hallyn	2009-02-27	1	-0/+5
\| \| \| \| \| \| \| \| \|	If a key is owned by another user namespace, then treat the key as though it is owned by both another uid and gid. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
*	keys: distinguish per-uid keys in different namespaces	Serge E. Hallyn	2009-02-27	5	-5/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	per-uid keys were looked by uid only. Use the user namespace to distinguish the same uid in different namespaces. This does not address key_permission. So a task can for instance try to join a keyring owned by the same uid in another namespace. That will be handled by a separate patch. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
*	security: introduce missing kfree	Vegard Nossum	2009-01-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Plug this leak. Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: <stable@kernel.org> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[CVE-2009-0029] System call wrappers part 28	Heiko Carstens	2009-01-14	1	-2/+2
\| \| \| \|	Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
*	[CVE-2009-0029] System call wrappers part 27	Heiko Carstens	2009-01-14	1	-9/+9
\| \| \| \|	Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
*	keys: fix sparse warning by adding __user annotation to cast	James Morris	2009-01-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix the following sparse warning: CC security/keys/key.o security/keys/keyctl.c:1297:10: warning: incorrect type in argument 2 (different address spaces) security/keys/keyctl.c:1297:10: expected char [noderef] <asn:1>buffer security/keys/keyctl.c:1297:10: got char <noident> which appears to be caused by lack of __user annotation to the cast of a syscall argument. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: David Howells <dhowells@redhat.com>
*	KEYS: Fix variable uninitialisation warnings	David Howells	2008-12-29	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix variable uninitialisation warnings introduced in: commit 8bbf4976b59fc9fc2861e79cab7beb3f6d647640 Author: David Howells <dhowells@redhat.com> Date: Fri Nov 14 10:39:14 2008 +1100 KEYS: Alter use of key instantiation link-to-keyring argument As: security/keys/keyctl.c: In function 'keyctl_negate_key': security/keys/keyctl.c:976: warning: 'dest_keyring' may be used uninitialized in this function security/keys/keyctl.c: In function 'keyctl_instantiate_key': security/keys/keyctl.c:898: warning: 'dest_keyring' may be used uninitialized in this function Some versions of gcc notice that get_instantiation_key() doesn't always set _dest_keyring, but fail to observe that if this happens then _dest_keyring will not be read by the caller. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
*	CRED: Make execve() take advantage of copy-on-write credentials	David Howells	2008-11-14	1	-42/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make execve() take advantage of copy-on-write credentials, allowing it to set up the credentials in advance, and then commit the whole lot after the point of no return. This patch and the preceding patches have been tested with the LTP SELinux testsuite. This patch makes several logical sets of alteration: (1) execve(). The credential bits from struct linux_binprm are, for the most part, replaced with a single credentials pointer (bprm->cred). This means that all the creds can be calculated in advance and then applied at the point of no return with no possibility of failure. I would like to replace bprm->cap_effective with: cap_isclear(bprm->cap_effective) but this seems impossible due to special behaviour for processes of pid 1 (they always retain their parent's capability masks where normally they'd be changed - see cap_bprm_set_creds()). The following sequence of events now happens: (a) At the start of do_execve, the current task's cred_exec_mutex is locked to prevent PTRACE_ATTACH from obsoleting the calculation of creds that we make. (a) prepare_exec_creds() is then called to make a copy of the current task's credentials and prepare it. This copy is then assigned to bprm->cred. This renders security_bprm_alloc() and security_bprm_free() unnecessary, and so they've been removed. (b) The determination of unsafe execution is now performed immediately after (a) rather than later on in the code. The result is stored in bprm->unsafe for future reference. (c) prepare_binprm() is called, possibly multiple times. (i) This applies the result of set[ug]id binaries to the new creds attached to bprm->cred. Personality bit clearance is recorded, but now deferred on the basis that the exec procedure may yet fail. (ii) This then calls the new security_bprm_set_creds(). This should calculate the new LSM and capability credentials into bprm->cred. This folds together security_bprm_set() and parts of security_bprm_apply_creds() (these two have been removed). Anything that might fail must be done at this point. (iii) bprm->cred_prepared is set to 1. bprm->cred_prepared is 0 on the first pass of the security calculations, and 1 on all subsequent passes. This allows SELinux in (ii) to base its calculations only on the initial script and not on the interpreter. (d) flush_old_exec() is called to commit the task to execution. This performs the following steps with regard to credentials: (i) Clear pdeath_signal and set dumpable on certain circumstances that may not be covered by commit_creds(). (ii) Clear any bits in current->personality that were deferred from (c.i). (e) install_exec_creds() [compute_creds() as was] is called to install the new credentials. This performs the following steps with regard to credentials: (i) Calls security_bprm_committing_creds() to apply any security requirements, such as flushing unauthorised files in SELinux, that must be done before the credentials are changed. This is made up of bits of security_bprm_apply_creds() and security_bprm_post_apply_creds(), both of which have been removed. This function is not allowed to fail; anything that might fail must have been done in (c.ii). (ii) Calls commit_creds() to apply the new credentials in a single assignment (more or less). Possibly pdeath_signal and dumpable should be part of struct creds. (iii) Unlocks the task's cred_replace_mutex, thus allowing PTRACE_ATTACH to take place. (iv) Clears The bprm->cred pointer as the credentials it was holding are now immutable. (v) Calls security_bprm_committed_creds() to apply any security alterations that must be done after the creds have been changed. SELinux uses this to flush signals and signal handlers. (f) If an error occurs before (d.i), bprm_free() will call abort_creds() to destroy the proposed new credentials and will then unlock cred_replace_mutex. No changes to the credentials will have been made. (2) LSM interface. A number of functions have been changed, added or removed: () security_bprm_alloc(), ->bprm_alloc_security() () security_bprm_free(), ->bprm_free_security() Removed in favour of preparing new credentials and modifying those. () security_bprm_apply_creds(), ->bprm_apply_creds() () security_bprm_post_apply_creds(), ->bprm_post_apply_creds() Removed; split between security_bprm_set_creds(), security_bprm_committing_creds() and security_bprm_committed_creds(). () security_bprm_set(), ->bprm_set_security() Removed; folded into security_bprm_set_creds(). () security_bprm_set_creds(), ->bprm_set_creds() New. The new credentials in bprm->creds should be checked and set up as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the second and subsequent calls. () security_bprm_committing_creds(), ->bprm_committing_creds() (*) security_bprm_committed_creds(), ->bprm_committed_creds() New. Apply the security effects of the new credentials. This includes closing unauthorised files in SELinux. This function may not fail. When the former is called, the creds haven't yet been applied to the process; when the latter is called, they have. The former may access bprm->cred, the latter may not. (3) SELinux. SELinux has a number of changes, in addition to those to support the LSM interface changes mentioned above: (a) The bprm_security_struct struct has been removed in favour of using the credentials-under-construction approach. (c) flush_unauthorized_files() now takes a cred pointer and passes it on to inode_has_perm(), file_has_perm() and dentry_open(). Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
*	CRED: Inaugurate COW credentials	David Howells	2008-11-14	9	-269/+317
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Inaugurate copy-on-write credentials management. This uses RCU to manage the credentials pointer in the task_struct with respect to accesses by other tasks. A process may only modify its own credentials, and so does not need locking to access or modify its own credentials. A mutex (cred_replace_mutex) is added to the task_struct to control the effect of PTRACE_ATTACHED on credential calculations, particularly with respect to execve(). With this patch, the contents of an active credentials struct may not be changed directly; rather a new set of credentials must be prepared, modified and committed using something like the following sequence of events: struct cred new = prepare_creds(); int ret = blah(new); if (ret < 0) { abort_creds(new); return ret; } return commit_creds(new); There are some exceptions to this rule: the keyrings pointed to by the active credentials may be instantiated - keyrings violate the COW rule as managing COW keyrings is tricky, given that it is possible for a task to directly alter the keys in a keyring in use by another task. To help enforce this, various pointers to sets of credentials, such as those in the task_struct, are declared const. The purpose of this is compile-time discouragement of altering credentials through those pointers. Once a set of credentials has been made public through one of these pointers, it may not be modified, except under special circumstances: (1) Its reference count may incremented and decremented. (2) The keyrings to which it points may be modified, but not replaced. The only safe way to modify anything else is to create a replacement and commit using the functions described in Documentation/credentials.txt (which will be added by a later patch). This patch and the preceding patches have been tested with the LTP SELinux testsuite. This patch makes several logical sets of alteration: (1) execve(). This now prepares and commits credentials in various places in the security code rather than altering the current creds directly. (2) Temporary credential overrides. do_coredump() and sys_faccessat() now prepare their own credentials and temporarily override the ones currently on the acting thread, whilst preventing interference from other threads by holding cred_replace_mutex on the thread being dumped. This will be replaced in a future patch by something that hands down the credentials directly to the functions being called, rather than altering the task's objective credentials. (3) LSM interface. A number of functions have been changed, added or removed: () security_capset_check(), ->capset_check() () security_capset_set(), ->capset_set() Removed in favour of security_capset(). () security_capset(), ->capset() New. This is passed a pointer to the new creds, a pointer to the old creds and the proposed capability sets. It should fill in the new creds or return an error. All pointers, barring the pointer to the new creds, are now const. () security_bprm_apply_creds(), ->bprm_apply_creds() Changed; now returns a value, which will cause the process to be killed if it's an error. () security_task_alloc(), ->task_alloc_security() Removed in favour of security_prepare_creds(). () security_cred_free(), ->cred_free() New. Free security data attached to cred->security. () security_prepare_creds(), ->cred_prepare() New. Duplicate any security data attached to cred->security. () security_commit_creds(), ->cred_commit() New. Apply any security effects for the upcoming installation of new security by commit_creds(). () security_task_post_setuid(), ->task_post_setuid() Removed in favour of security_task_fix_setuid(). () security_task_fix_setuid(), ->task_fix_setuid() Fix up the proposed new credentials for setuid(). This is used by cap_set_fix_setuid() to implicitly adjust capabilities in line with setuid() changes. Changes are made to the new credentials, rather than the task itself as in security_task_post_setuid(). () security_task_reparent_to_init(), ->task_reparent_to_init() Removed. Instead the task being reparented to init is referred directly to init's credentials. NOTE! This results in the loss of some state: SELinux's osid no longer records the sid of the thread that forked it. () security_key_alloc(), ->key_alloc() () security_key_permission(), ->key_permission() Changed. These now take cred pointers rather than task pointers to refer to the security context. (4) sys_capset(). This has been simplified and uses less locking. The LSM functions it calls have been merged. (5) reparent_to_kthreadd(). This gives the current thread the same credentials as init by simply using commit_thread() to point that way. (6) __sigqueue_alloc() and switch_uid() __sigqueue_alloc() can't stop the target task from changing its creds beneath it, so this function gets a reference to the currently applicable user_struct which it then passes into the sigqueue struct it returns if successful. switch_uid() is now called from commit_creds(), and possibly should be folded into that. commit_creds() should take care of protecting __sigqueue_alloc(). (7) [sg]et[ug]id() and co and [sg]et_current_groups. The set functions now all use prepare_creds(), commit_creds() and abort_creds() to build and check a new set of credentials before applying it. security_task_set[ug]id() is called inside the prepared section. This guarantees that nothing else will affect the creds until we've finished. The calling of set_dumpable() has been moved into commit_creds(). Much of the functionality of set_user() has been moved into commit_creds(). The get functions all simply access the data directly. (8) security_task_prctl() and cap_task_prctl(). security_task_prctl() has been modified to return -ENOSYS if it doesn't want to handle a function, or otherwise return the return value directly rather than through an argument. Additionally, cap_task_prctl() now prepares a new set of credentials, even if it doesn't end up using it. (9) Keyrings. A number of changes have been made to the keyrings code: (a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have all been dropped and built in to the credentials functions directly. They may want separating out again later. (b) key_alloc() and search_process_keyrings() now take a cred pointer rather than a task pointer to specify the security context. (c) copy_creds() gives a new thread within the same thread group a new thread keyring if its parent had one, otherwise it discards the thread keyring. (d) The authorisation key now points directly to the credentials to extend the search into rather pointing to the task that carries them. (e) Installing thread, process or session keyrings causes a new set of credentials to be created, even though it's not strictly necessary for process or session keyrings (they're shared). (10) Usermode helper. The usermode helper code now carries a cred struct pointer in its subprocess_info struct instead of a new session keyring pointer. This set of credentials is derived from init_cred and installed on the new process after it has been cloned. call_usermodehelper_setup() allocates the new credentials and call_usermodehelper_freeinfo() discards them if they haven't been used. A special cred function (prepare_usermodeinfo_creds()) is provided specifically for call_usermodehelper_setup() to call. call_usermodehelper_setkeys() adjusts the credentials to sport the supplied keyring as the new session keyring. (11) SELinux. SELinux has a number of changes, in addition to those to support the LSM interface changes mentioned above: (a) selinux_setprocattr() no longer does its check for whether the current ptracer can access processes with the new SID inside the lock that covers getting the ptracer's SID. Whilst this lock ensures that the check is done with the ptracer pinned, the result is only valid until the lock is released, so there's no point doing it inside the lock. (12) is_single_threaded(). This function has been extracted from selinux_setprocattr() and put into a file of its own in the lib/ directory as join_session_keyring() now wants to use it too. The code in SELinux just checked to see whether a task shared mm_structs with other tasks (CLONE_VM), but that isn't good enough. We really want to know if they're part of the same thread group (CLONE_THREAD). (13) nfsd. The NFS server daemon now has to use the COW credentials to set the credentials it is going to use. It really needs to pass the credentials down to the functions it calls, but it can't do that until other patches in this series have been applied. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>
*	CRED: Separate per-task-group keyrings from signal_struct	David Howells	2008-11-14	2	-80/+54
\| \| \| \| \| \| \| \| \|	Separate per-task-group keyrings from signal_struct and dangle their anchor from the cred struct rather than the signal_struct. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>
*	CRED: Use RCU to access another task's creds and to release a task's own creds	David Howells	2008-11-14	2	-14/+20
\| \| \| \| \| \| \| \| \| \| \| \|	Use RCU to access another task's creds and to release a task's own creds. This means that it will be possible for the credentials of a task to be replaced without another task (a) requiring a full lock to read them, and (b) seeing deallocated memory. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
*	CRED: Wrap current->cred and a few other accessors	David Howells	2008-11-14	2	-6/+7
\| \| \| \| \| \| \| \| \| \|	Wrap current->cred and a few other accessors to hide their actual implementation. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
*	CRED: Separate task security context from task_struct	David Howells	2008-11-14	5	-78/+86
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Separate the task security context from task_struct. At this point, the security data is temporarily embedded in the task_struct with two pointers pointing to it. Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in entry.S via asm-offsets. With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: Alter use of key instantiation link-to-keyring argument	David Howells	2008-11-14	5	-111/+187
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Alter the use of the key instantiation and negation functions' link-to-keyring arguments. Currently this specifies a keyring in the target process to link the key into, creating the keyring if it doesn't exist. This, however, can be a problem for copy-on-write credentials as it means that the instantiating process can alter the credentials of the requesting process. This patch alters the behaviour such that: (1) If keyctl_instantiate_key() or keyctl_negate_key() are given a specific keyring by ID (ringid >= 0), then that keyring will be used. (2) If keyctl_instantiate_key() or keyctl_negate_key() are given one of the special constants that refer to the requesting process's keyrings (KEY_SPEC_*_KEYRING, all <= 0), then: (a) If sys_request_key() was given a keyring to use (destringid) then the key will be attached to that keyring. (b) If sys_request_key() was given a NULL keyring, then the key being instantiated will be attached to the default keyring as set by keyctl_set_reqkey_keyring(). (3) No extra link will be made. Decision point (1) follows current behaviour, and allows those instantiators who've searched for a specifically named keyring in the requestor's keyring so as to partition the keys by type to still have their named keyrings. Decision point (2) allows the requestor to make sure that the key or keys that get produced by request_key() go where they want, whilst allowing the instantiator to request that the key is retained. This is mainly useful for situations where the instantiator makes a secondary request, the key for which should be retained by the initial requestor: +-----------+ +--------------+ +--------------+ \| \| \| \| \| \| \| Requestor \|------->\| Instantiator \|------->\| Instantiator \| \| \| \| \| \| \| +-----------+ +--------------+ +--------------+ request_key() request_key() This might be useful, for example, in Kerberos, where the requestor requests a ticket, and then the ticket instantiator requests the TGT, which someone else then has to go and fetch. The TGT, however, should be retained in the keyrings of the requestor, not the first instantiator. To make this explict an extra special keyring constant is also added. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>
*	KEYS: Disperse linux/key_ui.h	David Howells	2008-11-14	3	-1/+33
\| \| \| \| \| \| \| \| \|	Disperse the bits of linux/key_ui.h as the reason they were put here (keyfs) didn't get in. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>
*	CRED: Wrap task credential accesses in the key management code	David Howells	2008-11-14	4	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(\|e\|s\|fs)[ug]id to current_(\|e\|s\|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
*	keys: remove unused key_alloc_sem	Daniel Walker	2008-06-06	1	-1/+0
\| \| \| \| \| \| \| \| \|	This semaphore doesn't appear to be used, so remove it. Signed-off-by: Daniel Walker <dwalker@mvista.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	keys: explicitly include required slab.h header file.	Robert P. J. Day	2008-04-29	2	-0/+2
\| \| \| \| \| \| \| \| \| \|	Since these two source files invoke kmalloc(), they should explicitly include <linux/slab.h>. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	keys: make the keyring quotas controllable through /proc/sys	David Howells	2008-04-29	6	-15/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make the keyring quotas controllable through /proc/sys files: () /proc/sys/kernel/keys/root_maxkeys /proc/sys/kernel/keys/root_maxbytes Maximum number of keys that root may have and the maximum total number of bytes of data that root may have stored in those keys. () /proc/sys/kernel/keys/maxkeys /proc/sys/kernel/keys/maxbytes Maximum number of keys that each non-root user may have and the maximum total number of bytes of data that each of those users may have stored in their keys. Also increase the quotas as a number of people have been complaining that it's not big enough. I'm not sure that it's big enough now either, but on the other hand, it can now be set in /etc/sysctl.conf. Signed-off-by: David Howells <dhowells@redhat.com> Cc: <kwc@citi.umich.edu> Cc: <arunsr@cse.iitk.ac.in> Cc: <dwalsh@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>