summaryrefslogtreecommitdiff
path: root/gdb/gdbserver
diff options
context:
space:
mode:
authorPedro Alves <palves@redhat.com>2015-08-06 10:30:18 +0100
committerPedro Alves <palves@redhat.com>2015-08-06 10:30:18 +0100
commit863d01bde2725d009c45ab7e9ba1dbf3f5b923f8 (patch)
treee5a1cfd29ea29b854a215216192efe8a45a22bb8 /gdb/gdbserver
parent00db26facc14ac830adef704bba9b24d0d366ddf (diff)
downloadbinutils-gdb-863d01bde2725d009c45ab7e9ba1dbf3f5b923f8.tar.gz
gdbserver: Fix non-stop / fork / step-over issues
Ref: https://sourceware.org/ml/gdb-patches/2015-07/msg00868.html This adds a test that has a multithreaded program have several threads continuously fork, while another thread continuously steps over a breakpoint. This exposes several intertwined issues, which this patch addresses: - When we're stopping and suspending threads, some thread may fork, and we missed setting its suspend count to 1, like we do when a new clone/thread is detected. When we next unsuspend threads, the fork child's suspend count goes below 0, which is bogus and fails an assertion. - If a step-over is cancelled because a signal arrives, but then gdb is not interested in the signal, we pass the signal straight back to the inferior. However, we miss that we need to re-increment the suspend counts of all other threads that had been paused for the step-over. As a result, other threads indefinitely end up stuck stopped. - If a detach request comes in just while gdbserver is handling a step-over (in the test at hand, this is GDB detaching the fork child), gdbserver internal errors in stabilize_thread's helpers, which assert that all thread's suspend counts are 0 (otherwise we wouldn't be able to move threads out of the jump pads). The suspend counts aren't 0 while a step-over is in progress, because all threads but the one stepping past the breakpoint must remain paused until the step-over finishes and the breakpoint can be reinserted. - Occasionally, we see "BAD - reinserting but not stepping." being output (from within linux_resume_one_lwp_throw). That was because GDB pokes memory while gdbserver is busy with a step-over, and that suspends threads, and then re-resumes them with proceed_one_lwp, which missed another reason to tell linux_resume_one_lwp that the thread should be set back to stepping. - In a couple places, we were resuming threads that are meant to be suspended. E.g., when a vCont;c/s request for thread B comes in just while gdbserver is stepping thread A past a breakpoint. The resume for thread B must be deferred until the step-over finishes. - The test runs with both "set detach-on-fork" on and off. When off, it exercises the case of GDB detaching the fork child explicitly. When on, it exercises the case of gdb resuming the child explicitly. In the "off" case, gdb seems to exponentially become slower as new inferiors are created. This is _very_ noticeable as with only 100 inferiors gdb is crawling already, which makes the test take quite a bit to run. For that reason, I've disabled the "off" variant for now. gdb/ChangeLog: 2015-08-06 Pedro Alves <palves@redhat.com> * target/waitstatus.h (enum target_stop_reason) <TARGET_STOPPED_BY_SINGLE_STEP>: New value. gdb/gdbserver/ChangeLog: 2015-08-06 Pedro Alves <palves@redhat.com> * linux-low.c (handle_extended_wait): Set the fork child's suspend count if stopping and suspending threads. (check_stopped_by_breakpoint): If stopped by trace, set the LWP's stop reason to TARGET_STOPPED_BY_SINGLE_STEP. (linux_detach): Complete an ongoing step-over. (lwp_suspended_inc, lwp_suspended_decr): New functions. Use throughout. (resume_stopped_resumed_lwps): Don't resume a suspended thread. (linux_wait_1): If passing a signal to the inferior after finishing a step-over, unsuspend and re-resume all lwps. If we see a single-step event but the thread should be continuing, don't pass the trap to gdb. (stuck_in_jump_pad_callback, move_out_of_jump_pad_callback): Use internal_error instead of gdb_assert. (enqueue_pending_signal): New function. (check_ptrace_stopped_lwp_gone): Add debug output. (start_step_over): Use internal_error instead of gdb_assert. (complete_ongoing_step_over): New function. (linux_resume_one_thread): Don't resume a suspended thread. (proceed_one_lwp): If the LWP is stepping over a breakpoint, reset it stepping. gdb/testsuite/ChangeLog: 2015-08-06 Pedro Alves <palves@redhat.com> * gdb.threads/forking-threads-plus-breakpoint.exp: New file. * gdb.threads/forking-threads-plus-breakpoint.c: New file.
Diffstat (limited to 'gdb/gdbserver')
-rw-r--r--gdb/gdbserver/ChangeLog24
-rw-r--r--gdb/gdbserver/linux-low.c221
2 files changed, 216 insertions, 29 deletions
diff --git a/gdb/gdbserver/ChangeLog b/gdb/gdbserver/ChangeLog
index 77b4330ce9e..5588fb06a87 100644
--- a/gdb/gdbserver/ChangeLog
+++ b/gdb/gdbserver/ChangeLog
@@ -1,5 +1,29 @@
2015-08-06 Pedro Alves <palves@redhat.com>
+ * linux-low.c (handle_extended_wait): Set the fork child's suspend
+ count if stopping and suspending threads.
+ (check_stopped_by_breakpoint): If stopped by trace, set the LWP's
+ stop reason to TARGET_STOPPED_BY_SINGLE_STEP.
+ (linux_detach): Complete an ongoing step-over.
+ (lwp_suspended_inc, lwp_suspended_decr): New functions. Use
+ throughout.
+ (resume_stopped_resumed_lwps): Don't resume a suspended thread.
+ (linux_wait_1): If passing a signal to the inferior after
+ finishing a step-over, unsuspend and re-resume all lwps. If we
+ see a single-step event but the thread should be continuing, don't
+ pass the trap to gdb.
+ (stuck_in_jump_pad_callback, move_out_of_jump_pad_callback): Use
+ internal_error instead of gdb_assert.
+ (enqueue_pending_signal): New function.
+ (check_ptrace_stopped_lwp_gone): Add debug output.
+ (start_step_over): Use internal_error instead of gdb_assert.
+ (complete_ongoing_step_over): New function.
+ (linux_resume_one_thread): Don't resume a suspended thread.
+ (proceed_one_lwp): If the LWP is stepping over a breakpoint, reset
+ it stepping.
+
+2015-08-06 Pedro Alves <palves@redhat.com>
+
* linux-low.c (add_lwp): Set waitstatus to TARGET_WAITKIND_IGNORE.
(linux_thread_alive): Use lwp_is_marked_dead.
(extended_event_reported): Delete.
diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index 6a7182aafbb..98fffc992ee 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -271,6 +271,8 @@ static int lwp_is_marked_dead (struct lwp_info *lwp);
static void proceed_all_lwps (void);
static int finish_step_over (struct lwp_info *lwp);
static int kill_lwp (unsigned long lwpid, int signo);
+static void enqueue_pending_signal (struct lwp_info *lwp, int signal, siginfo_t *info);
+static void complete_ongoing_step_over (void);
/* When the event-loop is doing a step-over, this points at the thread
being stepped. */
@@ -489,6 +491,15 @@ handle_extended_wait (struct lwp_info *event_lwp, int wstat)
child_thr->last_resume_kind = resume_stop;
child_thr->last_status.kind = TARGET_WAITKIND_STOPPED;
+ /* If we're suspending all threads, leave this one suspended
+ too. */
+ if (stopping_threads == STOPPING_AND_SUSPENDING_THREADS)
+ {
+ if (debug_threads)
+ debug_printf ("HEW: leaving child suspended\n");
+ child_lwp->suspended = 1;
+ }
+
parent_proc = get_thread_process (event_thr);
child_proc->attached = parent_proc->attached;
clone_all_breakpoints (&child_proc->breakpoints,
@@ -688,6 +699,8 @@ check_stopped_by_breakpoint (struct lwp_info *lwp)
debug_printf ("CSBB: %s stopped by trace\n",
target_pid_to_str (ptid_of (thr)));
}
+
+ lwp->stop_reason = TARGET_STOPPED_BY_SINGLE_STEP;
}
}
}
@@ -1318,6 +1331,11 @@ linux_detach (int pid)
if (process == NULL)
return -1;
+ /* As there's a step over already in progress, let it finish first,
+ otherwise nesting a stabilize_threads operation on top gets real
+ messy. */
+ complete_ongoing_step_over ();
+
/* Stop all threads before detaching. First, ptrace requires that
the thread is stopped to sucessfully detach. Second, thread_db
may need to uninstall thread event breakpoints from memory, which
@@ -1686,6 +1704,39 @@ not_stopped_callback (struct inferior_list_entry *entry, void *arg)
return 0;
}
+/* Increment LWP's suspend count. */
+
+static void
+lwp_suspended_inc (struct lwp_info *lwp)
+{
+ lwp->suspended++;
+
+ if (debug_threads && lwp->suspended > 4)
+ {
+ struct thread_info *thread = get_lwp_thread (lwp);
+
+ debug_printf ("LWP %ld has a suspiciously high suspend count,"
+ " suspended=%d\n", lwpid_of (thread), lwp->suspended);
+ }
+}
+
+/* Decrement LWP's suspend count. */
+
+static void
+lwp_suspended_decr (struct lwp_info *lwp)
+{
+ lwp->suspended--;
+
+ if (lwp->suspended < 0)
+ {
+ struct thread_info *thread = get_lwp_thread (lwp);
+
+ internal_error (__FILE__, __LINE__,
+ "unsuspend LWP %ld, suspended=%d\n", lwpid_of (thread),
+ lwp->suspended);
+ }
+}
+
/* This function should only be called if the LWP got a SIGTRAP.
Handle any tracepoint steps or hits. Return true if a tracepoint
@@ -1703,7 +1754,7 @@ handle_tracepoints (struct lwp_info *lwp)
uninsert tracepoints. To do this, we temporarily pause all
threads, unpatch away, and then unpause threads. We need to make
sure the unpausing doesn't resume LWP too. */
- lwp->suspended++;
+ lwp_suspended_inc (lwp);
/* And we need to be sure that any all-threads-stopping doesn't try
to move threads out of the jump pads, as it could deadlock the
@@ -1719,7 +1770,7 @@ handle_tracepoints (struct lwp_info *lwp)
actions. */
tpoint_related_event |= tracepoint_was_hit (tinfo, lwp->stop_pc);
- lwp->suspended--;
+ lwp_suspended_decr (lwp);
gdb_assert (lwp->suspended == 0);
gdb_assert (!stabilizing_threads || lwp->collecting_fast_tracepoint);
@@ -2179,10 +2230,13 @@ linux_low_filter_event (int lwpid, int wstat)
/* Note that TRAP_HWBKPT can indicate either a hardware breakpoint
or hardware watchpoint. Check which is which if we got
- TARGET_STOPPED_BY_HW_BREAKPOINT. */
+ TARGET_STOPPED_BY_HW_BREAKPOINT. Likewise, we may have single
+ stepped an instruction that triggered a watchpoint. In that
+ case, on some architectures (such as x86), instead of
+ TRAP_HWBKPT, si_code indicates TRAP_TRACE, and we need to check
+ the debug registers separately. */
if (WIFSTOPPED (wstat) && WSTOPSIG (wstat) == SIGTRAP
- && (child->stop_reason == TARGET_STOPPED_BY_NO_REASON
- || child->stop_reason == TARGET_STOPPED_BY_HW_BREAKPOINT))
+ && child->stop_reason != TARGET_STOPPED_BY_SW_BREAKPOINT)
check_stopped_by_watchpoint (child);
if (!have_stop_pc)
@@ -2241,6 +2295,7 @@ resume_stopped_resumed_lwps (struct inferior_list_entry *entry)
struct lwp_info *lp = get_thread_lwp (thread);
if (lp->stopped
+ && !lp->suspended
&& !lp->status_pending_p
&& thread->last_resume_kind != resume_stop
&& thread->last_status.kind == TARGET_WAITKIND_IGNORE)
@@ -2611,9 +2666,7 @@ unsuspend_one_lwp (struct inferior_list_entry *entry, void *except)
if (lwp == except)
return 0;
- lwp->suspended--;
-
- gdb_assert (lwp->suspended >= 0);
+ lwp_suspended_decr (lwp);
return 0;
}
@@ -2706,7 +2759,7 @@ linux_stabilize_threads (void)
lwp = get_thread_lwp (current_thread);
/* Lock it. */
- lwp->suspended++;
+ lwp_suspended_inc (lwp);
if (ourstatus.value.sig != GDB_SIGNAL_0
|| current_thread->last_resume_kind == resume_stop)
@@ -3092,8 +3145,25 @@ linux_wait_1 (ptid_t ptid,
info_p = &info;
else
info_p = NULL;
- linux_resume_one_lwp (event_child, event_child->stepping,
- WSTOPSIG (w), info_p);
+
+ if (step_over_finished)
+ {
+ /* We cancelled this thread's step-over above. We still
+ need to unsuspend all other LWPs, and set them back
+ running again while the signal handler runs. */
+ unsuspend_all_lwps (event_child);
+
+ /* Enqueue the pending signal info so that proceed_all_lwps
+ doesn't lose it. */
+ enqueue_pending_signal (event_child, WSTOPSIG (w), info_p);
+
+ proceed_all_lwps ();
+ }
+ else
+ {
+ linux_resume_one_lwp (event_child, event_child->stepping,
+ WSTOPSIG (w), info_p);
+ }
return ignore_event (ourstatus);
}
@@ -3109,13 +3179,21 @@ linux_wait_1 (ptid_t ptid,
do, we're be able to handle GDB breakpoints on top of internal
breakpoints, by handling the internal breakpoint and still
reporting the event to GDB. If we don't, we're out of luck, GDB
- won't see the breakpoint hit. */
+ won't see the breakpoint hit. If we see a single-step event but
+ the thread should be continuing, don't pass the trap to gdb.
+ That indicates that we had previously finished a single-step but
+ left the single-step pending -- see
+ complete_ongoing_step_over. */
report_to_gdb = (!maybe_internal_trap
|| (current_thread->last_resume_kind == resume_step
&& !in_step_range)
|| event_child->stop_reason == TARGET_STOPPED_BY_WATCHPOINT
- || (!step_over_finished && !in_step_range
- && !bp_explains_trap && !trace_event)
+ || (!in_step_range
+ && !bp_explains_trap
+ && !trace_event
+ && !step_over_finished
+ && !(current_thread->last_resume_kind == resume_continue
+ && event_child->stop_reason == TARGET_STOPPED_BY_SINGLE_STEP))
|| (gdb_breakpoint_here (event_child->stop_pc)
&& gdb_condition_true_at_breakpoint (event_child->stop_pc)
&& gdb_no_commands_at_breakpoint (event_child->stop_pc))
@@ -3463,7 +3541,7 @@ suspend_and_send_sigstop_callback (struct inferior_list_entry *entry,
if (lwp == except)
return 0;
- lwp->suspended++;
+ lwp_suspended_inc (lwp);
return send_sigstop_callback (entry, except);
}
@@ -3565,7 +3643,12 @@ stuck_in_jump_pad_callback (struct inferior_list_entry *entry, void *data)
struct thread_info *thread = (struct thread_info *) entry;
struct lwp_info *lwp = get_thread_lwp (thread);
- gdb_assert (lwp->suspended == 0);
+ if (lwp->suspended != 0)
+ {
+ internal_error (__FILE__, __LINE__,
+ "LWP %ld is suspended, suspended=%d\n",
+ lwpid_of (thread), lwp->suspended);
+ }
gdb_assert (lwp->stopped);
/* Allow debugging the jump pad, gdb_collect, etc.. */
@@ -3584,7 +3667,12 @@ move_out_of_jump_pad_callback (struct inferior_list_entry *entry)
struct lwp_info *lwp = get_thread_lwp (thread);
int *wstat;
- gdb_assert (lwp->suspended == 0);
+ if (lwp->suspended != 0)
+ {
+ internal_error (__FILE__, __LINE__,
+ "LWP %ld is suspended, suspended=%d\n",
+ lwpid_of (thread), lwp->suspended);
+ }
gdb_assert (lwp->stopped);
wstat = lwp->status_pending_p ? &lwp->status_pending : NULL;
@@ -3613,7 +3701,7 @@ move_out_of_jump_pad_callback (struct inferior_list_entry *entry)
linux_resume_one_lwp (lwp, 0, 0, NULL);
}
else
- lwp->suspended++;
+ lwp_suspended_inc (lwp);
}
static int
@@ -3668,6 +3756,24 @@ stop_all_lwps (int suspend, struct lwp_info *except)
}
}
+/* Enqueue one signal in the chain of signals which need to be
+ delivered to this process on next resume. */
+
+static void
+enqueue_pending_signal (struct lwp_info *lwp, int signal, siginfo_t *info)
+{
+ struct pending_signals *p_sig;
+
+ p_sig = xmalloc (sizeof (*p_sig));
+ p_sig->prev = lwp->pending_signals;
+ p_sig->signal = signal;
+ if (info == NULL)
+ memset (&p_sig->info, 0, sizeof (siginfo_t));
+ else
+ memcpy (&p_sig->info, info, sizeof (siginfo_t));
+ lwp->pending_signals = p_sig;
+}
+
/* Resume execution of LWP. If STEP is nonzero, single-step it. If
SIGNAL is nonzero, give it that signal. */
@@ -4201,7 +4307,13 @@ start_step_over (struct lwp_info *lwp)
lwpid_of (thread));
stop_all_lwps (1, lwp);
- gdb_assert (lwp->suspended == 0);
+
+ if (lwp->suspended != 0)
+ {
+ internal_error (__FILE__, __LINE__,
+ "LWP %ld suspended=%d\n", lwpid_of (thread),
+ lwp->suspended);
+ }
if (debug_threads)
debug_printf ("Done stopping all threads for step-over.\n");
@@ -4273,6 +4385,39 @@ finish_step_over (struct lwp_info *lwp)
return 0;
}
+/* If there's a step over in progress, wait until all threads stop
+ (that is, until the stepping thread finishes its step), and
+ unsuspend all lwps. The stepping thread ends with its status
+ pending, which is processed later when we get back to processing
+ events. */
+
+static void
+complete_ongoing_step_over (void)
+{
+ if (!ptid_equal (step_over_bkpt, null_ptid))
+ {
+ struct lwp_info *lwp;
+ int wstat;
+ int ret;
+
+ if (debug_threads)
+ debug_printf ("detach: step over in progress, finish it first\n");
+
+ /* Passing NULL_PTID as filter indicates we want all events to
+ be left pending. Eventually this returns when there are no
+ unwaited-for children left. */
+ ret = linux_wait_for_event_filtered (minus_one_ptid, null_ptid,
+ &wstat, __WALL);
+ gdb_assert (ret == -1);
+
+ lwp = find_lwp_pid (step_over_bkpt);
+ if (lwp != NULL)
+ finish_step_over (lwp);
+ step_over_bkpt = null_ptid;
+ unsuspend_all_lwps (lwp);
+ }
+}
+
/* This function is called once per thread. We check the thread's resume
request, which will tell us whether to resume, step, or leave the thread
stopped; and what signal, if any, it should be sent.
@@ -4347,13 +4492,16 @@ linux_resume_one_thread (struct inferior_list_entry *entry, void *arg)
}
/* If this thread which is about to be resumed has a pending status,
- then don't resume any threads - we can just report the pending
- status. Make sure to queue any signals that would otherwise be
- sent. In all-stop mode, we do this decision based on if *any*
- thread has a pending status. If there's a thread that needs the
- step-over-breakpoint dance, then don't resume any other thread
- but that particular one. */
- leave_pending = (lwp->status_pending_p || leave_all_stopped);
+ then don't resume it - we can just report the pending status.
+ Likewise if it is suspended, because e.g., another thread is
+ stepping past a breakpoint. Make sure to queue any signals that
+ would otherwise be sent. In all-stop mode, we do this decision
+ based on if *any* thread has a pending status. If there's a
+ thread that needs the step-over-breakpoint dance, then don't
+ resume any other thread but that particular one. */
+ leave_pending = (lwp->suspended
+ || lwp->status_pending_p
+ || leave_all_stopped);
if (!leave_pending)
{
@@ -4536,7 +4684,23 @@ proceed_one_lwp (struct inferior_list_entry *entry, void *except)
send_sigstop (lwp);
}
- step = thread->last_resume_kind == resume_step;
+ if (thread->last_resume_kind == resume_step)
+ {
+ if (debug_threads)
+ debug_printf (" stepping LWP %ld, client wants it stepping\n",
+ lwpid_of (thread));
+ step = 1;
+ }
+ else if (lwp->bp_reinsert != 0)
+ {
+ if (debug_threads)
+ debug_printf (" stepping LWP %ld, reinsert set\n",
+ lwpid_of (thread));
+ step = 1;
+ }
+ else
+ step = 0;
+
linux_resume_one_lwp (lwp, step, 0, NULL);
return 0;
}
@@ -4550,8 +4714,7 @@ unsuspend_and_proceed_one_lwp (struct inferior_list_entry *entry, void *except)
if (lwp == except)
return 0;
- lwp->suspended--;
- gdb_assert (lwp->suspended >= 0);
+ lwp_suspended_decr (lwp);
return proceed_one_lwp (entry, except);
}