vasilito d37b421cb3 kernel: fix wakeup_contexts vs steal_work deadlock
Two-sided fix for the lock-ordering deadlock discovered by
Oracle review (Issue 24):

1. wakeup_contexts (this fn) held IDLE_CONTEXTS while
   waiting for SchedQueuesLock on its own CPU via
   SchedQueuesLock::new(&percpu.sched). If another CPU's
   steal_work was holding that SchedQueuesLock (via a victim
   SchedQueuesLock) and waiting for IDLE_CONTEXTS, both
   threads spin forever.

   Fix: drop idle_contexts immediately after building the
   wakeups Vec. The Vec is the only data we need; releasing
   the lock here means steal_work on another CPU can proceed
   while this CPU acquires its own SchedQueuesLock.

2. steal_work held a victim's SchedQueuesLock (victim_lock)
   while calling idle_contexts(token.downgrade()).push_back
   on a context that turned out to be Blocked. This is the
   matching side of the deadlock: CPU A held IDLE_CONTEXTS and
   waited for its own SchedQueuesLock; CPU B (steal_work) held
   CPU A's SchedQueuesLock and waited for IDLE_CONTEXTS.

   Fix: use idle_contexts_try (try_lock) instead of
   idle_contexts (blocking lock). If IDLE_CONTEXTS is busy
   (owned by wakeup_contexts on another CPU), skip the
   push-back; the context will be re-checked on the next
   wakeup round because it was not removed from IDLE_CONTEXTS
   (the Blocked status was set, but it stayed in IDLE_CONTEXTS
   because we never re-pushed it).

The original code at line 429 used idle_contexts (blocking)
which is what makes this a real deadlock. try_lock is safe
because:
  - If try_lock succeeds, the context is correctly pushed
  - If try_lock fails, the context is still in IDLE_CONTEXTS
    (we never removed it), so the next wakeup_contexts will
    find it again
2026-07-02 10:36:17 +03:00
2026-06-27 09:19:25 +03:00
2026-06-27 09:19:25 +03:00
2026-06-27 09:19:25 +03:00
2026-06-27 09:19:25 +03:00
2026-06-27 09:19:25 +03:00
2026-06-27 09:19:25 +03:00
2026-06-27 09:19:25 +03:00
2026-06-27 09:19:25 +03:00
2026-06-27 09:19:25 +03:00
2026-06-27 09:19:25 +03:00
2026-06-27 09:19:25 +03:00
2026-06-27 09:19:25 +03:00
2026-06-27 09:19:25 +03:00

Kernel

Redox OS Microkernel

docs SLOCs counter MIT licensed

Requirements

  • nasm needs to be available on the PATH at build time.

Building The Documentation

Use this command:

cargo doc --open --target x86_64-unknown-none

Debugging

QEMU

Running QEMU with the -s flag will set up QEMU to listen on port 1234 for a GDB client to connect to it. To debug the redox kernel run.

make qemu gdb=yes

This will start a virtual machine with and listen on port 1234 for a GDB or LLDB client.

GDB

If you are going to use GDB, run these commands to load debug symbols and connect to your running kernel:

(gdb) symbol-file build/kernel.sym
(gdb) target remote localhost:1234

LLDB

If you are going to use LLDB, run these commands to start debugging:

(lldb) target create -s build/kernel.sym build/kernel
(lldb) gdb-remote localhost:1234

After connecting to your kernel you can set some interesting breakpoints and continue the process. See your debuggers man page for more information on useful commands to run.

Notes

  • Always use foo.get(n) instead of foo[n] and try to cover for the possibility of Option::None. Doing the regular way may work fine for applications, but never in the kernel. No possible panics should ever exist in kernel space, because then the whole OS would just stop working.

  • If you receive a kernel panic in QEMU, use pkill qemu-system to kill the frozen QEMU process.

How To Contribute

To learn how to contribute to this system component you need to read the following document:

Development

To learn how to do development with this system component inside the Redox build system you need to read the Build System and Coding and Building pages.

How To Build

To build this system component you need to download the Redox build system, you can learn how to do it on the Building Redox page.

This is necessary because they only work with cross-compilation to a Redox virtual machine, but you can do some testing from Linux.

Funding - Unix-style Signals and Process Management

This project is funded through NGI Zero Core, a fund established by NLnet with financial support from the European Commission's Next Generation Internet program. Learn more at the NLnet project page.

NLnet foundation logo NGI Zero Logo

S
Description
RedBear Operating System, based on RedoxOS. Licenced under MIT license.
https://redbearos.org
Readme MIT 20 GiB
Languages
C 43.9%
C++ 23.5%
Makefile 7.3%
Python 3.7%
JavaScript 3.4%
Other 17.1%