6 comments

  • matheusmoreira 30 minutes ago
    This is an article I wish I could have read many months ago.

    > Hence, the most basic safety issue with setjmp is that if we call it and then return from the function that had called it, the context saved by setjmp is not valid to longjmp to.

    > longjmp is only safe if it's called at a time when the stack frame used by setjmp could not have possibly been overwritten, since that is the only way to guarantee that the register state restored by longjmp matches the stack frame that the stack pointer points to.

    That limitation could be lifted by simply copying the stack frames somewhere else prior to long jumping, and then spilling that entire thing on top of the current stack instead of just restoring the registers in from the jump buffer. This is how delimited continuations work! What ruins this for C is the existence of pointers. Stacks aren't freely relocatable since pointers into the stack could exist. Other languages don't have this problem.

    So much fun stuff in this article! The "fibers with ucontext", essentially swapping stack pointers back and forth, are how I implemented generators! I too reached for musl source code in order to understand setjmp, but for a different reason: its ability to spill the registers onto the stack was instrumental for my garbage collector.

    Blogged about all of these things too, in case anyone is curious:

    https://www.matheusmoreira.com/articles/delimited-continuati...

    https://www.matheusmoreira.com/articles/generators-in-lone-l...

    https://www.matheusmoreira.com/articles/babys-second-garbage...

    • pizlonator 2 minutes ago
      I’ve used the copy-stack trick before! It’s really great!

      You can work around the pointer relocation issue by always coping the stack back onto the main stack. So you’re always running on the same range of stack in memory and saved stacks are always elsewhere

    • Onavo 8 minutes ago
      > What ruins this for C is the existence of pointers. Stacks aren't freely relocatable since pointers into the stack could exist. Other languages don't have this problem*

      What about languages with pass by reference?

  • anitil 54 minutes ago
    How interesting! I thought that setjmp and longjmp were probably incompatible with Fil-C. And I'd somehow never heard of ucontext at all.

    I suppose managing the stack is still managing memory after all, even if we typically don't think of it that way, so Fil-C has something to add here.

    It's really worth reading the section here about the complexity of setjmp/longjmp and how they interact with register allocation and stack spilling. I knew they're tricky, but going in to the specifics is delicious.

  • gruntled-worker 1 hour ago
    No complaints about this in particular, but code that uses setjmp/longjmp often has a risk profile that's way bigger than memory safety alone. If you're stuck with them then by all means, mitigate all you can.
    • pizlonator 23 minutes ago
      What misuse are you imagining that isn’t a memory safety problem?

      You might find that Fil-C prevents those too. It’s pretty strict. You can only use longjmp to pop stack like an exception would

  • nanolith 36 minutes ago
    > For example, Boost uses ucontext as part of its fiber implementation.

    Maybe for the incredibly slow fallback, it does. Boost context and Boost fiber has ABI support for *nix / MacOS / Windows for x86_64 and ARM/ARM64. The overhead for a fiber switch using this support is about as heavy as a virtual function call. In comparison, ucontext is very heavy.

    I wrote my own fiber library for C. I got the idea from an old implementation I saw that used setjmp and longjmp, which took me down the rabbit hole of figuring out how to do this more efficiently and with an improved margin of safety. I chose to follow Boost's example, and in fact, used some of their fiber switch assembler with attribution in my library.

    • pizlonator 13 minutes ago
      > In comparison, ucontext is very heavy

      It's heavy because it switches the signal masks.

      Indeed, Fil-C's ucontext logic does this today, because I'm relying on glibc, and that's what glibc does.

      But it would be straightforward to teach the internal Fil-C zfiber_context API to not save the sigmasks. It would just mean using some other backend for setcontext/swapcontext. Considering that there are multiple open source projects (including Boost!) that have code that does this, it would be easy to set that up.

      But I'm taking baby steps here. And the first step is just to provide a memory safe wrapper around these quite dangerous APIs. Probably the next step is to just write a lot more tests to try to break it. Then, later, I can worry about adding alternative backends to expose the sigmask-free version of this that Boost (and most others) want.

      • nanolith 8 minutes ago
        Fair enough. I use my fiber library for cooperative multitasking, as an alternative to async I/O. It's still non-blocking, but as far as user code knows, it behaves as if it is blocking.

        To do this, I disable signals on threads that are fiber threads, and instead rely on a signal thread to intercept signals and alert the appropriate fibers.

  • lstodd 1 hour ago
    longjmp, setjmp, setcontext, getcontext, makecontext, and swapcontext and whatever have no bearing on safety, memory or otherwise. What you have to deal with is what is represented by sigaction(2) and only and much later then by what you use to drive the context switch, be it io, or preemptive.
    • pizlonator 29 minutes ago
      These functions can easily be misused to corrupt memory, so they very much have something to do with safety. Fil-C goes to great lengths to prevent your use of those functions leading to memory corruption or any violation of the capability model.

      Fil-C also makes sigaction memory safe. That protection does allow for signal handlers to longjmp or setcontext or swapcontext

    • anitil 50 minutes ago
      The article mentions that you typically have to longjmp within the same function as setjump (or a descendant function) otherwise your stack gets cleared and you longjmp to a garbage stack. I believe this counts as memory safety? Though I don't quite understand your comment about sigaction, so maybe there's some context I'm missing.

      Edit: The extra context- https://usenix.org/legacy/publications/library/proceedings/u...

  • brcmthrowaway 1 hour ago
    Is Fil-C now using Claude for dev?