Beware @unchecked Sendable, or Watch Out for Counterintuitive Implicit Actor-Isolation

I ran into some unexpected runtime crashes recently while testing an app on iOS 18 compiled under Swift 6 language mode, and the root causes ended up being the perils of using @unchecked Sendable in combination with some counterintuitive compiler behavior with implicit actor isolation. Rather than start at the end, I’ll walk you through how I introduced the crash, and then what I did to resolve it.

Let’s say you’re maintaining a Swift package that’s used across a number of different applications. Your library does nontrivial work and therefore your consumers need access to log messages coming from your package. Since your package cannot know how the host application wishes to capture logs, you decide to provide a public API surface that, at runtime, allows the host application to configure a logging “sink” through which all log messages from your package will flow:

public enum Logging {
    /// Replace the value of `sink` with a block that routes
    /// log output to your app's preferred destination:
    public static var sink: (String) -> Void = { print($0) }
}

But this will not compile under Swift 6 language mode (or under Swift 5 language mode with additional concurrency checks enabled):

Static property 'sink' is not concurrency-safe because it is nonisolated global shared mutable state

Womp-womp. As the maintainer of my Swift package, since Swift 4 I have been doing the legwork in the real world, through documentation and code review, to make sure that none of my consumers are mutating the sink property except exactly once, early during app launch, before anything could be accessing it. It’s a mitigated risk. But under Swift 6, these kinds of risk mitigations are no longer sufficient. A fix has to be made in the code. Pinky swears no longer cut it.

One option could be to convert the Logging API to something with explicit global actor isolation, say via the Main Actor:

public enum Logging {
    // I added `@MainActor` below:
    @MainActor public static var sink: (String) -> Void = { print($0) }
}

I wouldn’t recommend this option unless all the logging in your package is already coming from code isolated to the main actor. Otherwise, you’ll have to edit many, many call sites from a synchronous call:

foo()
Logging.sink("the message goes here")
bar()

to dispatch asychronously to the main queue:

foo()
DispatchQueue.main.async {
    Logging.sink("the message goes here")
}
bar()

Another problem with that change is that it dumps telemetry code (string interpolation, etc.) onto the main queue where it’s not desirable to occur, since it may degrade user interface code execution and introduce scroll hitches.

You might instead propose a more radical change that uses a non-global actor as a shared singleton so that existing synchronous read/write access can be preserved, obscuring the asynchronous details behind the scenes:

enum Logging {
    // NOTE: the addition of `@Sendable`
    typealias LoggingSink = @Sendable (String) -> Void

    static var sink: LoggingSink {
        get {
            { message in Task {
                await plumbing.log(message: message)
            } }
        }
        set {
            Task {
                await plumbing.configure(sink: newValue)
            }
        }
    }

    private static let plumbing = Plumbing()

    private actor Plumbing {
        var sink: LoggingSink = { print($0) }

        func log(message: String) {
            sink(message)
        }

        func configure(sink: @escaping LoggingSink) {
            self.sink = sink
        }
    }
}

This second option compiles without warnings or errors, even with the strictest concurrency checks that are enabled intrinsically when compiling under Swift 6 language mode. It also behaves as expected at runtime. This is a potential option, however it didn’t occur to me until writing this blog post. What occurred to me instead was to find a way to use locking mechanisms to synchronize access to the static var mutable property. What happened next led me down a path to some code that (A) compiled without warnings or errors but (B) crashed hard at runtime due to implicit actor isolation assertion failures.

It’s this other approach that I want to walk you through, since you might be as tempted as I was to take this path and might be lulled into a false sense of optimism by the lack of compiler warnings.

TL;DR: Seriously, beware the perils of @unchecked Sendable, it hides more sins than you might guess.

Let’s revisit the original code sample, trimming some fluff for conversational purposes.:

enum Logging {
    static var sink: (String) -> Void = { ... }
    // Error: Static property 'sink' is not concurrency-
    // safe because it is nonisolated global shared 
    // mutable state
}

We can resolve the compiler error by making sink a computed property backed by something that the compiler will accept. Let’s introduce a generic type Box<T> that is @unchecked Sendable and can be the backing storage for our property:

final class Box<T>: @unchecked Sendable {
    var value: T
    init(_ value: T) {
        self.value = value
    }
}

‌(Side note: I have, for purposes of this blog post, omitted the use of a locking mechanism to synchronize access to the value property. Such locking ensures that data races are, in practice, not possible. It doesn’t have bearing on the discussion that follows, however, because it amounts to a pinky swear that the Swift compiler cannot verify. It’s really too bad there isn’t a language-level support for this pattern that is concurrency checkable instead of only concurrency ignorable.)

Without the @unchecked Sendable, we’d get a compiler error down below when we try to use a Box<LoggingSink> to store our property:

enum Logging {
    typealias LoggingSink = (String) -> Void

    static var sink: LoggingSink {
        get { _sink.value }
        set { _sink.value = newValue }
    }

    private static let _sink = Box<LoggingSink>({ print($0) })
}

Since we have included the @unchecked Sendable on our Box type, that ☝🏻 there compiles without warnings or errors. We’ve converted a mutable static var sink property to a computed property backed by an object that the concurrency checker ignores. This preserves our existing API. It also carries forward all of the existing risks which our consumers have pinky-sworn with us not to get wrong. We could stop here. However, when we wire this code up in a sample project, we encounter a crash at runtime on iOS 18:

@main 
struct MyApp: App {
    init() {
        Logging.sink = { print($0) }
    }

    var body: some Scene { ... }
}

struct ContentView: View {
    var body: some View { ... }

    func userPressedTheButton() {
        DispatchQueue.global().async {
            Logging.sink("hello") // <-- CRASH
        }
    }
}

Here’s the output from the bt command in lldb:

* thread #3, queue = 'com.apple.root.default-qos', stop reason = EXC_BREAKPOINT (code=1, subcode=0x101bd43f8)
    frame #0: 0x0000000101bd43f8 libdispatch.dylib`_dispatch_assert_queue_fail + 116
    frame #1: 0x0000000101bd4384 libdispatch.dylib`dispatch_assert_queue + 188
    frame #2: 0x0000000244f8a400 libswift_Concurrency.dylib`swift_task_isCurrentExecutorImpl(swift::SerialExecutorRef) + 284
    frame #3: 0x0000000100838994 FalsePositives.debug.dylib`closure #1 in FalsePositivesApp.init($0="hello") at FalsePositivesApp.swift:0
    frame #4: 0x00000001008385ec FalsePositives.debug.dylib`thunk for @escaping @callee_guaranteed (@guaranteed String) -> () at <compiler-generated>:0
    frame #5: 0x000000010083849c FalsePositives.debug.dylib`thunk for @escaping @callee_guaranteed (@in_guaranteed String) -> (@out ()) at <compiler-generated>:0
  * frame #6: 0x0000000100839850 FalsePositives.debug.dylib`closure #1 in ContentView.userPressedTheButton() at FalsePositivesApp.swift:40:21
    frame #7: 0x00000001008398a0 FalsePositives.debug.dylib`thunk for @escaping @callee_guaranteed @Sendable () -> () at <compiler-generated>:0
    frame #8: 0x0000000101bd0ec0 libdispatch.dylib`_dispatch_call_block_and_release + 24
    frame #9: 0x0000000101bd27b8 libdispatch.dylib`_dispatch_client_callout + 16
    frame #10: 0x0000000101bd55f4 libdispatch.dylib`_dispatch_queue_override_invoke + 1312
    frame #11: 0x0000000101be63d4 libdispatch.dylib`_dispatch_root_queue_drain + 372
    frame #12: 0x0000000101be6f7c libdispatch.dylib`_dispatch_worker_thread2 + 256
    frame #13: 0x000000010099b7d8 libsystem_pthread.dylib`_pthread_wqthread + 224

This stack of method calls jumps out at us:

0: _dispatch_assert_queue_fail
1: dispatch_assert_queue
2: swift_task_isCurrentExecutorImpl(swift::SerialExecutorRef)

That’s a runtime assertion causing the app to crash. It appears to be asserting that some specific queue managed by a task executor is the expected queue. Since we aren’t using any other actors here except the Main Actor (implicitly the Main Actor since SwiftUI View and App protocols are implicitly isolated to the Main Actor), we have to assume that the runtime is doing the equivalent of this:

dispatch_assert_queue(.main)

But why? Our LoggingSink function type is implicitly nonisolated:

typealias LoggingSink = (String) -> Void

Same goes for the rest of the Logging namespace and the Box class. There’s nothing in our logging API surface that would imply Main Actor isolation. Where is that isolation being inferred?

It turns out that the implicit Main Actor isolation is getting introduced by MyApp where we’ve supplied the LoggingSink:

struct MyApp: App {
    init() {
        Logging.sink = { print($0) }
    }

The App protocol declaration requires @MainActor:

@available(iOS 14.0, macOS 11.0, tvOS 14.0, watchOS 7.0, *)
@MainActor @preconcurrency public protocol App {

Therefore that init() method is isolated to the Main Actor. But our Logging.sink member is not isolated to the Main Actor. It’s implicitly nonisolated, so why is the compiler inferring Main Actor isolation for the block we pass to it?

I will offer an educated guess about what’s happening here. I believe it’s a combination of three factors:

1) It is not possible to explicitly declare a function type as nonisolated.

You cannot include the nonisolated keyword as part of the type declaration for a function. We could not, for example, write our app’s initializer like this:

struct MyApp: App {
    init() {
        // Cannot find type 'nonisolated' in scope
        let sink: nonisolated (String) -> Void = { print($0) }
        Logging.sink = sink
    }

This is an important distinction because it has bearing on the next factor that I believe is contributing to the behavior:

2) Closures (may? always?) implicitly inherit the actor isolation where they are created.

If we rewrote our app’s initializer to look like this instead:

@main struct FalsePositivesApp: App {
    init() {
        let closure: (String) -> Void = { print($0) }
        Logging.sink = closure
    }

You might assume that the closure variable couldn’t possibly be isolated to the Main Actor. It’s right there in the type of the function: (String) -> Void. According to the language rules as I understand them, function types that are not explicitly isolated to a global actor are implicitly nonisolated. But when we run the app with ☝🏻 that code, we get the same result: a runtime assertion failure and a crash on dispatch_assert_queue, as if our closure had the type @MainActor (String) -> Void instead. For some reason here, the Swift 6 compiler is implicitly associating that { print($0) } closure with the Main Actor without informing us that it is doing so. I would argue that this is, at least in part, a defect in the compiler. This implicit Main Actor isolation is erroneous. But part of the problem is on us, because we’ve been using @unchecked Sendable, which causes the compiler to suppress errors that would otherwise bring related problems to the surface. This leads us to the third factor contributing to the observed behavior.

3) @unchecked Sendable suppresses compile-time concurrency checks of functions used as stored instance members, but does not suppress run-time concurrency checks when those functions are executed.

Phew, that’s a mouthful. Let’s unpack it a bit.

In other words, code that compiles OK may crash at run-time on false positive assertions in code that turns out not to be a legitimate problem.

I do not consider the supression of compile-time errors a bug or a defect. Slapping @unchecked Sendable on something is brazenly constructing a footgun. The Swift 6 compiler is justified in suppressing warnings and errors, and we have been “asking for it” if our app encounters actual data races.

However, I do think it’s impolite for the runtime to enforce isolation assertions on code that has explicitly been asked to suppress such assertions via @unchecked, especially since they are just naive assertions about the current dispatch queue, not introspection of the actual content of the function being executied. Pinky swears, documentation, peer review, and long stretches of production battle-hardening should be a sufficient counter to any gripes that the Swift 6 runtime may have about our desire to ignore actor isolation. The fact that it is possible for the runtime to not honor my intent here is, I’d argue, if not a defect, at least an annoyance. Let me build my footguns. I promise to only shoot between adjacent toes.

Putting All Three Together

Let’s see how all three factors come together to create this problem, where compile time seems OK but run-time crashes. Let’s start by, for sake of argument, rewriting our app’s initializer to this:

enum Example {
    static func runThisOnMainActor(
        closure: @escaping (String) -> Void
    ) {
        DispatchQueue.main.async {
            // ERROR: Sending 'closure' risks
            // causing data races:
            closure("example")
        }
    }
}

@main struct FalsePositivesApp: App {
    init() {
        let closure: (String) -> Void = { print($0) }
        Example.runThisOnMainActor { message in
            closure(message)
        }
    }

That will not compile under Swift 6 language mode because we cannot pass closure into DispatchQueue.main.async without declaring @Sending on the runThisOnMainActor method’s closure argument. The Swift 6 compiler is really good at catching all kinds of flavors of this potential for data races, even if you try stuffing @unchecked Sendable into the mix:

final class Example: @unchecked Sendable {
    func runThisOnMainActor(
        closure: @escaping (String) -> Void
    ) {
        DispatchQueue.main.async {
            // ERROR: Sending 'closure' risks
            // causing data races:
            closure("example")
        }
    }
}

@main struct FalsePositivesApp: App {
    init() {
        let closure: (String) -> Void = { print($0) }
        Example().runThisOnMainActor { message in
            closure(message)
        }
    }

But it does have a loophole: functions as stored instance members. If you have a type that is @unchecked Sendable with a stored instance member that’s a function, then all the compile-time concurrency checking around that function is suppressed. The following code sample compiles without warnings on Swift 6, but crashes at runtime with the assertion failure we’ve been discussing:

final class Example: @unchecked Sendable {
    var closure: (String) -> Void = { _ in }
}

@main struct FalsePositivesApp: App {
    init() {
        let closure: (String) -> Void = { print($0) }
        let example = Example()
        example.closure = closure
        DispatchQueue.global().async {
            example.closure("crash!") // CRASHES
        }
    }

What’s so strange here is the nature of our closure local variable. It’s implicitly nonisolated, but the compiler is baking Main Actor isolation checks into the { print($0) } body of the closure. There’s no compile-time warning about any of this because of the loophole in functions as stored instance members of unchecked sendables.

How To Fix This

Assuming that you just really, really, really want to keep that @unchecked Sendable in the mix, the way to resolve this issue is to change the function type declaration of the LoggingSink from:

typealias LoggingSink = (String) -> Void

to:

typealias LoggingSink = @Sendable (String) -> Void

With that one change, it no longer becomes possible to set the value of Logging.sink to anything other than a sendable function:

@main struct FalsePositivesApp: App {
    init() {
        let closure: (String) -> Void = { print($0) }
        // ERROR: Converting non-sendable function 
        // value to '@Sendable (String) -> Void' 
        // may introduce data races:
        Logging.sink = closure
    }

And the following will both compile without warnings or errors, and will not violate an assertion at runtime:

@main struct FalsePositivesApp: App {
    init() {
        Logging.sink = { print($0) }
    }

Good luck out there.

Update, November 13th 2024

Got some helpful responses via Mastodon yesterday.

Matt Massicotte writes:

What’s happening here is the compiler is reasoning “this closure is not Sendable so it couldn’t possibly change isolation from where it was formed and therefore its body must be MainActor too” but your unchecked type allows this invariant to be violated. This kind of thing comes up a lot in many forms, and it’s hard to debug…

Both Matt Massicotte and Rob Napier also brought up the Mutex struct from the Synchronization module, which I keep forgetting about because it has an iOS 18 minimum and is therefore not available to me on my projects. Mutex is analogous to the Box<T> class pictured in my screenshot above. Let’s look at Mutex’s generated interface alongside the one for Box<T> to compare and contrast (trimming some boilerplate for clarity).

Here’s Mutex:

struct Mutex<Value> : ~Copyable where Value : ~Copyable {
    func withLock<Result, E>(
        _ body: (inout Value) throws(E) -> sending Result
    ) throws(E) -> sending Result where E : Error, Result : ~Copyable
}

extension Mutex : @unchecked Sendable where Value : ~Copyable {
}

And here’s Box:

final class Box<T>: @unchecked Sendable {
    func access<Output>(block: (inout T) -> Output) -> Output
}

First, note how both of them require @unchecked in order to conform to Sendable. This is because there is no language-level primitive or concurrency structure that permit synchronized synchronous access to be checked for correctness by the concurrency checker. At best we can only suppress false positives. It sure would be nice if there was a language-level way to enforce correctness, though I suppose that will matter less in the future once apps can start declaring an iOS 18 minimum and rely on Mutex for these needs.

But prior to iOS 18, one has to roll their own alternative to Mutex in order to implement synchronous access to a synchronized resource, which is what my Box<T> example class is attempting to do. But note one key difference: Box is generic over anything. It’s just T. But Mutex enforces an additional constraint: ~Copyable. This means “not Copyable”, and it has unique behavior among protocols in Swift. I recommending reading (and rereading multiple times until you can comprehend it) the Copyable documentation. There are some truly astounding bits of information in there:

Astounding Bit #1) All generic type parameters implicitly include Copyable in their list of requirements.

If your codebase has a generic type, it’s using Copyable already. That means that my Box<T> type, despite what I said above, does have an implicit constraint:

final class Box<T: Copyable>

All classes and actors implicitly conform to Copyable, and it’s not possible to declare one that doesn’t. All structs and enums implicitly conform to Copyable, but it is possible to state definitely that one does not. Check this out:

final class Box<T> {
    var value: T
    init(_ value: T) {
        self.value = value
    }
}

struct AintCopyable: ~Copyable {
    var foo = 32
}

let box = Box(AintCopyable())
// ERROR: Generic class 'Box' requires that 
// 'AintCopyable' conform to 'Copyable'

Astounding Bit #2) Both copyable and noncopyable types can conform to protocols or generic requirements that inherit from ~Copyable.

This is counterintuitive to say the least. You would think “not copyable” would exclude, as a logical categorization, anything that is “copyable”. But you’d be wrong, at least when it comes to generics and protocol requirements. Here’s an example based on the one given in the Copyable documentation:

protocol HeadScratching: ~Copyable {
    var foo: Int { get set }
}

struct IsCopyable: Copyable, HeadScratching {
    var foo = 32
}

struct AintCopyable: ~Copyable, HeadScratching {
    var foo = 32
}

That ☝🏻 there compiles without warnings or errors. This next example almost compiles, except for some concurrency errors I will explain in a moment, the explanation of which will bring us to another Astounding Bit about the Copyable protocol:

final class Box<T: ~Copyable> {
    var value: T
    init(_ value: T) {
        self.value = value
    }
}

struct AintCopyable: ~Copyable {
    var foo = 32
}

struct IsCopyable: Copyable {
    var foo = 32
}

// Allowed:
let boxA = Box(AintCopyable())

// Also allowed:
let boxB = Box(IsCopyable())

Astounding Bit #3) Non-copyable values cannot be parameterized without specifying their ownership.

To put it another way, look at this compiler error from my preceding example:

You must specify one of the following ownership options for the value parameter since it is required to be non-copyable. The borrowing option is not useful for our purposes here because we’re trying to construct a threadsafe mutable reference to a shared value, and borrowing would not allow us to store value in a property. The inout option is slightly more usable, because it allows us to store value in a property, however it permits both Box and the caller using a Box to have read/write access to the same piece of data. That’s not going to fly when we’re trying to prevent data races. That leaves only the consuming option, which is what you see in Mutex’s initializer:

struct Mutex<Value: ~Copyable> {
    init(_ initialValue: consuming sending Value) {
        ...
    }

This all bears repeating: the compiler forces you to select an explicit memory ownership for any generic parameter that is noncopyable (\~Copyable). This makes \~Copyable especially useful for use with data synchronization utilities like Mutex. We want to be as explicit as possible, at compile time, about where and how data is permitted to mutate.

Astounding Bit #4) Non-copyable parameters with consuming ownership help diagnose concurrency issues at compile time.

Once you’ve selected the consuming ownership, there are some useful secondary effects that cascade from that choice. Let’s look at a contrived example to learn more:

The Swift 6 compiler is able to deduce that we’ve introduced a potential data race. The box declaration is at a global scope, and contains mutable state. The fixits in that screenshot list several options. Let’s use @unchecked Sendable and add our own locking mechanism to prevent a data race:

final class Box<T: ~Copyable>: @unchecked Sendable {
    private var _value: T
    private let lock = NSLock()

    init(_ value: consuming T) {
        self._value = value
    }

    func withLock<Output: ~Copyable>(
        _ body: (inout T) -> Output
    ) -> Output {
        lock.lock()
        defer { lock.unlock() }
        return body(&_value)
    }
}

let box = Box(false)

That all compiles now, without warnings or errors. We’ve still got a var value: T property, but it’s a computed property that synchronizes access via the withLock method, which uses an NSLock to synchronize concurrent reads and writes. But this example is still just using a Bool, which is pretty contrived. What happens if we go back to my original example above of a (String)-> Void function type instead of a Bool?

let sharedBox = Box<(String) -> Void>({ print($0) })

@main struct FalsePositivesApp: App {
    init() {
        sharedBox.withLock {
            $0 = { print("custom: \($0)") }
        }
    }

    var body: some Scene { WindowGroup { ContentView() } }
}

struct ContentView: View {
    var body: some View {
        Button("Press Me.") {
            userPressedTheButton()
        }
    }

    func userPressedTheButton() {
        DispatchQueue.global().async {
            let block = sharedBox.withLock { $0 }
            block("print me") // <--- CRASHES
        }
    }
}

It compiles without errors, but it crashes at runtime with the same main queue assertion we saw earlier in this blog post. What gives? Well, we’ve omitted something from our definition of the withLock method that is seen in Mutex’s equivalent method: a bunch of sending keywords. This keyword was introduced in SE-0430. It’s a bit wonkish to explain, but it allows a function parameter and/or result to indicate to the compiler that it might be sent over a concurrency isolation boundary (like, say, from the main actor to some other region). It’s akin to @Sendable, but less dramatic. It patches some holes in the concurrency checker that otherwise made it difficult to do things like: prevent concurrent access to a non-Sendable object after it got passed as a parameter into an actor’s initializer. The proposal is worth reading carefully, I won’t summarize it more than that now.

Here’s our Box’s withLock method with those sending keywords added:

func withLock<Output: ~Copyable>(
    _ body: (inout sending T) -> sending Output
) -> sending Output

What happens if we try to compile now? Check it out:

The text of those errors is underwhelming, but their presence is helpful. It’s detected that we’re sending a non-Sendable function type either from the Main Actor into some unspecified region (“…cannot be main actor-isolated at…”), or from one unspecified region into another unspecified region (“…cannot be task-isolated at…”). The solution is what we discovered earlier: to require our LoggingSink function type to be @Sendable:

typealias LoggingSink = @Sendable (String) -> Void

With that one change, the compiler errors go away and the runtime crashes are resolved, too.

OK, But why are you blathering on about this? Didn’t we already figure that out earlier?

Here’s why I think this matters. There’s one crucial difference between yesterday’s implementation of Box<T> and today’s implementation Box<T: ~Copyable>: the noncopyable version detected, at compile time, that we were sending a non-Sendable function type across isolation regions in a way that would trigger runtime actor isolation checks. The version that is implicitly copyable does not expose this information to the concurrency checker at compile time. This is true even though both implementations of Box use @unchecked Sendable. The internals of Box are still unchecked by the concurrency checker, but the values passed into and out of the Box are better checked when ~Copyable and consuming and sending are in the mix.

So if, like me, you are unable to use Mutex yet because it requires iOS 18 or later and, like me, you are on the hook to write a replacement that supports earlier OS versions, I think you and I both should consider mimicking the design of Mutex, in particular its compile time checks via ~Copyable, consuming, and sending.

|  12 Nov 2024