DEV Community

Cover image for Solving the Flutter `did_send` Crash: A Deep Dive into Isolate Race Conditions
Pshtiwan Mahmood
Pshtiwan Mahmood

Posted on

Solving the Flutter `did_send` Crash: A Deep Dive into Isolate Race Conditions

Solving the Flutter did_send Crash: A Deep Dive into Isolate Race Conditions

TL;DR: A race condition in Flutter isolate communication was causing fatal did_send crashes in our production sports app. The fix required async coordination, proper cleanup sequencing, and a critical 100ms delay. Here's how we solved it.

๐Ÿšจ The Problem: A Production Nightmare

Imagine this scenario: Your Flutter app is running smoothly in production with thousands of daily users. Everything works perfectly until users start reporting random crashes when navigating to specific screens. The crash logs show a cryptic, terrifying error:

[FATAL:flutter/lib/ui/window/platform_message_response_dart_port.cc(53)] Check failed: did_send. 
Enter fullscreen mode Exit fullscreen mode

This wasn't just any crashโ€”it was a fatal engine error that would completely terminate the app. No graceful error handling, no recovery, no second chances. Just instant death. ๐Ÿ’€

The worst part? The error message gave us absolutely no context about what was causing it.

๐Ÿ“ฑ The App: Real-Time Sports Data at Scale

Our app is a comprehensive sports statistics platform serving thousands of users. It provides:

  • โšฝ Live match scores updated every 5 seconds
  • ๐Ÿ† Real-time competition standings
  • ๐Ÿ“Š Player and team statistics
  • ๐ŸŒ Multi-language support

The real-time functionality is the heart of our app, powered by Flutter isolates that run background timers, fetch data from APIs, and stream updates to the main UI thread.

The Architecture

// Two main services handling real-time data class RealtimeDataService { // Handles general sports data (all games) // Updates every 5 seconds via isolate } class SingleRealtimeDataService { // Handles specific game data // Updates every 5 seconds via isolate } 
Enter fullscreen mode Exit fullscreen mode

๐Ÿ” The Investigation: Following the Crash Trail

Step 1: Understanding the did_send Error

After diving deep into Flutter's source code, I discovered that the did_send error occurs when:

  1. An isolate tries to send a message through a SendPort
  2. The receiving ReceivePort has already been closed/disposed
  3. Flutter engine's safety check fails: did_send = false
  4. ๐Ÿ’ฅ FATAL CRASH

Step 2: Reproducing the Issue

The crash had a very specific pattern. It happened when users:

  1. ๐Ÿ“ฑ Opened a modal dialog
  2. ๐Ÿ‘† Clicked on a list item
  3. ๐Ÿงญ Navigated to a detail screen

This navigation sequence triggered our router observer, which managed the real-time services based on the current route.

Step 3: Finding the Smoking Gun ๐Ÿ”ซ

Here's the problematic code that was causing our crashes:

// ๐Ÿ”ด PROBLEMATIC CODE (Before fix) void setRealtimeBaseOnRoute(Route route) { if (screen?.name == TargetRoute.name) { singleRealtimeDataService.stop(); // Closes ReceivePort immediately realtimeDataService.stop(); // Closes ReceivePort immediately  realtimeDataService.start(); // Starts new isolate immediately } } 
Enter fullscreen mode Exit fullscreen mode

The Race Condition Explained:

  1. stop() kills the isolate and closes ReceivePort
  2. start() creates a new isolate immediately
  3. But the old isolate hadn't finished cleanup yet!
  4. Old isolate timer fires โ†’ tries to send to CLOSED port โ†’ CRASH

โœ… The Solution: Proper Async Coordination

Fix #1: Make Operations Asynchronous

The key insight was that isolate lifecycle management needed to be properly coordinated with async/await:

// โœ… FIXED CODE (After solution) void setRealtimeBaseOnRoute(Route route) async { try { ScreenDetailData? screen = getScreen(route); if (screen?.name == TargetRoute.name) { // Wait for complete cleanup before proceeding await singleRealtimeDataService.stop(); await realtimeDataService.stop(); // ๐Ÿ”ง CRITICAL: Wait for isolate cleanup await Future.delayed(const Duration(milliseconds: 100)); await realtimeDataService.start(shouldStart: true); } } catch (e) { developer.log('Error in route management: $e'); // Don't rethrow to prevent app crashes } } 
Enter fullscreen mode Exit fullscreen mode

Fix #2: Enhanced Service Architecture

I also refactored the isolate services to be more robust:

class RealtimeDataService { ReceivePort? _receiver; Isolate? _isolate; bool _isStarted = false; bool _isStarting = false; // ๐Ÿ”ง Prevents race conditions Future<void> stop() async { if (!_isStarted && !_isStarting) return; try { _isStarting = false; // Kill isolate first _isolate?.kill(priority: Isolate.immediate); _isolate = null; // Close receive port _receiver?.close(); _receiver = null; _isStarted = false; } catch (e) { developer.log('Error stopping service: $e'); } } Stream<dynamic> get dataStream { if (_receiver == null) return const Stream.empty(); return _receiver!; } } 
Enter fullscreen mode Exit fullscreen mode

Fix #3: The Magic 100ms Delay โฐ

You might wonder: "Why 100ms? Isn't that arbitrary?"

Not at all! This delay is crucial because it gives the Flutter engine enough time to:

  • โœ… Complete isolate termination
  • โœ… Close all message ports
  • โœ… Clean up platform channels
  • โœ… Ensure no orphaned messages

๐Ÿง  The Technical Deep Dive: Timeline Analysis

Before the Fix (Race Condition):

โฐ Time 0ms: User navigates โฐ Time 1ms: Navigation starts โฐ Time 2ms: stop() called โ†’ ReceivePort closes โฐ Time 3ms: start() called โ†’ New ReceivePort opens โฐ Time 4ms: Old isolate timer fires โ†’ tries to send to CLOSED port โฐ Time 5ms: ๐Ÿ’ฅ CRASH: did_send check fails 
Enter fullscreen mode Exit fullscreen mode

After the Fix (Proper Sequencing):

โฐ Time 0ms: User navigates โฐ Time 1ms: Navigation starts โฐ Time 2ms: await stop() โ†’ ReceivePort closes + isolate cleanup โฐ Time 50ms: Old isolate fully terminated โฐ Time 100ms: Safety delay complete โฐ Time 101ms: await start() โ†’ New ReceivePort opens โฐ Time 102ms: โœ… SUCCESS: Clean transition 
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“š Lessons Learned

1. ๐Ÿ”— Isolate Communication is Fragile

Platform message channels in Flutter are low-level and unforgiving. Always ensure proper cleanup sequencing.

2. โšก Async/Await Saves Lives

What seemed like a simple synchronous operation actually required careful async coordination.

3. ๐Ÿ•ต๏ธ Production Errors Need Deep Investigation

The did_send error gave no context about the root cause. Only systematic investigation revealed the race condition.

4. ๐Ÿ›ก๏ธ Error Handling is Critical

Always wrap isolate operations in try-catch blocks to prevent crashes from propagating.


๐ŸŽ‰ The Results

After implementing these fixes:

  • โœ… Zero crashes related to the did_send error
  • โœ… Smooth navigation between all screens
  • โœ… Robust error handling prevents future isolate issues
  • โœ… Production stability with thousands of daily users

๐ŸŽฏ Key Takeaways for Flutter Developers

  1. Always use async/await when managing isolate lifecycles
  2. Add delays between stop/start operations to ensure cleanup
  3. Implement comprehensive error handling for isolate operations
  4. Test navigation flows thoroughly in production-like conditions
  5. Monitor crash logs for platform-level errors

๐Ÿ’ญ Final Thoughts

Debugging production crashes can be incredibly challenging, especially when the error messages are cryptic. This experience taught me the importance of:

  • Systematic investigation over quick fixes
  • Understanding the underlying platform (Flutter engine internals)
  • Proper async coordination in complex systems
  • Comprehensive error handling to prevent cascading failures

The did_send crash was a reminder that even small race conditions can have catastrophic effects in production. But with the right approach, even the most mysterious bugs can be solved.


๐Ÿค Discussion

Have you encountered similar isolate issues in your Flutter apps? What debugging strategies worked for you? Share your experiences in the comments below!

What would you like to see next?

  • More Flutter debugging deep dives?
  • Performance optimization techniques?
  • Real-time app architecture patterns?

Let me know! ๐Ÿ‘‡


If this helped you solve a similar issue, please give it a โค๏ธ and share it with your fellow Flutter developers!

Top comments (0)