Fix: Prevent retry of termination signals (Celery SoftTimeLimitExceeded) #2739
+101 −0
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
Changes being requested
This PR fixes issue #2737 where the OpenAI client's broad exception handling was catching Celery's
SoftTimeLimitExceededexception and retrying the request instead of allowing it to propagate for graceful task shutdown.Changes:
_should_not_retry()helper function to identify termination signals that should propagate immediatelyAffected exception types:
SoftTimeLimitExceeded,TimeLimitExceeded,TerminatedCancelledErrorThe fix preserves all existing retry logic while ensuring task executor termination signals are properly handled.
Additional context & links
Fixes #2737
Background:
When using the OpenAI client in Celery tasks with soft time limits, if the task exceeds its limit during an API call, Celery raises
SoftTimeLimitExceeded. The client'sexcept Exceptionclause was catching this and treating it as a retryable connection error, preventing cleanup logic from running.Solution approach:
Instead of narrowing the exception catch (which could miss legitimate connection errors), we check if caught exceptions are termination signals and re-raise them immediately without retry. This approach: