Error Handling
FlowState provides structured error handling through exit codes and the on-error mapping. This enables workflows to recover gracefully from failures rather than terminating immediately.
Exit Codes
Section titled “Exit Codes”Every tool execution produces an exit code:
| Code | Meaning |
|---|---|
0 | Success - proceed to next state |
1 | General error |
2 | Misuse or invalid arguments |
124 | Timeout exceeded |
130 | User cancelled (Ctrl+C) |
-1 | Pause requested (internal) |
Tools may define additional codes. Check individual tool documentation for specifics.
The on-error Map
Section titled “The on-error Map”The on-error field maps exit codes to error-handling states:
risky-operation: tool: bash arguments: command: ./might-fail.sh on-error: 1: handle-general-error 2: handle-invalid-args _: handle-unknown-error next: success-pathWhen the tool exits with a non-zero code:
- FlowState checks
on-errorfor that specific code - If found, transitions to the mapped state
- If not found, checks for the
_wildcard - If no handler matches, the workflow halts with that exit code
Wildcard Matching
Section titled “Wildcard Matching”The _ key is a catch-all that matches any exit code not explicitly mapped:
fetch-data: tool: bash arguments: command: curl https://api.example.com/data on-error: _: handle-network-error next: process-dataBest practice: Always include a _ handler unless you want the workflow to halt on unexpected errors.
Error Recovery Patterns
Section titled “Error Recovery Patterns”Retry with Backoff
Section titled “Retry with Backoff”fetch-data: tool: bash arguments: command: curl https://api.example.com/data output: var(data) on-error: _: retry-fetch next: process-data
retry-fetch: tool: bash arguments: command: sleep 5 next: fetch-dataFor more sophisticated retry logic, track attempt count:
variables: max_retries: "3" attempt: "0"
fetch-data: tool: bash arguments: command: curl https://api.example.com/data output: var(data) on-error: _: check-retry next: process-data
check-retry: tool: bash arguments: command: | attempt=$(({{ attempt }} + 1)) echo $attempt output: var(attempt) next: evaluate-retry
evaluate-retry: tool: switch arguments: value: "{{ attempt == '{{ max_retries }}' }}" goto: "true": give-up _: wait-and-retry
wait-and-retry: tool: bash arguments: command: sleep 5 next: fetch-data
give-up: tool: bash arguments: command: 'echo "Failed after {{ max_retries }} attempts" >&2'Fallback Values
Section titled “Fallback Values”get-config: tool: bash arguments: command: cat /etc/myapp/config.json output: var(config) on-error: _: use-default-config next: apply-config
use-default-config: tool: bash arguments: command: echo '{"mode": "default"}' output: var(config) next: apply-configCleanup on Error
Section titled “Cleanup on Error”create-temp-resources: tool: bash arguments: command: mkdir -p ./temp && touch ./temp/lock next: risky-operation
risky-operation: tool: bash arguments: command: ./might-fail.sh on-error: _: cleanup-and-fail next: cleanup-and-succeed
cleanup-and-succeed: tool: bash arguments: command: rm -rf ./temp next: done
cleanup-and-fail: tool: bash arguments: command: rm -rf ./temp next: report-failure
report-failure: tool: bash arguments: command: 'echo "Operation failed" >&2 && exit 1'Notification on Failure
Section titled “Notification on Failure”important-task: tool: bash arguments: command: ./critical-operation.sh on-error: _: notify-and-fail next: complete
notify-and-fail: tool: bash arguments: command: | curl -X POST https://hooks.slack.com/... \ -d '{"text": "Workflow failed in important-task"}' # No next - workflow ends after notificationTimeout Handling
Section titled “Timeout Handling”Exit code 124 indicates a timeout. Handle it explicitly:
long-running-task: tool: claude arguments: prompt: "Analyze this large codebase..." timeout: 10m on-error: 124: handle-timeout _: handle-other-error next: use-result
handle-timeout: tool: bash arguments: command: echo "Task timed out - using cached result" next: use-cached-resultUser Cancellation
Section titled “User Cancellation”Exit code 130 indicates the user pressed Ctrl+C during interactive prompts:
get-user-input: tool: ask-user arguments: question: "Enter the deployment target:" output: var(target) on-error: 130: user-cancelled _: input-error next: deploy
user-cancelled: tool: bash arguments: command: echo "Deployment cancelled by user"Error State Best Practices
Section titled “Error State Best Practices”-
Name clearly: Use prefixes like
handle-,recover-,fallback- -
Log context: Include enough information to diagnose issues
handle-error:tool: basharguments:command: 'echo "Error in fetch-data: check network connectivity" >&2' -
Preserve state: Avoid modifying variables that might be needed for debugging
-
Consider idempotency: Error handlers may run multiple times if retrying
-
Exit cleanly: Terminal error states should produce meaningful exit codes
fatal-error:tool: basharguments:command: exit 1
Debugging Failed Workflows
Section titled “Debugging Failed Workflows”When a workflow halts due to an unhandled error:
- Check the instance’s
context.jsonfor the current state - Review the exit code from the failed tool
- Examine any captured output for error messages
- Resume with
--resumeafter fixing the issue, or clean up and start fresh