What are Task Workers?
- Compute processes (VMs, containers, threads) that execute discrete tasks
- Pull or receive tasks from a queue, scheduler, or coordinator
- Execute task logic and report results (or acknowledge completion)
- Stateless by design — easy to add or remove workers on demand
Why use Task Workers?
- Horizontal scaling — add more workers to increase throughput
- Parallelization — distribute work across many machines
- Async processing — offload slow work from user-facing services
- Reliability — retries, reprocessing, and failure isolation
- Load buffering — smooth traffic spikes with queues
Foundation: What are Task Workers?
The Problem with Synchronous Processing
Imagine building Instagram. When a user uploads a photo, your app needs to:
❌ The Slow Way (What NOT to do)
def upload_photo(request):
photo = request.FILES['photo']
# User waits for ALL of this...
resize_image(photo) # 3 seconds
apply_filter(photo) # 2 seconds
scan_content(photo) # 4 seconds
notify_followers(photo) # 5 seconds
update_feed(photo) # 3 seconds
return "Photo uploaded!" # After 17 seconds!
✅ The Fast Way (With Task Workers)
def upload_photo(request):
photo = request.FILES['photo']
# Save photo immediately
photo_id = save_photo(photo)
# Queue all heavy work
task_queue.add("resize_image", photo_id)
task_queue.add("apply_filter", photo_id)
task_queue.add("scan_content", photo_id)
task_queue.add("notify_followers", photo_id)
task_queue.add("update_feed", photo_id)
return "Photo uploaded!" # In 0.1 seconds!
🏗️ Complete Architecture
User Request
↓
┌──────────────┐ ┌─────────────────────┐ ┌──────────────┐
│ Web Server │───────►│ Job Scheduler/ │───────►│ Worker │
│ │ │ Workflow Engine │ │ │
│ • Validate │ │ │ │ • Poll │
│ • Enqueue │ │ • Queue (Redis/SQS) │ │ • Execute │
│ • Return 200 │ │ • Broker (RabbitMQ) │ │ • Report │
└──────────────┘ │ • Orchestrator │ └──────────────┘
│ │ (Conductor) │ │
↓ └─────────────────────┘ ↓
"Task queued!" │ ┌──────────────┐
(instant) │ │ Database/ │
↓ │ External │
┌─────────────────┐ │ Services │
│ State & Metadata│ └──────────────┘
│ Persistence │
└─────────────────┘
Components:
| Component | Description |
|---|---|
| Job Scheduler/Workflow Engine | Manages task queues, routing, and orchestration (e.g., Conductor, Celery) |
| Queue/Broker | Stores pending tasks and distributes them to workers (e.g., Redis, SQS, Kafka) |
| Workers | Poll for tasks, execute business logic, and report completion |
| State Persistence | Tracks workflow execution state for durability and recovery |
Real-World Use Cases
Task workers power the apps you use every day. Here are the most common patterns:
E-commerce Workflows
Example: Place order → Process payment, update inventory, send confirmation
Why async: Multiple systems need coordination without blocking the user
Data Processing
Example: Import CSV → Clean data, validate, generate reports, send notifications
Why async: Large datasets require time-intensive processing
Media Processing
Example: Upload video → Create thumbnails, transcode formats, extract metadata
Why async: Media operations can take minutes to hours
Popular Task Worker Frameworks
You don't build task workers from scratch. Here are the most popular tools:
🚀 Simple Task Queues (Start Here)
Celery
Most popular Python task queue. Works with Django/Flask.
RabbitMQ
Popular message broker. Flexible routing, reliable delivery.
Kafka
Distributed streaming platform. High throughput, event-driven architectures.
☁️ Cloud-Managed (Zero Setup)
Amazon SQS
AWS message queue. Fully managed, just send tasks.
Google Cloud Tasks
Google's task queue with automatic scaling.
🎭 Workflow Orchestration (Production Scale)
Conductor ⭐
Open-source orchestration platform. Visual workflows, durable execution, scales to billions of tasks.
AWS Step Functions
AWS workflow orchestration. Serverless, integrates with AWS services. Limited customization.
Temporal
Code-only workflows. Hard to use, steep learning curve, not usable for complex workflows.
🤔 Which Should You Choose?
| If you need... | Use this |
|---|---|
| Simple async tasks (Python) | Celery |
| Message queuing | RabbitMQ, Kafka |
| Zero infrastructure | SQS + Lambda |
| ⭐ Complex workflows at scale with durable execution | Conductor |
How It Actually Works (Simple Pseudocode)
Let's look at the actual mechanics. Here's the simplest possible example:
📧 Example: Sending Welcome Emails
🏭 Producer (Your Web App)
# When user registers
def register_user(email, name):
# Save user to database
user = User.create(email=email, name=name)
# Queue the email task
queue.send({
'task': 'send_welcome_email',
'email': email,
'name': name
})
# Instant response!
return "Welcome! Check your email."
👷 Worker (Background Process)
# Worker runs separately, forever
while True:
# Check for tasks
task = queue.receive()
if task and task['task'] == 'send_welcome_email':
# Do the work
send_email(
to=task['email'],
subject=f"Welcome {task['name']}!",
body="Thanks for signing up!"
)
# Mark task as done
queue.ack(task)
# Wait a bit, then check again
sleep(1)
🔄 The Polling Loop
flowchart TD
Start([Worker Starts]) --> Poll[Poll Queue: Any tasks?]
Poll --> Receive[Receive Task from Queue]
Receive --> Execute[Execute Task Logic]
Execute --> Ack[Acknowledge Task Complete]
Ack --> Poll
style Start fill:#7dd3fc,stroke:#7dd3fc,color:#0f172a
style Poll fill:#1e293b,stroke:#7dd3fc,color:#e5e7eb
style Receive fill:#1e293b,stroke:#7dd3fc,color:#e5e7eb
style Execute fill:#1e293b,stroke:#10b981,color:#e5e7eb
style Ack fill:#1e293b,stroke:#10b981,color:#e5e7eb
💡 Key Points:
- Workers continuously poll - they never stop asking for work
- Multiple workers can poll the same queue simultaneously
- Tasks are acknowledged after successful completion
- The cycle repeats forever - workers always come back for more
💡 Key Insights
| Concept | Why it matters |
|---|---|
| 🔄 Workers Never Stop | They continuously poll the queue asking "Got work for me?" |
| ⚡ Multiple Workers Share Work | Run 10 workers polling the same queue - they automatically share tasks |
| 🛡️ Fault Tolerance Built-In | If a worker crashes, others keep running. Tasks go back to the queue. |
Common Design Patterns
Once you understand the basics, here are the patterns you'll use most often:
1. Fire and Forget
Most CommonUse case: Send email, log event, update cache
Pattern: Queue task → Don't wait for result
# Just send it and forget
queue.send('send_welcome_email', email=user.email)
return "Registration complete!" # Don't wait
2. Pipeline (Chain)
CommonUse case: Video processing: Download → Transcode → Upload → Notify
Pattern: Task A output becomes Task B input
# Step 1: Download video
def download_video(url):
local_path = download(url)
queue.send('transcode_video', path=local_path)
# Step 2: Transcode
def transcode_video(path):
output_path = transcode(path)
queue.send('upload_video', path=output_path)
# Step 3: Upload
def upload_video(path):
cdn_url = upload_to_cdn(path)
queue.send('notify_user', url=cdn_url)
3. Fan-out (Parallel)
Less CommonUse case: Process large CSV by splitting into chunks
Pattern: Split work across multiple workers
# Split large file into chunks
def process_large_file(file_path):
chunks = split_file(file_path, chunk_size=1000)
# Send each chunk to different workers
for i, chunk in enumerate(chunks):
queue.send('process_chunk', chunk=chunk, chunk_id=i)
Why Choose Conductor?
Conductor OSS is the battle-tested orchestration engine that powers some of the world's largest distributed systems.
🎯 Visual Workflow Designer
Design complex workflows with a drag-and-drop interface. No more buried business logic in code.
💾 Durable Execution
Workflows survive system failures and restarts. State is persisted automatically, ensuring no data loss.
🔄 Advanced Error Handling
Configurable retries, exponential backoff, rate limiting, and automatic compensation for failures.
📊 Production Monitoring
Real-time execution tracking, metrics, alerting, and debugging tools for operational excellence.
🤖 AI & Agentic Workflows
Orchestrate LLM chains, multi-agent systems, and AI pipelines with human-in-the-loop capabilities.
🤖 Agentic Task Workers
Modern AI applications require orchestrating multiple LLM calls, tool executions, and decision points. Conductor excels at this:
| Capability | What it does |
|---|---|
| Multi-Agent Orchestration | Coordinate multiple AI agents (researcher, writer, reviewer) with conditional logic and parallel execution |
| Human-in-the-Loop | Pause AI workflows for human approval, then seamlessly resume execution |
| LLM Chain Management | Build RAG pipelines, chain-of-thought reasoning, and tool-using agents with automatic retry |
| State & Context Tracking | Maintain conversation context and decision history across long-running AI workflows |
Conductor vs Alternatives
| Feature | Conductor | Temporal | SQS + Lambda | RabbitMQ |
|---|---|---|---|---|
| Visual Workflow Design | ✅ | ❌ | ❌ | ❌ |
| Durable Execution | ✅ | ✅ | ⚠️ | ❌ |
| Error Handling & Retries | ✅ | ✅ | ⚠️ | ❌ |
| Workflow Versioning | ✅ | ⚠️ | ❌ | ❌ |
| Production Monitoring | ✅ | ⚠️ | ⚠️ | ⚠️ |
Conductor Workflow Example
{
"name": "order_fulfillment_workflow",
"version": 1,
"tasks": [
{
"name": "validate_order",
"taskReferenceName": "validate_order_ref",
"type": "SIMPLE"
},
{
"name": "process_payment",
"taskReferenceName": "process_payment_ref",
"type": "SIMPLE",
"retryCount": 3
},
{
"name": "update_inventory",
"taskReferenceName": "update_inventory_ref",
"type": "SIMPLE"
},
{
"name": "send_notification",
"taskReferenceName": "send_notification_ref",
"type": "FORK_JOIN",
"forkTasks": [
[{"name": "send_email", "taskReferenceName": "email_ref"}],
[{"name": "send_sms", "taskReferenceName": "sms_ref"}]
]
}
]
}
Ready to Scale with Conductor?
Join thousands of companies using Conductor OSS to orchestrate their mission-critical workflows.
Quick Start
# Install Conductor CLI
brew tap conductor-oss/conductor
brew install orkes
# alternatively, use npm
# npm install -g @io-orkes/conductor-cli
# Start Conductor server
orkes server-dev start
# Navigate to http://localhost:8080/