Skip to content
Merged
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -687,6 +687,8 @@ prompting/
# Claude
CLAUDE.md
.claude/*
.mcp.json
CONTRIBUTION_GUIDE.md

# Windsurf
.qodo
Expand Down
2 changes: 1 addition & 1 deletion runner/src/unstract/runner/runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -455,7 +455,7 @@ def run_container(
)

container_config = self.client.get_container_run_config(
command=["/bin/sh", "-c", container_command],
command=["dumb-init", "/bin/sh", "-c", container_command],
file_execution_id=file_execution_id,
shared_log_dir=shared_log_dir, # Pass directory for mounting
container_name=container_name,
Expand Down
27 changes: 27 additions & 0 deletions runner/tests/signal-handling-demo/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Signal Handling Test Container
# Replicates production environment for SIGTERM signal forwarding testing

FROM python:3.12.9-slim

LABEL maintainer="Unstract Team" \
description="Signal Handling Test Container" \
version="1.0"

# Set environment variables for proper Python output
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1

# Install dumb-init (same as production tools)
RUN apt-get update && \
apt-get install -y --no-install-recommends dumb-init && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*

# Set working directory
WORKDIR /app

# Copy demo files
COPY shell-signal-test.py /app/

# Make scripts executable
RUN chmod +x /app/shell-signal-test.py
73 changes: 73 additions & 0 deletions runner/tests/signal-handling-demo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# **Manual Testing:**

## **Test 1: BROKEN (Shell Wrapper)**
```bash
### Build and run
docker build -t signal-test .
docker run -d --name broken-test signal-test /bin/sh -c "python3 /app/shell-signal-test.py"
```

### **Test 2: WORKING (Dumb-init Wrapper)**
```bash
# Run with dumb-init
docker run -d --name working-test signal-test dumb-init /bin/sh -c "python3 /app/shell-signal-test.py"
```

# Results:

## **Test 1: BROKEN (Shell Wrapper)**

```
🚀 Signal Test Program Started
📋 PID: 7
⏰ Start Time: 09:07:15
🎯 Waiting for SIGTERM or SIGINT signal...
💡 Send 'kill -TERM 7' or 'kill -INT 7' from another terminal
🔄 Program will log status every 2 seconds until signal received
⚠️ SIGTERM Test: If no graceful shutdown appears, signal was NOT forwarded!
✅ Expected: Graceful shutdown messages should appear within 10 seconds
⏱️ [09:07:15] Running... (iteration 1, elapsed: 0.0s)
⏱️ [09:07:17] Running... (iteration 2, elapsed: 2.0s)
⏱️ [09:07:19] Running... (iteration 3, elapsed: 4.0s)
⏱️ [09:07:21] Running... (iteration 4, elapsed: 6.0s)
⏱️ [09:07:23] Running... (iteration 5, elapsed: 8.0s)
🔍 [09:07:23] Still waiting for SIGTERM... (No signal received yet)
⏱️ [09:07:25] Running... (iteration 6, elapsed: 10.0s)
⏱️ [09:07:27] Running... (iteration 7, elapsed: 12.0s)
⏱️ [09:07:29] Running... (iteration 8, elapsed: 14.0s)
⏱️ [09:07:31] Running... (iteration 9, elapsed: 16.0s)
⏱️ [09:07:33] Running... (iteration 10, elapsed: 18.0s)
🔍 [09:07:33] Still waiting for SIGTERM... (No signal received yet)
⏱️ [09:07:35] Running... (iteration 11, elapsed: 20.0s)
```
**Program exits without graceful shutdown.**

## **Test 2: WORKING (Dumb-init Wrapper)**

```
🚀 Signal Test Program Started
📋 PID: 8
⏰ Start Time: 09:07:49
🎯 Waiting for SIGTERM or SIGINT signal...
💡 Send 'kill -TERM 8' or 'kill -INT 8' from another terminal
🔄 Program will log status every 2 seconds until signal received
⚠️ SIGTERM Test: If no graceful shutdown appears, signal was NOT forwarded!
✅ Expected: Graceful shutdown messages should appear within 10 seconds
⏱️ [09:07:49] Running... (iteration 1, elapsed: 0.0s)
⏱️ [09:07:51] Running... (iteration 2, elapsed: 2.0s)
⏱️ [09:07:53] Running... (iteration 3, elapsed: 4.0s)
⏱️ [09:07:55] Running... (iteration 4, elapsed: 6.0s)
⏱️ [09:07:57] Running... (iteration 5, elapsed: 8.0s)
🔍 [09:07:57] Still waiting for SIGTERM... (No signal received yet)
⏱️ [09:07:59] Running... (iteration 6, elapsed: 10.0s)
⏱️ [09:08:01] Running... (iteration 7, elapsed: 12.0s)
🔔 [09:08:02] Received SIGTERM signal!
🔔 [09:08:02] Received SIGTERM signal!
📊 Program ran for: 13.5 seconds
🧹 Starting graceful shutdown procedure...
⏳ Simulating cleanup work (5 seconds)...
[09:08:02] 💾 Saving application state...
```
**Program exits with graceful shutdown.**

## Full results are shown in screenshot in PR
127 changes: 127 additions & 0 deletions runner/tests/signal-handling-demo/shell-signal-test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
#!/usr/bin/env python3
"""SIGTERM Signal Handling Test Program

This program demonstrates proper SIGTERM signal handling.
It will run indefinitely until it receives a SIGTERM signal,
at which point it will perform graceful shutdown.

Usage:
python3 shell-signal-test.py
"""

import os
import signal
import sys
import time
from datetime import datetime


class SignalTestProgram:
def __init__(self):
self.running = True
self.start_time = datetime.now()

def signal_handler(self, signum, frame):
"""Handle SIGTERM and SIGINT signals with realistic cleanup simulation"""
signal_name = signal.Signals(signum).name
elapsed = datetime.now() - self.start_time

print(
f"\n🔔 [{datetime.now().strftime('%H:%M:%S')}] Received {signal_name} signal!"
)
print(f"📊 Program ran for: {elapsed.total_seconds():.1f} seconds")
print("🧹 Starting graceful shutdown procedure...")
print("⏳ Simulating cleanup work (5 seconds)...")
sys.stdout.flush()

# Simulate realistic cleanup work that takes time
cleanup_tasks = [
"💾 Saving application state...",
"🔌 Closing database connections...",
"📡 Shutting down network listeners...",
"🗄️ Flushing file buffers...",
"🧼 Final cleanup and validation...",
]

for i, task in enumerate(cleanup_tasks, 1):
print(f"[{datetime.now().strftime('%H:%M:%S')}] {task}")
sys.stdout.flush()
time.sleep(10) # Each cleanup task takes 1 second
print(
f"✅ [{datetime.now().strftime('%H:%M:%S')}] Cleanup step {i}/5 completed"
)
sys.stdout.flush()

print(
f"🎉 [{datetime.now().strftime('%H:%M:%S')}] All cleanup completed successfully!"
)
print(f"👋 Goodbye from PID {os.getpid()}!")
sys.stdout.flush()

self.running = False

def run(self):
"""Main program loop"""
# Register signal handlers for graceful shutdown
signal.signal(signal.SIGTERM, self.signal_handler)
signal.signal(signal.SIGINT, self.signal_handler)

print("🚀 Signal Test Program Started")
print(f"📋 PID: {os.getpid()}")
print(f"⏰ Start Time: {self.start_time.strftime('%H:%M:%S')}")
print("🎯 Waiting for SIGTERM or SIGINT signal...")
print(
f"💡 Send 'kill -TERM {os.getpid()}' or 'kill -INT {os.getpid()}' from another terminal"
)
print("🔄 Program will log status every 2 seconds until signal received")
print(
"⚠️ SIGTERM Test: If no graceful shutdown appears, signal was NOT forwarded!"
)
print("✅ Expected: Graceful shutdown messages should appear within 10 seconds")
print("")
sys.stdout.flush() # Force immediate output

counter = 0
max_iterations = 30 # Run for max 60 seconds to avoid infinite loops

while self.running and counter < max_iterations:
counter += 1
elapsed = datetime.now() - self.start_time
timestamp = datetime.now().strftime("%H:%M:%S")

print(
f"⏱️ [{timestamp}] Running... (iteration {counter}, elapsed: {elapsed.total_seconds():.1f}s)"
)

# Add explicit verification message every 5 iterations
if counter % 5 == 0:
print(
f"🔍 [{timestamp}] Still waiting for SIGTERM... (No signal received yet)"
)

sys.stdout.flush() # Force immediate output

try:
time.sleep(2)
except KeyboardInterrupt:
# Handle Ctrl+C as SIGINT
print("\n🔔 Received Ctrl+C (SIGINT)")
self.signal_handler(signal.SIGINT, None)
break

if counter >= max_iterations:
print(
f"\n⏰ [{datetime.now().strftime('%H:%M:%S')}] Maximum runtime reached (60s)"
)
print(
"❌ No SIGTERM signal received - this indicates BROKEN signal forwarding!"
)

print("🏁 Program terminated gracefully")
sys.stdout.flush()
sys.exit(0)


if __name__ == "__main__":
program = SignalTestProgram()
program.run()
2 changes: 1 addition & 1 deletion tools/classifier/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ LABEL maintainer="Zipstack Inc."


# Install dependencies for unstructured library's partition
RUN apt-get update && apt-get --no-install-recommends -y install libmagic-dev poppler-utils\
RUN apt-get update && apt-get --no-install-recommends -y install libmagic-dev poppler-utils dumb-init\
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

Expand Down
2 changes: 1 addition & 1 deletion tools/structure/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ ENV \
RUN apt-get update && \
apt-get install -y --no-install-recommends \
ffmpeg libsm6 libxext6 libmagic-dev poppler-utils \
libreoffice freetds-dev freetds-bin && \
libreoffice freetds-dev freetds-bin dumb-init && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* /var/cache/apt/archives/*

Expand Down
2 changes: 1 addition & 1 deletion tools/text_extractor/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ LABEL maintainer="Zipstack Inc."


# Install dependencies for unstructured library's partition
RUN apt-get update && apt-get --no-install-recommends -y install libmagic-dev poppler-utils\
RUN apt-get update && apt-get --no-install-recommends -y install libmagic-dev poppler-utils dumb-init\
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

Expand Down