Disclaimer: A significant portion of this project's code is intentionally authored with AI assistance (pair‑programming style) via pull requests that still pass through normal version control, code review, and CI quality gates. All generated contributions are curated, adjusted, and ultimately owned by the repository maintainer. If you spot something that can be improved, please open an issue or PR.
FileHorizon is an open-source, container-ready file transfer and orchestration system. Designed as a modern alternative to heavyweight integration platforms, it provides a lightweight yet reliable way to move files across local/UNC paths, FTP, and SFTP while ensuring observability and control. By leveraging Redis for distributed coordination, FileHorizon can scale out to multiple parallel containers without duplicate processing, making it suitable for both on-premises and hybrid cloud deployments.
Configuration is centralized through Azure App Configuration and Azure Key Vault, enabling secure, dynamic management of connections and destinations. With OpenTelemetry at its core, FileHorizon delivers unified logging, metrics, and tracing out of the box—no separate logging stack required. The system emphasizes safety and consistency, ensuring files are only picked up once they are fully written at the source.
FileHorizon is built for teams that need the reliability of managed file transfer (MFT) but want the flexibility, transparency, and scalability of modern open-source tooling.
This repository includes a multi-stage Dockerfile for building a lean runtime image that runs as a non-root user.
WSL / containerd environments: If you're on WSL2 using Rancher Desktop / Lima / containerd, prefer
nerdctlover the classicdockerCLI. The examples below show both forms where it matters. Mixingdocker(Moby) andnerdctl(containerd) against different runtimes in the same workspace can produce confusing state (images not found, networks missing, etc.). Pick one consistently—on WSL + containerd choosenerdctl.
docker build -t filehorizon:dev .
Optional build args:
BUILD_CONFIGURATION(defaultRelease)UID/GIDto align container user with host filesystem permissions
docker run --rm \
-p 8080:8080 \
-e ASPNETCORE_ENVIRONMENT=Development \
-v C:/Temp/FileHorizon/InboxA:/data/inboxA:ro \
-v C:/Temp/FileHorizon/OutboxA:/data/outboxA \
filehorizon:dev
Health endpoint:
curl http://localhost:8080/health
The image defines user appuser (UID 1001 by default). If you need to write into mounted volumes, ensure host directories grant appropriate permissions. Override UID/GID at build time if integrating with existing volume ownership models.
Runtime configuration is provided via appsettings.json / environment variables. To override via environment variables, use the standard ASP.NET Core naming pattern, e.g.:
docker run --rm -p 8080:8080 \
-e "Pipeline__Role=All" \
-e "Features__EnableFileTransfer=true" \
filehorizon:dev
File discovery is handled by protocol-specific pollers composed by a multi-protocol coordinator. Each poller can be toggled independently via feature flags so you can roll out new protocols safely.
Feature flags (section Features):
| Flag | Default | Purpose |
|---|---|---|
EnableLocalPoller |
true |
Enable local/UNC directory polling sources configured under FileSources (legacy/local). |
EnableFtpPoller |
false |
Enable FTP remote sources listed in RemoteFileSources:Sources. |
EnableSftpPoller |
false |
Enable SFTP remote sources listed in RemoteFileSources:Sources. |
EnableFileTransfer |
false |
Perform the actual transfer/move (side effects). When false, pipeline simulates discovery only. |
Environment variable examples:
Features__EnableLocalPoller=true
Features__EnableFtpPoller=false
Features__EnableSftpPoller=true
Features__EnableFileTransfer=true
If all three poller flags are disabled the composite poller runs with an empty set (harmless no-op). This is useful for staging environments while only processing already enqueued events.
Remote (FTP/SFTP) directories are defined under the RemoteFileSources options section. Each source entry specifies protocol, connection details, remote path, and readiness behavior.
Example appsettings.json excerpt:
{
"RemoteFileSources": {
"Sources": [
{
"Name": "UpstreamFtp",
"Protocol": "Ftp",
"Host": "ftp.example.com",
"Port": 21,
"RemotePath": "/drop",
"UsernameSecret": "secrets:ftp-user",
"PasswordSecret": "secrets:ftp-pass",
"MinStableSeconds": 5
},
{
"Name": "PartnerSftp",
"Protocol": "Sftp",
"Host": "sftp.partner.net",
"Port": 22,
"RemotePath": "/inbound",
"UsernameSecret": "secrets:sftp-user",
"PasswordSecret": "secrets:sftp-pass",
"PrivateKeySecret": "secrets:sftp-key",
"PrivateKeyPassphraseSecret": "secrets:sftp-key-pass",
"MinStableSeconds": 8
}
]
}
}Environment variable form (first FTP source):
RemoteFileSources__Sources__0__Name=UpstreamFtp
RemoteFileSources__Sources__0__Protocol=Ftp
RemoteFileSources__Sources__0__Host=ftp.example.com
RemoteFileSources__Sources__0__Port=21
RemoteFileSources__Sources__0__RemotePath=/drop
RemoteFileSources__Sources__0__UsernameSecret=secrets:ftp-user
RemoteFileSources__Sources__0__PasswordSecret=secrets:ftp-pass
RemoteFileSources__Sources__0__MinStableSeconds=5
Validation rules enforced at startup:
Name,Protocol,Host,RemotePathrequired.Portmust be > 0.- Appropriate credential secret(s) must be present (username/password or key for SFTP; username/password for FTP).
MinStableSecondsmust be >= 0.
Secrets are referenced indirectly (UsernameSecret, PasswordSecret, etc.). The application resolves them through an ISecretResolver abstraction. In development a simple in-memory + environment variable resolver is used; production hosts should plug in Azure Key Vault (or alternative) without changing poller code.
Files are only enqueued once their size remains stable for MinStableSeconds. A per-file observation snapshot retains last size & timestamp; unstable files are skipped (counted in metrics) until stable. This minimizes partial file ingestion.
Each remote source tracks consecutive failures. An exponential backoff (base 5s doubling up to 5 minutes) delays subsequent attempts after errors (connection, auth, listing). A single success resets the backoff window.
The orchestrated processor is enabled by default. It uses a modular orchestrator that selects a protocol-specific reader, routes the file via rules, and writes to a destination sink. This unlocks multi-destination routing and a clean separation of concerns.
Key pieces:
- Readers:
local,sftp(FTP reader pending). The orchestrator selects byFileReference.Protocol. - Router: Matches by protocol and path into a single destination (current implementation is 1:1).
- Sinks: Currently local filesystem sink; remote sinks can be added later.
- Idempotency: Prevents duplicate processing (in-memory or Redis-backed store).
File deletion after a successful transfer is now controlled directly by each FileEvent through the DeleteAfterTransfer flag. This removes hidden, config-only coupling in the orchestrator and makes behavior explicit and testable.
How the flag is populated:
| Source Type | Config Flag | Event Field | Behavior |
|---|---|---|---|
| Local (FileSources) | DeleteAfterTransfer |
DeleteAfterTransfer |
When true, the original source file is deleted after it has been successfully written to its destination(s). |
| SFTP (RemoteFileSources:Sftp) | DeleteAfterTransfer |
DeleteAfterTransfer |
Remote file removed via SFTP client after successful sink write. |
| FTP (RemoteFileSources:Ftp) | DeleteAfterTransfer |
DeleteAfterTransfer |
Remote file removed via FTP client after successful sink write. |
Notes:
- If a remote deletion fails (e.g., transient network error), the failure is logged at
Warninglevel but processing still succeeds (idempotent by design — file may be re-polled if still present unless already deleted server-side later). - Local deletions are attempted only if the file still exists; a missing file (e.g., manual cleanup) does not cause failure.
- The Redis-backed queue persists the flag (
deleteAfterTransferfield) so horizontally scaled workers will honor the intent identically. - This design allows future protocols (e.g., Cloud object storage) to simply project their own deletion flag into the event without additional orchestrator changes.
Example environment variables:
FileSources__Sources__0__DeleteAfterTransfer=true
# Remote SFTP delete after transfer
RemoteFileSources__Sftp__0__DeleteAfterTransfer=true
If neither flag is set (false / omitted), no deletion occurs; files remain at the source.
Example appsettings.json excerpt:
{
"Destinations": {
"Local": [
{
"Name": "OutboxA",
"BasePath": "/data/outboxA",
"Overwrite": true
}
],
"Sftp": [
{
"Name": "PartnerX",
"Host": "sftp.partner.net",
"Port": 22,
"RemotePath": "/outbound",
"UsernameSecret": "secrets:sftp-user",
"PasswordSecret": "secrets:sftp-pass"
}
]
},
"Routing": {
"Rules": [
{
"Match": {
"Protocol": "local",
"PathPattern": "^/data/inboxA/.+\\.txt$"
},
"Destination": "OutboxA"
}
]
},
"Transfer": {
"ChunkSizeBytes": 32768,
"Idempotency": {
"Enabled": true,
"TtlSeconds": 86400
}
}
}Environment variable form (Windows PowerShell examples):
Destinations__Local__0__Name=OutboxA
Destinations__Local__0__BasePath=/data/outboxA
Destinations__Local__0__Overwrite=true
Routing__Rules__0__Match__Protocol=local
Routing__Rules__0__Match__PathPattern=^/data/inboxA/.+\.txt$
Routing__Rules__0__Destination=OutboxA
Transfer__ChunkSizeBytes=32768
Transfer__Idempotency__Enabled=true
Transfer__Idempotency__TtlSeconds=86400
Notes:
- On Windows, paths are normalized internally; the router matches against a normalized forward-slash path.
- If Redis is enabled (
Redis__Enabled=true), the idempotency store uses Redis with TTL; otherwise an in-memory store is used. - Current sink support is local filesystem; remote sinks may be added in future.
For a deeper overview see docs/processing-architecture.md.
Destinations now support a Service Bus variant alongside Local and Sftp. A Service Bus destination allows routing rules to direct a file's full content into a queue or topic after it has been read. The orchestrator identifies the destination kind and invokes the IFileContentPublisher abstraction instead of a file sink.
Configuration (Destinations:ServiceBus):
{
"Destinations": {
"ServiceBus": [
{
"Name": "Events",
"EntityName": "files-events", // queue or topic name
"IsTopic": false,
"ContentType": "text/plain"
}
]
},
"Routing": {
"Rules": [
{
"Name": "TxtToEvents",
"Protocol": "local",
"PathGlob": "**/*.txt",
"Destinations": ["Events"]
}
]
}
}Environment variable equivalents (first Service Bus destination):
Destinations__ServiceBus__0__Name=Events
Destinations__ServiceBus__0__EntityName=files-events
Destinations__ServiceBus__0__IsTopic=false
Destinations__ServiceBus__0__ContentType=text/plain
Routing rule notes:
- The
Destinationsarray in each rule lists logical destination names (e.g.,Events). The router resolves the kind (ServiceBus) via theDestinationsoptions. - Current implementation still processes only the first matching destination; multi-destination fan-out is planned.
Processing flow for Service Bus destination:
- Poller emits
FileEventonce file is stable. - Router matches rule and yields a
DestinationPlanwithKind=ServiceBus. - Orchestrator reads the file content stream, converts to UTF-8 bytes, builds a
FilePublishRequest. - Publisher sends one message containing the entire file content.
- Optional source deletion executes if
DeleteAfterTransferis true.
Size considerations:
- Ensure file sizes do not exceed Azure Service Bus message limits (Standard: ~256 KB, Premium: ~1 MB). Oversized files will require future chunking logic (not yet implemented).
- Binary files are currently treated as UTF-8 text if routed to Service Bus; define
ContentTypeappropriately or avoid routing binary blobs until chunking/streaming support is added.
Telemetry:
- Publish operation creates an Activity (
servicebus.publish) with tags:messaging.system=azure.servicebus,messaging.destination=<EntityName>. - Failures propagate as
Result.Failurewith categorized messaging errors.
Future enhancements under consideration:
- Retry/backoff for transient publish failures.
- Multi-destination fan-out (e.g., local + Service Bus).
- Session / scheduled messages.
- Line/record splitting with transformation stage before publish.
The Azure Service Bus publisher lets FileHorizon emit the full contents of a processed file as a single message to a queue or topic. (Per‑line record splitting was intentionally deferred to keep v1 scope minimal.)
Add the ServiceBusPublisher section to appsettings.json (or provide via environment variables):
{
"ServiceBusPublisher": {
"ConnectionString": "Endpoint=sb://<namespace>.servicebus.windows.net/;SharedAccessKeyName=FileHorizonPublish;SharedAccessKey=***",
"MaxConcurrentPublishes": 4,
"EnableTracing": true
}
}Environment variable equivalents:
ServiceBusPublisher__ConnectionString=Endpoint=sb://<namespace>.servicebus.windows.net/;SharedAccessKeyName=FileHorizonPublish;SharedAccessKey=***
ServiceBusPublisher__MaxConcurrentPublishes=4
ServiceBusPublisher__EnableTracing=true
Below is a simplified example showing how you can publish a file (once discovered and considered ready) using the injected IFileContentPublisher.
using FileHorizon.Application.Abstractions;
using FileHorizon.Application.Models;
using FileHorizon.Application.Common;
public sealed class SimplePublishExample
{
private readonly IFileContentPublisher _publisher;
private readonly IFileContentReader _reader;
public SimplePublishExample(IFileContentPublisher publisher, IFileContentReader reader)
{
_publisher = publisher;
_reader = reader;
}
public async Task<Result> PublishFileAsync(string fullPath, string destinationName, CancellationToken ct)
{
// Read bytes via the existing content reader abstraction.
var bytes = await _reader.ReadAsync(fullPath, ct);
var request = new FilePublishRequest(
SourcePath: Path.GetDirectoryName(fullPath)!,
FileName: Path.GetFileName(fullPath),
Content: bytes,
ContentType: "application/octet-stream", // or text/plain, etc.
DestinationName: destinationName, // Queue or topic name
IsTopic: false, // Set true if sending to a Topic
ApplicationProperties: new Dictionary<string,string>
{
["source"] = "inboxA",
["env"] = Environment.GetEnvironmentVariable("ASPNETCORE_ENVIRONMENT") ?? "dev"
}
);
return await _publisher.PublishAsync(request, ct);
}
}Notes:
- The current implementation sends one message per file (whole content). No line/record splitting.
IsTopicis reserved for topic semantics; publisher treats queue vs topic uniformly at this stage.ApplicationPropertiesare promoted to Service Bus application properties for downstream consumers.- Ensure large files fit within Service Bus message size limits (256 KB standard / 1 MB premium) — otherwise introduce a pre‑processing/chunking layer (future enhancement).
When running in containers you can supply:
ServiceBusPublisher__ConnectionString=Endpoint=sb://demo.servicebus.windows.net/;SharedAccessKeyName=FileHorizonPublish;SharedAccessKey=***
If publishing is conditional, you can wrap calls in a feature flag (future: may add Features__EnableServiceBusEgress). For now, simply avoid invoking the publisher when not desired.
Planned extensions (tracked in Issue #14):
- Optional record/line splitting via a transformation stage.
- Retry with exponential backoff for transient publish failures.
- Topic filters & advanced metadata (session id, scheduled enqueue time).
- Config-driven routing rule: source path ➜ queue/topic name mapping.
All files (local and remote) are assigned a normalized identity key:
<protocol>://<host>:<port>/<normalized/path>
Local paths use a normalized form (e.g. local://_/:0/drive/path/file.txt). This shared scheme powers duplicate suppression and telemetry tagging.
Poller metrics (Meter FileHorizon):
| Metric | Type | Description | Key Tags |
|---|---|---|---|
filehorizon.poller.poll_cycles |
Counter | Number of poll cycles executed per source | protocol, source |
filehorizon.poller.files.discovered |
Counter | Files discovered (ready) | protocol, source |
filehorizon.poller.files.skipped.unstable |
Counter | Files skipped due to instability | protocol, source |
filehorizon.poller.errors |
Counter | Poll errors (enumeration failures) | protocol, source, error.type |
Traces include an Activity per remote source (poll.source) with tags: protocol, host, source.name, backoff.ms (if applied).
A docker-compose.yml is provided to spin up Redis + the FileHorizon app quickly. The file is compatible with nerdctl compose in containerd environments (Rancher Desktop, Lima, etc.).
Use nerdctl compose (NOT docker compose). Example mapping:
| Purpose | Docker CLI | nerdctl |
|---|---|---|
| Build & up (foreground) | docker compose up --build |
nerdctl compose up --build |
| Detached | docker compose up -d --build |
nerdctl compose up -d --build |
| Scale | docker compose up -d --scale app=2 |
nerdctl compose up -d --scale app=2 |
| Logs | docker compose logs -f app |
nerdctl compose logs -f app |
| Stop | docker compose down |
nerdctl compose down |
Why: Rancher Desktop (containerd backend) manages images separately from Docker Desktop (Moby). If you run docker build then nerdctl compose up, the image may not exist in containerd and the deployment will fail. Always build with the same tool:
nerdctl build -t filehorizon:dev .
nerdctl compose up -d --build
Quick start (nerdctl):
# Create local data folders (Linux/macOS examples)
mkdir -p _data/inboxA _data/outboxA
# Or on PowerShell (Windows):
New-Item -ItemType Directory -Path _data/inboxA,_data/outboxA | Out-Null
# Build and start (foreground)
nerdctl compose up --build
# Or start detached
nerdctl compose up -d --build
# Check service status
nerdctl compose ps
# Tail logs
nerdctl compose logs -f app
Health check:
curl http://localhost:8080/health
Stop & remove:
nerdctl compose down
The compose file sets sensible defaults:
Redis__Enabled=trueenables Redis Streams queue (falls back to in-memory if false).FileSources__Sources__0__*defines the first file source (InboxA). Add more sources incrementally:FileSources__Sources__1__Name=InboxBFileSources__Sources__1__Path=/data/inboxB
- Change
Features__EnableFileTransfertotrueto perform actual file transfers once implemented. Pipeline orchestration now usesPipeline__Roleto determine which background services run (see section below).Features__EnableFileTransferonly controls whether real file movement occurs.
You can override any value using an .env file placed next to docker-compose.yml:
# .env example
PIPELINE__ROLE=All
FEATURES__ENABLEFILETRANSFER=false
REDIS__ENABLED=true
POLLING__INTERVALMILLISECONDS=500
(Compose automatically loads .env; ensure variable names match exactly.)
- Faster polling during development:
Polling__IntervalMilliseconds=500
- Larger batch processing:
Polling__BatchReadLimit=25
- Switch to local fallback queue:
Redis__Enabled=false
- Separate stream per environment:
Redis__StreamName=filehorizon:dev:file-events
To test horizontal scaling (Redis-backed queue required):
nerdctl compose up -d --build --scale app=2
Each replica will create a unique consumer name derived from Redis__ConsumerNamePrefix ensuring cooperative consumption via the shared consumer group.
For a clean separation:
Example: one poller + multiple workers
# poller (enqueue only, no processing)
Pipeline__Role=Poller
Features__EnableFileTransfer=false
# worker (process only)
Pipeline__Role=Worker
Features__EnableFileTransfer=true
Alternatively (common simpler pattern):
# single poller that also processes
Pipeline__Role=All
Features__EnableFileTransfer=true
# additional workers (no polling - processing only)
Pipeline__Role=Worker
Features__EnableFileTransfer=true
Reminder (WSL + containerd): If you previously built with
docker build, rebuild withnerdctl buildto ensure the image exists in the containerd image store before scaling.
Use these commands (or similar) to generate lots of files quickly in the inboxes.
seq 1 1000 | xargs -I{} -P 8 sh -c 'echo "test" > inboxA/file{}.txt'
seq 1 1000 | xargs -I{} -P 8 sh -c 'echo "test" > inboxB/file{}.txt'
To count files i a folder you can use
find . -maxdepth 1 -type f | wc -l
- Switch to distroless or
-alpinebase (after validating native dependencies). - Add read-only root filesystem (
--read-only) with tmpfs mounts for transient storage. - Introduce health/liveness/readiness probes in orchestration environments.
Recently completed (this branch): FTP & SFTP pollers, feature flags, remote readiness/backoff, poller telemetry.
Upcoming / still planned:
- Persistent idempotency & deduplication registry (Redis / durable store)
- Message claiming / retry hardening for Redis Streams
- Service Bus ingress / egress bridge
- Extended telemetry (error categorization, size histograms)
- Configurable secret resolver (production Key Vault implementation)
- Optional archive / checksum verification stage
FileHorizon ships with unified tracing, metrics, and structured logging via OpenTelemetry. No other logging framework is used.
- Traces: file processing spans (
file.process,file.orchestrate), reader/sink spans (reader.open,sink.write), queue enqueue/dequeue spans (queue.enqueue,queue.dequeue), lifecycle span (pipeline.lifetime). - Metrics (Meter
FileHorizon):files.processed(counter)files.failed(counter)bytes.copied(counter)queue.enqueued(counter)queue.enqueue.failures(counter)queue.dequeued(counter)queue.dequeue.failures(counter)processing.duration.ms(histogram)poll.cycle.duration.ms(histogram)
If enabled (default), metrics are exposed at GET /metrics using the Prometheus scrape format. Health remains at /health.
Telemetry section (appsettings or env variables):
| Key | Default | Description |
|---|---|---|
EnableTracing |
true | Enable Activity/trace pipeline |
EnableMetrics |
true | Enable metrics collection |
EnableLogging |
true | Route structured logs through OTEL exporter |
EnablePrometheus |
true | Expose /metrics endpoint |
EnableOtlpExporter |
false | Enable OTLP exporter for traces/metrics/logs |
OtlpEndpoint |
null | OTLP/gRPC or HTTP endpoint (e.g. http://otel-collector:4317) |
OtlpHeaders |
null | Additional OTLP headers (key=value;key2=value2) |
OtlpInsecure |
false | Allow insecure (no TLS) if collector enforces it |
ServiceName |
FileHorizon | Override service.name resource attribute |
ServiceVersion |
Assembly version | Override service.version |
DeploymentEnvironment |
ASPNETCORE_ENVIRONMENT | Adds deployment.environment attribute |
Environment variable mapping uses double underscores, e.g.:
Telemetry__EnableOtlpExporter=true
Telemetry__OtlpEndpoint=http://otel-collector:4317
Telemetry__DeploymentEnvironment=dev
Point to a collector (recommended) rather than vendors directly:
Telemetry__EnableOtlpExporter=true
Telemetry__OtlpEndpoint=http://otel-collector:4317
If headers are required (HTTP/Protobuf variant):
Telemetry__OtlpHeaders=api-key=XYZ123
Add a collector and Prometheus service (sketch):
otel-collector:
image: otel/opentelemetry-collector:latest
command: ["--config=/etc/otel-collector-config.yaml"]
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml:ro
ports:
- "4317:4317" # OTLP gRPC
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
ports:
- "9090:9090"The application container only needs relevant environment variables; /metrics will be scraped by Prometheus.
Currently logs go through the OpenTelemetry logging provider. If an OTLP exporter is enabled they will be forwarded to the collector; otherwise they remain local (console).
| Span | Key | Example |
|---|---|---|
| file.process | file.id |
f_123 |
| file.process | file.protocol |
local |
| file.process | file.source_path |
/data/inbox/file1.txt |
| file.process | file.size_bytes |
2048 |
| reader.open | file.protocol |
sftp |
| sink.write | sink.name |
OutboxA |
| queue.enqueue / dequeue | messaging.system |
redis |
| queue.enqueue / dequeue | messaging.destination |
filehorizon:file-events |
| queue.dequeue | messaging.batch.message_count |
10 |
These may evolve toward official semantic conventions as they stabilize.
curl http://localhost:8080/metrics | head -n 40
You should see counters like files_processed_total and processing_duration_ms_bucket (Prometheus histogram exposition).
| Issue | Likely Cause | Remedy |
|---|---|---|
| No metrics at /metrics | Prometheus disabled | Set Telemetry__EnablePrometheus=true |
| No traces in collector | OTLP exporter disabled | Set Telemetry__EnableOtlpExporter=true and endpoint |
| Service name wrong | Override provided | Adjust Telemetry__ServiceName |
| High cardinality risk | Dynamic paths tagged | Consider trimming or hashing path tags in future |
Planned / candidate improvements:
- File size distribution histogram
- Error categorization with semantic convention attributes (e.g.
error.type) - Redis pending / claim latency metrics
- Service Bus ingress / egress spans and metrics
- De-duplication cache hit/miss counters
- Optional span events for validation / archive stages
- Configurable sampling (probabilistic/parent-based) via Telemetry options
Contributions welcome—feel free to open issues or draft PRs as the architecture evolves.
The runtime now supports explicit role selection separating file discovery (polling) from event processing.
Roles are configured via Pipeline:Role (or environment variable Pipeline__Role).
Available values:
| Role | Hosted Services Started | Typical Use Case |
|---|---|---|
| All | Polling + Processing | Local dev / simple single-node deploy |
| Poller | Polling only | Dedicated ingestion node |
| Worker | Processing only | Horizontal scale-out workers |
Example environment overrides:
Pipeline__Role=Poller
Pipeline__Role=Worker
Pipeline__Role fully determines polling vs processing. The only remaining feature flag in this area is Features__EnableFileTransfer which toggles actual file copy/move side effects.