feat: WhatsApp/Gmail channels, image vision, voice, PDF/GOG/summarize skills, perf by vsabavat · Pull Request #917 · qwibitai/nanoclaw

vsabavat · 2026-03-10T06:01:42Z

Summary

WhatsApp channel — Baileys-based, pairing code auth, group sync, reconnect logic
Gmail channel — OAuth, full channel mode (emails trigger the agent, agent can reply)
Voice transcription — OpenAI Whisper API, auto-transcribes WhatsApp voice notes
Image vision — WhatsApp image attachments resized + sent to Claude as multimodal content blocks (src/image.ts, updated whatsapp.ts + agent-runner)
PDF reader skill — pdftotext (poppler-utils) in container, handles attachments/URLs/local files
GOG skill — gog CLI available to all container agents
Summarize skill — @steipete/summarize routed through cli/claude (no OpenAI key needed); wrapper script injects API key at startup
Container entrypoint — extracted from inline Dockerfile RUN to container/entrypoint.sh
Compile cache — hash /app/src .ts files at startup; skip tsc when unchanged. Cuts cold start from ~10s to ~1–2s. Cache persists in data/sessions/<group>/agent-runner-dist/ between runs
Poll interval — default lowered to 1000ms, configurable via POLL_INTERVAL env var
Trigger fix — removed ^ anchor from TRIGGER_PATTERN so @AssistantName matches anywhere in a message, not just at the start
Security — shadow .env inside container mount so secrets can't be read from the mounted project root

Test plan

Send a WhatsApp image → agent describes it
Send a voice note → agent reads the transcription
Send a PDF attachment → agent extracts and summarizes text
Ask agent to summarize a URL → uses cli/claude, no OpenAI error
Send message with @Astra mid-sentence → agent triggers
Send second message to same group → confirm no tsc compile log (cache hit)
npm test passes

🤖 Generated with Claude Code

@astra

…lls, perf Channels & auth: - Add WhatsApp channel (Baileys, pairing code auth, group sync, reconnect) - Add Gmail channel (OAuth, full channel mode — emails trigger the agent) - Add voice transcription via OpenAI Whisper API - Register all channels in src/channels/index.ts Image vision: - src/image.ts: resize + base64-encode WhatsApp image attachments - src/channels/whatsapp.ts: detect + download images, send as multimodal blocks - container/agent-runner/src/index.ts: accept imageAttachments in ContainerInput Container skills: - container/skills/pdf-reader: extract PDF text via pdftotext (poppler-utils) - container/skills/gog: gog CLI available to all agents - container/skills/summarize: @steipete/summarize routed through cli/claude - container/entrypoint.sh: extracted from inline Dockerfile RUN for readability - container/summarize-wrapper.sh: injects API key, routes to claude-code - container/Dockerfile: install poppler-utils, @steipete/summarize, gog globally Performance: - Cache compiled agent-runner dist by hashing /app/src .ts files First run compiles once; subsequent runs skip tsc (~10s → ~1-2s cold start) Cached dist persists in data/sessions/<group>/agent-runner-dist/ on host - POLL_INTERVAL: default 1000ms, configurable via POLL_INTERVAL env var Fixes: - TRIGGER_PATTERN: remove ^ anchor so @astra matches anywhere in a message - Shadow .env inside container mount so secrets stay out of agent's reach Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

[email protected] has a peerOptional dep on zod@^3.x but the project uses zod@^4.x for ipc-mcp-stdio.ts. Adding legacy-peer-deps=true lets npm install both without ERESOLVE, matching how the lockfile was generated locally. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

@name

…hing The ^ anchor was removed so @name matches anywhere, not just at start. Update the test description and expectations accordingly. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

- .env.example: replace "# Added by skill" artifacts with proper docs - setup/index.ts: remove channels step (setup/channels.js doesn't exist) Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

When a user replies to a specific message in WhatsApp, the agent now sees which message was being replied to. Extracts contextInfo.stanzaId from Baileys message objects, stores it as reply_to_id in the messages table (with auto-migration for existing DBs), and emits it as a reply_to attribute on <message> elements in the XML context sent to the agent. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

@JasonOA888

Content pipeline for non-text messages: channel adapters produce RawContentPart[], processContentParts() resolves them into ContentPart[] (download, save, convert), agent runner dispatches to pluggable type handlers — Claude-native types get embedded, non-native types get file-reference injection. Skills can override any handler. - RawContentPart/ContentPart types (ref + buffer dual input) - processContentParts() with media download and local storage - content_parts DB column with string fallback (backward compat) - Handler registry + dispatch in agent runner - Media directory mount into containers Co-Authored-By: @JasonOA888 (qwibitai#902) Co-Authored-By: @kenmaz (qwibitai#1069) Co-Authored-By: @vsabavat (qwibitai#917, qwibitai#1055) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

@JasonOA888

Content pipeline for non-text messages: channel adapters produce RawContentPart[], processContentParts() resolves them into ContentPart[] (download, save, convert), agent runner dispatches to pluggable type handlers — Claude-native types get embedded, non-native types get file-reference injection. Skills can override any handler. - RawContentPart/ContentPart types (ref + buffer dual input) - processContentParts() with media download and local storage - content_parts DB column with string fallback (backward compat) - Handler registry + dispatch in agent runner - Media directory mount into containers Co-Authored-By: @JasonOA888 (qwibitai#902) Co-Authored-By: @kenmaz (qwibitai#1069) Co-Authored-By: @vsabavat (qwibitai#917, qwibitai#1055) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

@JasonOA888

Content pipeline for non-text messages: channel adapters produce RawContentPart[], processContentParts() resolves them into ContentPart[] (download, save, convert), agent runner dispatches to pluggable type handlers — Claude-native types get embedded, non-native types get file-reference injection. Skills can override any handler. - RawContentPart/ContentPart types (ref + buffer dual input) - processContentParts() with media download and local storage - content_parts DB column with string fallback (backward compat) - Handler registry + dispatch in agent runner - Media directory mount into containers Co-Authored-By: @JasonOA888 (qwibitai#902) Co-Authored-By: @kenmaz (qwibitai#1069) Co-Authored-By: @vsabavat (qwibitai#917, qwibitai#1055) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

@JasonOA888

Content pipeline for non-text messages: channel adapters produce RawContentPart[], processContentParts() resolves them into ContentPart[] (download, save, convert), agent runner dispatches to pluggable type handlers — Claude-native types get embedded, non-native types get file-reference injection. Skills can override any handler. - RawContentPart/ContentPart types (ref + buffer dual input) - processContentParts() with media download and local storage - content_parts DB column with string fallback (backward compat) - Handler registry + dispatch in agent runner - Media directory mount into containers Co-Authored-By: @JasonOA888 (qwibitai#902) Co-Authored-By: @kenmaz (qwibitai#1069) Co-Authored-By: @vsabavat (qwibitai#917, qwibitai#1055) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

vsabavat requested review from gabi-simons and gavrielc as code owners March 10, 2026 06:01

Andy-NanoClaw-AI added PR: Feature New feature or enhancement Status: Needs Review Ready for maintainer review labels Mar 10, 2026

vsabavat force-pushed the feat/all-contributions branch from 3043292 to e0be87f Compare March 10, 2026 22:57

vsabavat and others added 3 commits March 10, 2026 23:01

test: update TRIGGER_PATTERN test to reflect anywhere-in-message matc…

3f14152

…hing The ^ anchor was removed so @name matches anywhere, not just at start. Update the test description and expectations accordingly. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

vsabavat force-pushed the feat/all-contributions branch from e0be87f to 3f14152 Compare March 10, 2026 23:01

chore: clean up .env.example and remove broken setup channels step

16a205a

- .env.example: replace "# Added by skill" artifacts with proper docs - setup/index.ts: remove channels step (setup/channels.js doesn't exist) Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

This was referenced Mar 11, 2026

🦞 Bản tin hàng ngày hệ sinh thái OpenClaw 2026-03-11 compasify/agents-radar#26

Open

🦞 OpenClaw 生态日报 2026-03-11 rollysys/agents-radar#68

Open

This was referenced Mar 13, 2026

🦞 OpenClaw 生态日报 2026-03-13 gsscsd/big_model_radar#28

Open

🦞 Bản tin hàng ngày hệ sinh thái OpenClaw 2026-03-13 compasify/agents-radar#36

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: WhatsApp/Gmail channels, image vision, voice, PDF/GOG/summarize skills, perf#917

feat: WhatsApp/Gmail channels, image vision, voice, PDF/GOG/summarize skills, perf#917
vsabavat wants to merge 5 commits intoqwibitai:mainfrom
vsabavat:feat/all-contributions

vsabavat commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vsabavat commented Mar 10, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants