feat: WhatsApp/Gmail channels, image vision, voice, PDF/GOG/summarize skills, perf#917
Open
vsabavat wants to merge 5 commits intoqwibitai:mainfrom
Open
feat: WhatsApp/Gmail channels, image vision, voice, PDF/GOG/summarize skills, perf#917vsabavat wants to merge 5 commits intoqwibitai:mainfrom
vsabavat wants to merge 5 commits intoqwibitai:mainfrom
Conversation
3043292 to
e0be87f
Compare
…lls, perf Channels & auth: - Add WhatsApp channel (Baileys, pairing code auth, group sync, reconnect) - Add Gmail channel (OAuth, full channel mode — emails trigger the agent) - Add voice transcription via OpenAI Whisper API - Register all channels in src/channels/index.ts Image vision: - src/image.ts: resize + base64-encode WhatsApp image attachments - src/channels/whatsapp.ts: detect + download images, send as multimodal blocks - container/agent-runner/src/index.ts: accept imageAttachments in ContainerInput Container skills: - container/skills/pdf-reader: extract PDF text via pdftotext (poppler-utils) - container/skills/gog: gog CLI available to all agents - container/skills/summarize: @steipete/summarize routed through cli/claude - container/entrypoint.sh: extracted from inline Dockerfile RUN for readability - container/summarize-wrapper.sh: injects API key, routes to claude-code - container/Dockerfile: install poppler-utils, @steipete/summarize, gog globally Performance: - Cache compiled agent-runner dist by hashing /app/src .ts files First run compiles once; subsequent runs skip tsc (~10s → ~1-2s cold start) Cached dist persists in data/sessions/<group>/agent-runner-dist/ on host - POLL_INTERVAL: default 1000ms, configurable via POLL_INTERVAL env var Fixes: - TRIGGER_PATTERN: remove ^ anchor so @astra matches anywhere in a message - Shadow .env inside container mount so secrets stay out of agent's reach Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
[email protected] has a peerOptional dep on zod@^3.x but the project uses zod@^4.x for ipc-mcp-stdio.ts. Adding legacy-peer-deps=true lets npm install both without ERESOLVE, matching how the lockfile was generated locally. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…hing The ^ anchor was removed so @name matches anywhere, not just at start. Update the test description and expectations accordingly. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
e0be87f to
3f14152
Compare
- .env.example: replace "# Added by skill" artifacts with proper docs - setup/index.ts: remove channels step (setup/channels.js doesn't exist) Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
This was referenced Mar 11, 2026
When a user replies to a specific message in WhatsApp, the agent now sees which message was being replied to. Extracts contextInfo.stanzaId from Baileys message objects, stores it as reply_to_id in the messages table (with auto-migration for existing DBs), and emits it as a reply_to attribute on <message> elements in the XML context sent to the agent. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
This was referenced Mar 13, 2026
Fritzzzz1
added a commit
to Fritzzzz1/nanoclaw
that referenced
this pull request
Mar 23, 2026
Content pipeline for non-text messages: channel adapters produce RawContentPart[], processContentParts() resolves them into ContentPart[] (download, save, convert), agent runner dispatches to pluggable type handlers — Claude-native types get embedded, non-native types get file-reference injection. Skills can override any handler. - RawContentPart/ContentPart types (ref + buffer dual input) - processContentParts() with media download and local storage - content_parts DB column with string fallback (backward compat) - Handler registry + dispatch in agent runner - Media directory mount into containers Co-Authored-By: @JasonOA888 (qwibitai#902) Co-Authored-By: @kenmaz (qwibitai#1069) Co-Authored-By: @vsabavat (qwibitai#917, qwibitai#1055) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Fritzzzz1
added a commit
to Fritzzzz1/nanoclaw
that referenced
this pull request
Mar 23, 2026
Content pipeline for non-text messages: channel adapters produce RawContentPart[], processContentParts() resolves them into ContentPart[] (download, save, convert), agent runner dispatches to pluggable type handlers — Claude-native types get embedded, non-native types get file-reference injection. Skills can override any handler. - RawContentPart/ContentPart types (ref + buffer dual input) - processContentParts() with media download and local storage - content_parts DB column with string fallback (backward compat) - Handler registry + dispatch in agent runner - Media directory mount into containers Co-Authored-By: @JasonOA888 (qwibitai#902) Co-Authored-By: @kenmaz (qwibitai#1069) Co-Authored-By: @vsabavat (qwibitai#917, qwibitai#1055) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Fritzzzz1
added a commit
to Fritzzzz1/nanoclaw
that referenced
this pull request
Mar 23, 2026
Content pipeline for non-text messages: channel adapters produce RawContentPart[], processContentParts() resolves them into ContentPart[] (download, save, convert), agent runner dispatches to pluggable type handlers — Claude-native types get embedded, non-native types get file-reference injection. Skills can override any handler. - RawContentPart/ContentPart types (ref + buffer dual input) - processContentParts() with media download and local storage - content_parts DB column with string fallback (backward compat) - Handler registry + dispatch in agent runner - Media directory mount into containers Co-Authored-By: @JasonOA888 (qwibitai#902) Co-Authored-By: @kenmaz (qwibitai#1069) Co-Authored-By: @vsabavat (qwibitai#917, qwibitai#1055) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
gavrielc
pushed a commit
to Fritzzzz1/nanoclaw
that referenced
this pull request
Apr 1, 2026
Content pipeline for non-text messages: channel adapters produce RawContentPart[], processContentParts() resolves them into ContentPart[] (download, save, convert), agent runner dispatches to pluggable type handlers — Claude-native types get embedded, non-native types get file-reference injection. Skills can override any handler. - RawContentPart/ContentPart types (ref + buffer dual input) - processContentParts() with media download and local storage - content_parts DB column with string fallback (backward compat) - Handler registry + dispatch in agent runner - Media directory mount into containers Co-Authored-By: @JasonOA888 (qwibitai#902) Co-Authored-By: @kenmaz (qwibitai#1069) Co-Authored-By: @vsabavat (qwibitai#917, qwibitai#1055) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
src/image.ts, updatedwhatsapp.ts+agent-runner)pdftotext(poppler-utils) in container, handles attachments/URLs/local filesgogCLI available to all container agents@steipete/summarizerouted throughcli/claude(no OpenAI key needed); wrapper script injects API key at startupRUNtocontainer/entrypoint.sh/app/src.ts files at startup; skiptscwhen unchanged. Cuts cold start from ~10s to ~1–2s. Cache persists indata/sessions/<group>/agent-runner-dist/between runsPOLL_INTERVALenv var^anchor fromTRIGGER_PATTERNso@AssistantNamematches anywhere in a message, not just at the start.envinside container mount so secrets can't be read from the mounted project rootTest plan
cli/claude, no OpenAI error@Astramid-sentence → agent triggerstsccompile log (cache hit)npm testpasses🤖 Generated with Claude Code