feat: add image upload and multimodal chat support #285

jchris · 2025-08-22T18:46:38Z

Summary

Implements comprehensive image upload functionality for the chat interface with drag-and-drop support, file picker, preview thumbnails, and multimodal AI integration. This addresses issue #269 by providing a complete image attachment system with Fireproof-backed storage.

Key Features

File Upload & Drag-and-Drop: Added "+" button and drag-and-drop support on textarea for image selection
Image Preview System: Real-time thumbnail previews with remove functionality before sending
Fireproof Storage: Images stored as separate documents (not embedded base64) with session association
Multimodal AI Support: Full integration with OpenAI-style image_url segments for vision models
Message History: User messages display attached images using existing ImgFile component
Backward Compatibility: All existing text-only chat functionality preserved

Implementation Details

Types: Added ImageDocument type and images?: string[] to UserChatMessageDocument
Components: New ImagePreview component, updated ChatInput and Message components
State Management: Extended useSimpleChat with attachImages, removeAttachedImage, clearAttachedImages
API Integration: Updated streamHandler to support multimodal content arrays
Storage Strategy: Clean separation between temporary preview URLs and persistent Fireproof storage

Technical Approach

Images stored as Fireproof documents with only IDs in chat messages (no base64 persistence)
Drag-and-drop with visual feedback on textarea
Memory management with proper URL cleanup
Direct database operations for image storage/retrieval
Multimodal API requests built at send time from stored images

Test plan

🤖 Generated with Claude Code

netlify · 2025-08-22T18:46:44Z

✅ Deploy Preview for fireproof-ai-builder ready!

Name	Link
🔨 Latest commit	`92e561d`
🔍 Latest deploy log	https://app.netlify.com/projects/fireproof-ai-builder/deploys/68a8be0fe07f650008f107d5
😎 Deploy Preview	https://deploy-preview-285--fireproof-ai-builder.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

charliecreates

Overall implementation is solid and integrates image uploads across state, storage, UI, and streaming. However, rendering of user images relies on index alignment between loaded files and imageIds, which can desynchronize and cause incorrect React keys. Image loading/processing is done sequentially in multiple places, causing unnecessary latency; concurrency and optional client-side downscaling would improve performance. Consider adding limits on number/size of attachments to prevent large payloads and poor UX, and guard against accidental browser navigation when dropping outside the textarea.

Summary of changes

Added image attachment workflow to chat input: drag-and-drop on textarea, hidden file input, preview thumbnails, and remove button using a new ImagePreview component.
Extended chat state and types to include image metadata and image document type (ImageDocument), and added image management functions (attachImages, removeAttachedImage, clearAttachedImages).
Persisted images as Fireproof documents and reference them by ID in user messages; user messages can now display attached images via a new UserImages helper in Message.tsx.
Updated sendMessage to carry attached image IDs and pass them to the AI streaming layer.
Updated streamHandler to build multimodal content arrays for call-ai, loading image files from the session database and converting to data URLs.
Minor UI adjustments to ChatInput spacing and visual drag-over feedback.
Kept backward compatibility with text-only chat and existing model picker logic.

charliecreates · 2025-08-22T18:49:40Z

vibes.diy/pkg/app/components/Message.tsx

+  useEffect(() => {
+    let cancelled = false;
+    async function load() {
+      const result: DocFileMeta[] = [];
+      for (const id of imageIds) {
+        try {
+          const doc = await database.get(id);
+          const fpFiles = doc?._files || {};
+          const key = "image" in fpFiles ? "image" : Object.keys(fpFiles)[0];
+          const ref = key ? fpFiles[key] : undefined;
+          if (ref) {
+            const anyRef: any = ref as any;
+            if (typeof anyRef.file === "function") {
+              // Fireproof DocFileMeta
+              result.push(anyRef as DocFileMeta);
+            } else if (ref instanceof File) {
+              // Raw File stored in _files (dev/testing scenarios)
+              const wrapped: DocFileMeta = {
+                file: async () => ref,
+                type: ref.type || "application/octet-stream",
+                size: ref.size,
+                cid: undefined as any,
+              };
+              result.push(wrapped);
+            }
+          }
+        } catch (err) {
+          console.warn(`Failed to load image ${id}:`, err);
+        }
+      }
+      if (!cancelled) {
+        setFiles(result);
+      }
+    }
+    void load();
+    return () => {
+      cancelled = true;
+    };
+  }, [database, imageIds]);
+
+  if (files.length === 0) return null;
+
+  return (
+    <div className="flex flex-wrap gap-2 mb-2">
+      {files.map((file, index) => (
+        <div
+          key={`${sessionId}-${imageIds[index]}`}
+          className="border-light-decorative-00 dark:border-dark-decorative-00 overflow-hidden rounded border"
+        >
+          <ImgFile
+            file={file}
+            alt="User attachment"
+            className="max-w-xs"
+            maxHeight="12rem"
+          />
+        </div>
+      ))}
+    </div>
+  );
+};


Rendering attached images currently uses files.map((file, index) => … key={${sessionId}-${imageIds[index]})), which assumes a 1:1 alignment between files and imageIds by index. Because the loader skips missing/invalid images, index-based keys can become misaligned, causing duplicate or incorrect keys and coupling the wrong image to the wrong ID. Additionally, the loader performs network/DB fetches sequentially and relies on any for file refs, both of which are avoidable.

Recommend: (1) load images concurrently, (2) normalize Fireproof refs without any, and (3) keep the loaded results paired with their source IDs so the render key and file stay in sync.

Suggestion

Consider refactoring UserImages like this to ensure stable ID↔file pairing, avoid any, and parallelize loads:

Store pairs of { id, meta } in state instead of a bare DocFileMeta[].

Use a small helper to normalize Fireproof file refs without any.

Use Promise.all to fetch/normalize concurrently and filter out failures, preserving order by ID.

Example:

const toDocFileMeta = async (ref: unknown): Promise<DocFileMeta | null> => { if (!ref) return null; if (typeof (ref as any).file === "function") { return ref as DocFileMeta; } if (ref instanceof File) { const f = ref; return { file: async () => f, type: f.type || "application/octet-stream", size: f.size, cid: undefined as any, } satisfies DocFileMeta; } return null; }; const [files, setFiles] = useState<Array<{ id: string; meta: DocFileMeta }>>([]); useEffect(() => { let cancelled = false; (async () => { const pairs = await Promise.all( imageIds.map(async (id) => { try { const doc = await database.get(id); const fpFiles = doc?._files || {}; const key = "image" in fpFiles ? "image" : Object.keys(fpFiles)[0]; const meta = await toDocFileMeta(key ? fpFiles[key] : undefined); return meta ? { id, meta } : null; } catch (err) { console.warn(`Failed to load image ${id}:`, err); return null; } }), ); if (!cancelled) setFiles(pairs.filter(Boolean) as Array<{ id: string; meta: DocFileMeta }>); })(); return () => { cancelled = true; }; }, [database, imageIds]); // Render <div className="flex flex-wrap gap-2 mb-2"> {files.map(({ id, meta }) => ( <div key={`${sessionId}-${id}`} className="border-light-decorative-00 dark:border-dark-decorative-00 overflow-hidden rounded border"> <ImgFile file={meta} alt="User attachment" className="max-w-xs" maxHeight="12rem" /> </div> ))} </div>

Reply with "@CharlieHelps yes please" if you'd like me to add a commit with this refactor.

charliecreates · 2025-08-22T18:49:40Z

vibes.diy/pkg/app/utils/streamHandler.ts

+  // Load images if we have image IDs and a database
+  const imageDataUrls: string[] = [];
+  if (imageIds && imageIds.length > 0 && sessionDatabase) {
+    for (const imageId of imageIds) {
+      try {
+        const doc = await sessionDatabase.get(imageId);
+        const fpFiles = doc?._files || {};
+        const key = "image" in fpFiles ? "image" : Object.keys(fpFiles)[0];
+        const ref = key ? fpFiles[key] : undefined;
+        if (ref) {
+          const anyRef: any = ref as any;
+          let file: File;
+          if (typeof anyRef.file === "function") {
+            // Fireproof DocFileMeta
+            file = await anyRef.file();
+          } else if (anyRef instanceof File) {
+            // Raw File
+            file = anyRef;
+          } else {
+            continue;
+          }
+          // Convert to data URL (browser-compatible)
+          const reader = new FileReader();
+          const dataUrl = await new Promise<string>((resolve, reject) => {
+            reader.onload = () => resolve(reader.result as string);
+            reader.onerror = reject;
+            reader.readAsDataURL(file);
+          });
+          imageDataUrls.push(dataUrl);
+        }
+      } catch (error) {
+        console.warn(`Failed to load image ${imageId}:`, error);
+      }
+    }
+  }
+
+  // Build user message content - multimodal if we have images
+  let userContent:
+    | string
+    | Array<{
+        type: "text" | "image_url";
+        text?: string;
+        image_url?: { url: string };
+      }>;
+
+  if (imageDataUrls.length > 0) {
+    // Multimodal message with text and images
+    const contentParts: Array<{
+      type: "text" | "image_url";
+      text?: string;
+      image_url?: { url: string };
+    }> = [{ type: "text", text: userMessage }];
+
+    // Add image segments
+    for (const dataUrl of imageDataUrls) {
+      contentParts.push({
+        type: "image_url",
+        image_url: { url: dataUrl },
+      });
+    }
+
+    userContent = contentParts;
+  } else {
+    // Text-only message
+    userContent = userMessage;
+  }


Images are loaded and converted to Data URLs sequentially, which is unnecessarily slow when multiple images are attached. Also, sending original full-resolution images as inline Data URLs can significantly bloat the request and slow model input processing. Concurrency and client-side downscaling can materially improve performance without reducing utility.

Suggestion

Load and process images concurrently, and (optionally) downscale them to a reasonable max dimension before encoding to Data URLs. Example inline replacement for the loading block:

// Load images concurrently and (optionally) downscale to reduce payload const imageDataUrls: string[] = sessionDatabase && imageIds?.length ? ( await Promise.all( imageIds.map(async (imageId) => { try { const doc = await sessionDatabase.get(imageId); const fpFiles = doc?._files || {}; const key = "image" in fpFiles ? "image" : Object.keys(fpFiles)[0]; const ref = key ? fpFiles[key] : undefined; const file = typeof (ref as any)?.file === "function" ? await (ref as any).file() : ref instanceof File ? ref : null; if (!file) return null; // Optional downscale to cap resolution const toDataUrl = async (f: File) => { const img = await new Promise<HTMLImageElement>((res, rej) => { const url = URL.createObjectURL(f); const el = new Image(); el.onload = () => { URL.revokeObjectURL(url); res(el); }; el.onerror = (e) => { URL.revokeObjectURL(url); rej(e); }; el.src = url; }); const MAX = 1024; const scale = Math.min(1, MAX / Math.max(img.width, img.height)); if (scale < 1) { const canvas = document.createElement("canvas"); canvas.width = Math.round(img.width * scale); canvas.height = Math.round(img.height * scale); const ctx = canvas.getContext("2d")!; ctx.drawImage(img, 0, 0, canvas.width, canvas.height); return canvas.toDataURL("image/jpeg", 0.85); } // Fallback to original Data URL const reader = new FileReader(); return await new Promise<string>((resolve, reject) => { reader.onload = () => resolve(reader.result as string); reader.onerror = reject; reader.readAsDataURL(file); }); }; return await toDataUrl(file); } catch (e) { console.warn(`Failed to load image ${imageId}:`, e); return null; } }), ) ).filter((u): u is string => Boolean(u)) : [];

Reply with "@CharlieHelps yes please" if you'd like me to add a commit with this optimization.

charliecreates · 2025-08-22T18:49:40Z

vibes.diy/pkg/app/hooks/useSimpleChat.ts

+  const attachImages = useCallback(
+    async (files: FileList) => {
+      if (!sessionId) return;
+
+      const newImages: Array<{
+        id: string;
+        previewUrl: string;
+        mimeType: string;
+      }> = [];
+
+      for (let i = 0; i < files.length; i++) {
+        const file = files[i];
+        if (!file.type.startsWith("image/")) continue;
+
+        // Create preview URL
+        const previewUrl = URL.createObjectURL(file);
+
+        // Create image document in Fireproof
+        const imageDoc = {
+          type: "image" as const,
+          session_id: sessionId,
+          created_at: Date.now(),
+          _files: {
+            image: file,
+          },
+        };
+
+        try {
+          const result = await sessionDatabase.put(imageDoc);
+          newImages.push({
+            id: result.id,
+            previewUrl,
+            mimeType: file.type,
+          });
+        } catch (error) {
+          console.error("Failed to store image:", error);
+          URL.revokeObjectURL(previewUrl);
+        }
+      }
+
+      setAttachedImages((prev) => [...prev, ...newImages]);
+    },
+    [sessionId, sessionDatabase],
+  );


attachImages processes files sequentially and has no guardrails on count or size. Large or many images can degrade UX (upload time, memory), and increase token/latency cost for vision models. Adding limits and parallelizing the persistence reduces latency and prevents worst-case scenarios.

Suggestion

Add simple guardrails (max attachments, per-file size) and parallelize the DB puts. Example rewrite:

const MAX_ATTACHMENTS = 6; const MAX_FILE_MB = 10; const attachImages = useCallback( async (files: FileList) => { if (!sessionId) return; const current = attachedImages.length; const allowed = Math.max(0, MAX_ATTACHMENTS - current); const selected = Array.from(files) .filter((f) => f.type.startsWith("image/")) .slice(0, allowed); const results = await Promise.all( selected.map(async (file) => { if (file.size > MAX_FILE_MB * 1024 * 1024) { console.warn(`Skipping large image > ${MAX_FILE_MB}MB:`, file.name); return null; } const previewUrl = URL.createObjectURL(file); try { const result = await sessionDatabase.put({ type: "image" as const, session_id: sessionId, created_at: Date.now(), _files: { image: file }, }); return { id: result.id, previewUrl, mimeType: file.type }; } catch (error) { console.error("Failed to store image:", error); URL.revokeObjectURL(previewUrl); return null; } }), ); const newImages = results.filter(Boolean) as Array<{ id: string; previewUrl: string; mimeType: string }>; setAttachedImages((prev) => [...prev, ...newImages]); }, [sessionId, sessionDatabase, attachedImages.length], );

Reply with "@CharlieHelps yes please" if you'd like me to add a commit with these guardrails and concurrency.

@CharlieHelps yes please add gaurdrails but not concurrency. instead collect them and save them in one database bulk operation await database.bulk(arrayOfDocsToBulkUpdate)

Expand this to see my work.

Reviewed and enforced guardrails for max attachments and per-file size in attachImages, maintaining image/* filter and immediate cleanup for rejected files.

Switched to single sessionDatabase.bulk persist operation in attachImages, falling back to sequential puts as needed, updating state only for successful writes, and revoking previews on failures.

Refactored state updates in setAttachedImages to handle attachment limits and concurrency safely, including cleanup for preview URLs on excess or rejected files.

Committed changes to vibes.diy/pkg/app/hooks/useSimpleChat.ts in commit 98983e4 with full reviewer context in the message.

Pushed commit to remote for PR #285, but push was rejected due to non-fast-forward; branch is behind and needs to be synced.

Pushed guardrails + bulk save for image attachments in useSimpleChat.attachImages in 92e561d.

Changes

Added attachment guardrails (no concurrency introduced):

Enforces max attachments per message, respecting current attachedImages.length.

Enforces per‑file size cap.

Keeps existing MIME filter (image/* only).

Rejected files do not create previews; any previews for failed writes are revoked.

Replaced per‑file put loop with a single bulk write:

Collects accepted docs and persists via sessionDatabase.bulk(docs).

Uses returned IDs to build state entries; previews for failed writes are revoked.

Compatibility: if bulk is unavailable, falls back to sequential put (still non‑concurrent).

Race‑safe state update:

Caps additions inside the setAttachedImages updater to avoid exceeding the limit if attachImages is called twice quickly.

Revokes previews for any images pruned by the cap at update time.

Scope respected: only modified the attachImages function in vibes.diy/pkg/app/hooks/useSimpleChat.ts.

Implementation notes

Current guardrail defaults (please confirm/adjust):

MAX_ATTACHMENTS = 6 per message

MAX_FILE_MB = 10 MB per file

No concurrency was added. Bulk write is a single call; the fallback uses a sequential put loop only when bulk isn’t available.

Verification

# Install $ pnpm install --frozen-lockfile # Repo checks (format, build, test) $ pnpm run check # Results - Build: succeeded - Tests: 571 passed, 85 skipped (656 total) - Test files: 102 passed, 17 skipped (119 total)

Open question

Confirm the limits you want. I used MAX_ATTACHMENTS=6 and MAX_FILE_MB=10 for now. Happy to update to your preferred values.

charliecreates · 2025-08-22T18:49:40Z

vibes.diy/pkg/app/components/ChatInput.tsx

+            onDragEnter={(e) => {
+              e.preventDefault();
+              setIsDragging(true);
+            }}
+            onDragOver={(e) => {
+              e.preventDefault();
+              setIsDragging(true);
+            }}
+            onDragLeave={(e) => {
+              e.preventDefault();
+              setIsDragging(false);
+            }}
+            onDrop={onDropFiles}


Drag-and-drop is only handled on the textarea. Dropping files just outside the textarea can trigger the browser’s default navigation (opening the image), which risks losing the current chat state. It’s safer to prevent default dragover/drop on the container (or window) to avoid accidental navigations outside the intended drop zone.

Suggestion

Add container-level (or window-level) dragover/drop preventDefault handlers so drops outside the textarea don’t navigate away. For example:

// In the outer div <div ref={containerRef} className="px-4 py-2" onDragOver={(e) => e.preventDefault()} onDrop={(e) => { // If not dropping files, just prevent navigation if (!e.dataTransfer?.files?.length) e.preventDefault(); }} > … </div>

Alternatively, add a useEffect to register window-level dragover/drop listeners that preventDefault while the component is mounted.

Reply with "@CharlieHelps yes please" if you'd like me to add a commit implementing this.

@CharlieHelps yes please container-level

Expand this to see my work.

Inspected ChatInput.tsx to confirm drag-and-drop and onDropFiles were previously scoped only to the textarea.

Revised onDropFiles type from React.DragEvent<HTMLTextAreaElement> to React.DragEvent<HTMLElement>, supporting shared use by both container and textarea event handlers.

Added onDragEnter, onDragOver, onDragLeave, and onDrop handlers on the container <div>, activating preventDefault and using setIsDragging for visual state across the entire chat input area.

Ensured container-level onDrop calls preventDefault, routes valid file drops to onDropFiles, and reliably resets setIsDragging, with no window/global listeners introduced.

Validated that new handlers align with PR review acceptance and preserve existing textarea logic unless centralizing is more consistent.

Checked for type completeness and compat in shared handler logic after updates.

Ran pnpm check—all tests, type checks, and formatting passed without errors related to drag-and-drop or ChatInput.tsx.

Committed changes as a303fcc with detailed message summarizing fixes for container-level drop prevention and forwarding.

Pushed the commit to jchris/chat-images branch on origin without errors.

Pushed container-level drag-and-drop handling to prevent navigation and route file drops to the existing flow.

Commit: a303fcc

File: vibes.diy/pkg/app/components/ChatInput.tsx

Changes

Added container-level handlers on the outer ChatInput wrapper:

onDragEnter/onDragOver/onDragLeave: preventDefault and keep isDragging in sync for the entire container.

onDrop: preventDefault + stopPropagation to avoid browser navigation; forwards file drops to the shared onDropFiles handler; resets isDragging.

Updated onDropFiles signature to accept React.DragEvent so it can be called from both the textarea and the container without type errors.

Left the existing textarea-level handlers in place; container now reliably manages isDragging across the full input area. No window-level listeners added and no changes to text input/focus behavior.

Key snippet (container wrapper):

<div ref={containerRef} className="px-4 py-2" onDragEnter={(e) => { e.preventDefault(); setIsDragging(true); }} onDragOver={(e) => { e.preventDefault(); setIsDragging(true); }} onDragLeave={(e) => { e.preventDefault(); setIsDragging(false); }} onDrop={(e) => { e.preventDefault(); e.stopPropagation(); setIsDragging(false); if (e.dataTransfer?.files?.length) { void onDropFiles(e); // shared handler } }} >

And the shared drop handler type adjustment:

const onDropFiles = useCallback(async (e: React.DragEvent<HTMLElement>) => { e.preventDefault(); e.stopPropagation(); setIsDragging(false); const files = e.dataTransfer?.files; if (files && chatState.attachImages) { await chatState.attachImages(files); } }, [chatState.attachImages]);

Verification

# Format + Build + Tests $ pnpm -s check # Results - Build: success - Vitest (multi-project): • Test files: 102 passed | 17 skipped • Tests: 571 passed | 85 skipped

Notes:

No global/window listeners were introduced, per the review request.

Drops anywhere inside the ChatInput container no longer trigger browser navigation and correctly invoke the existing image attachment flow.

Re-requested review and assigned the PR to @jchris.

… drops to onDropFiles\n\n- Add container-level onDragOver/onDrop with preventDefault to avoid browser navigation\n- Forward file drops anywhere in ChatInput container to existing onDropFiles handler\n- Normalize onDropFiles to accept DragEvent<HTMLElement> so it can be reused by container and textarea\n- Keep textarea-level handlers; container now manages isDragging for full area

…e attachments\n\n- Enforce max attachments per message and per-file size (image/* only)\n- Use single bulk write to persist accepted images; revoke previews on failure\n- Race-safe state update to honor cap under concurrent calls; fallback to sequential put if bulk unavailable

jchris · 2025-08-22T23:40:48Z

Need to double check the image previews are showing up right

feat: add image upload and multimodal chat support with preview UI

83ce745

charliecreates bot requested a review from CharlieHelps August 22, 2025 18:46

charliecreates bot reviewed Aug 22, 2025

View reviewed changes

charliecreates bot removed the request for review from CharlieHelps August 22, 2025 18:49

charliecreates bot had a problem deploying to dev August 22, 2025 18:56 Failure

charliecreates bot assigned jchris Aug 22, 2025

charliecreates bot had a problem deploying to dev August 22, 2025 18:59 Failure

feat: add image upload and multimodal chat support #285

Are you sure you want to change the base?

feat: add image upload and multimodal chat support #285

Uh oh!

Conversation

jchris commented Aug 22, 2025

Summary

Key Features

Implementation Details

Technical Approach

Test plan

Uh oh!

netlify bot commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for fireproof-ai-builder ready!

Uh oh!

charliecreates bot left a comment

Choose a reason for hiding this comment

Uh oh!

charliecreates bot Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

charliecreates bot Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

charliecreates bot Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

jchris Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

charliecreates bot Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charliecreates bot Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

charliecreates bot Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

jchris Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

charliecreates bot Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charliecreates bot Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

jchris commented Aug 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

netlify bot commented Aug 22, 2025 •

edited

Loading

charliecreates bot Aug 22, 2025 •

edited

Loading

charliecreates bot Aug 22, 2025 •

edited

Loading