Skip to content

feat: Add clipboard image paste support for macOS#1580

Merged
scottdensmore merged 15 commits intogoogle-gemini:mainfrom
jaysondasher:feature/clipboard-image-paste
Jul 12, 2025
Merged

feat: Add clipboard image paste support for macOS#1580
scottdensmore merged 15 commits intogoogle-gemini:mainfrom
jaysondasher:feature/clipboard-image-paste

Conversation

@jaysondasher
Copy link
Copy Markdown
Contributor

TLDR

Adds clipboard image pasting functionality for macOS using Ctrl+V. Images are automatically saved as temporary files and inserted as @ commands with clean [Image #N] display formatting.

Detailed Discussion

This PR implements clipboard image paste support to address issue #1452. The feature allows users to paste images directly from their clipboard using Ctrl+V, making it easier to share screenshots and images with Gemini.

Key Features:

  • Ctrl+V shortcut: Dedicated keyboard shortcut for pasting images (separate from text)
  • Clean display: Shows [Image #1] instead of long file paths in the UI
  • Sequential numbering: Images numbered per-message starting from 1
  • Automatic cleanup: Old clipboard images removed after 1 hour
  • macOS support: Uses AppleScript for reliable clipboard access

Technical Implementation:

  • clipboardUtils.ts: Handles clipboard detection and image saving using AppleScript
  • messageFormatting.ts: Formats display text to show clean image references
  • Enhanced InputPrompt.tsx: Adds Ctrl+V handler and display formatting
  • Updated UserMessage.tsx: Applies formatting to message history
  • Images stored in .gemini-clipboard/ within project directory

Platform Support:

Currently macOS only due to AppleScript dependency. Future work could add Linux/Windows support using platform-specific clipboard APIs.

Reviewer Test Plan

  1. On macOS: Copy an image to clipboard (screenshot, image from browser, etc.)
  2. In Gemini CLI: Press Ctrl+V in the input field
  3. Verify: Image appears as [Image #1] in both input field and message history
  4. Test multiple images: Paste several images, verify sequential numbering [Image #1] [Image #2]
  5. Test AI response: Send message with images, verify Gemini can read them
  6. Test cleanup: Wait or manually check that old images are cleaned up

Testing Matrix

Platforms Tested:

  • ✅ macOS (primary implementation)
  • ❌ Windows (not supported yet)
  • ❌ Linux (not supported yet)

Installation Methods Tested:

  • ✅ npm run (development)
  • ⚠️ npx (should work but not extensively tested)
  • ❌ Docker (not tested)
  • ❌ Podman (not tested)
  • ❌ Seatbelt (not tested)

Fixes #1452

@jaysondasher jaysondasher requested a review from a team as a code owner June 25, 2025 19:36
@google-cla
Copy link
Copy Markdown

google-cla bot commented Jun 25, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@jacob314
Copy link
Copy Markdown
Contributor

Thanks for the pull request @jaysondasher! I'll review as soon as you sign the CLA.

@jacob314
Copy link
Copy Markdown
Contributor

Reviewing now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest we don't include this in the placeholder unless @miguelsolorio has some good ideas of how to make this read well.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this comment.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to factor in the cursor location. textBuffer has an existing helper for inserting text which you should use that will help with this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit remove (like Claude Code). ctrl-v to paste is quite standard.

@jacob314
Copy link
Copy Markdown
Contributor

Bugs: pressing the left arrow when over the 1[Image #1] text causes confusing autocomplete behavior make it feel like Gemini CLI hung. We should fix so that delete removes the entire [Image 1] tag and does not trigger autocomplete.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use fileUtils.ts which has a detectFileType util that already detects images?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic should move to text-buffer.ts and operate on the lines before wrapping.
lets also be sure to implement this in a way that we can support highlighting blocks like [Image 1] with custom colors in the future. I'd suggest we move these complex changes to a second PR and land this first just supporting paste that adds a user visible @some/path/to/image.png in the input prompt.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moving to text-buffer.ts will also make it easy to test this as that part of the code base has reasonable unit tests.

@jaysondasher
Copy link
Copy Markdown
Contributor Author

Thank you for your review, @jacob314.
I think my latest commit addresses all of your concerns.
Your concerns about the arrow key navigation bugs and complex delete behavior made me realize the display formatting was adding unnecessary complexity. By showing raw @.gemini-clipboard/... paths instead of [Image #N], we:

  • Eliminate all the cursor mapping complexity
  • Fix the arrow key navigation issues naturally
  • Remove the need for special delete/backspace handling
  • Avoid autocomplete triggering issues
  • Make the code much simpler and more maintainable

The simplified approach works seamlessly with the existing @ command system and avoids all the edge cases you identified. Images still paste with Ctrl+V and Gemini can read them perfectly.

I will leave advanced formatting display of the pasted images for a later PR, with more time allowing for a more robust implementation.

@umairidris
Copy link
Copy Markdown
Contributor

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces clipboard image pasting functionality for macOS. A critical security vulnerability exists due to command injection in the AppleScript execution, and a functional issue limits support to only PNG images. The review includes a code suggestion addressing both issues by properly escaping the file path and using a more robust AppleScript.

Comment on lines 56 to 72
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The tempFilePath is embedded directly into the AppleScript string without escaping, creating a command injection risk1. If the path contains special characters (e.g., a double quote "), a maliciously crafted directory name could break out of the string and inject arbitrary AppleScript, leading to arbitrary code execution. Also, the current script only attempts to read PNG data (as «class PNGf»), but clipboardHasImage() checks for PNG, TIFF, and JPEG, creating an inconsistency where the feature will fail for common image formats like JPEG, even when clipboardHasImage() returns true2. I recommend replacing the current implementation with a more robust approach that addresses both problems by using Image Events to handle various image formats and properly escaping the file path to mitigate the injection vulnerability.

    const tempFilePath = path.join(tempDir, `clipboard-${timestamp}.png`);
    const safeTempFilePath = tempFilePath.replace(/\\/g, '\\\\').replace(/"/g, '\\"');

    // Use a more robust AppleScript with Image Events to handle various image types
    const script = `
      try
        tell application "Image Events"
          launch
          set this_image to (the clipboard as picture)
          save this_image as PNG in (POSIX file "${safeTempFilePath}")
        end tell
        return "success"
      on error
        return "error"
      end try
    `;

Style Guide References

Footnotes

  1. Embedding user-controlled input directly into commands without proper escaping can lead to command injection vulnerabilities.

  2. The application should handle all image formats that it detects, to avoid unexpected failures.

@jacob314
Copy link
Copy Markdown
Contributor

Sorry for the slow reply. I missed a notification that you had replied. Code looks good now. Please add a basic test and then I will approve.
one note, please run the code formatter as there was a spurious whitespace diff on the commit that makes it look like it changes more than it does.

@jaysondasher
Copy link
Copy Markdown
Contributor Author

@jacob314 I have added a basic test, run the code formatter, as well as improved the image type handling.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does mac return a string?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry to be clear, please add tests for this logic added to InputPrompt then LGTM.

@jaysondasher
Copy link
Copy Markdown
Contributor Author

@jacob314

  1. Mac returns a string because saveClipboardImage() returns the file path on
    success (or null if no image/error). The test verifies it returns a valid path
    string.

  2. Got it! I'll add tests for the InputPrompt clipboard logic. Thanks for
    clarifying!

@jaysondasher
Copy link
Copy Markdown
Contributor Author

Added the InputPrompt tests. They cover:

  • Ctrl+V with image in clipboard
  • No image in clipboard (no-op)
  • Image save failure handling
  • Cursor position insertion with proper spacing
  • Error handling during clipboard operations

All tests passing

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please avoid timeouts in tests as slow tests slow everyone down. Can you use
await wait(1) if a wait is really required for this to pass?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment applies to all wait calls added in the file

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid wait(100)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this comment

jacob314
jacob314 previously approved these changes Jul 2, 2025
Copy link
Copy Markdown
Contributor

@jacob314 jacob314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Only small issue is to replace the wait(100) calls with wait(1) or better yet wait() calls in the tests

@jaysondasher
Copy link
Copy Markdown
Contributor Author

@jacob314 I replaced all wait(100) calls with wait() and removed the outdated comment you highlighted above.

Based on PR feedback, simplifying the implementation by:
- Removing [Image #N] display formatting - now shows raw file paths
- Removing complex cursor mapping logic
- Using TextBuffer's replaceRangeByOffset for proper cursor position insertion
- Removing placeholder text modification
- Removing messageFormatting.ts as it's no longer needed

This addresses reviewer concerns about complex display logic causing bugs
with arrow key navigation and autocomplete. The simpler approach shows
raw @ command paths which works well with existing functionality.
Based on PR feedback from jacob314 and security review:

- Add support for multiple image formats (PNG, JPEG, TIFF, GIF)
- Fix JPEG detection by matching "JPEG picture" in clipboard info
- Add basic test coverage for clipboard utilities
- Run code formatter to fix whitespace issues
- Improve error handling for different image formats

The implementation now tries each format in order until one succeeds,
making the feature more robust for different clipboard content types.
Per jacob314's review feedback:
- Replace all wait(100) with wait() to speed up tests
- Remove outdated comment about setText implementation
- Remove timing comment on async wait
git auto-resolved conflicts
all clipboard functionality preserved, handleClipboardImage and Ctrl+V handler are implemented properly still, and all tests passing.
@jaysondasher jaysondasher force-pushed the feature/clipboard-image-paste branch from 0edb112 to db57457 Compare July 9, 2025 14:10
Prettier formatting removed trailing newlines from:
- packages/cli/src/ui/components/InputPrompt.tsx
- packages/cli/src/ui/components/InputPrompt.test.tsx
@jaysondasher
Copy link
Copy Markdown
Contributor Author

@jacob314, I accidentally rebased against my own fork's main branch instead of upstream/main, which created duplicate commits in the history. I have successfully updated my branch with upstream/main and fixed merge conflicts. Ready for review!

* Checks if the system clipboard contains an image (macOS only for now)
* @returns true if clipboard contains an image
*/
export async function clipboardHasImage(): Promise<boolean> {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

powershell -command "Add-Type -AssemblyName System.Windows.Forms; if ([System.Windows.Forms.Clipboard]::ContainsImage()) { Write-Output 'true' } else { Write-Output 'false' }"

This command can be used on the Windows platform to detect whether the clipboard contains an image, and it will return a result of either "true" or "false" after execution. Could the Windows platform be included in this submission as well? Thank you for your assistance.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion, @youaodu ! I've attempted to implement Windows support using the PowerShell command you provided. However, I wasn't able to get it working properly during testing on my Windows machine. I don't have much experience with Windows development or PowerShell, so I may be missing something obvious.

The implementation is in the latest commit if you'd like to review it. I'm happy to work with you to get this working if you can provide additional guidance, or feel free to make adjustments directly. The macOS implementation is working well and ready for review.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thank you for adopting my suggestions @jaysondasher , but unfortunately, Windows may not be compatible with this function. Please delete the Windows - related code. After multiple tests and verifications, I found that in the terminal of the Windows platform, which is in text mode, when non - text formats are pasted, the terminal will automatically ignore this pasting event, which leads to the failure of this code to be triggered actively.

      // Ctrl+V for clipboard image paste
      if (key.ctrl && key.name === 'v') {
        console.error('[DEBUG] Ctrl+V detected in InputPrompt');
        handleClipboardImage();
        return;
      }

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One feasible approach is to poll and monitor changes in the clipboard. When a copied image is detected, it is automatically pasted into the inputPrompt box. However, this undermines user initiative, so I do not recommend it. I have implemented a version, and I can submit it if needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, @youaodu , I have reset the branch back to the commit before trying to implement the windows functionality. Feel free to implement your version for that. Not sure if that would be added manually by you to this PR, or if this PR will be approved for just the MacOS functionality, and you create a new PR for the windows functionality. Let me know what the best path forward is. This branch IS functional and ready to go for MacOS, though, I believe.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution. I will add the shortcut key Alt+V based on yours, so that the function of pasting pictures from the clipboard can be realized in Windows.
#4107

@jaysondasher jaysondasher force-pushed the feature/clipboard-image-paste branch from b044690 to 4e8b293 Compare July 11, 2025 15:34
Copy link
Copy Markdown
Contributor

@jacob314 jacob314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Copy Markdown
Contributor

@scottdensmore scottdensmore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved

@scottdensmore scottdensmore added this pull request to the merge queue Jul 12, 2025
Merged via the queue into google-gemini:main with commit c9e194e Jul 12, 2025
10 checks passed
@jaysondasher jaysondasher deleted the feature/clipboard-image-paste branch July 12, 2025 12:24
JunYang-tes pushed a commit to JunYang-tes/gemini-cli.nvim that referenced this pull request Aug 9, 2025
involvex pushed a commit to involvex/gemini-cli that referenced this pull request Sep 11, 2025
reconsumeralization pushed a commit to reconsumeralization/gemini-cli that referenced this pull request Sep 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality priority/p2 Important but can be addressed in a future release.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement] Image paste support

7 participants