feat: Add clipboard image paste support for macOS#1580
feat: Add clipboard image paste support for macOS#1580scottdensmore merged 15 commits intogoogle-gemini:mainfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
|
Thanks for the pull request @jaysondasher! I'll review as soon as you sign the CLA. |
|
Reviewing now. |
There was a problem hiding this comment.
I'd suggest we don't include this in the placeholder unless @miguelsolorio has some good ideas of how to make this read well.
There was a problem hiding this comment.
this needs to factor in the cursor location. textBuffer has an existing helper for inserting text which you should use that will help with this.
There was a problem hiding this comment.
nit remove (like Claude Code). ctrl-v to paste is quite standard.
|
Bugs: pressing the left arrow when over the 1[Image #1] text causes confusing autocomplete behavior make it feel like Gemini CLI hung. We should fix so that delete removes the entire [Image 1] tag and does not trigger autocomplete. |
There was a problem hiding this comment.
can you use fileUtils.ts which has a detectFileType util that already detects images?
There was a problem hiding this comment.
this logic should move to text-buffer.ts and operate on the lines before wrapping.
lets also be sure to implement this in a way that we can support highlighting blocks like [Image 1] with custom colors in the future. I'd suggest we move these complex changes to a second PR and land this first just supporting paste that adds a user visible @some/path/to/image.png in the input prompt.
There was a problem hiding this comment.
moving to text-buffer.ts will also make it easy to test this as that part of the code base has reasonable unit tests.
|
Thank you for your review, @jacob314.
The simplified approach works seamlessly with the existing @ command system and avoids all the edge cases you identified. Images still paste with Ctrl+V and Gemini can read them perfectly. I will leave advanced formatting display of the pasted images for a later PR, with more time allowing for a more robust implementation. |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces clipboard image pasting functionality for macOS. A critical security vulnerability exists due to command injection in the AppleScript execution, and a functional issue limits support to only PNG images. The review includes a code suggestion addressing both issues by properly escaping the file path and using a more robust AppleScript.
There was a problem hiding this comment.
The tempFilePath is embedded directly into the AppleScript string without escaping, creating a command injection risk1. If the path contains special characters (e.g., a double quote "), a maliciously crafted directory name could break out of the string and inject arbitrary AppleScript, leading to arbitrary code execution. Also, the current script only attempts to read PNG data (as «class PNGf»), but clipboardHasImage() checks for PNG, TIFF, and JPEG, creating an inconsistency where the feature will fail for common image formats like JPEG, even when clipboardHasImage() returns true2. I recommend replacing the current implementation with a more robust approach that addresses both problems by using Image Events to handle various image formats and properly escaping the file path to mitigate the injection vulnerability.
const tempFilePath = path.join(tempDir, `clipboard-${timestamp}.png`);
const safeTempFilePath = tempFilePath.replace(/\\/g, '\\\\').replace(/"/g, '\\"');
// Use a more robust AppleScript with Image Events to handle various image types
const script = `
try
tell application "Image Events"
launch
set this_image to (the clipboard as picture)
save this_image as PNG in (POSIX file "${safeTempFilePath}")
end tell
return "success"
on error
return "error"
end try
`;Style Guide References
Footnotes
|
Sorry for the slow reply. I missed a notification that you had replied. Code looks good now. Please add a basic test and then I will approve. |
|
@jacob314 I have added a basic test, run the code formatter, as well as improved the image type handling. |
There was a problem hiding this comment.
why does mac return a string?
There was a problem hiding this comment.
sorry to be clear, please add tests for this logic added to InputPrompt then LGTM.
|
|
Added the InputPrompt tests. They cover:
All tests passing |
There was a problem hiding this comment.
please avoid timeouts in tests as slow tests slow everyone down. Can you use
await wait(1) if a wait is really required for this to pass?
There was a problem hiding this comment.
comment applies to all wait calls added in the file
jacob314
left a comment
There was a problem hiding this comment.
lgtm. Only small issue is to replace the wait(100) calls with wait(1) or better yet wait() calls in the tests
|
@jacob314 I replaced all wait(100) calls with wait() and removed the outdated comment you highlighted above. |
Based on PR feedback, simplifying the implementation by: - Removing [Image #N] display formatting - now shows raw file paths - Removing complex cursor mapping logic - Using TextBuffer's replaceRangeByOffset for proper cursor position insertion - Removing placeholder text modification - Removing messageFormatting.ts as it's no longer needed This addresses reviewer concerns about complex display logic causing bugs with arrow key navigation and autocomplete. The simpler approach shows raw @ command paths which works well with existing functionality.
Based on PR feedback from jacob314 and security review: - Add support for multiple image formats (PNG, JPEG, TIFF, GIF) - Fix JPEG detection by matching "JPEG picture" in clipboard info - Add basic test coverage for clipboard utilities - Run code formatter to fix whitespace issues - Improve error handling for different image formats The implementation now tries each format in order until one succeeds, making the feature more robust for different clipboard content types.
Per jacob314's review feedback: - Replace all wait(100) with wait() to speed up tests - Remove outdated comment about setText implementation - Remove timing comment on async wait
git auto-resolved conflicts all clipboard functionality preserved, handleClipboardImage and Ctrl+V handler are implemented properly still, and all tests passing.
0edb112 to
db57457
Compare
Prettier formatting removed trailing newlines from: - packages/cli/src/ui/components/InputPrompt.tsx - packages/cli/src/ui/components/InputPrompt.test.tsx
|
@jacob314, I accidentally rebased against my own fork's main branch instead of upstream/main, which created duplicate commits in the history. I have successfully updated my branch with upstream/main and fixed merge conflicts. Ready for review! |
| * Checks if the system clipboard contains an image (macOS only for now) | ||
| * @returns true if clipboard contains an image | ||
| */ | ||
| export async function clipboardHasImage(): Promise<boolean> { |
There was a problem hiding this comment.
powershell -command "Add-Type -AssemblyName System.Windows.Forms; if ([System.Windows.Forms.Clipboard]::ContainsImage()) { Write-Output 'true' } else { Write-Output 'false' }"
This command can be used on the Windows platform to detect whether the clipboard contains an image, and it will return a result of either "true" or "false" after execution. Could the Windows platform be included in this submission as well? Thank you for your assistance.
There was a problem hiding this comment.
Thank you for the suggestion, @youaodu ! I've attempted to implement Windows support using the PowerShell command you provided. However, I wasn't able to get it working properly during testing on my Windows machine. I don't have much experience with Windows development or PowerShell, so I may be missing something obvious.
The implementation is in the latest commit if you'd like to review it. I'm happy to work with you to get this working if you can provide additional guidance, or feel free to make adjustments directly. The macOS implementation is working well and ready for review.
There was a problem hiding this comment.
Hi, thank you for adopting my suggestions @jaysondasher , but unfortunately, Windows may not be compatible with this function. Please delete the Windows - related code. After multiple tests and verifications, I found that in the terminal of the Windows platform, which is in text mode, when non - text formats are pasted, the terminal will automatically ignore this pasting event, which leads to the failure of this code to be triggered actively.
// Ctrl+V for clipboard image paste
if (key.ctrl && key.name === 'v') {
console.error('[DEBUG] Ctrl+V detected in InputPrompt');
handleClipboardImage();
return;
}There was a problem hiding this comment.
One feasible approach is to poll and monitor changes in the clipboard. When a copied image is detected, it is automatically pasted into the inputPrompt box. However, this undermines user initiative, so I do not recommend it. I have implemented a version, and I can submit it if needed.
There was a problem hiding this comment.
Okay, @youaodu , I have reset the branch back to the commit before trying to implement the windows functionality. Feel free to implement your version for that. Not sure if that would be added manually by you to this PR, or if this PR will be approved for just the MacOS functionality, and you create a new PR for the windows functionality. Let me know what the best path forward is. This branch IS functional and ready to go for MacOS, though, I believe.
There was a problem hiding this comment.
Thank you for your contribution. I will add the shortcut key Alt+V based on yours, so that the function of pasting pictures from the clipboard can be realized in Windows.
#4107
b044690 to
4e8b293
Compare
Co-authored-by: Jacob Richman <[email protected]> Co-authored-by: Scott Densmore <[email protected]>
Co-authored-by: Jacob Richman <[email protected]> Co-authored-by: Scott Densmore <[email protected]>
Co-authored-by: Jacob Richman <[email protected]> Co-authored-by: Scott Densmore <[email protected]>

TLDR
Adds clipboard image pasting functionality for macOS using Ctrl+V. Images are automatically saved as temporary files and inserted as @ commands with clean [Image #N] display formatting.
Detailed Discussion
This PR implements clipboard image paste support to address issue #1452. The feature allows users to paste images directly from their clipboard using Ctrl+V, making it easier to share screenshots and images with Gemini.
Key Features:
[Image #1]instead of long file paths in the UITechnical Implementation:
clipboardUtils.ts: Handles clipboard detection and image saving using AppleScriptmessageFormatting.ts: Formats display text to show clean image referencesInputPrompt.tsx: Adds Ctrl+V handler and display formattingUserMessage.tsx: Applies formatting to message history.gemini-clipboard/within project directoryPlatform Support:
Currently macOS only due to AppleScript dependency. Future work could add Linux/Windows support using platform-specific clipboard APIs.
Reviewer Test Plan
[Image #1]in both input field and message history[Image #1] [Image #2]Testing Matrix
Platforms Tested:
Installation Methods Tested:
Fixes #1452