A powerful MCP server leveraging Google's Gemini AI for advanced image generation and transformation. This studio offers two specialized tools: a 3D cartoon generator and an image processing transformer, both powered by the cutting-edge Gemini 2.0 Flash model.
- Generate high-quality 3D cartoon images from text descriptions
- Child-friendly designs with vibrant colors and engaging visuals
- Perfect for children's books, educational materials, and creative projects
- Transform existing images using Gemini AI's vision capabilities
- Apply various artistic styles and modifications
- Enhance, modify, or completely reimagine your images
- 🖼️ Automatic preview generation
- 🌐 Browser-based image viewing
- 💾 Local storage with organized output
- 🔄 Real-time processing
- 📱 Cross-platform support
# Clone the repository
git clone https://github.com/falahgs/gemini-vision-art-studio.git
# Install dependencies
cd gemini-vision-art-studio
npm install- Project Configuration:
Create a
.envfile in the root directory:
GEMINI_API_KEY=your_api_key_here
# Set to true if running in a remote environment (no browser preview)
IS_REMOTE=true- Claude Desktop Configuration:
Add the server configuration to your Claude Desktop config file at
%AppData%\Claude\claude_desktop_config.json:
{
"mcpServers": {
"gemini-vision-art-studio": {
"command": "node",
"args": [
"PATH_TO_YOUR_PROJECT\\build\\src\\index.js"
],
"env": {
"GEMINI_API_KEY": "your_gemini_api_key_here",
"IS_REMOTE": "true"
}
}
}
}Replace:
PATH_TO_YOUR_PROJECTwith your actual project pathyour_gemini_api_key_herewith your Gemini API key
💡 Note: On Windows, the config file is typically located at:
C:\Users\YourUsername\AppData\Roaming\Claude\claude_desktop_config.json
When running the server remotely:
-
Set
IS_REMOTE=truein your environment or Claude Desktop configuration -
The server will:
- Create necessary directories automatically:
/app/output: For generated images and previews/app/temp: For temporary processing files
- Skip browser preview attempts
- Save all files to the
/app/outputdirectory - Return absolute file paths in the response
- Create necessary directories automatically:
-
Directory Structure in Remote Mode:
/app/ ├── output/ # Generated images and previews │ ├── image1.png │ └── image1_preview.html └── temp/ # Temporary processing files -
Troubleshooting Remote Usage:
- Ensure the
/appdirectory exists and is writable - Check the console output for directory creation messages
- Look for "Image saved to:" messages in the logs
- File paths in the response will be absolute paths
- Ensure the
- Build the project:
npm run build- The server will be available in Claude Desktop automatically when you:
- Open Claude Desktop
- Start a new conversation
- The tools will appear in the available tools list
Creates a 3D-style cartoon image from your text description.
{
"name": "generate_3d_cartoon",
"arguments": {
"prompt": "A friendly dragon teaching math to forest animals",
"fileName": "dragon_teacher"
}
}Transforms existing images according to your instructions.
{
"name": "process_image",
"arguments": {
"imagePath": "input/photo.jpg",
"prompt": "Transform this into a watercolor painting with autumn colors",
"outputFileName": "watercolor_autumn"
}
}gemini-vision-art-studio/
├── src/ # Source code
├── build/ # Compiled code
├── input/ # Input images
├── output/ # Generated images and previews
├── temp/ # Temporary processing files
└── examples/ # Example usage and images
- Runtime: Node.js v14+
- Language: TypeScript 5.8.3
- AI Model: Gemini 2.0 Flash
- Framework: Model Context Protocol (MCP) SDK
- Image Processing: Google Generative AI
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Falah G. Salieh
- Copyright © 2025
- GitHub: @falahgs
- Google Gemini AI team for the powerful image generation model
- The MCP SDK team for the excellent tooling
- All contributors and users of this project
Made with ❤️ by Falah G. Salieh