tan-yong-sheng/ai-vision-mcp

v0.0.7 Breaking

This release includes breaking changes for platform teams planning a safe upgrade.

Published 3mo MCP Developer Tools

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Summary

AI summary

Added Vertex AI support for local files, dynamic prompt modes in image analysis, and size‑aware video handling.

Full changelog

v0.0.7 Release Notes

Overview

Key Features

🔄 Provider Architecture Improvements

Shared image source handling across providers — Unified image processing utilities between Gemini and Vertex AI providers, eliminating code duplication and ensuring consistent behavior
Shared video source handling across providers — Extracted video processing logic for HTTP URLs, local files, and YouTube videos, now available to both providers
VertexAI HTTP/local file support — Vertex AI provider now supports HTTP URLs and local file uploads (previously Gemini-only)
Provider env loading warnings — Better diagnostics when providers are misconfigured or missing credentials

📹 Video Processing Enhancements

Size-aware video handling — Intelligent video processing based on file size
50MB inline threshold — Videos under 50MB are sent inline; larger videos use file upload APIs for better reliability
Improved reliability — Enhanced error handling and security in file operations

🎨 Analysis Capabilities

Conditional system prompts — Context-aware prompts for all analysis modes (palette, hierarchy, components)
Dynamic prompt modes — analyze_image tool now supports multiple analysis modes for specialized outputs
Updated default model — Switched to gemini-3.1-flash-lite-preview for better performance and cost efficiency

Technical Details

Breaking Changes

None — this is a fully backward-compatible release.

Dependencies

Updated @google-cloud/storage handling for improved stability
No new external dependencies added

Testing

All 10 test cases pass with both Gemini and Vertex AI providers:

✅ Image analysis (URL, local file, base64)
✅ Image comparison (2 and 3 images)
✅ Object detection (URL, local file)
✅ Video analysis (remote URL, local file, YouTube)

Migration Guide

No migration needed. Existing code continues to work without changes. To take advantage of new features:

Use Vertex AI for local files — Vertex AI now supports local file uploads like Gemini
Leverage dynamic prompt modes — Use options.mode in analyze_image for specialized analysis
Benefit from improved video handling — Large videos are now handled more reliably

What's Next

Future releases will focus on:

Additional analysis modes and specialized outputs
Performance optimizations for batch operations
Enhanced error recovery and retry logic

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track tan-yong-sheng/ai-vision-mcp

Get notified when new releases ship.

About tan-yong-sheng/ai-vision-mcp

🪟 - Multimodal AI vision MCP server for image, video, and object detection analysis. Enables UI/UX evaluation, visual regression testing, and interface understanding using Google Gemini and Vertex AI.

All releases →