This release includes breaking changes for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Summary
AI summaryAdded Vertex AI support for local files, dynamic prompt modes in image analysis, and size‑aware video handling.
Full changelog
v0.0.7 Release Notes
Overview
Key Features
🔄 Provider Architecture Improvements
- Shared image source handling across providers — Unified image processing utilities between Gemini and Vertex AI providers, eliminating code duplication and ensuring consistent behavior
- Shared video source handling across providers — Extracted video processing logic for HTTP URLs, local files, and YouTube videos, now available to both providers
- VertexAI HTTP/local file support — Vertex AI provider now supports HTTP URLs and local file uploads (previously Gemini-only)
- Provider env loading warnings — Better diagnostics when providers are misconfigured or missing credentials
📹 Video Processing Enhancements
- Size-aware video handling — Intelligent video processing based on file size
- 50MB inline threshold — Videos under 50MB are sent inline; larger videos use file upload APIs for better reliability
- Improved reliability — Enhanced error handling and security in file operations
🎨 Analysis Capabilities
- Conditional system prompts — Context-aware prompts for all analysis modes (palette, hierarchy, components)
- Dynamic prompt modes —
analyze_imagetool now supports multiple analysis modes for specialized outputs - Updated default model — Switched to
gemini-3.1-flash-lite-previewfor better performance and cost efficiency
Technical Details
Breaking Changes
None — this is a fully backward-compatible release.
Dependencies
- Updated
@google-cloud/storagehandling for improved stability - No new external dependencies added
Testing
All 10 test cases pass with both Gemini and Vertex AI providers:
- ✅ Image analysis (URL, local file, base64)
- ✅ Image comparison (2 and 3 images)
- ✅ Object detection (URL, local file)
- ✅ Video analysis (remote URL, local file, YouTube)
Migration Guide
No migration needed. Existing code continues to work without changes. To take advantage of new features:
- Use Vertex AI for local files — Vertex AI now supports local file uploads like Gemini
- Leverage dynamic prompt modes — Use
options.modeinanalyze_imagefor specialized analysis - Benefit from improved video handling — Large videos are now handled more reliably
What's Next
Future releases will focus on:
- Additional analysis modes and specialized outputs
- Performance optimizations for batch operations
- Enhanced error recovery and retry logic
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About tan-yong-sheng/ai-vision-mcp
🪟 - Multimodal AI vision MCP server for image, video, and object detection analysis. Enables UI/UX evaluation, visual regression testing, and interface understanding using Google Gemini and Vertex AI.
Related context
Beta — feedback welcome: [email protected]