Skip to content

This release includes breaking changes for platform teams planning a safe upgrade.

Published 2mo MCP Developer Tools
✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Summary

AI summary

Added Vertex AI support for local files, dynamic prompt modes in image analysis, and size‑aware video handling.

Full changelog

v0.0.7 Release Notes

Overview

Key Features

🔄 Provider Architecture Improvements

  • Shared image source handling across providers — Unified image processing utilities between Gemini and Vertex AI providers, eliminating code duplication and ensuring consistent behavior
  • Shared video source handling across providers — Extracted video processing logic for HTTP URLs, local files, and YouTube videos, now available to both providers
  • VertexAI HTTP/local file support — Vertex AI provider now supports HTTP URLs and local file uploads (previously Gemini-only)
  • Provider env loading warnings — Better diagnostics when providers are misconfigured or missing credentials

📹 Video Processing Enhancements

  • Size-aware video handling — Intelligent video processing based on file size
  • 50MB inline threshold — Videos under 50MB are sent inline; larger videos use file upload APIs for better reliability
  • Improved reliability — Enhanced error handling and security in file operations

🎨 Analysis Capabilities

  • Conditional system prompts — Context-aware prompts for all analysis modes (palette, hierarchy, components)
  • Dynamic prompt modesanalyze_image tool now supports multiple analysis modes for specialized outputs
  • Updated default model — Switched to gemini-3.1-flash-lite-preview for better performance and cost efficiency

Technical Details

Breaking Changes

None — this is a fully backward-compatible release.

Dependencies

  • Updated @google-cloud/storage handling for improved stability
  • No new external dependencies added

Testing

All 10 test cases pass with both Gemini and Vertex AI providers:

  • ✅ Image analysis (URL, local file, base64)
  • ✅ Image comparison (2 and 3 images)
  • ✅ Object detection (URL, local file)
  • ✅ Video analysis (remote URL, local file, YouTube)

Migration Guide

No migration needed. Existing code continues to work without changes. To take advantage of new features:

  1. Use Vertex AI for local files — Vertex AI now supports local file uploads like Gemini
  2. Leverage dynamic prompt modes — Use options.mode in analyze_image for specialized analysis
  3. Benefit from improved video handling — Large videos are now handled more reliably

What's Next

Future releases will focus on:

  • Additional analysis modes and specialized outputs
  • Performance optimizations for batch operations
  • Enhanced error recovery and retry logic

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track tan-yong-sheng/ai-vision-mcp

Get notified when new releases ship.

Sign up free

About tan-yong-sheng/ai-vision-mcp

🪟 - Multimodal AI vision MCP server for image, video, and object detection analysis. Enables UI/UX evaluation, visual regression testing, and interface understanding using Google Gemini and Vertex AI.

All releases →

Beta — feedback welcome: [email protected]