UVLM

SAGAI v2.0 — A Unified Multi-Model Notebook for Streetscape Analysis

SAGAI v2.0 consolidates the full streetscape analysis pipeline into a single Google Colab notebook and replaces the inline LLaVA-only inference code with the UVLM package, enabling multi-model benchmarking across 11 VLM checkpoints. New features include a multi-task prompt builder, consensus validation with majority voting, chain-of-thought reasoning, truncation detection, interactive Folium maps, view-direction filtering, and support for loading existing polygons as study area boundaries.

2026-05-21T10:27:47+00:00May 21, 2026|Categories: Advanced, Python, Vision Language Model|Tags: , , , , , , |0 Comments

UVLM v3.0.0: From Colab Notebook to Python Package — Run Vision-Language Models Anywhere

UVLM v3.0.0 turns a Colab notebook into a full Python package. Run vision-language models locally, in notebooks, or scripts with a simple API and no setup complexity.

Introducing UVLM: A Free Tool to Compare AI Models That Understand Images

UVLM is a free, open-source tool for loading, testing, and comparing Vision-Language Models on custom image analysis tasks. Running entirely in Google Colab, it lets researchers and practitioners benchmark multiple AI models using the same prompts and images — no coding, no GPU ownership, no model-specific pipelines. This post explains what VLMs are, why comparing them matters, and how to get started in five minutes.