UVLM Archives - Urban Geo Analytics

SAGAI v2.0 — A Unified Multi-Model Notebook for Streetscape Analysis
Gallery
SAGAI v2.0 — A Unified Multi-Model Notebook for Streetscape Analysis

Advanced, Python, Vision Language Model

SAGAI v2.0 — A Unified Multi-Model Notebook for Streetscape Analysis

SAGAI v2.0 consolidates the full streetscape analysis pipeline into a single Google Colab notebook and replaces the inline LLaVA-only inference code with the UVLM package, enabling multi-model benchmarking across 11 VLM checkpoints. New features include a multi-task prompt builder, consensus validation with majority voting, chain-of-thought reasoning, truncation detection, interactive Folium maps, view-direction filtering, and support for loading existing polygons as study area boundaries.

Joan Perez2026-05-21T10:27:47+00:00May 21, 2026|Categories: Advanced, Python, Vision Language Model|Tags: AI, GIS, Image Analysis, Llava, Python, Qwen, UVLM|0 Comments

UVLM v3.0.0: From Colab Notebook to Python Package — Run Vision-Language Models Anywhere
Gallery
UVLM v3.0.0: From Colab Notebook to Python Package — Run Vision-Language Models Anywhere

Advanced, Package, Python, Vision Language Model

UVLM v3.0.0: From Colab Notebook to Python Package — Run Vision-Language Models Anywhere

UVLM v3.0.0 turns a Colab notebook into a full Python package. Run vision-language models locally, in notebooks, or scripts with a simple API and no setup complexity.

Joan Perez2026-04-23T07:25:41+00:00April 23, 2026|Categories: Advanced, Package, Python, Vision Language Model|Tags: AI, Google Colab, Image Analysis, Jupyter Notebook, Llava, Qwen, UVLM|0 Comments

Introducing UVLM: A Free Tool to Compare AI Models That Understand Images
Gallery
Introducing UVLM: A Free Tool to Compare AI Models That Understand Images

Intermediate, Python, Vision Language Model

Introducing UVLM: A Free Tool to Compare AI Models That Understand Images

UVLM is a free, open-source tool for loading, testing, and comparing Vision-Language Models on custom image analysis tasks. Running entirely in Google Colab, it lets researchers and practitioners benchmark multiple AI models using the same prompts and images — no coding, no GPU ownership, no model-specific pipelines. This post explains what VLMs are, why comparing them matters, and how to get started in five minutes.

Joan Perez2026-04-23T07:04:37+00:00March 17, 2026|Categories: Intermediate, Python, Vision Language Model|Tags: Benchmarking, Chain-of-Thought, Google Colab, Image Analysis, Llava, Multimodal AI, Open Source, Qwen, UVLM, VLM|0 Comments