A Stable and Reproducible Vision–Language Inference Engine for SAGAI v1.1
SAGAI v1.1 introduces Module 3 v2.0, a stable and reproducible vision–language inference engine for streetscape analysis. Built exclusively on Hugging Face LLaVA models, it enables robust multimodal processing of street-level images for large-scale urban and geospatial analysis.
