Joan Perez

About Joan Perez

This author has not yet filled in any details.
So far Joan Perez has created 18 blog entries.

Deploy Your Own Local LLM on Low VRAM in 30 Minutes — A Private Chat Assistant in Jupyter

Run a capable large language model entirely on your own machine — private, offline, and with as little as 8 GB of GPU memory. This hands-on guide sets up a clean Python environment, gets CUDA working even on the newest NVIDIA Blackwell cards, loads a 4-bit quantized model from Hugging Face, and builds an interactive chat widget with conversation memory and a live VRAM gauge in JupyterLab. No cloud, no API keys, no data leaving your computer.

2026-06-02T12:31:12+00:00June 2, 2026|Categories: Advanced, Python|Tags: , , , , , |Comments Off on Deploy Your Own Local LLM on Low VRAM in 30 Minutes — A Private Chat Assistant in Jupyter

SAGAI v2.0 — A Unified Multi-Model Notebook for Streetscape Analysis

SAGAI v2.0 consolidates the full streetscape analysis pipeline into a single Google Colab notebook and replaces the inline LLaVA-only inference code with the UVLM package, enabling multi-model benchmarking across 11 VLM checkpoints. New features include a multi-task prompt builder, consensus validation with majority voting, chain-of-thought reasoning, truncation detection, interactive Folium maps, view-direction filtering, and support for loading existing polygons as study area boundaries.

2026-05-21T10:27:47+00:00May 21, 2026|Categories: Advanced, Python, Vision Language Model|Tags: , , , , , , |0 Comments

UVLM v3.0.0: From Colab Notebook to Python Package — Run Vision-Language Models Anywhere

UVLM v3.0.0 turns a Colab notebook into a full Python package. Run vision-language models locally, in notebooks, or scripts with a simple API and no setup complexity.

Introducing UVLM: A Free Tool to Compare AI Models That Understand Images

UVLM is a free, open-source tool for loading, testing, and comparing Vision-Language Models on custom image analysis tasks. Running entirely in Google Colab, it lets researchers and practitioners benchmark multiple AI models using the same prompts and images — no coding, no GPU ownership, no model-specific pipelines. This post explains what VLMs are, why comparing them matters, and how to get started in five minutes.

From Large Language Models to Autonomous AI Agents — Architecture, Capabilities, and Emerging Risks

Large Language Models are stateless, single-pass prediction engines — powerful but passive. Wrapping them in a perception–action loop with environment access and tool use transforms them into something qualitatively different: autonomous AI agents. This post walks through the transformer architecture, explains how the agent paradigm introduces closed-loop reasoning over environments and tasks, surveys the growing toolkit ecosystem, and examines the emerging risk landscape.

2026-04-23T07:31:24+00:00February 19, 2026|Categories: Advanced|Tags: , , , , , |1 Comment

A Stable and Reproducible Vision–Language Inference Engine for SAGAI v1.1

SAGAI v1.1 introduces Module 3 v2.0, a stable and reproducible vision–language inference engine for streetscape analysis. Built exclusively on Hugging Face LLaVA models, it enables robust multimodal processing of street-level images for large-scale urban and geospatial analysis.

2025-12-17T17:07:11+00:00December 17, 2025|Categories: Python, Urbanism, Vision Language Model|Tags: , , , , |0 Comments

Qwen Image Edit for Urbanism v1.3 — Mask-Controlled Editing With Prompt or Reference Guidance

Version 1.3 of Qwen Image Edit for Urbanism introduces mask-controlled editing in ComfyUI, enabling precise, localized image transformations using prompts or reference images. The new Grow Mask utility softens boundaries, preserves unmasked areas, and integrates seamlessly with existing single-image and sequential workflows.

2025-12-04T22:18:54+00:00December 4, 2025|Categories: Advanced, Diffusion Models, Urbanism|Tags: , , , |0 Comments

Deploy a Guest Book on an EVM Blockchain Using Remix

Learn how to deploy your first smart contract on an Ethereum-compatible blockchain using Remix and the Sepolia testnet. In this beginner-friendly guide, we build a simple on-chain guestbook, connect MetaMask, verify the contract on Etherscan, and interact with it directly through the blockchain. A perfect starting point for anyone curious about smart contracts, Solidity, and decentralized applications.

2025-11-27T16:17:44+00:00November 27, 2025|Categories: Blockchain, Intermediate|Tags: , , , , |0 Comments

Qwen Image Edit for Urbanism v1.2 — Custom Nodes & Sequential Processing

ComfyUI Sequential Image Editing for Urbanism arrives in Qwen v1.2 with custom Python nodes, multi-image batch processing, and a six-slot buffer for reproducible urban edits. This version streamlines automated workflows for researchers, designers, and architects working with street and neighborhood imagery.

2025-12-04T20:14:41+00:00November 17, 2025|Categories: Advanced, Diffusion Models, Urbanism|Tags: , , , |Comments Off on Qwen Image Edit for Urbanism v1.2 — Custom Nodes & Sequential Processing

Qwen Image Edit for Urbanism v1.1 — Editing using a Reference Image and Advanced Sampling

Qwen Image Edit for Urbanism v1.1 expands local AI editing in ComfyUI with advanced sampling and dual-image workflows. The new Lightning LoRA system improves realism, texture fidelity, and processing speed, enabling fast, privacy-preserving urban scene transformation—entirely offline.

2025-11-14T09:53:20+00:00November 12, 2025|Categories: Advanced, Diffusion Models, Urbanism|Tags: , , |0 Comments