This boilerplate demonstrates how to use LM Studio to run a fully local LLM environment — hosting the Qwen/Qwen3-VL-8B model with OpenAI-compatible APIs, without paying for tokens.
It provides a complete, end-to-end example of a chat interface powered by a locally hosted LLM, ideal for experimentation, prototyping, and offline development.
Overview
The example integrates Next.js App Router, shadcn/ui, and the Vercel AI SDK (AI Elements) to build a modern chat UI connected to a local LM Studio instance.
A custom tool is included to demonstrate how the LLM can trigger server-side actions, such as generating and saving documents.
This boilerplate is designed as a local AI laboratory, enabling you to test ideas with a real LLM — without network dependency, rate limits, or token costs — before moving to a production-grade hosted model.
Features
- LM Studio hosting a local Qwen3-VL-8B model
- OpenAI-compatible API for frictionless integration
- Next.js App Router for modern routing
- Vercel AI SDK (AI Elements) for a polished chat interface
- shadcn/ui for consistent UI components
- Tool invocation example that creates documents from LLM responses
- Fully offline, zero-cost LLM experimentation
- End-to-end chat flow with local inference
Use Case
Ideal for developers who want to:
- Prototype AI features without paying per token
- Run real LLMs locally for fast and private experimentation
- Test integrations before switching to cloud LLM providers
- Build RAGs, agents, or tools using a local model
- Develop AI features in isolated environments with full autonomy
With LM Studio, Next.js, Vercel AI SDK, and Qwen3-VL-8B, this boilerplate provides a powerful foundation for building and testing AI applications — all directly on your machine, without external costs or dependencies. 🚀
Boilerplate details
Last update
1 hour agoBoilerplate age
1 hour ago