Lemonade

Lemonade Server lets you run local LLMs on your PC’s NPU and GPU. Free, private, and OpenAI-compatible.

Prerequisites

Setting up Lemonade in Multi

Option	Description
Base URL	Lemonade server URL (default: `http://localhost:13305/api/v1`)
Model ID	The model to use (e.g., `Llama-3.2-1B-Instruct-Hybrid`)

Free - Lemonade Server is open-source and runs on your own hardware.