How to Run LLMs locally

Published on: 2025-03-25

Running Large Language Models (LLMs) locally and completely offline isn’t as tricky as it used to be! There are tons of applications out there that let you do it. Here’s a quick look at some of them. It’s not an exhaustive list, but I’m focusing on the ones that are relatively easy to get started with and can handle a variety of LLMs.

Performance

When you’re running LLMs on your computer, their performance really depends on your hardware. LLMs with lots of parameters need more powerful machines. If you’re finding your hardware is struggling, experimenting with different quantized LLMs can help improve performance. Just keep in mind that you might see a slight drop in accuracy.

AnythingLLM

If you’re looking for an open-source, privacy-focused desktop app to run models locally, AnythingLLM is a fantastic choice. It boasts a really user-friendly interface, making it incredibly easy to switch between different models and keep your chat sessions organized.

Features

  • AI agents: These are LLMs which have access to tools like, web-scraping, search the web, document summarization, making charts, saving files, etc
  • Easy Website Integration: You can create chat widgets to easily add it to any website.
  • Export Your Chats: Save your conversations in various formats like CSV, JSON, JSON (Alpaca), and JSONL (for OpenAI fine-tuning).
  • Download Models: Get models directly from the AnythingLLM repository.
  • Connect to Other Tools: Works with other local model providers like Ollama, LM Studio, and Local AI.
  • Use Remote Models Too: Access models from OpenAI, Groq, Hugging Face, and others.
  • Chat with your documents: Local RAG (Retrieval-augmented generation) integration.
  • Docker support: An official docker image is available.

Supported OS: macOS, Linux, Windows

Website: https://anythingllm.com

Jan.ai

Similar to AnythingLLM, Jan.ai is an open-source, privacy-focused desktop application you can install and run completely locally – no internet connection required! It also features a user-friendly interface to simplify the experience.

Features

  • Download Models: Get LLMs directly from the Jan.ai repository, or import GGUF models from Hugging Face.
  • Local or Remote models: Use models running on your own computer, or connect to services like OpenAI for extra power.
  • Model customization: Just like Ollama, you can tweak and fine-tune models to do exactly what you need.
  • Chat with your documents: Easily ask questions and get answers from your own files with local and remote RAG integration (enable experimental mode and follow the instructions at https://jan.ai/docs/tools/retrieval).
  • OpenAI Compatibility: It works as an OpenAI-compatible server.
  • Extend Its Abilities: Install plugins to add even more features.
  • Mobile Support (Coming Soon!): Jan.ai will soon be available on mobile devices.

Supported OS: macOS, Linux, Windows

Website: https://jan.ai

LM Studio

Like AnythingLLM and Jan.ai, LM Studio is a desktop application you can install and run completely locally – just like the other two desktop apps, it has a user-friendly interface. The main difference? LM Studio isn’t open source. It’s free to use for personal projects, but if you want to use it for work, you’ll need to contact the LM Studio developers

Features

  • Customize Your Models: Fine-tune models to get exactly the results you want.
  • Import Models Easilying in models directly from Hugging Face.
  • Chat with Your Documents: Easily ask questions and get answers from your own files with local RAG integration.
  • OpenAI Compatibility: It works as an OpenAI-compatible server.
  • Run It in the Background: Run it as a service without needing to see the interface.

Supported OS: macOS, Linux, Windows

Website: https://lmstudio.ai

Ollama

Ollama stands apart from the other apps we’ve talked about. It’s a lightweight command-line app that lets you run LLMs directly on your computer. You won’t find features like a polished user interface or built-in RAG integration right away. But honestly, that’s not really a drawback! It’s actually a great way to pair it with other tools, like AnythingLLM or a web-based interface like Open WebUI, where you can get those UI and RAG features.

Features

  • Download Models Quickly: Easily get LLMs from the Ollama repository.
  • Import Models with a Custom File: You can bring in models using a special file (this might be a bit technical for some users).
  • Customize Your Models: Just like with other platforms, you can tweak and fine-tune models to do exactly what you need.
  • OpenAI Compatibility: It works as an OpenAI-compatible server.
  • Docker support: An official Docker image is available.

Ollama doesn’t have built-in tools for RAG (Retrieval Augmented Generation). If you need that functionality, you can use AnythingLLM alongside Ollama, or try a web interface like Open WebUI. Check out the documentation for even more integrations.

Supported OS: macOS, Linux, Windows

Website: https://ollama.com/

Comparison

Feature AnythingLLM Jan.ai LM Studio Ollama
AI Agents Yes No No No
Easy Website Integration Yes No No No
Export Chats Yes No No No
Model Download (Repository) Yes Yes Yes Yes
Model Import (Hugging Face/Custom) Yes Yes (GGUF only) Yes Yes
Local & Remote Models Yes Yes Yes Yes
Model Customization/Fine-tuning Yes Yes Yes Yes
Chat with Documents (RAG) Yes Yes Yes No
OpenAI Compatibility Yes Yes Yes Yes
Plugins/Extensions No Yes No No
Mobile Support No Coming Soon No No
Run as Service (Background) No No Yes Yes
Official Docker support Yes No No Yes

Conclusion

Once you’ve downloaded a model, all of these applications let you run LLMs locally, no internet connection needed! Keep in mind that some features, like using remote models, might still require a connection. Ultimately, you can pick whichever one best suits your needs and start experimenting. Right now, I’m using AnythingLLM alongside Ollama, and the combination gives me all the flexibility I need.