How to Run LLMs locally
Published on: 2025-03-25
Running Large Language Models (LLMs) locally and completely offline isn’t as tricky as it used to be! There are tons of applications out there that let you do it. Here’s a quick look at some of them. It’s not an exhaustive list, but I’m focusing on the ones that are relatively easy to get started with and can handle a variety of LLMs.
Performance
When you’re running LLMs on your computer, their performance really depends on your hardware. LLMs with lots of parameters need more powerful machines. If you’re finding your hardware is struggling, experimenting with different quantized LLMs can help improve performance. Just keep in mind that you might see a slight drop in accuracy.
AnythingLLM
If you’re looking for an open-source, privacy-focused desktop app to run models locally, AnythingLLM is a fantastic choice. It boasts a really user-friendly interface, making it incredibly easy to switch between different models and keep your chat sessions organized.
Features
- AI agents: These are LLMs which have access to tools like, web-scraping, search the web, document summarization, making charts, saving files, etc
- Easy Website Integration: You can create chat widgets to easily add it to any website.
- Export Your Chats: Save your conversations in various formats like CSV, JSON, JSON (Alpaca), and JSONL (for OpenAI fine-tuning).
- Download Models: Get models directly from the AnythingLLM repository.
- Connect to Other Tools: Works with other local model providers like Ollama, LM Studio, and Local AI.
- Use Remote Models Too: Access models from OpenAI, Groq, Hugging Face, and others.
- Chat with your documents: Local RAG (Retrieval-augmented generation) integration.
- Docker support: An official docker image is available.
Supported OS: macOS, Linux, Windows
Website: https://anythingllm.com
Jan.ai
Similar to AnythingLLM, Jan.ai is an open-source, privacy-focused desktop application you can install and run completely locally – no internet connection required! It also features a user-friendly interface to simplify the experience.
Features
- Download Models: Get LLMs directly from the Jan.ai repository, or import GGUF models from Hugging Face.
- Local or Remote models: Use models running on your own computer, or connect to services like OpenAI for extra power.
- Model customization: Just like Ollama, you can tweak and fine-tune models to do exactly what you need.
- Chat with your documents: Easily ask questions and get answers from your own files with local and remote RAG integration (enable experimental mode and follow the instructions at https://jan.ai/docs/tools/retrieval).
- OpenAI Compatibility: It works as an OpenAI-compatible server.
- Extend Its Abilities: Install plugins to add even more features.
- Mobile Support (Coming Soon!): Jan.ai will soon be available on mobile devices.
Supported OS: macOS, Linux, Windows
Website: https://jan.ai
LM Studio
Like AnythingLLM and Jan.ai, LM Studio is a desktop application you can install and run completely locally – just like the other two desktop apps, it has a user-friendly interface. The main difference? LM Studio isn’t open source. It’s free to use for personal projects, but if you want to use it for work, you’ll need to contact the LM Studio developers
Features
- Customize Your Models: Fine-tune models to get exactly the results you want.
- Import Models Easilying in models directly from Hugging Face.
- Chat with Your Documents: Easily ask questions and get answers from your own files with local RAG integration.
- OpenAI Compatibility: It works as an OpenAI-compatible server.
- Run It in the Background: Run it as a service without needing to see the interface.
Supported OS: macOS, Linux, Windows
Website: https://lmstudio.ai
Ollama
Ollama stands apart from the other apps we’ve talked about. It’s a lightweight command-line app that lets you run LLMs directly on your computer. You won’t find features like a polished user interface or built-in RAG integration right away. But honestly, that’s not really a drawback! It’s actually a great way to pair it with other tools, like AnythingLLM or a web-based interface like Open WebUI, where you can get those UI and RAG features.
Features
- Download Models Quickly: Easily get LLMs from the Ollama repository.
- Import Models with a Custom File: You can bring in models using a special file (this might be a bit technical for some users).
- Customize Your Models: Just like with other platforms, you can tweak and fine-tune models to do exactly what you need.
- OpenAI Compatibility: It works as an OpenAI-compatible server.
- Docker support: An official Docker image is available.
Ollama doesn’t have built-in tools for RAG (Retrieval Augmented Generation). If you need that functionality, you can use AnythingLLM alongside Ollama, or try a web interface like Open WebUI. Check out the documentation for even more integrations.
Supported OS: macOS, Linux, Windows
Website: https://ollama.com/
Comparison
Feature | AnythingLLM | Jan.ai | LM Studio | Ollama |
---|---|---|---|---|
AI Agents | Yes | No | No | No |
Easy Website Integration | Yes | No | No | No |
Export Chats | Yes | No | No | No |
Model Download (Repository) | Yes | Yes | Yes | Yes |
Model Import (Hugging Face/Custom) | Yes | Yes (GGUF only) | Yes | Yes |
Local & Remote Models | Yes | Yes | Yes | Yes |
Model Customization/Fine-tuning | Yes | Yes | Yes | Yes |
Chat with Documents (RAG) | Yes | Yes | Yes | No |
OpenAI Compatibility | Yes | Yes | Yes | Yes |
Plugins/Extensions | No | Yes | No | No |
Mobile Support | No | Coming Soon | No | No |
Run as Service (Background) | No | No | Yes | Yes |
Official Docker support | Yes | No | No | Yes |
Conclusion
Once you’ve downloaded a model, all of these applications let you run LLMs locally, no internet connection needed! Keep in mind that some features, like using remote models, might still require a connection. Ultimately, you can pick whichever one best suits your needs and start experimenting. Right now, I’m using AnythingLLM alongside Ollama, and the combination gives me all the flexibility I need.