This guide covers downloading, installing, and configuring GPT4All for running local LLMs on your hardware.
Visit gpt4all.io and download for your platform:
| Platform | Download |
|---|---|
| Windows | Windows Installer (x64) |
| Windows ARM | Windows ARM Installer |
| macOS | macOS Installer |
| Ubuntu | Ubuntu Installer (.deb) |
| Linux | Flathub (Flatpak) |
Windows:
# Run the downloaded installer
GPT4All-Installer.exe
macOS:
# Open .dmg and drag GPT4All to Applications
# Or use command line:
hdiutil attach GPT4All.dmg
cp -R /Volumes/GPT4All/*.app /Applications/
Ubuntu/Debian:
sudo dpkg -i gpt4all_*.deb
Linux (Flatpak):
flatpak install flathub io.gpt4all.gpt4all
flatpak run io.gpt4all.gpt4all
| Component | PC (Windows/Linux) | Apple |
|---|---|---|
| CPU | Intel i3-2100 / AMD FX-4100 | M1 |
| RAM | 8GB (for 3B models) | 16GB |
| GPU | Direct3D 11/12 or OpenGL 2.1 | M1 (integrated) |
| OS | Windows 10, Ubuntu 22.04 | macOS 12.6 |
| Disk | 5GB per model | 5GB per model |
| Component | PC (Windows/Linux) | Apple |
|---|---|---|
| CPU | Ryzen 5 3600 / Intel i7-10700 | M2 Pro |
| RAM | 16GB | 16GB |
| GPU | NVIDIA GTX 1080 Ti/RTX 2080+ (8GB+ VRAM) | M2 Pro (integrated) |
| OS | Windows 10, Ubuntu 24.04 | macOS 14.5+ |
Start with these models:
| Model | Size | Quantization | RAM Required | Use Case |
|---|---|---|---|---|
| Llama 3 8B | 8B | Q4_0 | 8GB | General purpose |
| Mistral 7B | 7B | Q4_0 | 6GB | Good balance |
| Phi-3 Mini | 3.8B | Q4_0 | 4GB | Fast responses |
| Gemma 2B | 2B | Q4_0 | 3GB | Very fast |
| Model | Size | Quantization | RAM Required | Use Case |
|---|---|---|---|---|
| Llama 3 70B | 70B | Q4_0 | 48GB | Maximum quality |
| Mixtral 8x7B | 47B | Q4_0 | 32GB | MoE architecture |
| Qwen 2.5 32B | 32B | Q4_0 | 24GB | Multilingual |
| Quantization | Size | Quality | Speed |
|---|---|---|---|
| Q2_K | Smallest | Lower | Fastest |
| Q4_0 | Small | Good | Fast (Recommended) |
| Q5_K_M | Medium | Better | Medium |
| Q6_K | Large | Best | Slower |
| Q8_0 | Largest | Near-lossless | Slowest |
Recommendation: Q4_0 offers the best balance for most users.
| Format | Extensions |
|---|---|
| Text | .txt, .md |
| Documents | .pdf, .docx |
| Presentations | .pptx |
| Spreadsheets | .csv, .xlsx |
| Code | .py, .js, .ts, .java, etc. |
pip install gpt4all
from gpt4all import GPT4All
# Load model
model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf")
# Generate text
output = model.generate("The capital of France is ", max_tokens=10)
print(output)
# Chat session
with model.chat_session():
response = model.generate("Hello, how are you?")
print(response)
from gpt4all import GPT4All
# NVIDIA GPU
model = GPT4All("model.gguf", device='gpu')
# AMD GPU
model = GPT4All("model.gguf", device='amd')
# Intel GPU
model = GPT4All("model.gguf", device='intel')
Note: GPT4All does not provide an official Docker image. Community images are available.
# Community image (not official)
docker run -d -p 4891:4891 localagi/gpt4all-docker:latest
curl http://localhost:4891/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3-8b-instruct",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'
Note: The Docker API feature is built into the desktop application. You can enable it in Settings → API.
Windows:
Linux:
pip install --upgrade gpt4all
docker pull nomicai/gpt4all:latest
docker restart <container-id>
Choose your deployment method:
Setting up local LLMs can be complex. We offer consulting services for:
Contact us at office@linux-server-admin.com or visit our contact page.