Local LLMs
When using a Local LLM, OpenHands may have limited functionality. It is highly recommended that you use GPUs to serve local models for optimal experience.
News
- 2025/05/21: We collaborated with Mistral AI and released Devstral Small that achieves 46.8% on SWE-Bench Verified!
- 2025/03/31: We released an open model OpenHands LM v0.1 32B that achieves 37.1% on SWE-Bench Verified (blog, model).
Quickstart: Running OpenHands on Your Macbook
Serve the model on your Macbook
We recommend using LMStudio for serving these models locally.
-
Download LM Studio and install it
-
Download the model:
- Option 1: Directly download the LLM from this link or by searching for the name
Devstral-Small-2505
in LM Studio - Option 2: Download a LLM in GGUF format. For example, to download Devstral Small 2505 GGUF, using
huggingface-cli download mistralai/Devstral-Small-2505_gguf --local-dir mistralai/Devstral-Small-2505_gguf
. Then in bash terminal, runlms import {model_name}
in the directory where you've downloaded the model checkpoint (e.g. runlms import devstralQ4_K_M.gguf
inmistralai/Devstral-Small-2505_gguf
)
- Option 1: Directly download the LLM from this link or by searching for the name
-
Open LM Studio application, you should first switch to
power user
mode, and then open the developer tab:
- Then click
Select a model to load
on top of the application:
- And choose the model you want to use, holding
option
on mac to enable advanced loading options:
- You should then pick an appropriate context window for OpenHands based on your hardware configuration (larger than 32768 is recommended for using OpenHands, but too large may cause you to run out of memory); Flash attention is also recommended if it works on your machine.
- And you should start the server (if it is not already in
Running
status), un-toggleServe on Local Network
and remember the port number of the LMStudio URL (1234
is the port number forhttp://127.0.0.1:1234
in this example):
- Finally, you can click the
copy
button near model name to copy the model name (imported-models/uncategorized/devstralq4_k_m.gguf
in this example):
Start OpenHands with locally served model
Check the installation guide to make sure you have all the prerequisites for running OpenHands.
export LMSTUDIO_MODEL_NAME="imported-models/uncategorized/devstralq4_k_m.gguf" # <- Replace this with the model name you copied from LMStudio
export LMSTUDIO_URL="http://host.docker.internal:1234" # <- Replace this with the port from LMStudio
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.39-nikolaik
mkdir -p ~/.openhands-state && echo '{"language":"en","agent":"CodeActAgent","max_iterations":null,"security_analyzer":null,"confirmation_mode":false,"llm_model":"lm_studio/'$LMSTUDIO_MODEL_NAME'","llm_api_key":"dummy","llm_base_url":"'$LMSTUDIO_URL/v1'","remote_runtime_resource_factor":null,"github_token":null,"enable_default_condenser":true,"user_consents_to_analytics":true}' > ~/.openhands-state/settings.json
docker run -it --rm --pull=always \
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.39-nikolaik \
-e LOG_ALL_EVENTS=true \
-v /var/run/docker.sock:/var/run/docker.sock \
-v ~/.openhands-state:/.openhands-state \
-p 3000:3000 \
--add-host host.docker.internal:host-gateway \
--name openhands-app \
docker.all-hands.dev/all-hands-ai/openhands:0.39
Once your server is running -- you can visit http://localhost:3000
in your browser to use OpenHands with local Devstral model:
Digest: sha256:e72f9baecb458aedb9afc2cd5bc935118d1868719e55d50da73190d3a85c674f
Status: Image is up to date for docker.all-hands.dev/all-hands-ai/openhands:0.39
Starting OpenHands...
Running OpenHands as root
14:22:13 - openhands:INFO: server_config.py:50 - Using config class None
INFO: Started server process [8]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:3000 (Press CTRL+C to quit)
Advanced: Serving LLM on GPUs
Download model checkpoints
The model checkpoints downloaded here should NOT be in GGUF format.
For example, to download OpenHands LM 32B v0.1:
huggingface-cli download all-hands/openhands-lm-32b-v0.1 --local-dir all-hands/openhands-lm-32b-v0.1
Create an OpenAI-Compatible Endpoint With SGLang
- Install SGLang following the official documentation.
- Example launch command for OpenHands LM 32B (with at least 2 GPUs):
SGLANG_ALLOW_OVERWRITE_LONGER_CONTEXT_LEN=1 python3 -m sglang.launch_server \
--model all-hands/openhands-lm-32b-v0.1 \
--served-model-name openhands-lm-32b-v0.1 \
--port 8000 \
--tp 2 --dp 1 \
--host 0.0.0.0 \
--api-key mykey --context-length 131072
Create an OpenAI-Compatible Endpoint with vLLM
- Install vLLM following the official documentation.
- Example launch command for OpenHands LM 32B (with at least 2 GPUs):
vllm serve all-hands/openhands-lm-32b-v0.1 \
--host 0.0.0.0 --port 8000 \
--api-key mykey \
--tensor-parallel-size 2 \
--served-model-name openhands-lm-32b-v0.1
--enable-prefix-caching
Advanced: Run and Configure OpenHands
Run OpenHands
Using Docker
Run OpenHands using the official docker run command.
Using Development Mode
Use the instructions in Development.md to build OpenHands.
Ensure config.toml
exists by running make setup-config
which will create one for you. In the config.toml
, enter the following:
[core]
workspace_base="/path/to/your/workspace"
[llm]
model="openhands-lm-32b-v0.1"
ollama_base_url="http://localhost:8000"
Start OpenHands using make run
.
Configure OpenHands
Once OpenHands is running, you'll need to set the following in the OpenHands UI through the Settings under the LLM
tab:
- Enable
Advanced
options. - Set the following:
Custom Model
toopenai/<served-model-name>
(e.g.openai/openhands-lm-32b-v0.1
)Base URL
tohttp://host.docker.internal:8000
API key
to the same string you set when serving the model (e.g.mykey
)