nuroai

eliot-9b

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Why Eliot

Most language models are trained to chat. Eliot is trained to act.

Instead of seeing pixels, Eliot sees the same semantic surface that assistive technologies use on macOS: roles, labels, values, app names, and element IDs. That makes the model fast to prompt, easier to inspect, and much easier to wrap in deterministic safety logic than screenshot-only computer-use systems.

Eliot is especially tuned for Apple-platform workflows:

Table
CapabilityWhat it means
macOS Accessibility controlOperates through structured UI trees rather than raw screenshots.
One tool call per turnPredictable agent loop for harnesses and product integrations.
Apple Silicon pathA separate MLX 4-bit build is available for resident local Mac use.
Harness-first safetyDestructive, external, credential, payment, network, and write actions must be intercepted by the runtime.
Local-first designBuilt for assistants that can run close to user data instead of shipping every screen state to a remote service.

Model Lineage

text

Qwen/Qwen3.5-9B
-> Eliot QLoRA v1.2
-> merged BF16 checkpoint
-> Eliot 9B

Eliot is fine-tuned from Qwen/Qwen3.5-9B using a Mac-agent dataset made from deterministic synthetic episodes, sanitized macOS Accessibility-tree contexts, and compatible replay data. The training target is structured tool use, confirmation behavior, failure recovery, and honest task completion inside a guarded desktop harness.

Downloads

Table
ArtifactRepoBest For
Full BF16 checkpointnuroai/eliot-9bGPU serving, vLLM, research, further conversion.
MLX 4-bit buildnuroai/eliot-9b-mlx-4bitApple Silicon Macs and local resident assistant use.

Download the full model:

bash

huggingface-cli download nuroai/eliot-9b --local-dir eliot-9b

Run the Apple Silicon build:

bash

pip install mlx-lm
mlx_lm.server --model nuroai/eliot-9b-mlx-4bit --port 8081 --host localhost

Serving With vLLM

bash

vllm serve nuroai/eliot-9b \
--served-model-name eliot-9b \
--max-model-len 32768 \
--enable-auto-tool-choice \
--tool-call-parser qwen3_coder \
--reasoning-parser qwen3

The GH200/vLLM validation path served this merged checkpoint as BF16 with no quantization.

Transformers Quick Start

python

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "nuroai/eliot-9b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{
"role": "system",
"content": "You are Eliot, an on-device computer-use agent. Respond with exactly one tool call per turn."
},
{
"role": "user",
"content": "TASK: Open Notes and create a note called Project Ideas.\nAPP: Finder\nUI:\n[1] menu item File\n[2] button Search\nRESULT: none"
},
]
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False,
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(**inputs, max_new_tokens=256, do_sample=False)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=False))

Action Interface

Eliot is intended to be used with a harness that supplies the current task, frontmost app, Accessibility tree, and previous action result. The model then emits exactly one tool call.

Table
ToolPurpose
clickClick, press, or select an accessible UI element.
typeFocus an element and replace or enter text.
open_appLaunch or foreground a macOS app.
readRead accessible text or values.
run_shellUse local shell only when allowed by policy.
web_searchSearch the web when live facts are required.
invoke_intentCall a structured app or system intent exposed by the harness.
ask_userAsk for clarification or approval.
doneFinish honestly with the outcome.

The recommended system prompt is included as eliot_system_prompt.txt.

Evaluation Snapshot

Representative held-out MacBench and live-gate results from the v1.2 release process:

Table
CheckResult
MacBench action match, BF1697.7%
Destructive confirmation100.0%
Abstention handling100.0%
Error recovery100.0%
Live destructive confirmation, shipping quantization100.0%
Guarded destructive interception100.0%
Benign over-ask, live held-out set1 / 4

These numbers measure the model and reference harness on structured Mac-agent tasks. They are not a claim that the model can safely operate a computer without a deterministic safety layer.

Safety Model

Eliot is not the safety boundary.

The model is trained to ask before destructive actions, but real deployments must enforce policy outside the model. A production harness should intercept destructive, external, credential, payment, network, file-write, send, delete, and shell actions before execution. The reference Eliot/OpenSiri harness follows this design: the model proposes an action, the guard decides whether it can run, needs approval, or must be denied.

Do not deploy Eliot as an unguarded remote-control agent.

Intended Use

Eliot is intended for:

  • macOS assistant research and prototyping.
  • Local-first desktop automation.
  • Tool-calling and computer-use harness development.
  • Apple Silicon assistant experiments using the MLX 4-bit build.
  • Evaluation of Accessibility-tree-based agent interfaces.

Out-of-scope uses include stealth automation, credential extraction, unauthorized access, bypassing user consent, spam, payment automation without approval, or any deployment where user data is acted on without clear visibility and control.

License And Attribution

Eliot is released under Apache 2.0. The base model is Qwen/Qwen3.5-9B, also Apache 2.0.

Eliot is an independent project by Nuro AI Labs. It is designed for Apple-platform and macOS workflows, but it is not created by, endorsed by, sponsored by, or affiliated with Apple Inc.

Model provider

nuroai

Model tree

Base

Qwen/Qwen3.5-9B

Fine-tuned

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today