What this model does
Offline patent-prosecution reasoning on Spark-class hardware
Patent prosecution work — claim construction, MPEP-grounded office-action responses, Markush analysis, doctrine-of-equivalents reasoning — happens inside firms that can't ship privileged client text to a hosted frontier API. This release distills DeepSeek-R1's chain-of-thought reasoning onto a 5,000-row synthetic patent-reasoning corpus so a single Spark-class box can run the workflow offline, with full IRAC-shaped reasoning chains.
Use cases:
- Claim construction (Markush groups, doctrine of equivalents)
- MPEP-grounded office-action argument drafting
- Prior-art relevance + non-obviousness reasoning chains
- Patent-licensing scenario analysis (most-favored-licensee, FTO)
Who this is for: Patent attorneys, prosecution-team engineers, and IP-strategy teams running privileged workflows offline on Spark-class hardware (GB10, 128 GB unified memory) or comparable edge devices.
Notebooks
Two runnable notebooks ship with this model — open either on a free cloud GPU:
Table with columns: Notebook, What it does, Open| Notebook | What it does | Open |
|---|
| Builder | Reproduce this model's build and DGX Spark benchmarks end-to-end with fieldkit. |  |
| User | Load the published model and call it from your own app in a few lines. |  |
Choosing this lane
Unsloth-trained BF16 merged weights. Pick this lane if you want to continue training in Unsloth's 4-bit QLoRA workflow or if you need transformers-format weights for inference paths outside llama.cpp. The bakeoff measured 7h 34m training wall on this lane (vs 5h 38m on NeMo Framework) at probe think rate 0.80 / mean chain 916 tokens. For pure inference on Spark-class hardware, the GGUF sibling is faster; for the bakeoff-winning checkpoint, see the NeMo lane.
Spark measurements (BF16 merged):
Table with columns: Variant, Size, Train wall, Probe think rate, Mean chain| Variant | Size | Train wall | Probe think rate | Mean chain |
|---|
| BF16 | 15.26 GB | 7h 34m | 0.80 | 916 tok |
How to run
HuggingFace Transformers:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "Orionfold/patent-strategist-v3-unsloth"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id, torch_dtype=torch.bfloat16, device_map="auto"
)
prompt = (
"<|User|>A patent claim recites \"a fastener selected from the group consisting "
"of bolts, screws, and rivets.\" Walk through the Markush-group construction "
"and explain how doctrine of equivalents applies to a magnetic snap.<|Assistant|>"
)
inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=1024, temperature=0.6, top_p=0.95)
print(tok.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
Methods
Full methodology and Spark-side measurement protocol: Two paths to the same chain — Unsloth vs NeMo Framework on Spark.
Known drift
Bounded limitations observed during Spark-side measurement. Each item below names the artifact and the scope of the drift; the balance of the bench measures clean — see Methods for the full breakdown.
- "metes-and-times" terminology — Two known terminology drifts inherited from the v3 synthetic corpus; balance of probe answers (~99%) cite real MPEP sections. Correct legal term in claim construction is metes and bounds.
- Fabricated MPEP §2163.05(s) citation — Same scope — corpus-generator artifact, not a model-wide hallucination pattern. Real §2163.05 has subsections (a)–(f) on written-description support; subsection (s) does not exist.
Other Orionfold variants
Sibling repos from the same release:
Published by Orionfold LLC · orionfold.com · Methods documented at ainative.business/field-notes.