JasonShiii/step-llm-llama3b-no_rag API & Inference Endpoint

Usage

python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")
model = PeftModel.from_pretrained(base_model, "JasonShiii/step-llm-llama3b-no_rag")
tokenizer = AutoTokenizer.from_pretrained("JasonShiii/step-llm-llama3b-no_rag")

Or use the inference script from the GitHub repo:

bash
python generate_step.py \
    --ckpt_path JasonShiii/step-llm-llama3b-no_rag \
    --caption "A cylindrical bolt with a hexagonal head"

Note: this adapter was trained with the no-RAG prompt template, so do not pass --use_rag when using it. For RAG inference, use JasonShiii/step-llm-llama3b instead.

Training Details

Table with columns: Parameter, Value
Parameter	Value
Base model	Llama-3.2-3B-Instruct
LoRA rank (r)	16
lora_alpha	16
Learning rate	5e-5
Batch size	2 (x4 grad accum = effective 8)
max_seq_length	16384
Training data	~20k STEP files, 0-500 entities
Training steps	6300
Prompt template	no-RAG (caption -> output, no retrieved example)

Citation

bibtex
@article{shi2026step,
  title={STEP-LLM: Generating CAD STEP Models from Natural Language with Large Language Models},
  author={Shi, Xiangyu and Ding, Junyang and Zhao, Xu and Zhan, Sinong and Mohapatra, Payal
          and Quispe, Daniel and Welbeck, Kojo and Cao, Jian and Chen, Wei and Guo, Ping and others},
  journal={arXiv preprint arXiv:2601.12641},
  year={2026}
}

Usage

python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")
model = PeftModel.from_pretrained(base_model, "JasonShiii/step-llm-llama3b-no_rag")
tokenizer = AutoTokenizer.from_pretrained("JasonShiii/step-llm-llama3b-no_rag")

Or use the inference script from the GitHub repo:

bash
python generate_step.py \
    --ckpt_path JasonShiii/step-llm-llama3b-no_rag \
    --caption "A cylindrical bolt with a hexagonal head"

Note: this adapter was trained with the no-RAG prompt template, so do not pass --use_rag when using it. For RAG inference, use JasonShiii/step-llm-llama3b instead.

Training Details

Table with columns: Parameter, Value
Parameter	Value
Base model	Llama-3.2-3B-Instruct
LoRA rank (r)	16
lora_alpha	16
Learning rate	5e-5
Batch size	2 (x4 grad accum = effective 8)
max_seq_length	16384
Training data	~20k STEP files, 0-500 entities
Training steps	6300
Prompt template	no-RAG (caption -> output, no retrieved example)

Citation

bibtex
@article{shi2026step,
  title={STEP-LLM: Generating CAD STEP Models from Natural Language with Large Language Models},
  author={Shi, Xiangyu and Ding, Junyang and Zhao, Xu and Zhan, Sinong and Mohapatra, Payal
          and Quispe, Daniel and Welbeck, Kojo and Cao, Jian and Chen, Wei and Guo, Ping and others},
  journal={arXiv preprint arXiv:2601.12641},
  year={2026}
}

step-llm-llama3b-no_rag

Get help setting up a custom Dedicated Endpoints.

README

Usage

Training Details

Citation

Explore FriendliAI today

README

Usage

Training Details

Citation