Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Model Details

  • Base model: HuggingFaceTB/SmolLM2-135M-Instruct
  • Model type: Causal Language Model
  • Language: English
  • License: Apache 2.0
  • Finetuned by: Anugya Sahu

Training Data

  • Dataset: RomanTeucher/text2cypher-curated
  • 1000 training samples, 75 validation, 50 test
  • Each sample contains a graph schema, a natural language question, and a target Cypher query

How to Use

python

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Anugya/text2cypher-smollm2")
tokenizer = AutoTokenizer.from_pretrained("Anugya/text2cypher-smollm2")
tokenizer.pad_token = tokenizer.eos_token
schema = "Movie {title, year}, Person {name}, (Person)-[:DIRECTED]->(Movie)"
question = "Which movies did Christopher Nolan direct before 2010?"
prompt = f"""### Schema:
{schema}
### Question:
{question}
### Cypher:"""
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=128, do_sample=False)
generated = outputs[0][inputs["input_ids"].shape[1]:]
print(tokenizer.decode(generated, skip_special_tokens=True))

Training Details

  • Full fine-tune — all weights updated, no LoRA
  • Epochs: 3
  • Learning rate: 2e-4
  • Batch size: 4
  • Max token length: 256
  • Hardware: CPU (Apple M-series)
  • Precision: float32

Evaluation

Evaluated on 50 test samples using:

  • Exact Match — strict comparison after lowercasing and stripping
  • Token F1 — token overlap between prediction and ground truth

Limitations

  • 135M parameter model — generates Cypher that looks right but often isn't
  • No query execution validation against a real Neo4j database
  • May struggle with complex schemas or multi-hop queries
  • Trained on CPU with limited epochs — larger training would improve results

Model provider

Anugya

Anugya

Model tree

Base

HuggingFaceTB/SmolLM2-135M-Instruct

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today