jiazhisun01
kennys-code-completion-model-0.2B
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Model Details
- Architecture: GPT2LMHeadModel
- Parameters: ~0.2B
- Context length: 1024 tokens
- Tokenizer: Byte-level BPE
- Vocabulary size: 32,000
- Training data:
codeparrot/codeparrot-clean - Task: short code completion / code continuation
Architecture Configuration
json
{"model_type": "gpt2","vocab_size": 32000,"n_positions": 1024,"n_ctx": 1024,"n_embd": 768,"n_layer": 24,"n_head": 12,"activation_function": "gelu_new","position_embedding": "learned absolute positional embedding"}
Intended Use
This model is intended for lightweight code completion experiments, especially short Python-style completions.
Example Usage
markdown
import torchfrom transformers import AutoTokenizer, AutoModelForCausalLMmodel_id = "jiazhisun01/kennys-code-completion-model-0.2B"tokenizer = AutoTokenizer.from_pretrained(model_id)model = AutoModelForCausalLM.from_pretrained(model_id)device = "cuda" if torch.cuda.is_available() else "cpu"model.to(device)model.eval()prompt = "def fib"inputs = tokenizer(prompt, return_tensors="pt").to(device)with torch.no_grad():outputs = model.generate(**inputs,max_new_tokens=24,do_sample=False,pad_token_id=tokenizer.pad_token_id,eos_token_id=tokenizer.eos_token_id,)print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Recommended Generation Settings
For short code completion, use a small number of generated tokens:
markdown
max_new_tokens = 8-32do_sample = False
or
markdown
do_sample = Truetemperature = 0.2top_p = 0.9repetition_penalty = 1.1
Training Procedure
The model was trained in two stages:
- Base language modeling: trained on tokenized code blocks from codeparrot/codeparrot-clean.
- Short completion tuning: continued training on short completion examples where only the completion part contributes to the loss.
Limitations
This is a small model trained from scratch. It may:
produce syntactically invalid code, generate incomplete snippets, repeat tokens, fail on complex programming tasks, reproduce patterns from the training data. It is best used for educational experiments and lightweight code completion demos, not production software development.
Model provider
jiazhisun01
Model tree
Base
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information