Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: mitModel components
- Base language model:
distilgpt2 - Core checkpoint: ForesightLM seed 42
- Sentence encoder used during training/evaluation:
sentence-transformers/all-MiniLM-L6-v2 - Future objective: sentence-boundary contrastive future embedding prediction
- Future-loss weight:
lambda_future = 0.08 - Contrastive temperature:
tau = 0.07
Intended use
This checkpoint is intended for research on:
- autoregressive language modeling
- sentence-level semantic planning
- discourse coherence diagnostics
- semantic reranking
- future-representation calibration
Important limitations
This model is a small research prototype. It should not be treated as a production-quality text generator.
Automatic metrics show that semantic reranking is a strong component by itself. Foresight training improves several diagnostics but does not uniformly dominate a reranked baseline. Direct future-head reranking exposes a calibration gap.
Human evaluation protocol files are released in the GitHub repository, but human judgments are still being collected and will be added in a later revision.
Reproducibility
Code, SLURM scripts, evaluation summaries, compute-cost accounting, bootstrap confidence intervals, qualitative examples, and reproducibility manifests are available at:
https://github.com/Ahmet2001/foresightLM
Large generation JSONL files and training data are not included in this model repository.
Citation
If you use this checkpoint, please cite the ForesightLM project repository until a paper DOI/arXiv identifier is available.
Model provider
Mandotosh
Model tree
Base
distilbert/distilgpt2
Fine-tuned
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information