Mandotosh/foresightlm-core-distilgpt2 API & Inference Endpoint

Model components

Base language model: distilgpt2
Core checkpoint: ForesightLM seed 42
Sentence encoder used during training/evaluation: sentence-transformers/all-MiniLM-L6-v2
Future objective: sentence-boundary contrastive future embedding prediction
Future-loss weight: lambda_future = 0.08
Contrastive temperature: tau = 0.07

Intended use

This checkpoint is intended for research on:

autoregressive language modeling
sentence-level semantic planning
discourse coherence diagnostics
semantic reranking
future-representation calibration

Important limitations

This model is a small research prototype. It should not be treated as a production-quality text generator.

Automatic metrics show that semantic reranking is a strong component by itself. Foresight training improves several diagnostics but does not uniformly dominate a reranked baseline. Direct future-head reranking exposes a calibration gap.

Human evaluation protocol files are released in the GitHub repository, but human judgments are still being collected and will be added in a later revision.

Reproducibility

Code, SLURM scripts, evaluation summaries, compute-cost accounting, bootstrap confidence intervals, qualitative examples, and reproducibility manifests are available at:

https://github.com/Ahmet2001/foresightLM

Large generation JSONL files and training data are not included in this model repository.

Citation

If you use this checkpoint, please cite the ForesightLM project repository until a paper DOI/arXiv identifier is available.

Model components

Base language model: distilgpt2
Core checkpoint: ForesightLM seed 42
Sentence encoder used during training/evaluation: sentence-transformers/all-MiniLM-L6-v2
Future objective: sentence-boundary contrastive future embedding prediction
Future-loss weight: lambda_future = 0.08
Contrastive temperature: tau = 0.07

Intended use

This checkpoint is intended for research on:

autoregressive language modeling
sentence-level semantic planning
discourse coherence diagnostics
semantic reranking
future-representation calibration

Important limitations

This model is a small research prototype. It should not be treated as a production-quality text generator.

Human evaluation protocol files are released in the GitHub repository, but human judgments are still being collected and will be added in a later revision.

Reproducibility

Code, SLURM scripts, evaluation summaries, compute-cost accounting, bootstrap confidence intervals, qualitative examples, and reproducibility manifests are available at:

https://github.com/Ahmet2001/foresightLM

Large generation JSONL files and training data are not included in this model repository.

Citation

If you use this checkpoint, please cite the ForesightLM project repository until a paper DOI/arXiv identifier is available.

foresightlm-core-distilgpt2

Get help setting up a custom Dedicated Endpoints.

README

Model components

Intended use

Important limitations

Reproducibility

Citation

Explore FriendliAI today

README

Model components

Intended use

Important limitations

Reproducibility

Citation