deep-conrad/conrad_nit API & Inference Endpoint | FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Container

Run this model inference with full control and performance in your environment.

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

What ships here

train_stage1.py for the Stage 1 GPT baseline
train_llama_stage2.py for LoRA fine-tuning on LLaMA
train_lora.py as a compatibility entry point for the LoRA path
train_pipeline.py to run both stages in sequence
build_datasets.py to generate topic-sharded synthetic datasets
data/ with example JSONL training sets

Model Direction

The project is intended to produce two artifacts:

a meaningful internal baseline from Stage 1
a higher-quality assistant checkpoint or adapter from Stage 2

The Stage 2 model is the release artifact for normal inference.

Base Model

Stage 2 currently targets:

meta-llama/Llama-3.1-8B-Instruct

The earlier LoRA script in this repo also supports the same model family.

Training Flow

Recommended flow:

Run python build_datasets.py --output_dir data/generated --include_docs to generate topic-specific shards and aggregate JSONL files.
Train Stage 1 with data/generated/stage1_sft.jsonl if you want the GPT baseline.
Run python train_stage1.py to build the small baseline.
Train Stage 2 with data/generated/stage2_conrad_sft.jsonl or let python train_pipeline.py --include_docs do the build step automatically.
Merge the Stage 2 adapter with python merge_stage2_lora.py.
Sync the merged checkpoint into the model repo with python sync_checkpoint.py.
Publish the merged checkpoint.

Intended Use

This project is designed for:

conversational assistants
documentation assistants
support routing
enterprise workflows
knowledge assistants
internal tooling
structured response generation

Notes

Stage 1 and Stage 2 are separate training jobs.
Stage 1 is not a wrapper around LLaMA.
Stage 2 is the higher-quality assistant tuning step.
The repo includes example datasets only; real training data is still required.
Set CONRAD_ENDPOINT_URL and HF_TOKEN in the Space secrets or environment to enable production chat.
If the endpoint is unavailable, the Space will show a clear fallback instead of generating from the raw checkpoint.

Model provider

deep-conrad

Model tree

Base

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today

Get started Talk to an engineer