Novaspree

llama-3.2-3B-tofu-adapter

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Model Details

Developed by: Suryash Yagnik, Shubham Gaur, Saksham Thakur, Vinija Jain, Aman Chadha, Amitava Das
Model type: LoRA adapter
Base model: meta-llama/Llama-3.2-3B
Method: Multi-phase Adapter-Aware Targeted Unlearning (MAAT)

Summary

The MAAT framework establishes a new operating point on the forget-retain Pareto frontier. It achieves high forgetting and high retention on causal knowledge by:

Gradient Policy Ascent: Using orthogonal projection to remove retain components from the forget gradient.
Structural Compression: Pruning rank dimensions via SVD profiling.
Utility Repair: Applying a multi-objective engine to maintain performance on the retain set.

Citation

bibtex
@article{yagnik2026maat,
  title={MAAT: Multi-phase Adapter-Aware Targeted Unlearning},
  author={Yagnik, Suryash and Gaur, Shubham and Thakur, Saksham and Jain, Vinija and Chadha, Aman and Das, Amitava},
  journal={arXiv preprint arXiv:2605.30514},
  year={2026}
}

Model provider

Novaspree

Model tree

Base

meta-llama/Llama-3.2-3B

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Model Details

Developed by: Suryash Yagnik, Shubham Gaur, Saksham Thakur, Vinija Jain, Aman Chadha, Amitava Das
Model type: LoRA adapter
Base model: meta-llama/Llama-3.2-3B
Method: Multi-phase Adapter-Aware Targeted Unlearning (MAAT)

Summary

The MAAT framework establishes a new operating point on the forget-retain Pareto frontier. It achieves high forgetting and high retention on causal knowledge by:

Gradient Policy Ascent: Using orthogonal projection to remove retain components from the forget gradient.
Structural Compression: Pruning rank dimensions via SVD profiling.
Utility Repair: Applying a multi-objective engine to maintain performance on the retain set.

Citation

bibtex
@article{yagnik2026maat,
  title={MAAT: Multi-phase Adapter-Aware Targeted Unlearning},
  author={Yagnik, Suryash and Gaur, Shubham and Thakur, Saksham and Jain, Vinija and Chadha, Aman and Das, Amitava},
  journal={arXiv preprint arXiv:2605.30514},
  year={2026}
}

llama-3.2-3B-tofu-adapter

Get help setting up a custom Dedicated Endpoints.

README

Model Details

Summary

Citation

Explore FriendliAI today

README

Model Details

Summary

Citation