Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Variant

  • Variant: Random bijection baseline, seed 42
  • Base model: Qwen/Qwen2.5-7B-Instruct
  • Local source path used for upload: /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42
  • Weight source used for upload: /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42
  • Tokenizer check: The local tokenizer produced different token IDs from the base tokenizer for the test sentence. Base tokenizer token IDs for the test sentence: [2403, 6247, 8521, 525, 25992, 26, 1817, 42151, 2997, 374, 42151, 304, 1181, 1828, 1616, 13].

Important Limitations

  • AlienLM does not provide cryptographic security or formal privacy guarantees.
  • The method is deterministic and should be evaluated under the relevant leakage and observer assumptions.
  • Safety behavior can differ from the original instruction-tuned model; use this model for research evaluation only.
  • Downstream quality depends on task, domain, alienization ratio, and adaptation data.

Tokenization Example

Test sentence:

text

All happy families are alike; each unhappy family is unhappy in its own way.

For this repository, the local tokenizer produces these visible token pieces:

text

[All, Ġhappy, Ġfamilies, Ġare, Ġalike, ;, Ġeach, Ġunhappy, Ġfamily, Ġis, Ġunhappy, Ġin, Ġits, Ġown, Ġway, .]

The table below records how the same sentence maps to token IDs across the uploaded tokenizers. The visible token pieces may look familiar because AlienLM changes the vocabulary-to-ID mapping; the ID sequence is the important model-facing representation.

TokenizerSourceCountToken IDs
Base Qwen/Qwen2.5-7B-InstructQwen/Qwen2.5-7B-Instruct16[2403, 6247, 8521, 525, 25992, 26, 1817, 42151, 2997, 374, 42151, 304, 1181, 1828, 1616, 13]
Base Qwen/Qwen2.5-14B-InstructQwen/Qwen2.5-14B-Instruct16[2403, 6247, 8521, 525, 25992, 26, 1817, 42151, 2997, 374, 42151, 304, 1181, 1828, 1616, 13]
Gemma2-9b-it-AlienLM-50-all-tokenizer-v3-32-qwen/data2/AlienLM/outputs/Gemma2-9b-it-AlienLM-50-all-tokenizer-v3-32-qwen16[207114, 211985, 23904, 164425, 201838, 244780, 104844, 11896, 124750, 78043, 11896, 40818, 112321, 155972, 188431, 235269]
Gemma2-9b-it-random42/data2/AlienLM/outputs/Gemma2-9b-it-random4216[118082, 85241, 174135, 184646, 114599, 58746, 48064, 71689, 147487, 81724, 71689, 163116, 23867, 77693, 75944, 217666]
Llama3-8B-Instruct-AlienLM-50-all-tokenizer-v3-32-qwenv2/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-50-all-tokenizer-v3-32-qwenv2/checkpoint-930616[4054, 43251, 60004, 66417, 35331, 114100, 27381, 6380, 39185, 23136, 6380, 109132, 8299, 21649, 82386, 11]
Llama3-8B-Instruct-AlienLM-ratio-20/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-2016[2460, 6380, 8689, 527, 27083, 26, 1855, 24241, 30235, 374, 24241, 23136, 1202, 1866, 1648, 13]
Llama3-8B-Instruct-AlienLM-ratio-40/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-4016[8140, 43251, 50556, 527, 27083, 114100, 27381, 6380, 15547, 18115, 6380, 304, 996, 1866, 1648, 13]
Llama3-8B-Instruct-AlienLM-ratio-60/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-6016[4054, 43251, 8689, 527, 27083, 114100, 27381, 6380, 3070, 40584, 6380, 304, 82321, 16244, 52224, 11]
Llama3-8B-Instruct-AlienLM-ratio-80/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-8016[4054, 43251, 60004, 66417, 35331, 26, 27381, 6380, 39185, 48649, 6380, 304, 1202, 1961, 1648, 11]
Llama3-8B-Instruct-random-42/data2/AlienLM/outputs/Llama3-8B-Instruct-random-42/checkpoint-930616[109112, 64630, 115549, 88947, 56261, 123661, 98632, 89092, 51180, 49115, 89092, 76847, 27799, 22779, 121871, 33744]
Qwen25-14b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama/data2/AlienLM/outputs/Qwen25-14b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama16[90633, 42151, 58904, 2804, 90614, 25, 272, 6247, 29135, 282, 6247, 293, 386, 94648, 28766, 11]
Qwen25-14b-Instruct-random-42/data2/AlienLM/outputs/Qwen25-14b-Instruct-random-4216[26430, 9244, 81484, 117800, 1086, 89842, 70268, 27147, 15693, 31326, 27147, 21062, 67902, 77163, 56354, 63835]
Qwen25-7b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama/data2/AlienLM/outputs/Qwen25-7b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama16[90633, 42151, 58904, 2804, 90614, 25, 272, 6247, 29135, 282, 6247, 293, 386, 94648, 28766, 11]
Qwen25-7b-Instruct-random-42/data2/AlienLM/outputs/Qwen25-7b-Instruct-random-4216[26430, 9244, 81484, 117800, 1086, 89842, 70268, 27147, 15693, 31326, 27147, 21062, 67902, 77163, 56354, 63835]

Uploaded Files

Only serving-time artifacts were staged for upload:

  • added_tokens.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/added_tokens.json
  • config.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/config.json
  • generation_config.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/generation_config.json
  • merges.txt from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/merges.txt
  • model-00001-of-00004.safetensors from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/model-00001-of-00004.safetensors
  • model-00002-of-00004.safetensors from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/model-00002-of-00004.safetensors
  • model-00003-of-00004.safetensors from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/model-00003-of-00004.safetensors
  • model-00004-of-00004.safetensors from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/model-00004-of-00004.safetensors
  • model.safetensors.index.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/model.safetensors.index.json
  • special_tokens_map.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/special_tokens_map.json
  • tokenizer.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/tokenizer.json
  • tokenizer_config.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/tokenizer_config.json
  • vocab.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/vocab.json

Training-only artifacts such as checkpoint-* directories, trainer_state.json, optimizer states, scheduler states, RNG states, logs, caches, and W&B files were intentionally excluded.

Training Data

The model was adapted on the Magpie instruction and reasoning mixture used in the AlienLM experiments:

  • Magpie-Align/Magpie-Pro-300K-Filtered
  • Magpie-Align/Magpie-Reasoning-V1-150K

Citation

If you use these weights, please cite the AlienLM paper.

Model provider

dsba-lab

dsba-lab

Model tree

Base

Qwen/Qwen2.5-7B-Instruct

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today