dsba-lab/Qwen25-7b-Instruct-random-42 API & Inference Endpoint

Variant

Variant: Random bijection baseline, seed 42
Base model: Qwen/Qwen2.5-7B-Instruct
Local source path used for upload: /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42
Weight source used for upload: /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42
Tokenizer check: The local tokenizer produced different token IDs from the base tokenizer for the test sentence. Base tokenizer token IDs for the test sentence: [2403, 6247, 8521, 525, 25992, 26, 1817, 42151, 2997, 374, 42151, 304, 1181, 1828, 1616, 13].

Important Limitations

AlienLM does not provide cryptographic security or formal privacy guarantees.
The method is deterministic and should be evaluated under the relevant leakage and observer assumptions.
Safety behavior can differ from the original instruction-tuned model; use this model for research evaluation only.
Downstream quality depends on task, domain, alienization ratio, and adaptation data.

Tokenization Example

Test sentence:

text
All happy families are alike; each unhappy family is unhappy in its own way.

For this repository, the local tokenizer produces these visible token pieces:

text
[All, Ġhappy, Ġfamilies, Ġare, Ġalike, ;, Ġeach, Ġunhappy, Ġfamily, Ġis, Ġunhappy, Ġin, Ġits, Ġown, Ġway, .]

The table below records how the same sentence maps to token IDs across the uploaded tokenizers. The visible token pieces may look familiar because AlienLM changes the vocabulary-to-ID mapping; the ID sequence is the important model-facing representation.

Tokenizer	Source	Count	Token IDs
Base Qwen/Qwen2.5-7B-Instruct	`Qwen/Qwen2.5-7B-Instruct`	16	`[2403, 6247, 8521, 525, 25992, 26, 1817, 42151, 2997, 374, 42151, 304, 1181, 1828, 1616, 13]`
Base Qwen/Qwen2.5-14B-Instruct	`Qwen/Qwen2.5-14B-Instruct`	16	`[2403, 6247, 8521, 525, 25992, 26, 1817, 42151, 2997, 374, 42151, 304, 1181, 1828, 1616, 13]`
Gemma2-9b-it-AlienLM-50-all-tokenizer-v3-32-qwen	`/data2/AlienLM/outputs/Gemma2-9b-it-AlienLM-50-all-tokenizer-v3-32-qwen`	16	`[207114, 211985, 23904, 164425, 201838, 244780, 104844, 11896, 124750, 78043, 11896, 40818, 112321, 155972, 188431, 235269]`
Gemma2-9b-it-random42	`/data2/AlienLM/outputs/Gemma2-9b-it-random42`	16	`[118082, 85241, 174135, 184646, 114599, 58746, 48064, 71689, 147487, 81724, 71689, 163116, 23867, 77693, 75944, 217666]`
Llama3-8B-Instruct-AlienLM-50-all-tokenizer-v3-32-qwenv2	`/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-50-all-tokenizer-v3-32-qwenv2/checkpoint-9306`	16	`[4054, 43251, 60004, 66417, 35331, 114100, 27381, 6380, 39185, 23136, 6380, 109132, 8299, 21649, 82386, 11]`
Llama3-8B-Instruct-AlienLM-ratio-20	`/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-20`	16	`[2460, 6380, 8689, 527, 27083, 26, 1855, 24241, 30235, 374, 24241, 23136, 1202, 1866, 1648, 13]`
Llama3-8B-Instruct-AlienLM-ratio-40	`/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-40`	16	`[8140, 43251, 50556, 527, 27083, 114100, 27381, 6380, 15547, 18115, 6380, 304, 996, 1866, 1648, 13]`
Llama3-8B-Instruct-AlienLM-ratio-60	`/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-60`	16	`[4054, 43251, 8689, 527, 27083, 114100, 27381, 6380, 3070, 40584, 6380, 304, 82321, 16244, 52224, 11]`
Llama3-8B-Instruct-AlienLM-ratio-80	`/data2/AlienLM/outputs/Llama3-8B-Instruct-AlienLM-ratio-80`	16	`[4054, 43251, 60004, 66417, 35331, 26, 27381, 6380, 39185, 48649, 6380, 304, 1202, 1961, 1648, 11]`
Llama3-8B-Instruct-random-42	`/data2/AlienLM/outputs/Llama3-8B-Instruct-random-42/checkpoint-9306`	16	`[109112, 64630, 115549, 88947, 56261, 123661, 98632, 89092, 51180, 49115, 89092, 76847, 27799, 22779, 121871, 33744]`
Qwen25-14b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama	`/data2/AlienLM/outputs/Qwen25-14b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama`	16	`[90633, 42151, 58904, 2804, 90614, 25, 272, 6247, 29135, 282, 6247, 293, 386, 94648, 28766, 11]`
Qwen25-14b-Instruct-random-42	`/data2/AlienLM/outputs/Qwen25-14b-Instruct-random-42`	16	`[26430, 9244, 81484, 117800, 1086, 89842, 70268, 27147, 15693, 31326, 27147, 21062, 67902, 77163, 56354, 63835]`
Qwen25-7b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama	`/data2/AlienLM/outputs/Qwen25-7b-Instruct-AlienLM-50-all-tokenizer-v3-32-llama`	16	`[90633, 42151, 58904, 2804, 90614, 25, 272, 6247, 29135, 282, 6247, 293, 386, 94648, 28766, 11]`
Qwen25-7b-Instruct-random-42	`/data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42`	16	`[26430, 9244, 81484, 117800, 1086, 89842, 70268, 27147, 15693, 31326, 27147, 21062, 67902, 77163, 56354, 63835]`

Uploaded Files

Only serving-time artifacts were staged for upload:

added_tokens.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/added_tokens.json
config.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/config.json
generation_config.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/generation_config.json
merges.txt from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/merges.txt
model-00001-of-00004.safetensors from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/model-00001-of-00004.safetensors
model-00002-of-00004.safetensors from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/model-00002-of-00004.safetensors
model-00003-of-00004.safetensors from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/model-00003-of-00004.safetensors
model-00004-of-00004.safetensors from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/model-00004-of-00004.safetensors
model.safetensors.index.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/model.safetensors.index.json
special_tokens_map.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/special_tokens_map.json
tokenizer.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/tokenizer.json
tokenizer_config.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/tokenizer_config.json
vocab.json from /data2/AlienLM/outputs/Qwen25-7b-Instruct-random-42/vocab.json

Training-only artifacts such as checkpoint-* directories, trainer_state.json, optimizer states, scheduler states, RNG states, logs, caches, and W&B files were intentionally excluded.

Training Data

The model was adapted on the Magpie instruction and reasoning mixture used in the AlienLM experiments:

Magpie-Align/Magpie-Pro-300K-Filtered
Magpie-Align/Magpie-Reasoning-V1-150K

Citation

If you use these weights, please cite the AlienLM paper.

Qwen25-7b-Instruct-random-42

Get help setting up a custom Dedicated Endpoints.

README