Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

Overview

GRPO RPG System 3.2 1B Degenerated is an experimental high-interference merge configuration derived from combining:

  • Ultimate-RPG.System-3.2-1B (narrative RPG base)
  • jtatman/llama3.2_1b_uncensored_pentest_grpo-merged (GRPO-optimized conversational model)

with a heavily GRPO-weighted interpolation factor (t = 0.60).

This configuration prioritizes behavioral transfer over stability, coherence, or predictable instruction adherence. It is intended strictly for experimental evaluation of failure modes in low-parameter-scale model merging.


Architecture

  • Base architecture: Llama 3.2 1B
  • Parameters: 1B
  • Merge method: SLERP
  • Merge coefficient (t): 0.60
  • Precision: FP16
  • GRPO influence: high (~60%)
  • RPG System influence: reduced (~40%)

Intended Purpose

This configuration is not intended for production or general use.

It is designed for:

  • Stress-testing model merging boundaries.
  • Observing degradation thresholds in small-scale LLMs.
  • Evaluating coherence collapse under high interpolation weights.
  • Studying interference between divergent fine-tuning objectives.

Expected Behavior

At this interpolation level, outputs may exhibit:

  • Noticeable loss of narrative stability.
  • Increased inconsistency in persona or roleplay structure.
  • Overfitting to dominant behavioral priors from the GRPO model.
  • Reduced long-context coherence.
  • Occasional formatting or token-level instability.
  • Divergent responses depending on prompt phrasing sensitivity.

Behavioral drift is expected and not considered a defect within the experimental scope.


Known Failure Modes

  • Semantic drift across multi-turn conversations.
  • Repetitive or unstable response structures.
  • Partial collapse of role consistency.
  • Overreaction to ambiguous prompts.
  • Abrupt tonal shifts without contextual grounding.
  • Degradation into generic or loosely structured outputs under load.

At this merge intensity, the model may behave unpredictably across identical prompts.


Stability Warning

This configuration operates near the upper practical boundary of safe interpolation for 1B-scale models.

Further increases beyond this threshold are likely to produce:

  • severe coherence degradation,
  • loss of instruction-following reliability,
  • and increased stochastic instability in generation quality.

Recommended Usage Conditions

If used at all:

  • Temperature: 1.1 – 1.3
  • Top-p: 0.95 – 0.99
  • Min-p: 0.05 – 0.08
  • Repetition penalty: 1.05 – 1.10
  • Context window: 4K–8K preferred

Summary

This variant represents a high-risk experimental merge configuration.

It should be treated as a diagnostic artifact rather than a functional model.

Expect instability. Expect inconsistency. Expect degradation in exchange for exploratory behavioral variance.


Version

GRPO RPG System 3.2 1B Degenerated (t=0.60)

High-interference experimental merge — maximum GRPO dominance within SLERP constraints.


Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

yaml

# Author: Dr. Novaciano
# Objective: GRPO RPG Unethic 3.2 1B AI Model
# =========================================================
# PROJECT: GRPO RPG System 3.2 1B - "Degenerated"
# =========================================================
models:
- model: NovaCorp/Ultimate-RPG.System-3.2-1B # Experimental viral strain neural imprint
- model: jtatman/llama3.2_1b_uncensored_pentest_grpo-merged # Baseline cognitive template, "safe mode"
merge_method: slerp # Spherical Linear Interpolation to preserve extreme viral traits smoothly
base_model: NovaCorp/Ultimate-RPG.System-3.2-1B # Anchor model for stable latent space
dtype: bfloat16 # Memory-efficient precision, minimal loss in viral feature fidelity
parameters:
t: 0.60
normalize: false
rescale: true
rescale_factor: 1.12
memory_efficient: true
low_cpu_mem_usage: true
layer_range:
- value: [4, 22]
tie_word_embeddings: false
tie_output_embeddings: false

Model provider

NovaCorp

Model tree

Base

jtatman/llama3.2_1b_uncensored_pentest_grpo-merged

Base

NovaCorp/Ultimate-RPG.System-3.2-1B

Merged

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today