asphyxiation112

gemma4-it-kaltsit-lora

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Project Overview

This project fine-tunes google/gemma-4-E4B-it with LoRA to generate Chinese dialogue in the style of Kal'tsit from Arknights. The goal is not to train a full model from scratch, but to adapt a large instruction-tuned base model with a lightweight PEFT adapter so that it better follows a specific character voice: calm, restrained, analytical, and context-aware.

The project covers the full workflow from story data collection, text cleaning, character-specific dataset construction, prompt design, SFT formatting, LoRA training, validation monitoring, test generation, and optional adapter merging. The main training workflow is documented in gemma4_emotion_lora_arknights.ipynb.

Data Collection and Processing

The raw data was collected from Arknights story pages through ASTR story reader URLs. The URL list is stored in urls.txt and contains 263 story links. The data collection script is download_arknights_story.py.

The collection pipeline works as follows:

Parse the language code and story file path from each ASTR page URL.
Convert the page route into a raw JSON URL from the ArknightsStoryJson repository, using the pattern zh_CN/gamedata/story/{story_path}.json.
Download each story JSON with requests.
Read storyList and extract attributes.name as the speaker and attributes.content as the dialogue text.
Mark lines without a speaker as narration, and preserve Sticker text as on-screen text.
Clean color tags, HTML-like tags, escaped newlines, and redundant whitespace.
Save each story as both readable .txt and structured .jsonl files.

Each structured line uses the following format:

json
{"speaker": "凯尔希", "text": "dialogue text"}

The character dataset is then built with build_character_dataset.py:

Load all .jsonl files from the result folder in sorted order.
Add source file names and line indices to every record for traceability.
Select only records whose speaker exactly matches 凯尔希.
Filter very short or very long responses. The default range is 2 to 300 Chinese characters.
Use the previous 3 story lines as the dialogue context for each target response.
Convert each sample into an instruction/input/output SFT record.
Shuffle with seed 42 and split the dataset into train, validation, and test sets with an 80/10/10 ratio.

Final dataset size:

Table with columns: Split, Samples
Split	Samples
Train	2680
Validation	335
Test	335
Total	3350

Prompt Design

Each training sample is converted into a chat-style prompt and assistant completion. The notebook uses the official tokenizer chat template to format the data for Gemma.

System prompt:

text
你正在扮演《明日方舟》中的凯尔希。
请根据用户给出的上下文进行回复。

要求：
1. 只输出凯尔希的回复内容。
2. 不要解释你为什么这样回复。
3. 不要输出“凯尔希：”这个角色名前缀。
4. 语气应冷静、克制、理性，句子可以偏长。
5. 回复应尽量贴合上下文，而不是机械复述已有台词。

User prompt template:

text
请根据上下文，以凯尔希的说话风格进行回复。

上下文：
[Character A]：previous line
[Character B]：previous line
[凯尔希]：previous line

The assistant completion is the target Kal'tsit response. In other words, the model is trained to generate the next character-style reply from context, rather than to classify text into labels.

Fine-Tuning Setup

The base model is google/gemma-4-E4B-it, downloaded from ModelScope and loaded from a local directory. Training uses single-GPU BF16 LoRA fine-tuning with PEFT and TRL SFTTrainer. The notebook loads the model with AutoModelForCausalLM and saves the final adapter and tokenizer.

Core training configuration:

Table with columns: Item, Value
Item	Value
Base model	`google/gemma-4-E4B-it`
Fine-tuning method	LoRA / PEFT
Trainer	TRL `SFTTrainer`
LoRA rank	8
LoRA alpha	16
LoRA dropout	0.05
Target modules	`all-linear`
Trainable parameters

Training workflow:

Load local JSONL files into a DatasetDict.
Convert instruction/input/output records into chat-style prompt/completion examples.
Load the Gemma 4 tokenizer and base model.
Configure LoRA and verify that trainable parameters are correctly attached.
Run supervised fine-tuning for 2 epochs with SFTTrainer.
Evaluate on the validation set every 50 steps and save checkpoints.
Generate responses for all 335 test examples.
Save the LoRA adapter, tokenizer, training metrics, and test generations.

Results

Training completed successfully at global_step=336, and the best checkpoint was checkpoint-336, which is also the final step. Total training time was about 1993 seconds, or 33.2 minutes.

Validation metrics:

Table with columns: Step, Eval loss, Eval token accuracy
Step	Eval loss	Eval token accuracy
50	3.1132	0.4541
100	2.9867	0.4716
150	2.9440	0.4739
200	2.9127	0.4758
250	2.8917	0.4769
300	2.8853	0.4801

The final test generation file is kaltsit_test_generations.csv, with 335 generated responses. There were no empty outputs and no generated responses with the unwanted 凯尔希： role prefix. The average target response length was 23.42 Chinese characters, while the average generated response length was 16.53 characters.

Qualitatively, the model learned part of the target style, especially restrained phrasing, concise responses, and role-prefix control. However, the test set also shows limitations. The response ...... appears 38 times, and 20 generated responses are 4 characters or shorter. This suggests that the LoRA adapter is valid and learned useful stylistic behavior, but it is not yet a high-quality story continuation model.

My interpretation of the result:

The training run completed successfully, and the adapter files are valid.
The language model LoRA weights were updated.
The model shows measurable style-control behavior.
Contextual reasoning and narrative continuation still need improvement.
Since the training data is text-only, this experiment should be viewed as Chinese character-style text fine-tuning, not multimodal capability fine-tuning.

Repository Artifacts

Main artifacts:

Table with columns: File, Description
File	Description
`adapter_model.safetensors`	LoRA adapter weights
`adapter_config.json`	PEFT/LoRA configuration
`tokenizer.json` / `tokenizer_config.json`	Tokenizer files
`chat_template.jinja`	Gemma 4 chat template
`train_metrics.json`	Training summary metrics

This repository currently contains the LoRA adapter, not a fully merged model. To deploy a merged model, the matching google/gemma-4-E4B-it base model must be loaded, the adapter must be attached with PeftModel.from_pretrained, and the weights can then be merged with merge_and_unload(). The full processor should also be saved with the merged model.

Future Improvements

Apply LoRA only to the language model modules instead of all linear modules, since the dataset is text-only.
Filter or downweight very short target responses such as ...... and ——.
Add a small manually curated evaluation set for character consistency, contextual relevance, and naturalness.
Use longer context windows or scene-level samples to improve narrative continuity.
Compare multiple LoRA configurations, including different ranks, target modules, and data filtering strategies.

Model provider

asphyxiation112

Model tree

Base

google/gemma-4-E4B-it

Adapter

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Project Overview

Data Collection and Processing

The collection pipeline works as follows:

Parse the language code and story file path from each ASTR page URL.
Convert the page route into a raw JSON URL from the ArknightsStoryJson repository, using the pattern zh_CN/gamedata/story/{story_path}.json.
Download each story JSON with requests.
Read storyList and extract attributes.name as the speaker and attributes.content as the dialogue text.
Mark lines without a speaker as narration, and preserve Sticker text as on-screen text.
Clean color tags, HTML-like tags, escaped newlines, and redundant whitespace.
Save each story as both readable .txt and structured .jsonl files.

Each structured line uses the following format:

json
{"speaker": "凯尔希", "text": "dialogue text"}

The character dataset is then built with build_character_dataset.py:

Load all .jsonl files from the result folder in sorted order.
Add source file names and line indices to every record for traceability.
Select only records whose speaker exactly matches 凯尔希.
Filter very short or very long responses. The default range is 2 to 300 Chinese characters.
Use the previous 3 story lines as the dialogue context for each target response.
Convert each sample into an instruction/input/output SFT record.
Shuffle with seed 42 and split the dataset into train, validation, and test sets with an 80/10/10 ratio.

Final dataset size:

Table with columns: Split, Samples
Split	Samples
Train	2680
Validation	335
Test	335
Total	3350

Prompt Design

Each training sample is converted into a chat-style prompt and assistant completion. The notebook uses the official tokenizer chat template to format the data for Gemma.

System prompt:

text
你正在扮演《明日方舟》中的凯尔希。
请根据用户给出的上下文进行回复。

要求：
1. 只输出凯尔希的回复内容。
2. 不要解释你为什么这样回复。
3. 不要输出“凯尔希：”这个角色名前缀。
4. 语气应冷静、克制、理性，句子可以偏长。
5. 回复应尽量贴合上下文，而不是机械复述已有台词。

User prompt template:

text
请根据上下文，以凯尔希的说话风格进行回复。

上下文：
[Character A]：previous line
[Character B]：previous line
[凯尔希]：previous line

The assistant completion is the target Kal'tsit response. In other words, the model is trained to generate the next character-style reply from context, rather than to classify text into labels.

Fine-Tuning Setup

Core training configuration:

Table with columns: Item, Value
Item	Value
Base model	`google/gemma-4-E4B-it`
Fine-tuning method	LoRA / PEFT
Trainer	TRL `SFTTrainer`
LoRA rank	8
LoRA alpha	16
LoRA dropout	0.05
Target modules	`all-linear`
Trainable parameters

Training workflow:

Load local JSONL files into a DatasetDict.
Convert instruction/input/output records into chat-style prompt/completion examples.
Load the Gemma 4 tokenizer and base model.
Configure LoRA and verify that trainable parameters are correctly attached.
Run supervised fine-tuning for 2 epochs with SFTTrainer.
Evaluate on the validation set every 50 steps and save checkpoints.
Generate responses for all 335 test examples.
Save the LoRA adapter, tokenizer, training metrics, and test generations.

Results

Training completed successfully at global_step=336, and the best checkpoint was checkpoint-336, which is also the final step. Total training time was about 1993 seconds, or 33.2 minutes.

Validation metrics:

Table with columns: Step, Eval loss, Eval token accuracy
Step	Eval loss	Eval token accuracy
50	3.1132	0.4541
100	2.9867	0.4716
150	2.9440	0.4739
200	2.9127	0.4758
250	2.8917	0.4769
300	2.8853	0.4801

My interpretation of the result:

The training run completed successfully, and the adapter files are valid.
The language model LoRA weights were updated.
The model shows measurable style-control behavior.
Contextual reasoning and narrative continuation still need improvement.
Since the training data is text-only, this experiment should be viewed as Chinese character-style text fine-tuning, not multimodal capability fine-tuning.

Repository Artifacts

Main artifacts:

Table with columns: File, Description
File	Description
`adapter_model.safetensors`	LoRA adapter weights
`adapter_config.json`	PEFT/LoRA configuration
`tokenizer.json` / `tokenizer_config.json`	Tokenizer files
`chat_template.jinja`	Gemma 4 chat template
`train_metrics.json`	Training summary metrics

Future Improvements

Apply LoRA only to the language model modules instead of all linear modules, since the dataset is text-only.
Filter or downweight very short target responses such as ...... and ——.
Add a small manually curated evaluation set for character consistency, contextual relevance, and naturalness.
Use longer context windows or scene-level samples to improve narrative continuity.
Compare multiple LoRA configurations, including different ranks, target modules, and data filtering strategies.

gemma4-it-kaltsit-lora

Get help setting up a custom Dedicated Endpoints.

README

Project Overview

Data Collection and Processing

Prompt Design

Fine-Tuning Setup

Results

Repository Artifacts

Future Improvements

Explore FriendliAI today

README

Project Overview

Data Collection and Processing

Prompt Design

Fine-Tuning Setup

Results

Repository Artifacts

Future Improvements