Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

License & Attribution

This mirror redistributes Google DeepMind's Gemma 4 weights under Apache 2.0, the original license. All credit for the model goes to Google DeepMind.

This is NOT a derivative work — every file in this repo is byte-identical to the canonical source as of the mirror date below. Use the canonical Google repo when available; this mirror exists as a redundancy.

Verification

All small files (config, tokenizer, README, etc.) were SHA-256 verified bit-identical to the canonical source at the time of upload. The 10.2 GB model.safetensors is content-addressed by HF LFS and the ETag matched.

FieldValue
Canonicalgoogle/gemma-4-E2B-it
Mirrorxaitalk/gemma-4-E2B-it-mirror
Mirror date2026-05-28
Total size~10.3 GB
Files mirrored9 (all)

How to use

This mirror is a drop-in replacement for the canonical repo:

python

from transformers import AutoModelForMultimodalLM, AutoProcessor
# Either of these works:
model = AutoModelForMultimodalLM.from_pretrained("google/gemma-4-E2B-it")
model = AutoModelForMultimodalLM.from_pretrained("xaitalk/gemma-4-E2B-it-mirror")

Inside the xaitalk framework, xaitalk.hub.ensure_model("gemma-4-e2b-it") tries the canonical first and falls back to this mirror automatically.

Why xaitalk mirrors this model

xaitalk is a cross-framework XAI library that supports Gemma 4 in PyTorch, TensorFlow, and JAX with bit-equivalent attribution methods. Mirroring the canonical weights guarantees long-term reproducibility of the cross-framework benchmarks shipped with the library.

Cross-framework results for Gemma 4 E2B (8 methods × 2 modalities, fp32, A100 80GB, 2026-05-28):

Modalityseqmethodsmin r
image+text2858/8 PASSr ≥ 0.9999998
audio+text418/8 PASSr ≥ 0.9999999

Mirror policy

Per the xaitalk hub-mirror policy, third-party artifacts ≤ 2.5 GB are mirrored by default. Gemma 4 E2B at ~10 GB is an explicit exception because it's the flagship "any-to-any" thinking model in xaitalk's coverage matrix, and license compliance (Apache 2.0) permits redistribution with attribution.

— xaitalk

Model provider

xaitalk

Model tree

Base

google/gemma-4-E2B-it

Fine-tuned

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today