harsh0024
HiDream-O1-Image-Dev-2604
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: mitModels
| Name | Script | Inference Steps | HuggingFace Repo |
|---|---|---|---|
| HiDream-O1-Image-Dev-2604 | inference.py | 28 | 🤗 HiDream-O1-Image-Dev-2604 |
| Prompt Agent 2604 | prompt_agent_v2.py | — | 🤗 HiDream-ai/Prompt-Refine |
Installation
- Clone this repository:
bash
git clone https://github.com/HiDream-ai/HiDream-O1-Image.gitcd HiDream-O1-Imagegit checkout dev
- Install the required dependencies:
bash
pip install -r requirements.txt
Note on
flash-attn. We highly recommend installingflash-attnfor optimized attention computation. If you do not (or cannot) installflash-attn, you must editmodels/pipeline.pyline 291 and change"use_flash_attn": Trueto"use_flash_attn": False— otherwise inference will fail to import the kernel.
Reasoning-Driven Prompt Agent
HiDream-O1-Image ships with a Reasoning-Driven Prompt Agent (prompt_agent_v2.py) that explicitly reasons through layout, subject attributes, physical logic, and text-rendering details, then rewrites a raw user instruction into a self-contained English prompt. Feed its output into inference.py for best results on intricate, reasoning-heavy requests.
The agent talks to an OpenAI-compatible endpoint serving HiDream-ai/Prompt-Refine via vLLM.
Step 1 — Download the refiner weights
bash
huggingface-cli download HiDream-ai/Prompt-Refine \--local-dir HiDream-ai/Prompt-Refine
Step 2 — Start the vLLM server
bash
bash start_vllm_server.sh
This launches HiDream-ai/Prompt-Refine on http://localhost:8000/v1.
Step 3 — Run the refiner
bash
python prompt_agent_v2.py \--prompt "A vintage aviation poster featuring a bright red biplane cruising over rolling farmlands. Bold blocky text at the bottom promises adventure in the friendly skies."
By default the script targets http://localhost:8000/v1 and HiDream-ai/Prompt-Refine; override with --base_url or --model_id if you serve the model elsewhere. The same module also exposes a reusable refine_prompt(prompt, model_id=..., base_url=...) function used by app.py.
Usage
A CUDA-capable GPU is required for inference. The examples below use the undistilled model (--model_type full); see the last subsection for running the same tasks with the distilled model (--model_type dev).
1. Text-to-Image Generation
Generate an image from a text prompt:
bash
python inference.py \--model_path /path/to/HiDream-O1-Image-Dev-2604 \--prompt "A vintage aviation poster depicting a bright red biplane cruising over rolling farmlands under a partly cloudy sky, with saturated colors and an aged paper texture. A red biplane with two sets of wings and a radial engine is positioned in the upper center of the image, flying toward the right. A pilot with light skin, wearing a brown flight helmet, goggles, and a brown jacket, is visible in the open cockpit. The biplane has black wheels with red hubs and a spinning propeller. Below, the landscape consists of rolling fields in various shades of green, yellow, and brown, divided by dirt roads and scattered with small houses, including a red barn, a brown house, and a white house. In the background, a line of green trees separates the fields from distant hills under a blue sky with white clouds. The poster has a textured, aged paper border with visible creases and discoloration. At the bottom, the text \"ADVENTURE IN THE FRIENDLY SKIES\" is displayed in large, bold, dark brown capital letters across two lines on a light beige background." \--output_image results/t2i.png \--height 2048 \--width 2048
License
The code in this repository and the HiDream-O1-Image-Dev-2604 models are licensed under MIT License.
Citation
bibtex
@article{hidreamolimage,title={HiDream-O1-Image: A Natively Unified Image Generative Foundation Model with Pixel-level Unified Transformer},author={Cai, Qi and Chen, Jingwen and Gao, Chengmin and Gong, Zijian and Li, Yehao and Mei, Tao and Pan, Yingwei and Peng, Yi and Qiu, Zhaofan and Yao, Ting and Yu, Kai and Zhang, Yiheng and others},journal={arXiv preprint arXiv:2605.11061},year={2026}}
Model provider
harsh0024
Model tree
Base
this model
Modalities
Input
Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information