This guide walks through our recommended approach for generating high-quality, consistent isometric assets. The audience here is experienced users who already understand the basics of captioning, training, and forging. The focus is on best practices that reliably produce sharp, stylistically uniform assets in the correct isometric perspective.
Check out this quick tutorial video to get started.
Understanding Isometric Assets
At the core, an isometric asset is defined by its angle and perspective. Every asset in your dataset should strictly follow that same isometric view. A few key rules to keep in mind:
Keep alignment consistent. Assets should be centered both vertically and horizontally.
Maintain size uniformity. Assets should scale similarly across the dataset—don’t mix oversized and undersized versions of the same type of object.
Avoid skew or distortion. Any drift from the isometric perspective can confuse the model.
Check tile and floor alignment. Flooring patterns or grids should always line up correctly to reinforce the perspective.
Common mistakes to avoid:
Mixing camera angles (top-down or side view sneaking into the set).
Inconsistent asset sizing (some zoomed in, some zoomed out).
Overly complex or cluttered details that break the clean, isometric silhouette.
Captioning Structure
Captioning is the backbone of a successful isometric training set. Every caption should follow this structure: [view] → [asset description] → [art style description]
View: Always start with something like:
“Isometric view of …”Asset description: A clear, straightforward explanation of the object, character, background, or scene.
Art style description: Use the same art style sentence across every caption to enforce uniformity.
Isometric view of a wooden treasure chest with metal trim. The art style is vibrant and whimsical, featuring smooth 3D modeling, exaggerated proportions, and soft, rounded forms.
Prompt Prefix & Suffix
To extend consistency from training to forging, mirror your captions when setting up the style:
Prompt Prefix = [view]
Prompt Suffix = [art style description]
This way, during forging, you only need to provide the asset description. The prefix and suffix automatically enforce perspective and style.
Maintaining Consistency
Consistency is the difference between a polished style and scattered results. To keep your assets uniform:
Use the same art style description in every caption. Don’t vary it between assets.
Leverage prefix/suffix. This makes it nearly impossible to “forget” the perspective or style.
Ensuring Quality
Sharpness over blur. If assets come out soft or fuzzy, check your training set—are the inputs crisp and high resolution?
Optional Upscaling. Use upscalers to refine outputs when sharper details are needed.
Reference images. If you’re aiming for specific textures or patterns, bring in references to reinforce details.
Advanced Tips
Repeating patterns. For floors or tiles, make sure training images use proper grid alignment and no distortion. This helps the model naturally repeat patterns cleanly.
Lighting & shadows. Keep them consistent across assets—light from one direction, shadows at a predictable angle.
Balance detail vs. simplicity. Avoid overloading assets with micro-details that muddy readability at game scale. Instead, focus on bold shapes and clean contours.
Recommended Workflow
Captioning & Training
Build a dataset of assets all in isometric perspective. Caption them with the [view] → [asset] → [art style] structure.Set Prompt Prefix/Suffix
Prefix = view. Suffix = art style description.Forge
Input only the asset description when generating.Refine & Iterate
If an output isn’t quite right, try small prompt variations while in Forge. You can also use a previously generated asset as an image reference to help guide the AI toward a more consistent result.Optional: Upscale & Finalize
If sharper results are needed, run outputs through an upscaler to clean up details and polish the final asset.
Conclusion
By following this workflow—consistent perspective, structured captioning, prefix/suffix reinforcement, and disciplined iteration—you’ll get the best possible results for isometric asset generation. The key is not just training the model, but training it with clarity and consistency.