Skip to main content

How to train a Single Character Style on Layer

Need more content for a game character? Training a custom style ensures generations will stay high-quality and on brand.

Updated this week

Training a Single Character style in Layer lets you teach the AI to consistently generate a specific character - with the right face, expressions, outfits, poses, and personality. Whether you’re creating a mascot, a main character, or an avatar style, this process helps lock in consistency across generations.


Start with Your Assets

For single character styles, your training images should include a variety of shots and expressions:

  • Headshots and closeups

  • Full body poses

  • Varying facial expressions

  • Different angles

This gives the model a broader understanding of your character’s visual identity - how they look, move, and emote.

When you’re creating prompts later, always include the character’s name to reinforce identity. For example:

“Luna, happy, wearing a Halloween costume”

This kind of input ensures your character stays visually consistent, even as you change outfits, expressions, or settings.

If you're looking for more details on the types of images you should be using to train your model, check out this video.


Why Captions Matter

(and Why Auto-Captions Aren’t Enough)

When you upload images for style training, Layer will automatically caption them, but for single character training, this isn’t ideal. The auto-captioning might miss key details or fail to structure descriptions consistently - which makes it harder for the AI to learn who your character really is.

That’s where LLMs like OpenAI’s ChatGPT or Google’s Gemini come in. They can help you write detailed, structured descriptions and keep things consistent. (They are also MUCH faster than doing it manually)


Use an LLM to Write Descriptions

You’ll want to kick off your LLM session with a clear prompt that defines the task, the format, the tone, and the length.

Here’s the full starting prompt you can use:

LLM Starting Prompt (for Single Character Captioning)

I’m working on training a LoRA for a single character. I will send you images and I need you to describe them to me. I need detailed descriptions that follow a consistent format for each image.

The descriptions should follow this format: [character’s name], [physical appearance], [pose], [expression], [shot type], and the [overall art style]. It’s important to maintain the same format and language across all images for this character. The text should be under 1024 characters, but aim for around 900.

Here are two examples of what a good description looks like:

Example 1:

“Luna is a tall, slender woman with pale skin and long, flowing dark hair. She has striking blue eyes and a sharp jawline, wearing a sleek, black bodysuit with silver accents. She stands confidently with one hand on her hip and the other holding a glowing orb of light. Her expression is calm yet determined, with a slight smirk. Full body shot. Semi-realistic art style with detailed shading, emphasizing sharp contrasts between light and dark tones.”

Example 2:

“Michael is a stout man with fair skin, a round, slightly jowly face, and reddish-brown hair styled with a side part that often appears neatly combed. He consistently wears distinctive pink glasses, which frame his friendly, green eyes. His build is generally stocky, and he has a cheerful, expressive demeanor. Wearing a straw hat, Michael strikes a playful, crouched pose with a joyful smile. He has brown suspenders with a plain white shirt and light blue pants. His energetic gesture makes him look like he is dancing! The art style is a 3D animated style, characterized by smooth, almost plasticky surfaces, bright and saturated colors, and soft, diffused lighting. The overall aesthetic is cartoonish and friendly, with exaggerated proportions and expressive facial features that contribute to a lighthearted, family-friendly appeal. There’s a notable absence of harsh lines or shadows, enhancing the gentle and approachable nature of the visuals.”

Please follow that same structure and level of detail when describing the image I send next. The character’s name is [INSERT NAME].


Comparison: Auto-Caption vs LLM-Enhanced

Let’s compare what Layer’s auto-caption might give you vs. what you can get with a few minutes of help from an LLM.

Auto-caption (Layer):

“Cheerful man with blond hair, wearing a bright pink suit, yellow accents, and glasses, holding a blue box with yellow items; standing pose with a wide smile, 3D cartoon style, full body shot.”

LLM-enhanced caption:

“Fred is a stout man with fair skin, a round face, and neatly combed reddish-brown hair. He wears distinctive pink glasses, framing his friendly, green eyes. He’s dressed in a bright pink suit with a yellow collar and a pink tie, over a blue vest. He sports yellow shoes with pink trim. Standing upright, Fred gestures playfully with his fingers while smiling cheerfully. Full body shot. The art style is 3D animated, with smooth, plasticky surfaces, bright and saturated colors, and soft lighting. The overall aesthetic is cartoonish and friendly, with exaggerated proportions and expressive features.”

That extra attention to detail — and structure — helps the model retain character identity and style much more accurately.

Automatic captioning in Layer

cheerful man with blond hair, wearing a bright pink suit, yellow accents, and glasses, holding a blue box with yellow items; standing pose with a wide smile, 3d cartoon style, full body shot.

LLM enhanced captioning

Fred is a stout man with fair skin, a round face, and neatly combed reddish-brown hair. He wears distinctive pink glasses, framing his friendly, green eyes. He's dressed in a bright pink suit with a yellow collar and a pink tie, over a blue vest. He sports yellow shoes with pink trim. Standing upright, Fred gestures playfully with his fingers while smiling cheerfully. Full body shot. The art style is 3D animated, with smooth, plasticky surfaces, bright and saturated colors, and soft lighting. The overall aesthetic is cartoonish and friendly, with exaggerated proportions and expressive features.

Yes, It’s a Bit Tedious (But Worth It)

We know this process can be slow — manually writing or refining captions, even with AI help, takes time. We’re actively working on product updates to support single character training better (as of April 2025). But right now, this is the best method to get great results.

Also keep in mind: even LLMs make mistakes. Sometimes they won’t follow your formatting exactly. You may need to lightly edit or re-prompt to stay consistent.


Adding Example Prompts

Once you’ve finished your captions and uploaded your assets, you’ll reach the example prompts step.

While these don’t impact the actual training, they do give you a quick, high-level preview of what the style might generate — so it’s worth doing thoughtfully.

You can reuse the same LLM session to generate prompts. Here’s what to say:

“I’m at the final stage of a LoRA training, and I need to generate 5 new ideas for [insert asset type here—e.g., characters, in-game items, backgrounds, etc.]. These new ideas should be similar to the assets already in the training set, but they should introduce some variation, like different character poses or slight changes in design. The descriptions must follow the same format and consistency we’ve used throughout the project, with each description around 900 characters, not exceeding 1024.”

Once the LLM gives you 5 new ideas, copy and paste your favorite 3–5 as your example prompts, and you’re good to go.


While It’s Training: Set Up Prompt Prefix + Suffix

While your style is training, take a few minutes to set up your Prompt Prefix + Suffix.

This helps guide how the model behaves when you generate assets — for example:

  • Always inserting the character’s name at the start

  • Always appending a style description like “3D cartoon style with soft lighting”

Prefixes and suffixes are powerful ways to lock in tone, naming, and consistency once the model is ready to use.

We’ll explore prefix/suffix best practices more in a follow-up article.

Did this answer your question?