The AIGC vocabulary

The hubStudio AIGC Glossary

Plain-English definitions of the AI content terms brand teams keep hearing in 2026. Tokens, LoRA, diffusion, RAG, and a few you can stop pretending to understand.

30 Terms defined

4 Models for 2026

9 Minute read

Last reviewed: May 19, 2026. Next review: August 2026.

A printer's wooden type case holding loose letterpress capitals in warm afternoon light. — 30 terms, set A to Z

Why this glossary exists

Vague brief in, vague output out.

The brand teams getting real value from AI content right now share one habit: they speak the language. They know what a LoRA is when their studio mentions one. They get why a fine-tuned model behaves more on-brand than a stock one. They ask sharper questions and they get sharper work back. It's not complicated.

This is the vocabulary you'll keep hearing in any AI content meeting in 2026. Definitions are short. Examples are concrete. And where a term has a useful nuance for brand teams, we've called it out.

30 terms

AIGC (AI-Generated Content)

What it is. Content made by a generative AI system from a prompt. Copy, images, video, audio, 3D, code, basically anything a brand might ship.

Why it matters. The term comes out of Chinese tech writing, where it sits alongside UGC (user-generated content) and PGC (professionally generated content). For a marketing team, AIGC is a production method, not a single tool.

Picture this. You type "summer skincare hero, beach pastels, glass bottle on sand," and twenty on-brand visuals land in your inbox before lunch. The skill is no longer finding an asset. It's deciding which one to use.

AI Avatar / Digital Human

What it is. A synthetic on-camera presenter, generated and animated by AI. Tools like Synthesia and HeyGen turn a script into a video of a realistic person delivering your lines in any of dozens of languages.

Why it matters. Brands use avatars for product explainers, internal training, and localized advertising where filming forty versions would be impossible. The cleanest setups use licensed actor likenesses rather than purely synthetic faces.

The payoff. A brand presenter who never travels, never reshoots, and never forgets the script. We've shipped these for clients who needed 14 market versions of the same explainer in under a week.

AI Brand Ambassador / Virtual Influencer

A fully synthetic persona built, owned, or licensed by a brand. Lil Miquela, Imma and Aitana López are the names you've probably seen. Each one has millions of followers and real commercial contracts. They never age. They never miss a shoot. And they can host a hundred localized campaigns in a single week.

The serious version of this is always disclosed and always built with rights-cleared likenesses. For global brands, the appeal is consistency across markets that human talent rarely matches. The flop version of this (poorly disclosed, badly designed, badly written) is what gives the whole category a bad name.

AI Voice / Voice Cloning

A digital voice built from a few minutes of recorded audio, able to read any new script. ElevenLabs, Resemble, and PlayHT are the names you'll hear most.

Brands dub video into new languages overnight, narrate explainers in a consistent voice, or keep a founder audible in ads when the founder isn't around to record. One warning though: consent and contracts are not optional. Using a real person's voice without their written permission is a legal problem long before it becomes a brand problem.

Artificial Intelligence (AI)

The broadest term on this list. AI is the field of building software that does things normally tied to human reasoning: language, perception, decision-making, prediction. Generative AI is one part of it. Older AI you already use every day includes spam filters, recommendation engines, and search ranking.

A useful mental model: AI is the discipline. Machine learning is the most common approach inside it. Generative AI is the part that makes new content. Knowing the wider field helps you tell the real claims from the marketing dust.

C2PA / Content Credentials

C2PA stands for Coalition for Content Provenance and Authenticity. It's an open standard that attaches a tamper-evident label to a file, recording who made it, when, with which tool, and whether AI was involved. Adobe Firefly, ChatGPT image generation, and several professional cameras already sign their files this way.

For brands, content credentials are how you prove an asset is authentic, or how you declare openly that a visual is AI-assisted. Expect platforms and regulators to make this standard practice within a few years. Probably sooner than that.

Diffusion Model

A diffusion model creates images by starting with pure visual noise and gradually cleaning it up, step by step, until a clear picture appears. It's the technology behind Stable Diffusion, DALL·E, Midjourney, and most professional image tools.

Picture an old TV screen of static slowly resolving into a perfume bottle on marble. That's diffusion in one image.

Diffusion gives you strong control over style, lighting, and composition, which is why it's become the default engine for brand-grade visual work. Most of what hubStudio produces on the image side is running diffusion in the background. After three years of using this method daily, I still find it the most teachable to creative teams: prompt, sample, refine, repeat.

eCom Pack

What it is. The bundle of visuals a product needs to live on a retail page. White-background packshot, lifestyle shot, infographic, hero banner, mobile crop, sometimes a short video.

Why it matters. Traditionally a five-figure photo shoot per SKU. With AIGC, a studio delivers a full pack in days, on-brand and pre-approved for each marketplace's specs.

Where the math works. For brands with thousands of SKUs across regions, this is where AI content pays for itself fastest. We've seen 60% cost reductions on packs of 500 SKUs or more.

Fine-Tuning

What it is. Continuing to train an existing model on a curated set of your own material, so it sounds more like your brand or knows your products better.

Why it matters. A retailer might fine-tune a language model on five years of approved campaign copy. The model keeps its general skills, but its defaults shift toward your voice and house rules.

Cost shape. More than prompting. Less than building a model from scratch. It pays off when volume is high and brand consistency really matters. (See also: LoRA.)

Foundation Model

A very large, general-purpose AI model trained on broad data so it can be adapted to many tasks. GPT, Claude, Gemini, and Stable Diffusion are foundation models. Think of them as raw studio equipment: powerful out of the box, but they shine once tuned for a specific brief.

Many AIGC products you see are foundation models wrapped in a polished interface. The choice of base model decides quality, cost, and brand fit before a single prompt gets written.

GAN (Generative Adversarial Network)

Two neural networks paired against each other. One creates fake content, the other tries to spot it. They keep training together until the fakes are convincing.

GANs powered the first wave of photoreal AI faces and synthetic product shots. They've mostly been joined or replaced by diffusion models for new work, but you'll still meet them in face swaps, super-resolution, and some video pipelines. Mostly the term is useful when a vendor mentions "GAN-based" output and you want to know if that's current or legacy thinking.

Generative AI (GenAI)

What it is. The family of models that creates new content rather than only analyzing existing data. It studies patterns in huge collections of examples, then produces fresh outputs in a similar style.

Why it matters. GenAI powers ChatGPT, Midjourney, Sora, Suno, and most of the tools your team is starting to use.

The shift for marketers. The basic question changes from "where do we find an asset" to "what do we want to imagine first." A copywriter asks GenAI for fifteen variants of an email subject line, then takes two into an A/B test.

Halfway through. The remaining terms are the ones most often misused in pitch decks. Worth reading these even if you skipped a few above.

Hallucination

A hallucination is when a generative model invents something that sounds confident but is wrong. A language model may cite a study that doesn't exist. An image model may give a watch the wrong number of hands. We've all seen the six-fingered hand.

It happens because these models predict plausible content, not verified facts. The fix is workflow, not blind trust. Pair the model with retrieval, human review, or a fact-checking pass before anything ships out the door.

Human-in-the-Loop

What it is. The principle that a person reviews and steers AI output at the moments that matter most: brief, selection, edit, approval.

Why it matters. It's the difference between AIGC that ships on brand and AIGC that embarrasses one.

How we work. At hubStudio, every generated asset passes through a creative director before it leaves the studio. The AI gives us speed and scale. The human keeps taste, story, and accountability. It's the part of the workflow nobody can shortcut, and frankly we don't want to.

Inference

Inference is the moment a trained model actually does its job. It takes your prompt and produces output. Training is the long, expensive learning phase. Inference is the millisecond-by-millisecond use that happens every time someone clicks "generate."

Most of what brands pay for in AIGC is inference, billed per token, per image, or per second of video. GPU capacity at inference time decides how fast a studio can deliver during peak campaign weeks. It's why we maintain our own GPUs rather than rent on demand.

Large Language Model (LLM)

What it is. A foundation model that has read enormous amounts of text and learned to predict the next word in any sequence. That single skill, applied at scale, becomes writing, summarizing, translating, and answering questions.

The names you'll meet. ChatGPT, Claude, and Gemini are the best-known.

Use cases. LLMs draft copy, transcribe interviews, build briefs, and turn product specs into landing-page text. They're very confident writers, which is exactly why every output gets read before it goes out.

LoRA (Low-Rank Adaptation)

What it is. A lightweight fine-tuning technique that changes only a small percentage of a model's parameters. Instead of retraining the whole network, you train a small "patch" that slots in at inference time.

Why it matters. This is how a studio captures a specific art style, a product silhouette, or a brand character and applies it to thousands of new visuals.

Practical note. LoRA files are small, a few megabytes, which makes them easy to swap, version, and share with a partner agency. Most of the on-brand AIGC you admire online has a LoRA somewhere in the pipeline.

Machine Learning (ML)

How most modern AI actually learns. Instead of being programmed with strict rules, a model studies examples and figures out the patterns on its own. The more relevant the examples and the cleaner the data, the better the result.

For a brand, an image model trained on luxury product photography will generate cleaner luxury shots than one fed random web images. ML sits underneath nearly every AIGC tool your team will touch.

Multimodal AI

What it is. A model that handles more than one type of input or output. Text plus image. Voice plus video. Sometimes all of them at once.

Picture this. You upload a product photo, attach the brand book, and ask for three Instagram captions in your tone. The same model does both jobs.

Why it matters. Modern flagships like GPT-4o, Gemini, and Claude are multimodal. For brand teams this collapses several tools into one workflow.

Neural Network

A stack of mathematical layers loosely inspired by how brain cells connect. Each layer transforms the input a little, and after enough layers the model can recognize a cat, translate a sentence, or design a packshot.

Neural networks are the engine inside LLMs, diffusion models, and GANs. You don't need the math to use them. But the term is useful when an engineer tells you "the network is too small for that job."

Prompt / Prompt Engineering

What it is. A prompt is the written instruction you give a generative model. Prompt engineering is the craft of writing it well, with the right context, references, constraints, and tone.

The difference. A vague "make me a poster" gets you a vague poster. A clear "60×80 cm key visual, beauty product centered on soft pink seamless paper, top-down camera, brand magenta accent" gets you something usable.

What senior teams do. They treat prompt libraries the same way they treat brand guidelines. Documented. Versioned. With somebody who actually owns them.

RAG (Retrieval-Augmented Generation)

What it is. A method that connects a generative model to a private knowledge source, such as your product catalog, brand book, or policy documents. The system retrieves relevant snippets first, then asks the model to answer using them.

Why it matters. The result is grounded in your own data rather than the model's general memory. Fewer hallucinations, more accurate answers.

Use cases. Branded chatbots, internal research copilots, sales assistants that quote real product specs rather than guesses.

Style Transfer

Style transfer takes the look of one piece of art (the brushstrokes, the palette, the lighting) and applies it to a different subject. A reference photo plus the words "in the style of a 1950s travel poster" can shape an entire campaign in that aesthetic.

It's how brands keep visual identity consistent across thousands of localized assets. Combined with a LoRA, one campaign visual produces hundreds of market-specific variations without losing its look.

Synthetic Media

The umbrella term. Any content generated or substantially altered by AI counts: images, video, voice, music. Deepfakes are one subset. AI avatars and AI music are others. The category is growing fast, which is why most social platforms now require AI-edited content to be labeled.

For brand teams the practical job is simple: treat synthetic assets like any other production. Get them licensed. Get them approved. Make sure they're traceable.

Text-to-Image

What it is. The use case where you type a description and the model produces a picture. The prompt is your art direction: "Editorial portrait, soft window light, navy linen suit, 35mm film grain."

What drives quality. The model, the prompt, and any reference images you feed in.

Where it pays off. Moodboards, A/B creative tests, full campaign visuals. Midjourney, DALL·E, Adobe Firefly, and Stable Diffusion are the engines you'll meet most often.

Text-to-Video

What it is. A written prompt produces a short clip. Tools like OpenAI's Sora, Runway, Pika, and Kling can now generate sequences of several seconds with believable motion, camera moves, and lighting.

The honest take. The output is still short. But for social ads, hero shots, and product B-roll, that's exactly the length most briefs need.

Look ahead. Expect noticeable improvement every quarter. Frame rate, resolution, and physical realism all climbing.

Token

The smallest piece of input a model reads or writes. For text, a token is roughly three-quarters of an English word. Models are billed by tokens, count their context windows in tokens, and treat everything as numbers behind the scenes.

Knowing this helps with cost ("this brief is about 4,000 tokens") and with limits ("the model can hold 200,000 tokens at once"). For images, tokens come from small patches. For audio, from short clips. It's the unit of currency for the whole industry, basically.

Training Data

Training data is the collection of examples a model studies during learning. For an LLM it might be billions of web pages and books. For a brand-tuned image model it might be a few hundred clean product photographs.

The quality, rights, and bias of that dataset set the ceiling for everything the model can ever do. The cliché everyone repeats here is true:

Garbage in, garbage out.
Attributed to IBM programmer George Fuechsel, 1960s.

Transcreation

Creative translation. Rather than translate copy word for word, transcreation rewrites the message so it lands with the same emotional weight in a new culture or language. A French slogan that depends on a pun becomes a different but equally clever line in Japanese.

AIGC speeds up transcreation by drafting localized versions at scale, which a native copywriter then refines. The faster a brand wants to enter new markets, the more useful this becomes. We do a lot of this for clients moving between Chinese and Western markets, and it's never as simple as it looks on paper.

Transformer

The model architecture behind almost all modern generative AI, named after the 2017 paper that introduced it:

Attention Is All You Need.
Vaswani et al., Google Brain (2017)

A transformer reads sequences (words, image patches, audio frames) and learns which parts of the sequence relate to which. That "attention" mechanism is what lets an LLM remember a long brief or stay on brand across a thousand-word article. When somebody says a model has "billions of parameters," they usually mean a transformer of a certain size.

B, J, K, O, Q, U, V, W, X, Y, Z have no entries yet. Suggest a term: hello@hubstudio.com.

Verified May 2026

The models you'll actually brief in 2026

Models in this category change fast. We refresh this section quarterly.

Image models

ChatGPT Image 2

OpenAI · gpt-image-2 · April 2026

OpenAI's current image model (gpt-image-2 in the API), launched April 2026 as the successor to DALL·E, which was retired the following month. The first OpenAI image model with native reasoning, meaning it pauses to plan a complex scene before drawing it.

Outputs up to 2K resolution, handles aspect ratios from 3:1 to 1:3, and produces up to 8 coherent images from a single prompt with the same characters and props across the batch. Multilingual text rendering (Japanese, Korean, Chinese, Hindi, Bengali) makes it a strong choice for global campaign packs.

Nano Banana 2

Google DeepMind · Gemini 3.1 Flash Image · February 2026

Google DeepMind's latest image model, released February 2026. Now the default image generator across Gemini, Search, Lens, AI Studio, and Vertex AI. Technically Gemini 3.1 Flash Image, which pairs Gemini's world knowledge and web-search grounding with very fast generation.

Headline tricks for brand teams: accurate text rendering, character consistency across up to five characters, and object fidelity across up to fourteen elements in a single image. Strong for storyboards, infographics, and ecommerce sets. If you want one image engine that understands your prompt in detail and finishes quickly, this is the one.

Video models

Kling 3.0

Kuaishou · February 2026

Kling 3.0, from Kuaishou, released February 2026. Supports text-to-video, image-to-video, multi-shot storyboarding, and reference-based generation at native 4K, 60 FPS, and up to 15 seconds. Also generates audio across multiple languages, dialects, and accents in the same pass.

The "AI Director" feature lets you specify duration, shot size, camera angle, and movement for each shot in a sequence, which turns the model into a real storyboard tool. For brand films, hero campaigns, and high-volume social, Kling sits at the top of the working filmmaker's shortlist.

Seedance 2.0

ByteDance · February 2026

Seedance 2.0 is ByteDance's video model, released February 2026. Currently ranks #1 for image-to-video with audio on the Artificial Analysis leaderboard. Generates up to 15 seconds of 1080p video with synchronized, dual-channel audio (dialogue, music, ambient sound) in a single pass. Accepts up to 12 reference assets to keep characters and products consistent across shots.

Multilingual prompts in English, Chinese, Japanese, and Korean. For TikTok-native, eCom, and APAC-led campaigns, it has become the model many studios reach for first.

A note for marketers

You don't need to know how a diffusion model works internally to brief a great campaign. Same way you don't need to know how a CMOS sensor works to commission a photo shoot. But knowing the vocabulary changes the conversation with your studio, your agency, your internal stakeholders.

You ask sharper questions. You catch the difference between a tool with real craft behind it and one with a slick demo. And you save money, because most AIGC waste happens at the brief, long before anyone clicks "generate."

Work with the studio

See the vocabulary in action

Book a 30-minute working session. Brief one product with us and get three free visuals back within 48 hours. No pitch deck, no commitment.

Book a 30-minute working session