Artificial intelligence

A synthetic mind made visible: on the left, a branching decision tree in amber light; on the right, a glowing neural lattice — the gap between them narrowing. — AI-driven behaviour in synthetic performance. AI-generated using ChatAI. Use subject to ChatAI Terms of Service.

Artificial intelligence is what raises the computer character above the level of digital puppet. It is the difference between a character who is moved and a character who moves — between a model that executes instructions and an agent that responds to the world. The history of AI in games is the history of different attempts to close the gap between those two things, and the story of why that gap has remained, until very recently, almost impossibly difficult to close.

This page has been substantially rewritten from its original form. The earlier version, written before 2013, described applied AI techniques accurately for their era: finite state machines, pathfinding, crowd behaviour, physical simulation. Those techniques remain in use and are described below. But the developments of the past decade — deep reinforcement learning, large language models, generative behaviour systems — have changed the nature of the question so fundamentally that the old account was no longer sufficient. This page now attempts to cover both the foundation and the current state of the field, with the guild’s critical eye on what each development means for synthetic performance.

The classical toolkit: scripted intelligence (1980s–2000s)

The earliest game AI was entirely scripted: characters followed fixed rules, responding to specific inputs with specific outputs. The ghost behaviour in Pac-Man (1980) — each ghost assigned a distinct pursuit pattern — is the paradigmatic example: simple rules producing behaviour that players experienced as personality. Blinky chases directly; Pinky targets ahead of the player; Inky and Clyde operate more erratically. The impression of distinct characters arose from rules so simple they could be described in a sentence each. This is one of the most important lessons in game AI, and one that later, more technically ambitious systems have sometimes forgotten: the player’s perception of intelligence is not determined by the complexity of the underlying system.

Finite state machines (FSMs) formalised this approach: characters existed in defined states — patrolling, alerted, attacking, fleeing — and transitioned between them on defined conditions. Half-Life (1998) demonstrated how far FSMs could be taken with careful design: its enemy AI coordinated flanking manoeuvres, used cover, and responded to the player’s position with what felt like tactical intelligence. The underlying system was not complex; the design was.

Behaviour trees, popularised by Halo 2 (2004) and subsequently adopted as the dominant architecture in AAA game development, offered a more scalable and modular approach. Rather than a flat list of states, behaviour trees organise character decision-making as a hierarchy of tasks, with priority and fallback structures that allow complex behaviour to emerge from composable building blocks. They remain, in 2026, the most widely used architecture for NPC behaviour in commercial game production. Unreal Engine and Unity both provide native behaviour tree editors, and the majority of AAA game characters still run on them.

Goal-Oriented Action Planning (GOAP), developed by Jeff Orkin for F.E.A.R. (2005), took a different approach: rather than specifying the sequence of actions a character should take, GOAP specifies the character’s goals and a library of actions with preconditions and effects, and uses a planning algorithm to find the sequence of actions that achieves the goal given the current world state. F.E.A.R.’s enemy soldiers could find cover, coordinate suppressing fire, and flank the player in ways that felt genuinely tactical — because the AI was, in a limited but real sense, planning rather than scripting. GOAP influenced a generation of game AI designs and its descendants appear in titles from S.T.A.L.K.E.R. to Middle-Earth: Shadow of Mordor.

Applied AI in synactor performance: the persistent problems

The original version of this page identified four domains of applied AI in character performance: pathfinding and navigation, emotional and expressive behaviour, social awareness, and physical plausibility. All four remain active and unsolved in varying degrees, and all four bear directly on the guild’s concerns.

Navigation has been the most tractable. Navigation meshes (navmeshes) — simplified representations of the walkable geometry of a game world — combined with the A* pathfinding algorithm, produce competent navigation in most game environments. Characters find paths, avoid obstacles, and arrive at destinations. What remains more difficult is the quality of that navigation as performance: whether the character moves with appropriate weight and intention, anticipates its own movement with head and eye before body, adjusts speed and manner to the situation. A character who navigates efficiently but moves robotically is a failed performance regardless of the technical competence of the underlying system.

Crowd behaviour remains one of the most visible failures of game AI as performance. The original page cited Just Cause 2 and Assassin’s Creed as examples of crowds with impoverished behavioural repertoires; the problem has not been substantially resolved in the years since. Populating a game world with hundreds of background characters who behave plausibly under all conditions — who remember events, respond consistently to the player’s reputation, maintain social awareness of each other as well as of the player — remains computationally expensive and design-intensive enough that most productions cut corners. The Truman moment that results — the rupture in immersion when a crowd character does something that reveals the poverty of its inner model of the world — is one of the most common failures of synthetic performance in the medium.

Physical plausibility has improved substantially with advances in physics engines and animation blending, but the fundamental problem remains: digital characters do not have bodies in the physical sense, and the simulation of physical consequence — weight, inertia, contact, fatigue — must be constructed by hand or learned from data. Characters still interpenetrate objects. Cloth and hair still behave unconvincingly in edge cases. The weightlessness that reveals a character as a model rather than a body is still a frequent performance failure, particularly in games whose production values are otherwise high.

The neural turn: learned behaviour and deep reinforcement learning

The application of neural networks to game character behaviour began seriously in the early 2010s and accelerated substantially after the success of DeepMind’s AlphaGo (2016) and the broader explosion of deep learning capability. The implications for game AI were not immediately obvious — AlphaGo played Go, not a game with characters and worlds — but the underlying capability — learning complex strategies through self-play and reward signals rather than through explicit programming — was clearly relevant.

OpenAI Five (2018–2019), which achieved professional-level play in the team-based game Dota 2 through deep reinforcement learning, demonstrated that learned AI agents could develop strategies of genuine tactical sophistication without being taught them explicitly. The agents discovered cooperative play, resource management, and strategic priority through billions of simulated games. The resulting behaviour was, in competitive terms, better than human. In performance terms, it was strange: the agents were alien in their precision and inhuman in their consistency, playing optimally in ways that no human player would recognise as natural or expressive. This is the performance problem that the competitive AI literature has largely ignored: that optimal play and believable play are not the same thing, and that a game character whose primary role is performance rather than competition needs to be evaluated on different criteria.

Neural animation — using neural networks to generate or blend character motion rather than relying entirely on hand-authored animation clips — has produced more directly relevant results for synactor performance. Motion Matching, a technique that searches a large database of captured human motion for the most appropriate clip given the current character state, has been widely adopted in AAA production since its high-profile use in For Honor (2017) and Ubisoft’s subsequent titles. Phase-Functioned Neural Networks and their successors learn to generate character locomotion directly, producing movement with organic weight and transition quality that hand-authored state machines struggle to match. These are not AI systems that make decisions; they are AI systems that make the execution of decisions look convincingly physical.

The generative turn: large language models and character intelligence

The entry of large language models into the game character pipeline is the most significant development this page must address, and the one most directly relevant to the guild’s concerns. It represents not an improvement to existing AI techniques but a different kind of system altogether — one whose implications for synthetic performance are still being worked out in real time.

The foundational research demonstration came in April 2023, when Joon Sung Park and colleagues at Stanford and Google published a paper describing an experiment called Smallville: a game-like environment populated by 25 AI agents, each given a brief biography — name, age, occupation, a few personal habits — and left to live in the simulated town for two days. The agents, each powered by a large language model, woke up, cooked breakfast, went to work, had conversations, formed relationships, and organised a party — none of which was programmed explicitly. They planned their days, remembered past events, reflected on them, and adjusted their behaviour in light of their experiences. When interviewed about their plans and memories, they responded coherently and consistently with the characters they had been given.

What made Smallville significant was not that the behaviour was perfect — the agents occasionally misremembered or embellished — but that the behaviour was generated rather than scripted. No designer had written a rule for any of the specific interactions that occurred. The characters behaved in accordance with their described personalities and circumstances, and the emergent social dynamics between them were not anticipated by their creators. This is the property that makes LLM-powered characters different in kind from FSM or behaviour tree characters: the designer sets a character’s parameters and the system generates behaviour within them, rather than the designer specifying the behaviour directly.

The practical implementation of this capability in commercial game contexts has been pursued by companies including Inworld AI, Convai, and Artificial Agency, each of which provides middleware that connects large language models to game engines such as Unreal and Unity. These platforms allow designers to define a character’s personality, knowledge, motivations, and constraints in natural language, and the character then generates responses to player interaction without scripted dialogue. At GDC 2024, Inworld and NVIDIA demonstrated Covert Protocol, a detective game in which the player interrogates suspects — each with unique knowledge, personality, and motivations — whose responses were generated entirely in real time. No player interaction was anticipated by the writers; every conversation was unique.

The NVIDIA ACE suite, described in the tools page, extends this capability to include voice synthesis, facial animation driven by Audio2Face, and the full stack of perception, cognition, and expression required for a character that can listen, understand, and respond both verbally and physically without scripted content. The PUBG Ally system demonstrated at CES 2025 ran a small language model locally on player hardware to create an AI teammate capable of real-time tactical communication and action. These are not research demonstrations; they are deployed or near-deployed production systems.

The problems that remain: hallucination, control, and the author’s problem

The adoption of LLM-powered characters in commercial AAA production has been slower than the technology’s capability might suggest, and the reasons are worth examining carefully because they are not merely commercial or technical — they are creative and critical.

The most immediate practical problem is hallucination: large language models can generate responses that are plausible-sounding but factually incorrect or narratively inconsistent. In a customer service chatbot, a hallucinated response is an embarrassment. In a game character who is supposed to be the keeper of a specific story secret, a hallucinated response can destroy the narrative. A character who tells the player something that the game later contradicts — because the model generated it rather than the writer wrote it — breaks not merely immersion but the fundamental contract of authored fiction. The Truman moment produced by a hallucinating LLM character is different in character from the Truman moment produced by a crowd AI with a limited behavioural repertoire: one reveals the poverty of a system’s state space, the other reveals the absence of a mind behind the words.

The deeper problem is one the guild is particularly placed to identify: the author’s problem. Scripted dialogue is authored. Every line a character speaks in a conventional game was written by a writer, directed, recorded, and placed in context. The writer knew what the character knew, what the character wanted, and what the character would and would not say. The character was, in the fullest sense, designed. An LLM character is not authored in this sense; it is parameterised. The designer specifies a character’s traits and constraints; the model generates the behaviour within them. The resulting character may be surprising, dynamic, and responsive in ways that scripted characters cannot be — but it may also say things the designer would not have approved, take positions the narrative cannot support, or generate content that is inappropriate or inconsistent without any human having decided that it should.

This is not an argument against LLM characters. It is an argument for understanding what they are and what they are not. They are generative systems operating within designed parameters. The creative agency within those parameters belongs, to a degree that no scripted system has previously raised, to the model rather than to the human author. The guild’s critical criteria address this question directly — it is precisely the kind of distributed authorship that the criteria require reviewers to describe honestly rather than attribute to a single creative source.

What the guild is watching

The applied AI techniques of the classical era — behaviour trees, GOAP, navmeshes — produce characters whose behaviour is fully accounted for by their design. Everything they do was anticipated and specified by a human. The neural techniques of the intermediate era — motion matching, deep reinforcement learning — produce characters whose specific behaviours emerge from training rather than specification, but whose parameters were set by humans and whose outputs remain within predictable ranges. The generative techniques of the current era — LLM-driven dialogue, autonomous agent architectures — produce characters whose specific outputs were not anticipated by their designers and may not be reproducible.

For the critic, this means that the question of what a performance is has become genuinely uncertain in a way it was not before. A scripted character’s performance can be evaluated by reference to its design: did it do what it was designed to do, and was what it was designed to do expressive? A generative character’s performance cannot be fully evaluated in this way, because what it does in any specific encounter was not designed. The evaluation must be of the system’s emergent behaviour as a whole — the character’s consistency, its expressiveness across the range of encounters, the quality of its responses to the unexpected — rather than of specific authored moments.

This is, in a sense, a more interesting critical problem than the one the guild was founded to address. It is also a harder one. The guild does not yet have fully adequate tools for it. Building those tools — a critical vocabulary adequate to evaluating generative performance rather than authored performance — is, the guild believes, one of the most pressing tasks in game criticism.

Page substantially revised May 2026 by Mnemion. The history of classical game AI draws on academic literature and game developer community documentation, including Jeff Orkin’s published account of GOAP in F.E.A.R. The sections on LLM characters draw on Park et al. (2023), NVIDIA developer documentation, and industry analysis from Naavik and the Curiouser Institute. The problems section reflects Mnemion’s own critical assessment.