Version 0.9.3 is Released!
The latest version update includes agentic capabilities, improved semantic analysis, and improvements to background handling for multiple NPCs, among other improvements:
Intelligent Memory Retrieval (TF-IDF Scoring)
Memory retrieval has been completely overhauled. The Brain now uses TF-IDF (Term Frequency–Inverse Document Frequency) scoring to find the most contextually relevant memories for each conversation, replacing the previous keyword matching system.
What this means for your game: When a player mentions “the merchant who cheated me,” the Brain now correctly surfaces the memory about that specific event, even if the memory uses different words like “shopkeeper” or “swindled.”* Common words like “the” and “was” are automatically deprioritized, while rare, meaningful words carry more weight.
How it works: Every time the Brain retrieves memories for the LLM prompt, it builds a statistical vocabulary from the NPC’s full memory set and scores each memory against the current conversation using cosine similarity. This runs entirely on CPU in microseconds.
What changed: GetRankedMemories now blends two signals: 40% decay/importance (how recent and significant the memory is) and 60% TF-IDF relevance (how semantically related it is to the current conversation). The result is memories that are both timely and topically relevant.
Built-in stemming automatically handles word variants. “Running,” “ran,” and “runs” all match against each other. “Betrayed,” “betrayal,” and “betraying” all resolve to the same root. This works out of the box with no configuration required. English currently supported, with additional configuration in the future.
No changes needed to existing projects. The upgrade is fully backwards-compatible.
*When combined with a Concept Library
Concept Library (Synonym-Aware Matching)
A new optional data asset, UPersonicaConceptLibrary, lets you define synonym groups for your game’s vocabulary. This gives the memory system awareness of domain-specific relationships that pure word matching cannot capture.
Example: Map “angry,” “furious,” “enraged,” and “wrathful” to CONCEPT_ANGER. Map “merchant,” “shopkeeper,” “vendor,” and “trader” to CONCEPT_TRADE. Now a memory about an “angry merchant” matches a conversation about a “furious vendor” because both resolve to CONCEPT_ANGER + CONCEPT_TRADE.
How to use:
-
Create a Data Asset of type PersonicaConceptLibrary in the Content Browser
-
Add entries to the WordToConceptMap (word → concept tag)
-
Assign the library to your Brain Component’s new Concept Library field
The system works without a concept library (pure TF-IDF + stemming). Adding one improves matching quality for your specific game’s vocabulary. We recommend starting with 50–100 entries covering your game’s key nouns, emotions, and roles.
General-Purpose Text Scorer (UPersonicaTextScorer)
The TF-IDF engine powering memory retrieval is exposed as a standalone Blueprint-callable utility class. Use it anywhere you need to rank text by relevance.
Available methods:
-
BuildCorpus — Index an array of strings (memories, lore entries, quest descriptions, dialogue lines)
-
ScoreQuery — Rank all indexed strings against a search query, returns sorted results
-
ScorePair — Quick one-off comparison between two strings (no corpus needed)
Use cases beyond memory retrieval:
-
Data table selection: Score rows against NPC context to dynamically pick the most relevant scenario, quest, or dialogue
-
Lore injection: Index your lore database, then pull the most relevant entries into the LLM prompt based on the current conversation
-
Dialogue fallback for LOD-Far NPCs: Select the best pre-written line for NPCs not using the LLM, based on conversation context
-
Item relevance: Find inventory items related to the current topic so NPCs can reference them naturally
All scoring supports optional stemming and concept library expansion.
Context Actuator (LLM Write Path)
The Context Interface now supports bidirectional data flow. In addition to reading game state into prompts, NPCs can now write data back to the game world through LLM-generated context mutations.
New interface methods:
-
SetContextForTag(Tag, Value) — Write a value to a context tag
-
GetWritableTags() — Return which tags this actor allows the LLM to modify (empty = all writable)
How it works: When an LLM response includes a context_mutations array in its JSON output, the Brain processes each entry and routes it to the appropriate context source. The target actor validates the write against its writable tags and either accepts or rejects it.
{ "npc_opening_statement": "The grain prices must rise...",
"context_mutations": [
{ "tag": "GRAIN_PRICE", "value": "15" },
{ "tag": "RUMOR_MARKET", "value": "Prices expected to climb further" }
]
}
New Blueprint events:
OnContextMutated(Tag, Value) — Fires on the Brain
Component whenever a context mutation is successfully applied. Use this
to trigger UI updates, game logic, or chain reactions.
Registering context sources:
InventorySource Deprecation: The InventorySource property on the Brain Component is now deprecated. Existing connections are automatically migrated to the unified ContextSources system on BeginPlay — your project will continue to work without changes, but you’ll see a log warning encouraging you to switch to RegisterContextSource(). New projects should use RegisterContextSource() exclusively.
Safety: The GetWritableTagsmethod gives developers full control over what the LLM can modify. A lore manager might allow writes to biography tags but protect world history. An economy actor might allow price adjustments but prevent direct currency manipulation. Return an empty array to make all tags writable, or return specific tag names to restrict access.
Dynamic Tag Resolver: Context Sources Now Support Read Path
Previously, actors registered via RegisterContextSource() could only receive writes from LLM context mutations. Custom prompt tags like {SHOP_STATUS} would only resolve if the Brain’s owner actor implemented IPersonicaContextInterface directly, or if a Blueprint delegate was bound to OnResolveContextTag. Registered context sources are now queried during prompt tag resolution. Any actor registered via RegisterContextSource() can provide values for custom tags through its GetContextForTag implementation. The resolution order is: Blueprint delegate → registered context sources → owner actor.
This means you can now register a shop actor, economy manager, or any other context provider and use its tags in prompt templates without the owner actor needing to know about them.
Flexible Profile Data Dictionaries
The UPersonicaProfile now natively
implements the Context Interface and includes generic data dictionaries, allowing developers to expand NPC profiles without writing custom C++ structs.
-
Genre-Agnostic Variables: Add custom data to your NPCs (e.g., job
titles, ages, or status effects) using the new StringMetadata,
IntegerStats, and StatusTags properties directly in the Profile data
asset.
-
Native Actuator Integration: Because the Profile acts as a
registered Context Source, the LLM can seamlessly read these custom
variables into its prompt and dynamically update them via
context_mutations.
-
Granular Write Protection: A new WritableDataTags array gives
developers strict control over which specific variables the LLM is
allowed to overwrite, protecting read-only lore while allowing safe,
dynamic status updates.
Flash Attention Toggle
Flash Attention (-fa) for the local inference server is now configurable through Project Settings and at runtime. Disabled by default.
Project Settings: Global LLM Configuration → Local Server → Enable Flash Attention
Runtime (for player-facing settings menus):
-
UPersonicaSettings::SetFlashAttentionEnabled(WorldContext, bEnabled) — Toggle flash attention and restart the server
-
UPersonicaSettings::IsFlashAttentionEnabled() — Query the current state to initialize UI toggles
The setting persists across sessions via SaveConfig.
JSON Resilience & Error Recovery
The Brain’s response parsing pipeline is now more resilient to model hallucinations or truncation errors that previously caused NPCs to hang in a “Thinking” state.
JSON Sanitizer/Soft Parse:
-
The Brain now includes an automated sanitizer that attempts to repair malformed JSON before it reaches the parser.
-
Automatically closes dangling braces ({}), terminates unclosed
quotes, and strips out illegal Markdown code blocks (like “`json) that
models sometimes include despite instructions.
-
What this means: If an LLM hits its token limit and cuts off
mid-sentence, the Sanitizer will “soft close” the JSON, allowing the
game to still extract the partial dialogue instead of rejecting the
entire response. Or, if an LLM outputs an incorrect closing to its JSON,
the Sanitzer corrects the output and allows the response to be
corrected.
-
To toggle: select a Personica Brain Component and select Enable JSON Sanitizer.
Brute Force Regex Fallback:
As a last resort, the Brain can now bypass the JSON parser entirely using a regex-based extraction method.
-
If the JSON is so mangled that the Sanitizer cannot fix it, the
Brain will scan the raw text for the dialog_line key (or your custom
template key) and attempt to “brute force” the text out of the mess.
What this means: Even if the model outputs 500 words of garbage, if
there is a valid dialogue string anywhere in the response, the NPC will
still speak it. Disabled by default.
-
To toggle: Select a Personica Brain Component and select Enable Regex Fallback.
Multi-NPC Stability Improvements
Significant improvements to the streaming pipeline when multiple NPCs generate responses simultaneously:
-
Global Generation IDs: All Brain Components now
share a single monotonic ID counter, preventing ID collisions when
multiple NPCs dispatch requests in the same frame
-
Sequence ID Guards: The LocalLLMManager’s full
streaming pipeline (polling, chunk processing, completion) uses sequence
IDs to prevent cross-contamination between NPC requests
-
Buffer Flush on Completion: When an HTTP stream
completes, remaining buffered tokens are flushed immediately to the
Brain’s DialogBuffer before the JSON parser runs, preventing
partial-buffer parse failures
-
Spawn Stagger: Brains that auto-start on BeginPlay
now stagger their initial requests using a sequential index, reducing
the burst load on the local server
-
Cancel Safety: CancelGeneration now
guards against clobbering a job that has already completed, preventing
state corruption when snip timing overlaps with natural HTTP completion
Additional Changes
-
Live Monitor Enhancements: The Personica Debugger (SPersonicaDebugger) has been updated to track and display real-time, comma-separated readouts of the new StatusTags, IntegerStats, and StringMetadata dictionaries within the Traits foldout.
-
Llama.cpp updated to a more recent version (b8646).
Upgrade Notes
-
No breaking changes. All new features are additive. Existing projects compile and run without modification.
-
Concept Library is optional. Memory retrieval improves immediately from TF-IDF + stemming alone. Add a concept library when you want synonym awareness.
-
Context Actuator is opt-in. The context_mutations JSON field is only parsed if present. Existing prompt templates that don’t include it are unaffected.
-
**InventorySource is deprecated. Existing connections auto-migrate
at runtime. Switch to RegisterContextSource() at your convenience. The
{INVENTORY} tag continues to work through the dynamic tag resolver.