THIS PRODUCT IS NOT INTENDED TO REPLACE HUMAN ART OR WRITING. ONLY TO ENHANCE.
One plugin, three brains. In Project Settings → Cometforge NPC you pick the mode — no code:
• On-Device — an LLM runs inside your packaged game via embedded llama.cpp (Vulkan). No cloud, no separate process, no API key. Ministral 3 3B Instruct (~2 GB, Apache-2.0) ships embedded in the plugin, so it works offline out of the box.
• Cloud — route to any OpenAI-compatible API with your key (requires internet at runtime).
• BYOM — point it at your own local server (LM Studio, llama-server, etc.).
Under all three sits the same 4-tier memory (working / episodic / semantic / procedural — salience, decay, consolidation, saved to disk). Tell an NPC something, quit the game completely, come back tomorrow — they remember.
Memory that respects your save games. One settings toggle: Persistent (NPCs never forget — zero config) or Save Game (memory follows your save slots — call Save/Load Memory To Slot beside your own save nodes). Reload an earlier save and NPCs know exactly what they knew then — nothing from the future, nothing from another playthrough. Reset All Memory gives you a clean New Game. All local, no account, no server, no session IDs.
Drop-in or bring your own UI. Place the ready-made CometforgeDemoNPC actor in any level and press Play: walk up, an "[E] Talk" prompt appears, press E, talk — zero wiring. Or add the Cometforge Dialogue widget to your own UI, or skip our UI entirely and call Talk / TalkAsync from your existing dialogue system. The brain is UI-agnostic, and TalkAsync keeps your game running smoothly while the model thinks.
Blueprint nodes: Talk, Talk Async (with On Reply event), Open/Close Dialogue, Remembered, Save All, Save/Load/Delete Memory To Slot, Reset All Memory, Get/Restore Memory Snapshot.
A demo map is included in the plugin (/CometforgeNPC/Maps/CometforgeDemo) — engine content only, with Bram the innkeeper ready to talk.
Requirements & honesty: On-Device mode needs a Vulkan-capable GPU; the bundled 3B runs comfortably alongside a 3D game on most modern gaming PCs (≈8 GB VRAM and up). Cloud/BYOM modes need no local GPU headroom. Win64, UE 5.7. Language models generate — storage is exact, but a small model can improvise beyond its memory; KNOWNISSUES.md ships in the plugin's Docs folder and says so plainly. Building for memory-tight platforms like Xbox Series S, or want a custom-trained model sized to your budget? Contact us.
Support: travis@tntholley.com