Cactus AI Framework Plugin for UE5.6 for Local LLM & VLM as Generative AI

eray_ozr · August 8, 2025, 9:48pm

Cactus AI is a framework to run local LLM, VLM and TTS on your computer. It is especially good for low powered, edge devices such as mobile phones but it works on desktop platforms, too. This is one of the reasons why it is a good choice for UE5.

So, I integrated it to UE5.6 and share it with you guys.

Features:

Single question
Multi-turn conversations
Image processing
Conversation export for persistent conversations.
Blueprint exposed

Use Cases:

You can talk with your NPCs just like humans.
Digital twin / factory simulation projects that can analyze runtime generated data.
AR & MR Projects that can understand surroundings. (It will be good with Quest 3)

Roadmap:

Currently it supports Windows but I will include Android, too.
It is work-in-progress state. Expect some problems !

Limitations:

Cactus AI doesn’t support buffer import for processing. It requires actual files. (I tried virtual file mapping and it didn’t work) So, you need paths. I shared another plugin to export images. So, you will be fine and also I opened an issue about it. They said they will look at it with next updates.
I don’t have Apple and Linux devices. So, I will only support Android and Windows
But library itself supports Apple ecosystem.
There will be no support about UE4 or older versions than 5.6.

Dependencies

Library itself is here. I did some changes in order solve MSVC related problems and created pull request. You can solve them if you want to build library by yourself and good at C++ but probably you won’t need because I will update it if there is an important update.
GitHub - cactus-compute/cactus: Run AI locally on phones, wearables and AI-native hardware
nlohmann json: Already included. Framework requires json to format prompt. Actually we can do it with Unreal’s json implementation (for example LLM side) but if we need to ask something to Cactus team, using generic codes will makes everyone’s job easier. For this reason I used nlohmann for VLM sideç But you can delete it if you want. In that case just remember to implement your own json system. You can use LLM side as sample.
Extended Variables: One of my other plugins. It allows you to extract texture buffers and export them. GitHub - Frozen-Projects/ExtendedVars: UE5 plugin for converting some variables to each other, exporting bitmaps, runtime font load, getting widgets as textures and getting color arrays of textures.

I can help you if you have “problems” about Unreal side but for framework related problems, ask them to Cactus’ github repo and I don’t offer training/tutorials about your project.

Plugin have sample blueprints in plugin’s content folder. You can look at them to understand workflow.

I am NOT affiliated with Cactus AI team.

Carlos_ECarvalho · August 18, 2025, 2:45pm

Hi! I liked a lot your project, I´m trying to learn more about AI. Do you have any tips from where to start?

eray_ozr · August 18, 2025, 5:35pm

If your question is about AI itself (such as model choosing, fine tuning and etc.) you should look at Cactus AI’s own discord channel and Github page. I am not in their team. I am just an acquaintance of them.

If your question is about Unreal implementation, you can ask me. But before that I suggest you to look at plugin’s content folder. I put some samples.

Other than that “how can I start to AI” is a vague expression and depends on what you want to do. For example do you want to create AI integrated products or do you want to work on AI itself ?

Carlos_ECarvalho · August 19, 2025, 1:46pm

I wanted to make some NPCs that remembers conversations etc. I saw the sav file, would be great. I´m trying to install the plugin right now, created a new third person template project (5.6.1). It asked to compile and I got an error. Will try again

Update: It compiled! I just updated the visual studio

Carlos_ECarvalho · August 19, 2025, 2:39pm

To be more specific, I wanted an companion npc to be believable. To be able to input some more information (or unlock) in it. Also I wanted to activate it with triggers in the world, so the NPC could “react” to some things without being asked by the player. Another thing is that It has to be light for the processing, since there is aleady a game running. I don´t know if I´m dreaming too big, but that´s what I was thinking

eray_ozr · August 19, 2025, 6:39pm

I already integrated “remembering conversation history”. You just need to use import and export functions.
You can implement “interacting with world without being asked” feature on your own. Use line trace or collision box to detect surroundings. And based on what they see, you have to generate a prompt. There is no template for that. it depends on your imagination.

For example: Greeting command
If Hit Actor = player, get its experience level and use that value in a prompt string with append to have different greetings based on player’s level.

You don’t need to use image processing to understand environment. It will be stupidly expensive and unnecessary.
To trigger something from AI:
There is no “runtime” agent like way because there is no constant or predictable way to do it for every user. You can send an email or do some editor side operations, because workflows, classes, functions and buttons are same for everyone. For example, import something to editor has a specific workflow. only variable is paths. But not in runtime. Runtime mechanics are subjective to its developer and it causes ambiguity.

Let’s assume we solved that. We developed something special only for specific project. You need to be sure that your model will give you correct answer without hallucinations everytime.

If you can solve them (I mean, all function and variable names are ready, LLM is fine tuned… and these are not my or plugin’s job), I have another plugin to allow you to call a function, set a variable or get its values with only string without actual variable reference.

For example, your generated text should be like this.

Call “APlayer::BowAnimation”, set “bending” level to “20”, “print” “Greetings Player” on “UNPCWidget::Conversation_Text”

If you have that string and objects like that, AI can even trigger things (for example animations) on its own but making sure that LLM will generate exactly these strings everytime (with fine tuning or prompts) is your job.

In short, sky is the limit and there is no automatic way. Just mess around.

Carlos_ECarvalho · August 19, 2025, 7:34pm

Thanks a lot! It´s been some busy days studying ai.. very nice!
It´s working, the plugin is working great! great work! The responses are taking a while, around 25 seconds depending on the question. I think the model is not the best for an NPC, I downloaded another one with 1B from huggingface, but it keeps talking about my case or something like that. And now I´m thinking about what would be the best way to make the model talk like the character. I´m searching a chat model in the huggingface, but no luck until now. But I´m happy with the progress!

eray_ozr · August 20, 2025, 10:43pm

You can’t get too much speed with complex personality prompts on local libraries that run on end-user hardwares. You guys get used to Gemini, ChatGPT and Claude. They use billion dollars worth hardwares.

Of course there are ways to make local LLMs faster. but there is nothing to do on Unreal’s side. Cactus team can do it if they see fit with their scope. For example, they can increase usages of hardware specific instruction sets or some advanced hardware accelerated calculation systems.

But these are not Cactus Framework’s job. This is a simple, easy-to-use and implement library solution especially for edge devices when privacy required or there is no network connection.

More dependency means more weight and complexity.

Also, UE5 is very problematic about third party libraries. Because sometimes already included third party libraries cause clashes and we can’t update them without engine modifications. So, you wouldn’t be able to use it.

And I will be honest with you. If I needed to integrate that much libraries to UE5 such as CuBlast and solve all the problems, I wouldn’t share it as a free & open source project. Because my goodness has its limits

You can increase the thread counts, though but you have to be careful with it. Because Unreal already use game thread, render thread, network thread (if your project is multiplayer), audio thread and maybe you can use some allocated threads (such as FRunnables) in the future. So, if you have 8 core 16 thread computer, don’t give more than half.

Also, you can use more advanced models but they need more RAM.

If your game requires clever, philosophical conversations, local LLMs might not be good for you (or at least not with every hardwares). Design your project in a way that weirdness of AI halucinations are good for you. Consume and use that… madness.

CyberAxe · September 1, 2025, 5:21am

Could you make a video guide for us special folk
Using input text to NPC text and text to voice Output…in blueprints

eray_ozr · September 2, 2025, 4:03pm

nope.

I didn’t implement voice output, yet. Also even if I generate “sound” with AI. I don’t know anything about Unreal’s audio system and lipsync. So, I can’t integrate runtime audio import system at short term. Actually that’s why I shared it as “free and open source”. People who have more audio knowledge than me, should fork and improve it.
For text output only, Plugin’s content folder has very straightforward sample. It’s not my job to teach basic blueprints to people. Even if I have an intent like that, these kind of stuffs are very subjective. Everybody wants their own “unique” implementation sample and I don’t have time for that.
A sample workflow for you: Create your widget with a text box or some options as questions. Add my manager class to your NPC class as a child actor component. Or spawn one manager class in scene and get it in your NPC actor.
This is game development. You have to try, break and build things by yourself. Nothing will be ready like lego.