Java in UE4 - Text Extractor Plugin
Unreal Engine 4 integraton of Apache Tika™, a Java based library to detect and extract metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
Github: https://github.com/Deams51/TextExtractor-UE4
This was originally part of a paid job, but since I couldn’t find a proper example of Java integration in UE4, here it is.
Feedback is more than welcome on the implementation!
Usage
Add the plugin to your plugin folder.
Then, a single method is accessible from C++ or BP: “GetTextFromFile” which takes the path of the file as argument and returns the content parsed if successful.
In case you want to use it from C++, add TextExtractionPlugin to the public dependencies of your module.
Srategy
A first C++ wrapper is taking care of the creation of a Java Virtual Machine(JVM) as well as loading the Tika Java library via JNI.
This wrapper is encapsulated in a DLL which is dynamically loaded by UE4 and called from a blueprint library.
WARNING
Since the plugin is using a Java virtual machine, you will see memory exceptions if you are using Visual Studio.
You can just skip them.