Character encoding problem with FString::Contains

Hello
I’m making a search field on UMG. I have string array and simple string as input and string array as output. I just need to find all matches in input string array by simple string and put result as output.

The problem is when i try to make lower case from russian chars. Unreal Engine doesnt want to make lower case from russian language. I find in documentation that UE4 handles only ANSI. Honestly i dont wanna make any iteration stuff to make by myself big chars into small one.

There is any solution in this situation?

ref:

the base encoding unforunately won’t work with the russian language.

FString text = "test";
FString s(TCHAR_TO_UTF8(*passedText));
s = s.ToLower();	

Breaks all non-latin characters .

As in the documentation you would need a converter from ANSI to ISO/IEC 8859-5 and back.

ToUpper() and ToLower() Non-Trivial in Unicode

UE4 currently only handles ANSI (ASCII | code page 1252 | | Western European).

The least worst method to do this for all languages seems to be mentioned here en.wikipedia.org/wiki/ISO/IEC_8859

  • ISO/IEC 8859-1 for English, French, German, Italian, Portuguese, and both Spanishes
  • ISO/IEC 8859-2 for Polish, Czech, and Hungarian
  • ISO/IEC 8859-5 for Russian

The mappings from ftp.unicode.org/Public/MAPPINGS/ISO8859/ contain the conversion rules for the above mentioned languages. ‘CAPITAL LETTER’ and ‘SMALL LETTER’ info would be cross referenced with the appropriate unicode character to get the desired result.

1 Like

It sounds good but im not sure that my c++ experience is enough to get through this problem.

By the way, i solve this problem inside blueprints with replacement russians chars in array.

Thank u for answer

@F1nansist I got an international version working!

Digging through the various files and documentations I stumbled upon FTextTransformer

You just need add

#include "Internationalization/TextTransformer.h" 

Then you can try

	FString s(TCHAR_TO_UTF16(*passedText));					
	FString passedTextLow = FTextTransformer::ToLower(s);
	FString passedTextUP = FTextTransformer::ToUpper(s);

Where passedText is a FString

Seems to work with many other languages. I tested it with Polish and it does upper case and lower case well, also did a quick test with some Russian letters and seems to handle them well :slight_smile:


@EpicChris I was wondering if someone from the staff could maybe incorporate a comparison like this natively for unicode test of the contains function for strings?

Native uppercase & lowercase for FText should use UTF16 with FTextTransformer or at least have an option for unicode compliant conversion? (maybe a bool)

2 Likes

.h

	UFUNCTION(BlueprintCallable, BlueprintPure)
		static FString ToLower(const FString& passedText);

	UFUNCTION(BlueprintCallable, BlueprintPure)
	static FString ToUpper(const FString& passedText);

.cpp

#include "Internationalization/TextTransformer.h" 
FString UTextExtraFunctionLibrary::ToLower(const FString& passedText) {	
	FString s(TCHAR_TO_UTF16(*passedText));					
	FString passedTextLow = FTextTransformer::ToLower(s);	
	return passedTextLow;
}

FString UTextExtraFunctionLibrary::ToUpper(const FString& passedText) {
	FString s(TCHAR_TO_UTF16(*passedText));	
	FString passedTextUP = FTextTransformer::ToUpper(s);	
	return passedTextUP;
}

Yeah, works on UE4. Thank u a lot.

@F1nansist case sensitive version of find and contains

bool UTextExtraFunctionLibrary::ContainsCS(const TArray<FString> fullArray, const FString& passedText) {

	FString s = TCHAR_TO_UTF16(*passedText);
	
	for (int i = 0; i < fullArray.Num(); i++) {
		FString e =TCHAR_TO_UTF16(*fullArray[i]);
		if (e.Compare(s, ESearchCase::CaseSensitive) == 0) {
			return true;			
		}
	}
	return false;		
}





int32 UTextExtraFunctionLibrary::FindCS(const TArray<FString> fullArray, const FString& passedText) {

	FString s = TCHAR_TO_UTF16(*passedText);

	for (int i = 0; i < fullArray.Num(); i++) {
		FString e = TCHAR_TO_UTF16(*fullArray[i]);
		if (e.Compare(s,ESearchCase::CaseSensitive) == 0) {
			return i;

		}
	}
	return -1;
}

The find and contains could probably be combined into 1 node you’d just have to return a tuple<bool,int32>. (if I remember right it’s TTuple in UE)

By the way, this is how looks my search in string array by input string

#include "Internationalization/TextTransformer.h"
#include "Algo/Copy.h"

void UFindUserPluginBPLibrary::CPPFindMatchesBySubstring(TArray<FString> InArray, FString TargetWord, bool IgnoreCase, TArray<FString>& FoundMatches)
{
	if (IgnoreCase)
		TargetWord = FTextTransformer::ToLower(TargetWord);

	Algo::CopyIf(InArray, FoundMatches, [&](const FString str) {
		FString temp(TCHAR_TO_UTF16(*str));
		if (IgnoreCase)
			temp = FTextTransformer::ToLower(temp);
		return temp.Find(TargetWord, ESearchCase::IgnoreCase) != INDEX_NONE;
	});
}

@F1nansist
You have a bug in your code try :


void UFindUserPluginBPLibrary::CPPFindMatchesBySubstring(TArray<FString> InArray, FString TargetWord, bool IgnoreCase, TArray<FString>& FoundMatches)
{
	if (IgnoreCase)
		TargetWord = FTextTransformer::ToLower(TargetWord);

	Algo::CopyIf(InArray, FoundMatches, [&](const FString str) {
		FString temp(TCHAR_TO_UTF16(*str));
		if (IgnoreCase) {
			temp = FTextTransformer::ToLower(temp);
			return temp.Find(TargetWord, ESearchCase::IgnoreCase) != INDEX_NONE;
		}
		else {
			return temp.Find(TargetWord, ESearchCase::CaseSensitive) != INDEX_NONE;
		}
		});
}

@3dRaven

And also i forget to convert TargetWord to UTF-16)

Thank u a lot!

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.