What is the performance impact of empty indexes in arrays?

I’m trying to make it work like an address by multiplying it by hundreds of integers depending on the class type, but this make hundreds of empty indices. There are actually only a few allocations, but will there be any performance degradation in For Each Loop etc. due to empty indexes?

It is hard say how much slower they will be without a context because, as always “it depends on the case” but sparse arrays are definitely to be avoided.

Processing information in a continuous manner avoids cache misses which is a main point of optimization for high performance software like games.

Now I’ll not pretend to be an expert on the matter so you are probably better off researching it on your own but the jest is this:
When you read memory you take a big chunk of it in your processor cache (even bigger in the higher level cache and so on). If the next information you process is in that already fetched chunk it is lightning-fast but if it is not, the whole chunk has to be “flushed” before taking the next chunk and the higher level cache flushed the slower it is (all lower levels are flushed too).

So in your case the performance impact depends on how big the gaps are and how often do you trigger those cache misses. How often you try to read, how often you jump etc. A few missing indecies should not be a problem.

There is a map type that associates keys (which can be int) with data and they are still stored continuously.

P.S.
Here we are not talking at all about the amount of memory wasted.

1 Like

For example,Is there a difference in load between an array with just 5 indexes and an array with 100 empty indexes between them? How to measure cache miss frequency?

I don’t plan on using this array that often. I wish there was a better way to sort class types, but for now this is the only way I can think of.

Why don’t you use a Map container for this? :thinking:

1 Like

Maps cannot have same keys.

Two array elements can’t have the same index either. Use the keys of the map sparsely like you would use the indices in your array.

What is the actual thing you are trying to do? Maybe we can help you use the proper container type.

Ultimately the answer to your question is:
“Performance is impacted negatively the larger an array is. By adding empty elements, you reduce your performance without any real benefit. (without storing anything)”

About the cache misses - it’s a really low level thing and it depends on so many highly technical parts like operating system, compiler, processor… I don’t really feel comfortable giving advice there. I just know about it and I know that it is the reason performance is better when your data is tightly packed and you process it in sequence.

How big is the negative impact?

  • It depends on many things and it will be different on different devices and operating systems.

Is there a difference in load in an array with just 5 indexes and an array with 100 empty indexes between them?

  • Yes. Lading times depend on the ultimate size of the array not the amount of empty elements. Empty indexes still allocate memory so 500 element array would take the same time to load (and the same size in memory) regardless if elements are filled or not. The memory is allocated anyway and accessing it is the same for empty and “filled” elements. However, this is ultimately unnecessary memory use and work and there are ways to avoid it. This also applies to arrays of pointers/references but in this case the data in the array is just the memory address. The objects they refer/point to are another matter entirely.

I don’t plan on using this array that often.

  • Well… Does it matter? I’ll guess that you are making a personal or class project (not a commercial product) so you should really do whatever works for you. If you are interested in performance and optimization - all the power to you. It is fascinating how computers work so read and watch all the videos about Cache Misses and memory layout that you can. If you just want to get over it - do it as you can and move on. There are highly skilled professionals whose whole job is software optimization.
2 Likes

Well, all I wanted to know was whether or not it was recommended. It didn’t come from a single issue, but rather a question I personally had for a long time.
I couldn’t find a clear answer, so it’s important to know for this is not recommended.

I cannot go into detail on other points here as they are not relevant to the question. If I have any problems I will post a new question. Thank you for your answer.

when you declare an array with N elements, on the stack, in Unreal ALL arrays and structs are stack allocated (think RAM with a lower likelihood to be shunted to the HD on a overflow)

when a program needs a chunk of memory greater then what is already allocated to it then a call to Malloc (or Alloc on older hardware) is made, with the size of the data. an array of Size(N) (N being the number of bytes) the actually thing sent to Malloc will be a pointer to the memory address the data will be placed at the total size of the object, in this case the array, in raw bytes. The good news is because we are in C++ and not C we don’t need to pessimize the exact number of bytes to make the call.

  • for example if I want an array for 20 ints the actual call to Malloc will be something like int* MyIntArray = Malloc(80); where the 80 is the 20 * 4 (because each 32bit integer is 4 bytes in size)
  • and this size is for the entire array at the time of the request. this is why attempting to write a value into an index greater then the last index causes an OUT_OF_BOUNDS_EXCEPTIONS even though many operating system will lump “out of bounds” with just “Access Violation” when you attempt to access MyIntArray[21].
  • if the array was allocated “too big” for the initial task then the system will allocate the space for the whole thing contiguously (the other part of Stack allocation). when a call to Malloc is made the presumption of the operating system is that you know what you are doing (this is why C/C++ is considered dangerous because it is so easy to forget “delete”, even though Unreal tries to keep that stuff away), and because Unreal is built on C++, even though Blueprints go to a JiT compiler they are still making C++ Malloc calls.
  • the funny thing is that when requests to Malloc are made they are often times a little bit more then what was requested unless you are really optimizing the dataType, for example a request for an array of Integer with 10 elements , Malloc is very likely to find a contiguous chunk of size 48 bytes (enough for 2 bytes more then I asked), I will still get an “out of bounds” if I try to access those last 2, but the system will prefer “sqare numbers” and “powers of 2” when possible.
  • when it comes to structs in Cache specifically, most operations will be done with pairs of indices, and we should be putting early “continue” for blank indexes anyways.
  • when it comes to an array of Objects (like actors), the Array is Stack Allocated, but it is of pointers, which are just integers that “point” into the heap.
  • Strings are tough because a string is an array of unknown length of char, so that gets even more rough.

most of the operating system optimizations that dZh0 is talking about are on the access, and potiential compression on the Page file. the allocation, and size in Memory is based on IEEE, and ISA expectations (like size of generic int can vary as there are some systems that for example use 16 or 8 bit integer when int is declared), but if you send the entire array for something like a sort, or looking at each index the blanks can be ignored, to try and avoid Cache Invalidation which we short hand to Cache Miss.

  • a Cache invalidation is any time the contents of the current task needs to be stopped Now, and the Cache flushed. Cache Miss is the cause (we don’t need index 18, we need index 22 would be a cache miss), but you can think of a Cache Miss as “we need to take a trip to RAM” think about needing to go to the store to pick up Milk instead of going to the refrigerator.

now for the potential positives of making the Array “too big”:

  • having the array sit in memory not doing anything is a hit that happens, you are technically limited by total system RAM + Paging file (and on Modern Windows the Paging File is an opt-in on the part of the user)
  • it is reallocating the Array because it needs to grow that can be a much bigger impact. for example if I have an array of 20 elements, and then I need to add 1 index to it, a new call to Malloc will be made. if the new size still fits within what malloc already gave me Malloc will return the same pointer, and do nothing else. if it doesn’t fit:
    • then a space “big enough” for the new array is sectioned off, with the current array is still being held in memory
    • each element is then copied from the original array to the location at the new array byte by byte on the primary thread which is blocking
    • only after each element is copied to the new location will the original arrays space be marked as available
  • things like TMap, TSet, and a greater extent DataTable already are arrays (the difference is access), and with regards of TMap particularly they are actually best suited with a bunch of empty indexes, so the Hash function has room to get unique slots for the result.

so if the array is expected to grow often (you keep putting more and more stuff into it) then it might not be that much negative to allocate it “too big” at the start, because you will be saving the re-allocations later. Presuming you do not run into system RAM issues, and that the data Structure you are considering would not already run afoul of Cache size, but in most cases it is the size of the data being held not the size of the array itself.

If the array is just meant to be big, because it “might” get bigger. I would suggest looking into how big it might reasonably get, and look at the thing being put in the array.


-At the Same time-
one of the worst impacts of allocating an array that has “too big” initial size, YOU ARE responsible for your own bookkeeping for the indexes, and constantly doing a naive search for an empty slot, can easily be a bigger impact then calling to Malloc for an array of 1 element bigger.

1 Like