Announcement

Collapse
No announcement yet.

Use AVX instruction

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

    [FEATURE REQUEST] Use AVX instruction

    Hello, why you do not want to use the default AVX instructions for 64-bit processors? Easy installation architecture VCToolChain, allows the use of almost all the power of modern processors, thus facilitating the lives of developers who have no idea about such wonderful things as AVX / AVX2.

    All modern Intel processors starting with Intel Core i3 have the support of SSE4.1 / SSE4.2 and AVX

    After all, it can give a huge boost in performance, and in some places up to 10 times!







    My request on github: https://github.com/EpicGames/UnrealEngine/pull/1157



    Здравствуйте, почему вы не хотите по умолчанию использовать AVX инструкции для 64 разрядных процессоров? Простая установка архитектуры в VCToolChain, позволяет использовать почти всю мощь современных процессоров, при этом облегчая жизнь разработчиков, которые не догадываются о таких прекрасных вещах, как AVX/AVX2.

    Все современные процессоры Intel, начиная с IntelКоре Core i3 имеют поддержку SSE4.1/SSE4.2 и AVX

    Ведь это может дать колоссальный прирост в производительности, причем в некоторых местах почти в 10 раз!

    Updated:
    Например, скорость загрузки редактора выросла почти в 3 раза! С 60 секунд до ~20. Большенство циклов были векторизованы, из за высокой оптимизации UE4
    https://en.wikipedia.org/wiki/SSE2
    http://en.wikipedia.org/wiki/Advanced_Vector_Extensions



    My request on github: https://github.com/EpicGames/UnrealEngine/pull/1157

    #2
    This should be optional architecture as you proposed in last message (i got Q9300 which does not support AVX, so i wish i would able to build SEE2 version without me needed to edit the UBT code).
    Also i can't view your commit, as you pull request are flooded with other commits and i can't find yours, thats probably because you doing pull request on release branch and all master branch commits of others trying to flood in to it.
    =========
    My Tutorials:
    Basic knowledge about Classes and UObject environment and stuff like that

    Comment


      #3
      Originally posted by Shadowriver View Post
      This should be optional architecture as you proposed in last message (i got Q9300 which does not support AVX, so i wish i would able to build SEE2 version without me needed to edit the UBT code).
      Also i can't view your commit, as you pull request are flooded with other commits and i can't find yours, thats probably because you doing pull request on release branch and all master branch commits of others trying to flood in to it.
      Intel's compiler generates code as the AVX's instructions, and later (SSE 4/3/2); How does the compiler UE4 - I unfortunately do not know;
      Intel® Compiler Options for Intel® SSE and Intel® AVX generation (SSE2, SSE3, SSSE3, ATOM_SSSE3, SSE4.1, SSE4.2, ATOM_SSE4.2, AVX, AVX2, AVX-512) and processor-specific optimizations

      and

      The significance of SIMD, SSE and AVX
      Last edited by Prynec; 05-18-2015, 01:36 AM. Reason: added The significance of SIMD, SSE and AVX

      Comment


        #4
        I'm not 100%, but you should be able to set your CPU arch to AVX or AVX2 setting the global CL or _CL_ environment variables.

        These env vars are supposed to add the defined switches to every invocation of cl.exe. Add a new env var CL or _CL_ and give it the value /arch:AVX or /arch:AVX2 or whatever else, I also give it the /MP switch, tho that can cause problems.

        Note that you will need a Bulldozer, Sandy Bridge or Haswell to run these executables.

        Tho, I'm not 100% because I'm not sure what happens if the CL var appends /arch:AVX then UBT appends say /arch:SSE3 . I'm sure it's documented, I think the superseding (/arch:avx2 also enables SSE/2/3/4/AVX instructions) arch is used, but don't quote me on that.

        https://msdn.microsoft.com/en-us/library/7t5yh4fd.aspx

        Note that the default mechanism, the C++ properties page doesn't exist for UE4 projects, UBT invokes the cl.exe command, not Visual Studio, thus my suggestion to use CL and _CL_ global env vars and _AVX2_ _SSE4_ or _AVX_ etc preproessor macros to alter all invocations of cl.exe, (in theory) invoked by UBT and otherwise.

        I don't have the skill to read the executable in a hex editor, but If someone knows of a tool to check what arch a program is compiled for I'd love to check into it.
        Last edited by User-658380556; 08-21-2015, 12:45 PM.

        Comment

        Working...
        X