Using cycle counters in multithreaded function

Lets say I have a function:

void fun(){
    ... Do something

But fun will be getting called by multiple threads at the same time. Stat will thus complain about multiple threads using same CYCLE_STAT. What is the correct way to get separate CYCLE_STATS for each thread? So later when I look at the stat file you’ll have multiple STAT_fun recorded for each of the different threads?