Learning agents with classification

Nitecon · March 1, 2024, 5:51am

I’ve been using learning agents for a while now and the output is fine for throttle and standard capabilities which rely on float outputs. However I’m looking for more numeric outputs to be used. It looks like the current architecture just takes the input features sets up the architecture to output a single float. Would it be possible to at the very least get the ability to have classification type to drive more basic actions. For instance (Action Buttons)

Long Heal
Short Heal
Attack 1
Attack 2
Cover

That way it can be used in a more neaningful way with more generic applications, a basic example could be something like this:

class CSVClassifier(nn.Module):
    def __init__(self, num_features, num_classes):
        super(CSVClassifier, self).__init__()
        self.fc1 = nn.Linear(num_features, 512)
        self.fc2 = nn.Linear(512, 256)
        self.fc3 = nn.Linear(256, 128)
        self.fc4 = nn.Linear(128, 64)
        self.fc5 = nn.Linear(64, num_classes)
        self.dropout = nn.Dropout(0.5)

    def forward(self, x):
        x = F.leaky_relu(self.fc1(x))
        x = self.dropout(x)
        x = F.leaky_relu(self.fc2(x))
        x = self.dropout(x)
        x = F.leaky_relu(self.fc3(x))
        x = F.leaky_relu(self.fc4(x))
        x = self.fc5(x)
        return x

That way you can still provide the number of input features but also receive teh correct amount of output features and then just apply sigmoid / softmax as appropriate with something like:

            if n_classes == 1:
                # Binary classification
                activated_preds = sigmoid(raw_preds).squeeze()
                for pred in activated_preds:
                    predicted_class = int(pred >= 0.5)
                    confidence = float(pred) if predicted_class == 1 else float(1 - pred)
                    response.append({"class": predicted_class, "confidence": confidence, "timestamp": ts})
            else:
                # Multi-class classification
                activated_preds = softmax(raw_preds, dim=1)
                for pred in activated_preds:
                    predicted_class = pred.argmax()
                    confidence = float(pred[predicted_class])
                    response.append({"class": int(predicted_class), "confidence": confidence, "timestamp": ts})```

Something like that would be so beneficial within the learning agents.  Is this a current capability or something that is still coming?

Nitecon · March 11, 2024, 8:35pm

@ranierin any chance you know or could shed some light on this, or know someone that may? Classification + confidence would be such a huge thing for a use case of something like the above, especially using that with reinforcent.

ranierin · March 13, 2024, 3:15pm

I am not very familiar with LearningAgents (unfortunately) but maybe @Deathcalibur can help

Deathcalibur · March 14, 2024, 1:20pm

This is coming in 5.4

If you build from source, you can get it today by switching to the 5.4 branch or main.

Brendan

P.S. Sorry for the delayed response. If you use the “learning-agents” tag on any future questions, I will be notified and can respond quicker.