Deep neural networks (DNNs) for supervised labeling problems are known to produce accurate results on a wide variety of learning tasks. However, when accuracy is the only objective, DNNs frequently make over-confident predictions, and they also always make a label prediction regardless of whether or not the test data belongs to any known labels.
With the increasing prevalence of encrypted network traffic, cybersecurity analysts have been turning to machine learning (ML) techniques to elucidate the traffic on their networks. However, ML models can become stale as new traffic emerges that is outside of the distribution of the training set. ...
StatSWAG implements several statistical estimators that, given noisy categorical predictions (labels) from multiple labelers for a set of data samples, estimate both the accuracy of each individual labeler and the true label for each data instance.