Deep neural networks (DNNs) for supervised labeling problems are known to produce accurate results on a wide variety of learning tasks. However, when accuracy is the only objective, DNNs frequently make over-confident predictions, and they also always make a label prediction regardless of whether or not the test data belongs to any known labels.
StatSWAG implements several statistical estimators that, given noisy categorical predictions (labels) from multiple labelers for a set of data samples, estimate both the accuracy of each individual labeler and the true label for each data instance.