Live QA Tester
Run the QA system against every entry in an eval set and compare with ground truth labels.