How does an organization the size of Mozilla sift through the hundreds of bug reports that come in each day for projects like Firefox? With machine learning, as it turns out. In a blog post this morning, it detailed BugBug, an open source Bugzilla tool that automatically classifies bugs by product (e.g. Firefox, Firefox for Android, Thunderbird) and component (subsets of products).

“Getting bugs to the right eyes as soon as possible is essential in order to fix them quickly,” Mozilla’s Marco Castelluccio and Sylvestre Ledru wrote. “Historically, the product/component assignment has been mostly done manually by volunteers and some developers. [but] unfortunately, this process fails to scale, and it is effort that would be better spent elsewhere.”

Training BugBug to autonomously slot bugs into products and components categories required compiling a data set of over two years’ worth of reports, or 100,000 bugs. As if that weren’t arduous enough, the data couldn’t be used as-is — any change to a bug after triage (i.e., screening and prioritization) has been completed would be inaccessible to the tool during real operation.

To get around this, the BugBug team “rolled back” the bugs to the time they were originally filed, and filtered the corpus for components that had at least 1% of the number of bugs of the largest component.

Firefox Mozilla BugBug

Above: BugBug in action.

Image Credit: Mozilla

With the title, the first comment, and the keywords and flags associated with each bug in hand, they set about training BugBug’s model, which took 40 minutes on a six-core PC with 32GB of RAM, compared with the week it takes for manual bug assignment. Castelluccio and Ledru say that since they deployed BugBug in production at the end of February 2019, about 350 bugs have been triaged with an accuracy of 93%. (Assignment is only performed when the model is over 60% confident in its decision, they note.)

Currently, BugBug only categorizes bugs for Firefox-related products, but Mozilla plans to extend it to other projects in the future. It also hopes to use it to identify duplicate bugs and detect bug types (like “performance,” “memory usage,” and “crash”), isolate bugs in which documentation is missing, and suss out those that might be particularly important for or relevant to a given Firefox release.

“By presenting new bugs quickly to triage owners, we hope to decrease the turnaround time to fix new issues,” Castelluccio and Ledru said.