Google Brain's XLNet bests BERT at 20 NLP tasks

A group of Google Brain and Carnegie Mellon University researchers this week introduced XLNet, an AI model capable of outperforming Google's cutting-edge BERT in 20 NLP tasks and achieving state-of-the-art results on 18 benchmark tasks. BERT (Bidirectional Encoder Representations from Transform) is Google's language representation model for unsupervised pretraining of NLP models first introduced last fall.

XLNet achieved state-of-the-art performance in several tasks, including seven GLUE language understanding tasks, three reading comprehension tasks like SQuAD, and seven text classification tasks that include processing of Yelp and IMDB data sets. Text classification with XLNet saw a marked reduction of up to 16% in error rates compared to BERT. Google open-sourced BERT in the fall of 2018.

XLNet harnesses the best of autoregressive and autoencoding methods used for unsupervised pretraining through a variety of techniques detailed in an arXiv paper published Wednesday by a group of six authors.

"XLNet is a generalized autoregressive pretraining method that allows learning bidirectional context learning by maximizing the expected likelihood over all permutations of the factorization order and [...] overcomes the limitations of BERT thanks to its autoregressive formulation," the paper reads.

The model's name is derived from Transformer-XL, an autoregressive model released in January by the same team of researchers. XLNet adopts Transformer-XL's pretraining methods for segment recurrence mechanism and relative encoding schemes. The model also borrows from NADE, which was created by researchers from Google DeepMind, Twitter, and academia for its permutation language modeling methods.

XLNet is the most recent NLP model to emerge that performs better than BERT. Microsoft AI researchers introduced Multi-Task Deep Neural Network (MT-DNN) in May. The model is based on BERT but achieves better performance on a number of GLUE language understanding benchmark performance tasks.

More