Clova AI Research uses unsupervised learning to deliver state-of-the-art image style transfers

AI researchers in South Korea created an AI system named U-GAT-IT, an unsupervised learning model for image-to-image translation. They said U-GAT-IT outperforms four other top generative networks created with unsupervised learning in almost every instance.

Image-to-image translation can improve applications ranging from image style transfers to inpainting, the process of predicting missing pixels in a photo. Progress in unsupervised learning for generative neural networks means more advanced generative systems can be created without data sets labeled by human annotators.

U-GAT-IT ranked highest in a survey that asked 135 people how they feel about the quality of images generated by the five top unsupervised GAN models, including CycleGAN and CartoonGAN. Researchers asked users to pick the best image generated from data sets made for turning horses into zebras (horse2zebra), cats into dogs (cat2dog), and selfies into anime characters (selfie2anime). UNIT, a framework for unsupervised generative adversarial networks that Nvidia introduced in 2017, narrowly outranked U-GAT-IT in human ranking of photo2portrait, which can turn a headshot into a black-and-white sketch.

U-GAT-IT also achieved the lowest and therefore best results in Kernel Inception Distance (KID), a method of measuring discrepancies between real and generated images from generative adversarial networks, or GANs.

There's also a selfie-to-waifu demo of U-GAT-IT.

U-GAT-IT was created by Clova AI Research, video game company NCSOFT, and Boeing Korea Engineering and Technology Center. All three are located in South Korea. A paper detailing U-GAT-IT was accepted for publication by the International Conference on Learning Representations (ICLR). Originally scheduled to take place later this month in Addis Ababa, Ethiopia, the all-digital ICLR conference will be held online starting April 27.

U-GAT-IT's advances are the result of a normalization technique researchers call Adaptive Layer-Instance Normalization, or AdaLIN. Inspired by the Batch Instance Normalization method proposed by Lunit AI researchers in 2018, researchers said AdaLIN is "essential for translating various data sets that contain different amounts of geometry and style changes."

"The AdaLIN function helps our attention-guided model to flexibly control the amount of change in shape and texture. As a result, our model, without modifying the model architecture or the hyperparameters, can perform image translation tasks not only requiring holistic changes but also requiring large shape changes," the paper reads.

Researchers shared the AdaLIN method publicly for the first time last summer and updated it last week, according to preprint repository arXiv. A paper on U-GAT-IT has been cited nearly a dozen times, according to arXiv.

More