Check out the on-demand sessions from the Low-Code/No-Code Summit to learn how to successfully innovate and achieve efficiency by upskilling and scaling citizen developers. Watch now.
Google today introduced TensorFlow.Text, a library for preprocessing language models with TensorFlow. The open source machine learning framework created by the Google Brain team has seen more than 41 million downloads.
TensorFlow.Text can be installed using PIP and comes with the ability to utilize tokens to break apart and analyze text like words, numbers, and punctuation.
At launch, TensorFlow.Text can recognize white space, unicode script, and predetermined sequences of word fragments like suffixes or prefixes that Google calls wordpieces. Wordpieces are commonly used in approaches like BERT, a pretraining technique for language models Google open-sourced last fall.
The library also comes with ops for normalization, n-grams, and sequence constraints for labeling, according to a Medium post announcing the news.
Intelligent Security Summit
Learn the critical role of AI & ML in cybersecurity and industry specific case studies on December 8. Register for your free pass today.
TensorFlow.Text’s tokenizers use RaggedTensors, a new kind of tensor made for recognizing text. RaggedTensors and Unicode support for TensorFlow were first detailed by Google engineer Mark Omernick earlier this year at the TensorFlow Dev Summit.
The news comes just days after the beta release of TensorFlow 2.0. The latest version of Google’s open source framework was released in alpha in March at the TensorFlow Dev Summit. TensorFlow 2.0 uses fewer APIs, deeper Keras integration, and improvements to runtime for Eager Execution.
TensorFlow.Text is the latest dedicated library introduced by Google in the past few months to help people accomplish specific tasks with machine learning. TensorFlow Graphics was released last month and is designed to bring more deep learning to graphics and 3D models.
Perhaps the most popular is TensorFlow Lite for embedded devices, which is now used on more than 2 billion devices, Google said earlier this year. Google uses TensorFlow Lite to power things like speech detection on GBoard and edge detection in Google Photos.
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.