The Transform Technology Summits start October 13th with Low-Code/No Code: Enabling Enterprise Agility. Register now!
The computer science grad students trained image classification networks to determine whether a dog is sitting, standing, or lying down. If a dog responds to a command by adopting the correct posture, the machine dispenses a treat.
The students used an Nvidia Jetson edge AI platform for real-time trick recognition and treats. Stock and Cavey see their prototype system as a dog trainer’s aid — it handles the treats — or a way to school dogs on better behavior at home.
“We’ve demonstrated the potential for a future product to come out of this,” Stock said in a statement.
Fetching dog training data
The researchers needed dog images that exhibited the three specified postures. They found the Stanford Dogs datasets, with more than 20,000 images of various sizes depicting dogs in many positions. The images required preprocessing so they wrote a program to help quickly label them.
In an email to VentureBeat, Nvidia said, “It doesn’t yet work remotely; it’s currently for in-person. But that would be an easy setup to make it a remote system. You might think of it as a system, or IP, to license for devices like the Furbo. The researchers see many possible applications but haven’t committed to anything yet.”
To refine the model, the researchers applied features of dogs from ImageNet to enable transfer learning. Next, they applied post-training and optimization techniques to boost speed and reduce model size.
For optimizations, they tapped into Nvidia’s Jetpack software development kit on Jetson, which is a lightweight AI platform for drones and other systems. It offers an easy way to get things up and running quickly and to access the TensorRT and cuDNN libraries, Stock said. Nvidia TensorRT optimization libraries offered “significant improvements in speed,” he added.
Tapping into the university’s computing system, Stock trained the model overnight on two 24GB Nvidia RTX 6000 graphics processing units (GPUs).
Deployed models on Henry
The researchers tested their models on Henry, Cavey’s Australian Shepherd. The model achieved accuracy of up to 92% in tests and demonstrated an ability to make split-second inference at nearly 40 frames per second.
Using the Jetson Nano, the system makes real-time decisions about dog behaviors and reinforces positive actions with a treat, transmitting a signal to release a reward.
“We looked at Raspberry Pi and Coral, but neither was adequate, and the choice was obvious for us to use Jetson Nano,” Cavey said.
Explainable AI helps provide transparency about the makeup of neural networks. It’s becoming more common in the financial services industry as a way to understand fintech models. Stock and Cavey included model interpretation in their paper to provide explainable AI for the pet industry.
They do this with images of the videos that show the posture analysis. One set of images relies on GradCAM — a common technique for displaying where a convolutional neural network model is focused. Another set of images explains the model by tapping into Integrated Gradients, which helps analyze pixels.
The researchers said it was important to create a trustworthy and ethical component of the AI system for trainers and general users. Otherwise, there’s no way to explain the methodology, should it come into question.
“We can explain what our model is doing, and that might be helpful to certain stakeholders — otherwise how can you back up what your model is really learning?” Cavey said.
VentureBeatVentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the subjects of interest to you
- our newsletters
- gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
- networking features, and more