Perhaps the most important feature of the new Amazon Fire Phone announced this week is Firefly, in which the handset uses its camera to recognize physical and media products in the real world and then links them to buying options on Amazon. It’s what makes the Fire Phone a shopping phone.
When Firefly is scanning an item, like a box of Cheerios, you’ll see bright little dots (fireflies) on the screen gathering around the item in the camera’s view. Once Firefly has recognized the product, a link to additional information (metadata, buying options) then appears at the bottom of the screen.
When Firefly is scanning the image, it pulls out only the uniquely identifiable pieces of the image. When scanning a box of Cheerios, Firefly would pull the outline and color of the box and the item’s logo.
The outlines of those shapes, not the whole image, zips to Amazon’s cloud servers, where the system looks for a match from among the thousands of images in its product catalog, according to Fire Phone product manager Cameron Janes.
If the product that you scan does not appear in the Amazon product catalog, Firefly won’t recognize it. The main reason for Firefly, after all, is to help people recognize items that they can then easily buy on Amazon. Ultimately, the Fire Phone is more or less a vending machine for Amazon products.
To find the right match without going through millions of product images in the Amazon product database, Janes says his company uses “heuristic approaches and computer vision” to narrow down the results.
The algorithm might look for certain characteristics of the scanned image — like its color or the shape of the bounding box around the scanned image — and then looks only at images with similar general characteristics for matches.
Once Firefly finds the match, and this usually takes only a few seconds, it displays a link at the bottom of the screen that leads to other content related to the product it has found. It finds this stuff by using a code that is common between the found product and all metadata and related material.
Scans are stored until you delete them
My colleague, John Koetsier, wrote a story this week called “Amazon’s Fire Phone might be the biggest privacy invasion ever (and no one’s noticed)” about the privacy implications of the new Fire Phone. But Amazon says it doesn’t use Firefly scans for anything other than the shopper’s convenience and for improving the online retailer’s product-detection chops.
Amazon uses these images to help Firefly recognize similar images faster and with more accuracy. Here’s how the company explained it to me in a follow-up email after my 40-minute interview with Janes:
“As Jeff described in his presentation, computer vision is a complex problem and the more data we have to train our image recognizers, the more accurate the Firefly technology will be for customers,” Janes wrote. “For example, if we have multiple, different images of a single item, the recognizer can use common qualities among the images to more accurately identify the item.”
Firefly stores all of its scans in its servers, until you go to a storage-settings page on the phone and deletes the images.
Firefly does not geo-tag scans of products, Amazon says, but it does geo-tag scans of phone numbers because it wants to be able to attach area codes to numbers that don’t have them. Firefly will do this only if a customer has enabled location services during phone set-up. Users can disable location services at the Firefly app level, Amazon says. Firefly can also capture URLs and email addresses.
Nor are Firefly scans combined with personal photos taken with the phone’s camera app and stored in the Amazon servers. “The Firefly app has nothing to do with the camera app,” Janes said.
Music and TV
Firefly also can recognize and find matches for music and movies. But Firefly doesn’t actually watch video, so no need to hold the camera up to the set. “We’re recognizing the audio in the scene to recognize what you’re watching,” Janes said.
As the phone listens to audio, the “fireflies” on the screen gather in a tight square and appear to vibrate together.
The sound sample zips to the Amazon cloud, where it is compared against all the sounds in songs, movies, and TV shows in the Amazon catalog. For music and movies, too, an algorithm detects general characteristics (is it music, dialogue, or both?) to narrow down the number of sound files it must it must compare the sample against.
When a match for a song is found, Firefly returns a series of links and metadata about the music. It might provide a link to buy the music at Amazon, and a link to a third-party app like iHeartRadio which will use the sound sample as a seed for a new radio station.
When a match for a TV show is found, Firefly identifies the episode and even the part of the episode that was scanned. The link on the bottom of the page links to iMDB data about the show and the actors, and links to the steam, download, and physical copy buying options at Amazon.
Firefly can also identify live TV from 160 broadcast stations. This has nothing to do with selling products in the Amazon marketplace; once Firefly finds a match, it returns information about the show and the people in it from the iMDB database. It also provides some options for sharing the content on via social media.