MIT project lets you author code with screenshots, pictures

Picture 21Programming Luddites may have less to fear in the future.

A new MIT project called Sikuli allows people to program using screenshots in lieu of written code. Basically, it lets you reference user interface elements like a Microsoft Word icon, Trash Can or search bar with pictures of the button or icon instead of script. (If you look at the picture to the right, you’ll see functions referencing icons and screenshots of buttons instead of text. It’s best explained in the video below.)

The idea is to make it dead-simple for casual computer users to write their own programs without having to know a programming language. Let’s say you’re building a location-based app that uses real-time bus locations to tell you when the next one is going to arrive. If the city transportation Web site keeps a map of where buses are at all times, you could take a screenshot of the map and instruct the program to send a notification when the marker pin reaches a certain point on the map, instead of entering latitude and longitude coordinates.

Sikuli catalogues user interface elements from online tutorials, computer books and documentation. It analyzes the text surrounding the icons in documentation, performs optical character recognition and uses cutting-edge computer vision techniques to figure out what different visual elements are responsible for.

The team behind Sikuli, which means ‘God’s Eye’ in the language of Mexico’s Huichol Indians, have also found other uses for their technology. You could use it as a visual search engine when you’re lost and don’t know what to do in specific application. If you’re clueless about what a certain button does, you could just take a screenshot of it and use it to search for help. Google’s Goggles mobile app has similar functionality but for landmarks, logos and business cards.


VentureBeat is studying mobile marketing automation. Chime in, and we’ll share the data.