Etleap, a stealthy startup that has gone through the prominent Silicon Valley accelerator Y Combinator, is launching today with a cloud service for easily cleaning up data from several sources and sticking it all into a data warehouse such as Redshift (from public cloud provider Amazon Web Services).
Etleap founder and chief executive Christian Romming was inspired to build something like this as a result of his experience at ad tech company VigLink, where he was chief technology officer. “Engineers were frustrated that they had to spend time on this [the extract-transform-load (ETL) process], and analysts were frustrated they didn’t have robust and timely access to data,” Romming told VentureBeat in an interview. Rather than buy expensive data preparation software, VigLink chose to work with its own system.
Now, three years after leaving VigLink, he and a small team have a tool that can help many companies. Customers include CrowdFlower, Kabam, and PagerDuty. Starting today, companies can sign up to use the tool via Etleap’s website.
Data cleaning is an old and arguably boring process. The heavyweight is Informatica, which went private earlier this year. Talend has sought to focus more on the Hadoop open source big data software. Startups like Trifacta and Paxata have made this area exciting in the past couple of years.
Etleap only works as a cloud service, and that’s a reflection of companies wanting to run their data warehouses in the cloud, Romming said. Of course, over time the team could make a similar version available for companies to run in their own on-premises data centers.
Etleap allows users to send in data from databases and other cloud services, including Amazon S3, Google Analytics, the Hadoop Distributed File System (HDFS), HTTP events, Marketo, MongoDB, MySQL, PostgreSQL, Salesforce, Segment, Singular, and Upsight. For now Etleap can only send data into Redshift and Apache Hive.
The software lets users visually set up transformations of data by dragging and dropping. The system maintains multiple transformations for data sources at once, and you can see how things are going by taking a look at a dashboard. For one thing, the dashboard shows the latency for each pipeline, so you can get an idea of how far behind the data warehouse is from the main data source.
Based in San Francisco, Etleap started in early 2013 and now has around four employees. The startup participated in Y Combinator’s winter 2013 batch.