E-commerce giant eBay needs to deal with new usage data — to personalize content and detect fraud, among other things — within seconds. So engineers went and built something to perfectly meet the company’s needs: Pulsar.
The company revealed details about the system for the first time today, and eBay is making it available for anyone to use under an open-source license.
“Pulsar can be used to collect and process user and business events in real time, providing key insights and enabling systems to react to user activities within seconds,” eBay’s Sharad Murthy and Tony Ng wrote in a blog post today on the new system.
Google has built a stream-processing system to meet its own data needs.
For eBay, a batch processing system — like the MapReduce framework at the heart of the Hadoop open-source software for storing and processing lots of different kinds of data — is no longer sufficient on its own.
“The sheer data volume and the low latency requirements demand in flight data processing instead of a store and process model as in batch oriented systems,” eBay’s Murthy, Ng, Bhaven Avalani, Xinglang Wang, Ken Wang, and Anand Gangadharan wrote in a new paper on Pulsar.
The system integrates with existing open-source tools like the Cassandra NoSQL database and the Druid data store.
Inside eBay, the system is widely used.
“Several teams within eBay have successfully built solutions leveraging our platform, solving problems like in-session personalization, advertising, internet marketing, billing, business monitoring and many more,” the eBay employees wrote in the Pulsar paper.