Shopzilla buys into big data for inventory management

Why Shopzilla moved its inventory platform from a traditional RDBMS to VoltDB

With a user base of more than 40 million shoppers worldwide, Shopzilla is a leader in connecting buyers and sellers online. Each month, through both its destination websites and affiliate network, Shopzilla connects shoppers with more than 100 million products from tens of thousands of retailers. To provide the most recent inventory of products and prices, Shopzilla moved its inventory platform from a traditional relational database to VoltDB, a specialized, open source database for high-velocity big data.

Shopping the virtual mall

If you shop online, chances are you've landed at Shopzilla or its portfolio of brands -- including Bizrate, Beso, Retrevo, TaDa, RobotOatmeal, and others -- which help shoppers worldwide discover, compare, and purchase products.

Shopzilla's core value is that it presents an up-to-date inventory of products and prices, so shoppers don't need to surf endlessly across different sites. But until recently Shopzilla's inventory platform, which ingests terabytes of data from merchant feeds on a daily basis, was powered by a traditional relational database. Software, hardware, and administration resources were becoming prohibitively costly to support high-velocity inventory updates.

While researching database vendors, Shopzilla compared a number of NoSQL and sharded MySQL products before selecting VoltDB, an in-memory relational database, designed specifically for high performance. The intent was to narrow the data "ingestion-to-decision gap" by running thousands of writes and tens of thousands of reads per second, while performing real-time tracking.

Get tips for choosing the right database for your needs, then compare two leading NoSQL datastores with the MongoDB vs Couchbase showdown. Want more Java enterprise news? Get the Enterprise Java newsletter delivered to your inbox.

Getting up to speed

Working with VoltDB, Shopzilla significantly boosted the rate at which it can process inventory data and derive actionable intelligence. Higher-velocity updates drive revenues by delivering near-real-time information to consumers and by passing along more highly targeted leads to the thousands of retailers paying Shopzilla on a pay-per-click basis.

On a simple three-node evaluation cluster supporting full durability, Shopzilla achieved 80,000 to 100,000 writes per second with VoltDB. Once fully optimized and in production, that level of performance helped eliminate complicated caching and data pre-loading processes, simplifying Shopzilla's architecture and allowing it to interact directly with the database. The company also used VoltDB's point-in-time snapshot capability as a faster way to export inventory data for further analysis. In addition, Shopzilla gained the ability to filter offers coming into its system, removing duplicates and reducing the transactional load downstream from 2,500 TPS to 650 TPS, allowing the company to save on hardware and operational expenses.

VoltDB's stored SQL procedures allow Shopzilla to identify and fix any errors in application updates before deployment. Shopzilla is also able to constantly update its sales and consumer feedback for online merchants and retail advertisers based on those real-time analytics.

Since switching to VoltDB, Shopzilla has achieved its first milestone of rapid feed ingestion with a five-fold increase in performance. It's the first step in enabling an overall latency reduction for accurate product and pricing information.

This article, "Shopzilla buys into big data for fast meta-shopping," was originally published at Read more of Andrew Lampitt's Think Big Data blog, and keep up on the latest developments in big data at For the latest business technology news, follow on Twitter.

This story, "Shopzilla buys into big data for inventory management" was originally published by InfoWorld.