MongoDB CTO on cloud database inroads and riding the developer wave

MongoDB was the original NoSQL open source upstart. What began as a small experiment in developer-friendly document model storage grew into one of the most established players in data processing. Now it’s a big, publicly traded company with a product that has grown more powerful over the years.

The core of the product is still a NoSQL document store, but it’s continually added to that core, and adapted to developer needs. Today, developers who want to use SQL with MongoDB can send SQL queries that MongoDB will unpack and execute. It's not an experiment for the curious anymore -- it's a big, well-rounded tool that's designed to handle all of the data storage tasks that a modern enterprise might have.

VentureBeat sat down with Mark Porter, the CTO of MongoDB, to see how the company has been evolving, to discern its role in the database world, and to understand where company leaders and community members are taking the platform from here.

This interview has been edited for length and clarity.

VentureBeat: What do you believe developers are looking for?

Mark Porter: One of our holy grails is to focus on native integration into their languages. So rather than some of the other database vendors who make the developers call interfaces that look identical in Rust or C or whatever, our drivers actually are in the native language. We actually built 12 different drivers, to tell you the truth.

Developers just type in their own data structures and then the data is just committed to the database.

One of the things I said in my blog when I joined MongoDB was that no developer actually wants to learn SQL. They actually just want to code. The early 2010s fascination with infrastructure and AWS, and clouds and all that? That's all passed away. Now, developers are like, 'Yeah, that's all a commodity. It used to be a special thing.' Now, the special thing is to write applications faster.

So, we remove provisioning and monitoring and management and security patching and all that from developers' to-do list because, at the end of the day, their manager is asking them to get that app done and live on the site.

VentureBeat: Which kind of use cases are ideal for the Mongo document model?

Porter: We believe that for most purposes -- including all of the mission-critical ones -- it is an ideal general-purpose database. The convenience store 7-Eleven actually uses us as an end-to-end solution for retail where they have mobile devices in people's hands in the stores. And they manage all their inventory, they manage their stores, they manage their hours, they manage all these different features of 7-Eleven at over 8,500 stores. All that data is synced and backed up to MongoDB in the cloud. It syncs to all the other associates and all the reporting systems automatically. All they did was write a mobile application and the whole rest of it was handled for them.

VentureBeat: So the document model makes it flexible enough for any general application, and schema-free flexibility is a feature?

Porter: In fact, MongoDB is not schema-less. There's a schema there just like in a relational database. It's just that ours is more flexible. Now you can lock down your schema for the production app. We have JSON schema enforcement where you can say: 'No more fields' or 'these fields have to have values like this.' You can do all that same stuff. But by embedding stuff, the way we do, we actually find that the code that developers are writing is significantly shorter.

VentureBeat: Do all of the field tags end up bloating the database?

Porter: It's actually not true. All that stuff is compressed, and it really doesn't affect us. So we've done overhead comparisons between Wired Tiger, which is our storage engine, and relational storage and frankly, there's really not any kind of significant difference either up or down. It's not really there. We don't see that. We think that's kind of something people bring up that's not noticeable.

VentureBeat: MongoDB is moving into the cloud like everyone else. How did the company approach the pricing model? What did you decide to emphasize with your pricing model?

Porter: So, first off, we have a perpetual free tier, which allows people to come to us, and use this free on one of our smaller machines forever. And people love that. That's number one. Number two is, we really focus on making things incredibly easy and giving people choices around storage, around compute size, and memory.

Now, just like everybody else, you know, we are slowly moving into managing more of that infrastructure for customers. Our pricing model right now is a combination of compute, storage, and then, obviously, network bandwidth. That's our pricing model, and we believe we are completely competitive.

VentureBeat: How do you see the company competitively today in relation to big cloud providers?

Porter: We partner with and compete with the cloud vendors. Our technology is better, available in more places, and moving faster than the cloud providers. We see ourselves as a general purpose data platform for modern applications, as organizations are looking to reduce the complexity of their data layer and innovate faster. Our document data model is a superset of all data models and is versatile enough to serve the majority of application use cases, and we've expanded into a platform offering with Search, Analytics, Mobile, and Storage products. And our distributed database architecture allows enterprises to scale easier and have more highly available systems.

For companies looking to build a competitive advantage, they need to maximize the productivity of their developers as working with data -- or worse, working around data -- is often a developer's most challenging problem and largest drain on their productivity. The key to MongoDB's technological differentiation is that our product is built to be the exact same experience in every environment whether you're running it on your laptop, a mainframe, or in the cloud. You don't need to rewrite a single line of code if you want to move from your own datacenter to the public cloud, move between clouds, or any other scenario.

VentureBeat: Are there any licensing issues or landmines for companies that want to use MongoDB as a commercial cloud?

Porter: To clarify, MongoDB is not a cloud provider -- we offer services at the next higher-value layer in the stack. We offer a database-as-a-service offering called MongoDB Atlas which is available in 80 regions across AWS, Azure, and Google Cloud. The majority of MongoDB usage comes through the public cloud whether it's from Atlas or our other versions (Enterprise Advanced and Community) that companies and developers self-manage.

VentureBeat: How is this reflected in the license?

Porter: MongoDB made a structural change to its license in 2018, moving to what we call Server-Side Public License (SSPL). SSPL basically mandates that it's still a free-to-use license, but with the exception that, if you plan to monetize an offering of MongoDB-as-a-service offering, you can still do that, but then you have to either open source not just the code but also the management plane so that anyone can take that code and basically also offer MongoDB as a service offering. Or, you can come and talk to us, like Tencent, Alibaba, IBM, and others have done -- and they are very successful offering our core MongoDB product as a managed service on their platforms.

Since we made that change, we've seen the majority of other open source startups also make licensing changes, as well as seeing some adopt the SSPL. It hasn't harmed our popularity or relationships with customers or partners. We've had 75 million downloads of our free product in the last 12 months, more than all in the entire first decade of the company combined. Our open community is alive and well -- and we remain the database that developers most want to use, as voted by developers on StackOverflow, four years running.

VentureBeat: What is coming up next for MongoDB?

Porter: We had this database thing and now we've added on search. But we added that differently than anybody else. We added Lucene search directly on the nodes that have the MongoDB data, so there's no duplication of data. You can stand-up search with a click on the console. I think we've timed someone at 15 minutes from start to finish. At the end of 15 minutes, they're showing me a screen that has Lucene searching against that data.

VentureBeat: No duplication?

Porter: To stand up search, stand up the search engine right on those same machines and it accesses the MongoDB files for the data. Now it does have to build its own listing indexes obviously, right? Because it has its indexes but the actual core data is never duplicated.

The next area is the Atlas Data Lake. We can create buckets for you where you can move the data out of your online data store and you can still query that data. [Transaction processing] storage is more expensive than S3 or Azure blob storage, right?

We got this brilliant idea, which was: Why are we just using our own buckets? And so with that Atlas Data Lake, you can bring in your own buckets and we will actually let you federate queries with those buckets. That's important if you want to get a fraud verdict in under 400 milliseconds, which is the industry standard that I'm aware of for getting a fraud verdict on whether you should let a transaction go through. It's important because real-time analytics is so important.

More