TigerGraph seeks to democratize graph databases

This week, graph database provider TigerGraph announced version 3.2 of its key product. The release is aimed at boosting performance -- especially for larger datasets -- while broadening accessibility for users. The new edition increases support for enterprise-critical technologies, such as Kubernetes, while upgrading practical features like cross-region replication to improve reliability in the face of network or hardware failures.

Redwood City, California-based TigerGraph claims to be the only graph database provider that can scale to handle the extremely large datasets that are becoming more common in enterprises.

TigerGraph said its database is able to parse the 36TB LDBC-SNB (Linked Data Benchmark Council Social Network Benchmark) business intelligence benchmark, a particularly challenging graph with more than 70 billion nodes and more than 500 billion edges. This enables opportunities for companies with extremely large customer data collections to apply graph algorithms for complex connections.

The new release includes enhancements to TigerGraph's development and access tools, such as the visual GraphStudio platform. TigerGraph uses the new query language, called GQL, which is going through an industry standardization process and now includes at least 30 more functions and language enhancements to make it simpler to answer more complex queries with fewer queries. The company has also enhanced batch processing for faster responses to the bigger queries that often run in the background.

The v3.2 graph database release enhances support for the work data scientists handle. TigerGraph's ability to scale large datasets is combined with enhancements to the query language to make it easier to tackle more complex questions inside the database itself without exporting data to another process, the company said. TigerGraph is also expanding a collection of its open source-based solutions that can be customized quickly.

TigerGraph's Jay Yu on graph databases

To understand the implications for this release and TigerGraph's plans heading into the future, VentureBeat sat down with product and innovation VP Dr. Jay Yu.

This interview has been edited for clarity and brevity.

VentureBeat: So you've got a list of dozens of new features. Is it possible to summarize them?

Jay Yu: It's really all about how we democratize graph adoption -- by all companies and all sizes. That's really the key thing.

VentureBeat: How do you even begin?

Yu: We have this cool tool that's very visual called GraphStudio. Think about this: It's a visual studio for the graph developer. So we allow you to draw a node, connect the nodes with edges, and then you attach attributes. But the best part is this allows you to query and examine the data visually because it's a graph.

Once you do that, we visually show the result itself as a graph. Then we can add some simple details. You can select easily [by saying]: "Hey, find where Jay [is] in the result." We can highlight those nodes, make them more interactive, more usable. And finally, if your result contains latitude and longitude as those nodes, we can automatically show how those things appear on the map.

VentureBeat: So this will open things up to the non-developers?

Yu: Yes. There are a lot of business intelligence users. They don't want to do programming. They only want to do configuration, right? We have this new thing called Visual Query Builder. It will literally allow you to visually describe what you want.

Say I want to understand the relationship between two people. Say the last person is Jay and the other person is Peter. And I want to see how many other people are in the middle and connect them. We allow you to draw that intention of your query and some expressions. Again, we translate that into a query so it makes it much easier for business users to adopt it without the need to code.

VentureBeat: Graphs are definitely made for graphical interaction, so that's got to open it up to the average user. Where can they take it?

Yu: We focused on the AI and machine learning libraries. We already have about 50 out-of-box graph algorithms that are open source and already available. We precoded them for everybody. People can copy and paste, and they adjust for their task. Now we added 25 more, expansion for Similarity and Centrality algorithms, and a new category of Topological Link Prediction algorithms and new support, such as graph embedding.

VentureBeat: Embedding the graph so it's easier for machine learning?

Yu: Yes! A graph is just nodes and edges and a weight on edges, right? Graph embedding is translating from the graph format into a mathematical matrix model so you can take it into a machine-learning algorithm.

VentureBeat: Does that help you unlock more of the signal in the data?

Yu: We say we can apply a fair amount of machine learning or AI directly onto the exported data, and it works well with the embedded graph. But at the same time, there are certain limitations. We've also found that there's a hybrid use case; that's why we call it hybrid integration. The data in the graph represents the deep connections. It's often a multi-hop connection. Think about it. We're talking Jay to Peter, but only because Samantha arranged the call. With normal machine-learning training, the algorithm doesn't see that deep relationship. It only sees one hop. It's very hard to train.

But if we add what human beings already know about these kinds of deep relationships stored in the graph, we can easily expose this. We can extract those hops out. We can precompute those relationships that are much deeper inside the graph and bring that to your machine-learning algorithm. That's game-changing because that will bring human knowledge in the graph to the machine-learning model. Before that, you could only rely on flat data structure features; you will never discover those inside.

VentureBeat: You've also added a collection of features to make it simpler to run bigger, enterprise-scale clusters using Kubernetes, right?

Yu: We already have a lot of enterprise-grade customers. One has a largish graph database with 15 billion nodes. It has 5-10 billion addresses, right? How can we keep adding enterprise tools? The new features will improve manageability and supportability -- all these things so that we become a really, really mature company that can support the largest customers ever.

We actually have a couple of features I can highlight. One is cross-region replication. People can configure that themselves out of the box. So if one of the regions of AWS is gone, you continue with the other regions.

The second one is Kubernetes support. We want to simplify cloud management, like the ability to spin up and shut down instances or reclustering. Kubernetes' support is key for this release. Any TigerGraph image on any VM can actually be managed by Kubernetes.

The third area is what we call in-place expansion. When we notice that a customer grows their data really fast, sometimes they have to upgrade their machine. Now you can scale up with a rolling update.

We are by nature a cluster, right? Now you can actually double the size of the thing with one single command, instead of having to do a lot of manual work. You spin off the new cluster and move data over.

And then, finally, one of the features we want to highlight is how we now allow you to control the query workload because TigerGraph supports both OLTP and OLAP queries. Anybody knows that with database systems, if you've mixed those two workloads together, they will impact each other. No matter what, right? So we're actually going to support that. We're going to allow you to actually say: "I want to dedicate this cluster for OLAP query only. But I want to direct all my smaller OLTP training query to another cluster." So that means we can do both in parallel while minimizing impact to each other.

VentureBeat: Where do you want this to take you in the next few years?

Yu: Ultimately, we want to get to petabyte-level right now. We have a 36TB limit. We're going to go to 100TB. When we go to petabyte-level, we completely rewrite the book on big data. Imagine your data lake in one big graph. That's ultimately the goal we want to go to.

TigerGraph's Jay Yu on graph databases

More