Head over to our on-demand library to view sessions from VB Transform 2023. Register Here
Graph platform Neo4j today announced the general availability of Neo4j 5, the latest version of its cloud-ready graph database. Neo4j is following up on its achievements in 2021, which include surpassing $100 million in annual recurring revenue, closing a $325M series F financing round at over $2B valuation, which it calls “the largest funding round in database history,” and launching a free tier of its fully managed cloud service.
Neo4j 5 promises better ease of use and performance through improvements in its query language and engine, as well as automated scale-out and convergence across deployments. Jim Webber, chief scientist at Neo4j, discussed Neo4j 5 as well as the bigger picture in the graph market in an interview with VentureBeat.
Markets and Markets anticipates the graph database market will reach $2.4 billion by 2023, up from $821.8 million in 2018. And analysts at Gartner expect that enterprise graph processing and graph databases will grow 100% annually in 2022, facilitating decision-making in 30% of organizations by 2023. However, the graph market isn’t immune to the economic downturn and has its own intricacies as well.
Query language and performance improvements
This is the first major release for Neo4j in two-and-a-half years, following up on Neo4j 4 released in 2020. Back then, CEO Emil Eifrem identified ease of use as the major objective going forward. To help achieve that objective, Neo4j doubled its engineering workforce between versions 4 and 5, from 100 to 200 engineers. The increased engineering resources are allowing Neo4j to improve the developer experience in several areas, Webber said.
VB Transform 2023 On-Demand
Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.
Webber said that Cypher, Neo4j’s query language, has evolved considerably in a number of ways. First, the Neo4j engineers and product management team made “spontaneous improvements.” Those mostly have to do with simplifying pattern matching in the language to behave in a way that resembles more what SQL users would expect. While Cypher was able to perform pattern matching previously, the new syntax makes the code shorter and easier to get, Webber said.
These “spontaneous improvements” weren’t the only way Cypher has evolved. Neo4j is part of the graph query languages (GQL) standardization effort. As opposed to relational databases, in which SQL is the standardized query language promoting interoperability among vendor implementations, NoSQL query languages aren’t standardized. As of 2019, a working group of the ISO has been developing GQL in collaboration with a number of vendors, including Neo4j. This has provided Neo4j with useful ideas for the evolution of Cypher.
In addition to the query language, Neo4j’s query engine performance has also evolved considerably as a result of R&D efforts. The company claims improvements of up to 1000 times, although these improvements refer to corner cases (i.e., scenarios that occur outside normal operating parameters). Webber said users should expect at least one order of magnitude better performance across the board.
There’s also a new runtime called the Parallel Runtime, which capitalized on the results of a collaborative EU R&D project that Neo4j participated in. In addition, Neo4j’s indexing and storage engine has improved as well.
Historically, Neo4j hasn’t released benchmarks. However, Webber said that his team is interested in performance and happy with where Neo4j has gone so far. “If anything, my team is Neo4j’s fiercest critics in terms of performance. So if we’re not unhappy, I think that’s not such a bad outcome,” Webber said.
Improved operations and convergence across cloud and on-premises
The other major area of improvement that Webber identified is operations. Neo4j has been offering an on-premises platform since its inception in 2007. Aura DB, Neo4j’s fully managed cloud platform, only came along in 2019. Since then, the Neo4j team has been working on achieving feature parity in both directions and Webber said the gap is closing.
The on-premises version of Neo4j 5 offers new and enhanced features like autonomous clustering and fabric, enabling organizations to efficiently operate very large graphs and scale out in any environment. Neo4j 5 also automates the allocation and reassignment of computing resources. Webber referred to how this simplifies Neo4j operations on premises drastically and mentioned that lessons learned from Aura DB have been valuable in developing those features.
In the other direction, Webber noted that certain functions in Neo4j’s APOC (awesome procedures on Cypher), its library of custom and prebuilt functions and procedures, were only available in the on-premises version due to security considerations in the cloud. That gap is closing, as Neo4j is doing research on intermediate representation analysis that will enable analyzing procedures to ensure they are safe before deploying them to Aura DB. At that point, Webber said, the two approaches will reach feature parity. The goal is to make sure that the experience users have with Aura DB is similar to the ones users have with Neo4j on-premises.
“For folks new to Neo4j who come straight into Aura, they’re not going to notice, as Aura is relatively friction-free. They can get going and be productive that way. But for certain people who have sophisticated on-premises installations, we want to ease their path into the cloud should they choose to go there over the medium term,” said Webber.
Neo4j 5 also sports a new tool called Neo4j Ops Manager that’s designed to provide a single pane for easy monitoring and management of global deployments, giving customers full control over their environments. In addition, the existing Neo4j Admin tool has also been simplified. Webber noted that both this and the new version of Cypher come with mechanisms to ensure backward compatibility, despite the fact that some breaking changes have been introduced.
Graph market outlook
As far as the bigger picture in the graph market goes, Webber said that while there are multiple forces at play, the overall outlook remains positive. Arguably, peak graph hype seems to be behind us. Webber said he’s “happy that we’re over the hype phase, because people started imagining all sorts of insane possibilities for graph databases, which weren’t backed up by computer science.”
These days people increasingly understand what graph databases are good for and that’s helping the market, Webber said. Modern data is sometimes very structured and uniform and sometimes very sparse and irregular, and that suits graphs very well, he added. Learning to tell the difference means that users come to Neo4j with realistic graph problems that Neo4j can help solve.
Webber said that analyst predictions about the graph market are broadly on target, despite the current macro climate, and the total addressable market remains substantial. Given the current macro climate, we may see a bit of a shakedown and Neo4j is not immune to that. Even before the economic downturn, the graph market has been one in which a great number of vendors are vying for market share and it’s predictable that not everyone will make it.
“This downturn has happened, but I think that’s a company-by-company thing. I don’t think it’s systemic across the graph database industry,” Webber said. “Certainly the metrics that we see and what we understand from the industry at large, including some of the web hyperscalers, is that the interest in graph continues to grow. I think that’s quite a solid foundation for the next decade or so of growth in the industry.”
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.