Commentary: Relational databases aren't dead, but database innovation is cropping up in all sorts of places now, including graph dadtabases.
I don't normally celebrate venture funding because customer money, not VC money, is the best signal of success. That said, $325 million is a lot of money. That's how much Neo4j, an open source graph database company, recently raised. When I was at MongoDB, we raised $311 million, but that was spread across a number of funding rounds; $325 million, all at once, is mind blowing.
Graph databases, once wrongly considered niche, are finding their way as general purpose databases. In fact, this is arguably the real lesson from Neo4j's successful fund raise: graph databases are very much mainstream.
SEE: Navigating data privacy (free PDF) (TechRepublic)
Finally getting their due
Graph databases aren't new—Neo4j was first released in 2010. About that same time (late 2009), Apache Tinkerpop was born. Others like Apache Giraffe and OrientDB emerged in the same general time frame, or a bit later. Pick through the DB-Engines database rankings and there are scads of graph databases listed, though only Neo4j has thus far managed to crack the top 20 (it currently ranks #18). Each of these graph databases have been around long enough to be well-understood by developers.
Yet, they aren't.
A graph database stores individual data points (such as key-value pairs or documents), but it also stores the data relationships between them. Those relationships between data are first-class citizens in a graph database, and it's what makes a graph database blindingly fast in applications with a connected dataset like fraud detection. Speaking to me back in 2014, Neo4j CEO Emil Efrem claimed that "a graph database can easily be a million times faster than a relational database."
Say that again? "It's basically 1,000 times performance improvements, despite a 1,000 times increase in data size." This is so because a graph database accelerates transversals while maintaining performance, even as the database size grows.
This is just one reason that databases are interesting again, something Efrem called out in funding post:
Everything we do in our daily digital lives—a text message, buying groceries online, a zoom call or even driving a modern car—is centered around information. All of that information ultimately lands in a database. Databases are a fundamental part of the fabric of present-day society.
The database is at the heart of every application. That's why the database market is the largest market in all of enterprise software. Companies spent more than $50B last year on database software, a number that's expected to grow to more than $100B by 2025.
However, from an innovation perspective that market has been static! The relational database was invented in the late '70s. And while Larry Ellison has bought many yachts and a few Hawaiian islands with the fruits of the commercial success of the relational database, the technology has remained remarkably the same.
Databases represent the single largest category of software spend, so perhaps it's not surprising that Neo4j would attract such significant investment. There's a lot at stake as we start to innovate with databases again.
But, again, for many (and not merely these Hacker News commenters) graph databases are niche. They're for social applications or other areas where relationships matter. The problem with this line of reasoning is that it overlooks the reality that relationships between data almost always matter. Of course, similar things can be (and have been) said about time series databases—they're not niche, they track how data changes over time, and who doesn't need that?
The so-called niche players, in other words, are branching into the mainstream. At the same time, general-purpose databases like Postgres and MongoDB are adding graph (and other) functionality.
I suppose all of this is a long way of repeating a point made above: after decades of database doldrums, where we tried to make all data fit the rows-and-columns of relational databases, we're finally innovating in databases again. Neo4j's big raise is indicative of this trend. Long may it continue.
Disclosure: I work for AWS, but the views expressed herein are mine.
Data, Analytics and AI Newsletter
Learn the latest news and best practices about data science, big data analytics, and artificial intelligence. Delivered MondaysSign up today
- What's the secret to database success? The answer may surprise you (TechRepublic)
- How to succeed in software engineering management (TechRepublic)
- How to become a data scientist: A cheat sheet (TechRepublic)
- Big data's role in COVID-19 (free PDF) (TechRepublic download)
- Power checklist: Local email server-to-cloud migration (TechRepublic Premium)
- Volume, velocity, and variety: Understanding the three V's of big data (ZDNet)
- Big data: More must-read coverage (TechRepublic on Flipboard)