You are on page 1of 22
tsior2022, 1907 ariner Repent Gartner. Licensed for Distribution Market Guide for Graph Database Management Solutions Published 24 May 2021 - ID G00737853 - 33 min read By Merv Adrian, Afraz Jaffri, and 1 more Data and analytics leaders are challenged with a multitude of existing and new DBMS choices for use with graph analytics use cases. This Market Guide describes available graph database management solutions Over Ww Key Findings = Graph database management solutions (OBMSs) are growing in awareness and popularity as they and their associated tools become more mature. However, itis often difficult for data and analytics professionals to distinguish between different implementation models, and to fit them to their use case. = Native and multimodel graph databases have the ability to scale to meet enterprise production requirements, including stability, availability and security, but native graph DBMSs offer performance gains when dealing with queries over very large graphs (typically billions of nodes). = The traditional split between RDF and property graph support is becoming less important for choosing a graph DBMS compared to features that support enterprise readiness, = Vendors in the graph DBMS market are expanding their stack to become platform plays for enterprise knowledge graphs or graph artificial intelligence (Al) with an associated ecosystem of products. 1 Managed as-a-service offerings from established and new vendors and cloud service providers significantly lower the barrier to entry for organizations that do not have graph expertise. Recommendations Data and analytics leaders responsible for data management solutions, and implementing and evolving operational and analytical infrastructure, should: hitpssiwwu.gartnerconvdectreprinis?__helo=108400980, 164ba5 !2a9fa a0 Tbdb62521916347 1642290390625.1642280909625.1642200309025.1... 1122 1510172022, 1907 Garner Reprint = Invest time to fully develop a graph use case before choosing a graph DBMS by defining business value, sample queries and transactional versus analytical processing requirements. = Assess the enterprise requirements for graph DBMS product functionality that go beyond prototypes by comparing security models, API support, ease of configuration and optimization, and end-user tools for visualization, query and schema building, = Distinguish between RDF or property graph models only where there is a clear need by evaluating data exchange, ontology reuse and reasoning requirements, = Test the ability of existing multimodel offerings to deliver the required analytics, scalability and performance before adding another vendor and product combination. = Deploy managed services where available to minimize the overhead of managing and maintaining graph DBMS, which may be unfamiliar to existing team members. Strategic Planning Assumption By 2025, graph technologies will be used in 80% of data and analytics innovations, up from 10% in 2021, facilitating rapid decision making across the enterprise. Market Definition Graph DBMSs provide support for graph data models, in native and virtual formats, including data loading, conversion, consistency and security, with the ability to scale up and scale out. They support multiple types of interactions ranging from simple node and edge traversal and triple pattern matching for transactional uses to complex multihop queries, reasoning and inference, and algorithms that run across the entire graph structure for analytical workloads. Market Description As in the overall DBMS market, graph DBMS use cases can be classified as operational or analytics (and increasingly involve both). Graph DBMS vendors have accordingly provided two types of product, Transactional graph DBMSs are optimized for retrieving small parts of one or multiple graphs, frequently through querying or traversal (transactional processing). Analytical graph DBMSs are optimized to perform operations on each node and edge in a graph, and aggregate results across the entire or large part of the graph. Some use-case examples are presented in Note 3. Another type of division that exists in the Graph DBMS market is in the way a graph is structured and how it can be queried: = Aproperty graph (also known as a labeled property graph) allows pairs of tuples (properties) to be added to any node or edge in a graph. Nodes can be grouped together using labels. hitpsiwwu.garinercondectreprinis?__helo=108400980, 164ba5'!2a9fa a0 Tbdb62521910347 1642290390625.1642280909625.1642200309025.1... 2:22 1510172022, 1907 ‘corer Rept = An RDF graph explicitly models every node and edge combination as a set of triples of the form . Use cases for each type of graph overlap, and the difference is becoming less important as an iteration of RDF, called RDF*, is widely implemented by vendors, which overcomes some of the modeling complexity of RDF. Property graph vendors are also adding support for graph partitioning, adding rules and graph exports. There are even products that allow both graph formats to be used interchangeably. Efforts to standardize a property graph query language (GQL) will further enhance interoperability. Any of these products may be “native” graph DBMSs or multimodel DBMSs, which support graph as one data structure among others. The perceived advantage of multimodel is the ease of integration with existing data — and the elimination of another vendor to manage. But native solutions may be more applicable for resource-heavy processing involving real-time calculations, machine learning (ML) or even standard queries on graphs that have several billions of nodes and edges. Another difference is on how many “hops” need to be done from one node to another to get a result. Due to the highly connected nature of most graphs, the number of nodes that need to be visited can rise exponentially on each hop. (See Note 4 for more discussion.) Parallel processing is key to. supporting use cases that have this requirement, as well as the ability to scale up and scale out (see Working With Graph Data Stores). The modern graph DBMS landscape narrows the gap between all these different attributes with operational systems. This includes transactions also being able to handle analytical use cases, RDF and PGs being used for both traversal and data integration with semantics, and multimodel data stores handling complex querying and, in some cases, graph analytics. The end result is pure graph DBMSs expanding their stack of technology to handle two main types of use cases: (1) data integration and knowledge management as enterprise knowledge graph platforms (EKGPs); and (2) graph analytics, ML and Al as graph Al platforms (GAIPs). This convergence is shown in Figure 1 Figure 1: ‘onvergence of Capabilities in the Graph DBMS Landscape hitpsiwwu.gartnercondectreprints?__helo=108400980, 164ba5'!2a9fa 300 Tbdb62521016347 1642290390625.1642230909625.1642200309025.1... 3122 1510112022, 13:07 Gartner Reprint Convergence of Capabilities in the Graph DBMS Landscape Native RDF Native LPG een) Peter Seer rN App Dev Visualization Pecan een} eee eed an Eeeecd Traversal matching Distributed Parallel querying processing LI Converged SE. eee EKGP a“ cap GS NoSQL Inememory integration computation Multimode! RDF Analytics, ML Multimode! LP Gartner. Not all products discussed here will support all types of use cases equally well — use Figure 1 to identify which features map best to the use cases you have in mind, and Table 1 to identify the offerings that support those features as a first step to creating your shortlist. Finally, as elsewhere in the DBMS market, some offerings are cloud only, which includes as-a-service offerings, as opposed to hybrid on-premises DBMSs. Market Direction The growth of graph DBMSs is being driven by: = Application developers moving toward graph for an increasing number of customer- and internally facing projects utilizing graph databases as the storage and execution back end hitpssiwwu.gartnerconvdectreprinis?_helo=108400980, 164ba51!2a9fa a0 Tbdb62521916347 1642290390625.1642290909625.1642200309025.1... 4122 1510172022, 1907 Garner Reprint = Data architects designing knowledge graph-based solutions for content management, personalization and semantic data interoperability = Data scientists needing to perform higher-order exploration into connections and relationships between data points for better insights through graph visualizations, queries and algorithms = Database designers and specialists, struggling to support users with relationship-based analytical requests, or simply seeking better solutions to growing volumes of “semistructured” data = Business owners requiring purpose-built tools for use cases best served by graph technology, such as digital twins, 360-degree views of customers, fraud detection and other investigative intelligence use cases All of these groups will benefit from the accelerating innovation in the graph DBMS space. Increasingly, the reuse of existing data is being considered as migration tools for data from relational sources, object stores or even rival graph vendors are becoming common. Not surprisingly, as cloud platform providers enter the market, products like Amazon Neptune will drive more of this activity to attract users to the cloud as an ideal platform for experimentation. Hardware acceleration and hardware-based graphs will dramatically improve performance; support for these options will more likely occur on cloud platforms instead of at user sites. The longtime competition among languages, acknowledged in Hype Cycle for Data Management, 2020, has been a challenge, but has also resulted in both significant enhancements to product capabilities and continuing efforts to promote standards such as GQL. Vendors are supporting GraphQL as an API so that developers can access the graph data model without learning SPARQL or Gremlin to build queries. RDF graph DBMS vendors are adding support for openCypher or Gremlin to support graph traversal. Over the next three to five years, we expect these efforts to enhance and expand the momentum we already see in driving graph DBMSs through the Trough of Disillusionment to the next level of adoption. Market Analysis Interest in and use of graph DBMSs has been increasing steadily. In 2019, 27% of 501 respondents to Gartner's Magic Quadrant for Operational Database Management Systems Survey reported that they were already using graph DBMS; an additional 20% indicated that they planned to within the next year. In 2020, inquiries to Gartner DBMS analysts about graph rose substantially. Gartner's 2019 Al in Organizations Survey found that more than 20% of surveyed organizations were utilizing graph techniques in Al development. ' Vendors are describing dramatically increased interest and visibility as their revenue and pipelines continue to grow, and increasing cloud deployments are accelerating the trend further. Cloud service providers (CSPs) are adding graph capabilities to their own DBMSs and making competitive products — such as Neodj — available on their platforms. ips gun conGocreprints?_lc+106400089,te4bastfaotat2d0bdb62sat9 16347 1642290980625.1642290200625.1642290209625.1... S22 1510172022, 1907 Garner Reprint New DBMS deployments are dominated by cloud plays. Not all of these dominant vendors are competing aggressively in the graph DBMS space yet, but that will change as it grows. Gartner expects the percentage of revenue attributable to cloud in the overall DBMS market to exceed 50% by 2023 (see Forecast: Public Cloud Services, Worldwide, 2019-2025, 1921 Update and Forecast: Enterprise Infrastructure Software, Worldwide, 2019-2025, 1Q21 Update). Trends that have become inherent to the overall DBMS market are manifesting themselves in the cloud DBMS market as well; in both cases, nearly 85% of the revenue is attributable to a handful of dominant vendors (see Figure 2. Figure 2, DBMS Market Dynamics s DBMS Market Dynamics Overall and Cloud Pe PLO ate) Perr ee LL cy Oracle $15.28 AWS $9.48 Microsoft $13.78 Microsoft $3.58 83.4% 84.6%|— AWS $9.48 Google $15B 15M $4.98 Alibaba $0.88 SAP $3.5B Oracle $0.68 Google $1.58 Tencent $0.4B Teradata $0.88 IBM $0.28 em, Worldwide; 3, Worldwide, 2019 Source; Markt Share Analyse: Database Management Market Share: Enterprise Pubic Cloud Serv Gartner. As the incumbent CSP services begin to take market share, they will be a formidable barrier for small vendors that will have to both partner with and compete with them. As with other specialty markets, the small vendors’ ability to be agile and concentrate on specific requirements such as enterprise knowledge graphs, domain solutions, or analytics and Al will help them stay ahead. However, the ‘small vendors will also have to focus on integration with other systems and prove enterprise-level readiness. Choosing to add an additional product to your toolkit should begin with an assessment of the capabilities of your existing products, which may be suitable for smaller, first-use graph cases. Graph hitpssiwww.gartnerconvdectreprinis?__helo=108400980, 164ba5'!2a9fa a0 Tbdb62521916347 1642290390625.1642280909625.1642200399025.1... 6122 1510172022, 1907 Garner Reprint DBMS specialist offerings will need to demonstrate differentiation, leaning on hybrid deployment capabilities where data gravity keeps some data on-premises, and will need to clearly articulate why anew and different technology should be a part of your data and analytics portfolio. Further advancements will be made in: = Scale — parallel computation, GPU utilization Low-/no-code querying and schema development Providing a platform for application development Data transformation and ingestion Graph visualization ModelOps for graph data models Developing these capabilities, bringing them to market and supporting them will require commercial success and investment. Data and analytics leaders must recognize that not all the vendors listed here will achieve sufficient scale and operational effectiveness to succeed. Moreover, at this still- early stage of market development,vendors have formidable barriers to overcome, including: = Use-case identification/user awareness Concerns about vendor lock-in Finding a pricing model and level that prospects will accept Building a partner and service provider ecosystem ‘Supporting and extending production deployments The graph DBMS market includes relatively new entrants, as well as established vendors and cloud providers that position their products within a portfolio of other specialized platforms. Some are available on-premises, some in the cloud and some in both — potentially with the ability to query both at the same time in a hybrid architecture. Representative Vendors The vendors listed in this Market Guide do not imply an exhaustive list. This section is intended to provide more understanding of the market and its offerings. Market Introduction hitpsiww.gartnercondectreprints?__helo=108400980, 164ba5'!2a9fa a0 Tbdb62521016347,1642290390625.1642280909625.1642290309025.1..._ 7122 1510112022, 13:07 Gartner Reprint Table 1 provides a guide to the features offered by representative graph DBMS vendors. Table 1: Features Guide for Representative Graph DBMS Vendors Vendor Native or ‘Supported Deployment Query ‘Supported Gra Product Multimode! Models Platforms Language Algorithms/Lit AWS Native Property, Cloud TinkerPop, TinkerPop ‘Amazo RDF Gremlin, a ‘SPARQL Neptun. e Cambri Native RDF, RDF* Ons penCypher, Built-in dge (property) premises, ‘SPARQL Semant multicloud, ics hybrie AnzoGr aph DB Engine Datast Multimodel Property On- TinkerPop, TinkerPop ax premises, Gremlin, Enterpri multicloud Graph seand Datast ax Astra Dgraph Native Graph, On- DQL, Graph. Built-in JSON, ROF premises, cloud htpssiwu.gartnercondectreprinis?__hslo=108400980, 164ba5'!2a9fa 300 Tbdb62521016347,1642290390625.1642280909625.1642200309025.1... 8122 1510112022, 13:07 Franz Allegro Graph Markt, ogic Micros oft Cosmo DB Neodj Ontote xt Graph Multimodel Multimode! Multimode! Native Native Document, graph (JSON-LD), OWL, RDF, RDF* Triples Property Property RDF, RDF* OWL/ROFS Gartner Reprint On premises, multiclou, hybria On premises, multicloud, hybrie Cloud On premises, multicloud, hybrie On premises, multicloud SPARQL, ‘SPARQL*, FedSharc- Parallel, SPARQL, GraphQ, Prolog/Datalog, Lisp, JIG/Gremiin, domain- specific languages JavaScript, Coptic, Search, ‘SPARQL, XQuery Gremlin Cypher (openCypher), Graphal, RDF/SPARQL, sal GraphQL, ‘SPARQL, ‘SPARQL*SQL Builtin Built‘in hitpsiwu.garinercondectreprinis?__helo=108400980, 164ba5 !2a9fa a0 Tbdb62521016347,1642290390625.1642280909625,1642290399025.1 922 1510112022, 13:07 Openti Multimodel nk Softwar e Virtuos ° Univers: al Server Oracle Multimode Databa se Redis Multimodel Labs RedisGr aph SAP Multimodel Graph Stardo Native 9 Enterpri se Knowle dge Graph Platfor m RDF Property, ROF Property Property RDF, RDF* Gartner Reprint On premises, multiclou, hybria On- premises, multicloug, hybria On- premises, multicloug, hybria On premises, multicloud, hybrie ‘SPARQL, SQL PGQL, SPARQL Cypher LAGraph GraphScript (proprietary), openCypher, SQL, SQLSeript GraphQL, Built-in ‘SPARQL, SQL htpsiwou.garinercondectreprinis?__helo=108400980, 164ba5'!2a9fa a0 Tbdb62521910347,1642290390625.1642280909625.1642200309025.... 10:22 1510112022, 13:07 Gartner Reprint TIBCO Native Property On Gremlin Softwar premises, e multicloud, TIBco hybri¢ Graph Databa se TigerGr Native Labeled Ons GsaL Builtin aph DB property premises, and multicloud TigerGr aph Cloud Source: Gartner (May 2021) Vendor Profiles Amazon Web Services (AWS) ‘Amazon Web Services (AWS), based in Seattle, introduced Amazon Neptune in 2018 as a cloud-only managed service and claims thousands of active customers. Neptune is a native graph DBMS supporting property graph and the W3C’s RDF models. ACID transactions, in-memory execution, up to 15 read replicas and high availability are part of the service. Instances are priced by the hour and billed per second with no long-term commitments, and with added charges for storage, IOs, backup, and data transfer in and out. The service supports graphs of up to 6478, encryption-at-rest with customer-managed keys and cross-region snapshot sharing. AWS lists Neptune as a service in the scope of AWS compliance efforts for HIPAA, PCI, FIPS, ISO and CSA Star, and SOC. Neptune can be queried with W3C's SPARQL as well as Apache TinkerPop Gremlin to build graph applications and implement custom graph algorithms. Open-source tools are available on GitHub under Apache 2.0 and MIT licenses. AWS is an active contributor to the Apache TinkerPop open- source project. The AWS Graph Notebook, an open-source Apache? Jupyter Notebook developed by AWS, allows customers to visualize the results of queries run against the graph database and to get started with sample graph applications. NeptuneML supports ML to make predictions over graph data using graph neural networks (GNNs) from the Deep Graph Library. hitpssiwwu.gartnerconvdectreprinis?__helo=108400980, 164ba5 !2a9fa a0 Tbdb62521916347 1642290390625.1642290909625.1642200309025.... 11/22 tsior2022, 1907 ariner Repent Cambridge Semantics Cambridge Semantics is based in Boston and has offered Anzo, a knowledge graph platform that includes the AnzoGraph DB Engine, since 2015 with freemium and enterprise software subscriptions on-premises, and via Cloud Marketplaces in multicloud and hybrid deployment. Pricing for both the platform and engine is based on the number of vCPU cores used by the graph engine. AnzoGraph version 2.2 is a native graph engine supporting RDF and RDF* for property graph use cases. Inferencing is performed for RDFS and OWL 2 RL ontologies using in-memory materialization of triples. It utilizes an MPP OLAP engine serving use cases where calculations and analytics need to be performed across the whole of an RDF knowledge graph. It also supports querying via SPARQL and openCypher. A library of analytics functions is provided with the product, and ML frameworks supported include Apache Arrow Flight, Apache Spark MLlib, Tensorflow, pandas and Jupyter. The Anzo Knowledge Graph Platform adds capabilities for knowledge graph management, metadata management, visual schema and query design tools, data ingestion, and integration with analytics and business intelligence (Bi) tools, DataStax DataStax is based in Santa Clara, California, and added graph capabilities to DataStax Enterprise (DSE) in 2015. The capabilities are also available in DataStax’s cloud DBMS offering, Astra. A nonrelational muitimodel DBMS, it supports property graphs using GraphQL and Gremlin for a query language. With its history as the leading commercializer of Apache Cassandra, DataStax offers an open-core version of that DBMS, with added features in the enterprise version including graph support. Version 6.8 offers graph data models implemented natively within Cassandra DataStax Studio supports developers writing queries in Cassandra Query Language (CQL), Graph/Gremlin and Spark SQL language, and offers a collaborative notebook interface as well as visual exploration of DSE graphs without requiring Gremlin skills. DataStax Enterprise also offers support for both Gremlin algorithms and GraphFrames, which integrate with Spark MLlib. This enables the multimodel aspect of the product to support combinations of technologies across a variety of data collections for analytics use cases and ML. Given Cassandra's broad adoption for operational use cases, this provides DataStax market differentiation. Dgraph Dgraph is based in Palo Alto, California, and is a recent entrant to the graph DBMS market with the Dgraph product in 2016. The platform is entirely written in the Go language with features aimed at the application development community, which is increasingly using GraphQL as an API layer on top of graph data structures, and enterprise customers that use GraphQL as a layer to collect data from multiple back ends and present a unified service or API. There is a community edition available for download, together with hosted public and private cloud solutions in AWS, Microsoft Azure and Google Cloud Platform (GCP) with pricing based on CPU nodes. htpssiwwu.gartnerconvdectreprinis?__helo=108400980, 164ba5 !2a9fa a0 Tbdb62521916347 1642290390625.1642290909625.1642200309025.... 12:22 1510172022, 1907 ‘corer Rept Dgraph is a native graph DBMS that handles JSON data as well as RDF triples. Querying the database, however, is done using the Dgraph Query Language (DQL) and GraphQL. The platform is designed for real-time transactional workloads utilizing relationship-based sharding to optimize queries and traversals across distributed clusters. Dgraph is optimized for transactional workloads. Its native support for graph algorithms includes recursive traversal of graph and k-shortest path algorithms. Dgraph plans to support Gremlin in upcoming releases. The product fits into the Graph ecosystem, including tools for querying and visualization Franz Franz, based in Lafayette, California, entered the graph database market in 2006 with AllegroGraph, a multimodel triplestore supporting documents and graph that implements RDF*, SPARQL* and JSON- LD. The platform can be deployed in on-premises and cloud environments with pricing based on CPU cores. ‘An RDFS++ runtime reasoner allows usage of the RDFS modeling language and a subset of terms from the Web Ontology Language (OWL) at query time, OWL2 RL support and inference is also available using static materialization A unique feature of AllegroGraph since its first release is the use of Prolog as a mechanism to extend or customize the RDF model and reasoning capabilities. AllegroGraph can federate SPARQL queries in parallel across multiple distributed triplestores using its proprietary FedShard technology. AllegroGraph’s Triple Attributes Security uniquely addresses high-security data environments through role-based, cell-level data access. Following the trend of RDF triplestores increasingly being able to support use cases considered to be LPG territory, graph traversal, graph analytics, graph algorithms and ML are all included natively within the product or through extensions with third-party libraries, such as PyTorch Geometric and Python Anaconda libraries providing an interface from SPARQL to pandas. Graph visualization and exploratory analysis are supported by Gruff's no-code environment, a tool natively integrated to the platform. MarkLogic MarkLogic, based in San Carlos, California, is a private company founded in 2001 and one of the largest nonrelational DBMS vendors by revenue. It entered the graph DBMS market in 2013 as a document-based multimodel product, with a focus on knowledge graph use cases. Unlike most other multimodel offerings, it also has a native triple store permitting it to optimally store data not already inside documents. The MarkLogic Semantics capability is built into the core product, and sold as a license option. It enables direct querying in SPARQL as well as reasoning support for RDFS and OWL Horst ontologies. Custom rules can be defined using MarkLogic’s own language, which is based on the SPARQL Construct operator. hipstiwin gain camdoctepints?_sl=108400080, Abas 12a9tat2d0 tbdve2521016247.1642290090025.1042200090626. 1042200000625... 13/22 1510172022, 1907 ‘corer Rept MarkLogic can be deployed on-premises, in the cloud or in hybrid deployments, and has APIs for SPARQL, JavaScript, XQuery, Search, SQL, REST, Java, Node,js and Optic queries. It offers a free developer version and pricing by cores and/or consumption. It provides tools that tend to be very focused on data curation and access — queries can be against both data and metadata within the same query, which makes it well-suited to building data hubs. It has a set of algorithms including textual functions, ONNX and PyTorch are supported for ML. Microsoft Microsoft, based in Redmond, Washington, provides graph capabilities in three offerings: Azure SQL, SQL Server and Azure Cosmos DB (discussed here), a nonrelational multimodel DBMS deployed as a managed service in the cloud, Azure Cosmos DB provides a property graph model and supports Gremlin as a query language. Gremlin is open sourced under the Apache 2.0 license. Azure Cosmos DB is offered with the choice of two pricing models: provisioned throughput capacity and pay-per- request consumption (serverless). Visual design and query design tools are from the Gremlin ecosystem; Microsoft recommends Linkurious and KeyLines. Similarly, algorithms are available through Gremlin recipes. Microsoft provides a Spark connector to enable ML via the Spark ecosystem, which is broadly supported across Microsoft's Azure software portfolio. Neo4j Neo4j was founded in 2007 and is based in San Mateo, California. Neodj is a native graph store supporting the property graph model. An open-source version of the platform is available, as well as an enterprise version that can be deployed across on-premises and cloud environments and a managed SaaS service called Neo4j Aura. Pricing for self-managed installations is based on number of machines, cores and RAM, with Aura pricing based on RAM and consumption. Neodj was the creator of the Cypher query language that has been adopted by other graph databases as openCypher. This is in addition to Gremlin for graph traversals. There are also connectors available for BI tools, Apache Kafka streaming data, Spark, and ingestion from various databases and file formats. Neo4j is a member and active participant in the development of the ISO GQL standard, which aims to provide a standard framework for querying property graph databases. Neodj can be used for a variety of graph use cases. Neo4j for Graph Data Science includes over 50 algorithms and natively available graph embeddings, as well as internal support for unsupervised and supervised ML. A number of third-party and community-supported tools are also available for visualization, schema design and data source connections. Ontotext Ontotext is based in Bulgaria and provides GraphDB in standard and enterprise editions, priced according to the number of CPU cores in a deployment, as well as a free edition. GraphDB is a native htpssiwwu.gartnerconvdectreprinis?__helo=108400980, 164ba5 !2a9fa a0 Tbdb62521916347 1642290390625.1642280909625.1642200309025.... 14/22 151012022, 1307 Gartner Reprint triplestore supporting RDF* and SPARQL*. Reasoning and inference are also supported for RDFS, OWL Lite and OWL2 RL and QL via materialization of triples at load time. GraphDB supports virtualization over RDBMS systems, allowing SPARQL queries to be run against relational databases. Downstream systems can access GraphDB over JDBC. GraphDB Enterprise Edition features high- availability cluster architecture and integration of authentication services, GraphDB is part of the bigger Ontotext Platform, which offers additional capabilities for data ingestion, including prebuilt text processing pipelines, schema creation and platform deployment via Kubernetes. Ontotext also focuses on making it easier to build applications for developers with GraphQL interfaces, overcoming the need to create SPARQL queries and understand the RDF data model, and providing role-based access control (RBAC). GraphDB and the Ontotext Platform are suited to traditional RDF use cases that are transaction- centric, but interoperability with Python ML libraries for analytics and ML is also supported. in addition, ML algorithms for semantic similarity, based on graph- and word-embedding, complement the full-text search capabilities backed by Elastic and Apache Solr. GraphDB Workbench offers a set of facilities for graph development, querying, exploration and visualization, provided under an open source license, which allows those to be integrated and customized in proprietary applications. Tabular data can be ingested, reconciled and converted into an RDF graph via an integrated version of OpenRefine, GraphDB's plug-in architecture allows for implementation of connectors to other engines, custom-made indexes or analytics “close to the metal.” A rich set of plug-ins is provided on an open-source basis too: RDFRank (computing node importance using PageRank), GeoSPARQL, Reasoning Provenance (query-time inference) and Proof (tracing how a given statement was inferred), Change-tracking, Data History and Versioning. OpenLink Software OpenLink Software, based in Burlington, Massachusetts, entered the graph DBMS market in 2006 with its product Virtuoso Universal Server. Virtuoso is a multimodel database platform supporting both relational and RDF data types. It is available on-premises and in hybrid and multicloud deployments, with pricing based on the number of concurrent users and CPU cores assigned to multithreaded operations. Virtuoso is a data virtualization platform that utilizes the URI element of RDF graphs to act as “super keys” to bring data together in a knowledge graph from different data repositories. The platform hosts the Linked Open Data Cloud, a collection of over 1,200 datasets (as of May 2020) accessible as a knowledge graph on the web using Linked Data principles. Virtuoso has a number of APIs and services to bring data into the platform, including HTTP, ODBC, JDBC, ADO.NET, OLE DB and others. It also supports inference on subclass, subproperty, inverse-functional properties and same-as constructs at query time. A transitive clause can be added to SPARQL queries, enabling graph traversal. The query engine also supports SPARQL statements within SQL queries and GeoSPARQL htpssiwwu.gartnerconvdectreprinis?__helo=108400980, 164ba5 !2a9fa a0 Tbdb62521916347 1642290390625.1642290909625.1642200309025.... 18122 1510172022, 1907 Garner Reprint for geospatial queries. A number of graph algorithms are supported natively, including Centrality, Betweenness, Shortest Path, Eigenvector Centrality and General Transitivity Modeling, Oracle Oracle, based in Redwood Shores, California, entered the Graph DBMS market in 2009 by adding graph capabilities to its multimodel Oracle DBMS. The product is one of the few in the market that, offers both RDF and property graph capabilities without the need to maintain separate instances Oracle DBMS supports SPARQL and PGQL, and is a longtime participant active in ISO SQL/PGQ standards activity and plans support for this standard. The Oracle DBMS supports multiple pricing models: perpetual license based on users, servers or enterprise; subscription; and consumption-based. Oracle's graph technologies are included in on- premises and cloud database licenses and Autonomous Database at no additional cost. The Oracle DBMS is available on-premises and in multicloud and hybrid deployments, and offers visual schema and query design tools. It is available as a managed service and includes a library of several dozen high-performance graph algorithms. Users can also develop custom algorithms in Java syntax using a compiler that creates parallel algorithms. ML frameworks that support panda or NumPy — including Oracle's own Machine Learning — may be used. Redis Labs Redis Labs, based in Mountain View, California, entered the graph DBMS market in 2018 with RedisGraph, a Redis module that is available to use with open-source Redis and commercial Redis Enterprise. It can be deployed across all major public clouds in a fully managed service, or on- premises, in multicloud and hybrid configurations over Kubernetes or virtual machine architectures using Redis Enterprise Software. Pricing is based on memory usage and throughput. RedisGraph supports the Cypher query language and multiple graph algorithms. It uses SuiteSparse, an implementation of the GraphBLAS technology. In this architecture, Cypher queries are translated to linear algebraic operations over a sparse matrix. RedisGraph can be used in conjunction with the RedisA! module for ML applications. SAP SAP, headquartered in Walldorf, Germany, offers SAP HANA graph capabilities as a component of its. Enterprise edition. Today, it is available on-premises and as a multicloud managed service in SAP HANA Cloud. As a multimodel offering based on an in-memory columnar RDBMS, HANA graph supports property graphs with query via the proprietary GraphScript language, openCypher and SQL (SQL procedures via SQLScript). Pricing is based on used memory on-premises and memory consumption in the cloud. SAP graph offers its own library of graph algorithms and users can build custom ones as database procedures using GraphScript. The SAP HANA Predictive Analysis Library and ML frameworks such hitpssiww.garinercondectreprinis?__helo=108400980, 164ba51!2a9fa a0 Tbdb62521016347,1642290390625.1642280909625.1642200309025.... 16122 1510112022, 13:07 Gartner Reprint as TensorFlow may be used for ML. The Graph Viewer offers visual tools for developers. Stardog Stardog is headquartered in Arlington, Virginia, and entered the graph DBMS market in 2015. Its Stardog Enterprise Knowledge Graph Platform is a native triplestore supporting RDF* and SPARQL*. Stardog is one of the few triplestores that supports all OWL2 subsets with reasoning performed at query time. Custom rules can be added using SPARQL-like syntax and also SWRL. The platform is offered on a subscription pricing model based on number of nodes, or a consumption or use-case basis. A fully managed platform is provided on AWS. Virtual graphs can be used to represent data without loading it into the graph, when duplicating data is not desirable or feasible, Stardog, like other triplestore vendors, extends its product from being a standard graph store to an enterprise knowledge graph platform. This includes Stardog Studio, a UI for business users to explore the graph, and a developer UI that provides features to trace data lineage, write queries and edit ontologies. A feature of Stardog that is not commonly found in other triplestores is the ability to do in-database ML using SPARQL constructs that support classification, regression and similarity models. Native integration with Spark is also included for typical graph analytics and ML. For building applications, loading and querying data can be done using GraphQL. TIBCO Software TIBCO Software, based in Palo Alto, California, introduced TIBCO Graph Database in 2016 and version 3.0 became generally available in November 2020. A native graph store, TIBCO Graph Database offers a property graph model featuring SQL-92 data types. It supports the Apache Gremlin Query Language, both as a functional Java API and as a graph traversal query language. TIBCO Graph Database supports clustering for high availability and scalability with multitenancy. TIBCO Graph Database is available on-premises and in multicloud hybrid deployment on AWS and Azure. Its open-source community edition is free; on-premises, the Enterprise Edition is priced by cores and connections. In the cloud, pricing is based on hourly consumption. TIBCO Graph Database is not available as a managed service. TIBCO Spotfire provides visualization for network charts, and a visual graph builder is available as a ‘community open-source resource. Built-in graph algorithms can be supplemented with stored procedures using Python — and these can also be used for ML frameworks. TigerGraph TigerGraph, based in Redwood City, California, has offered the TigerGraph DBMS since 2017, on- premises and in multiple clouds. Now in general availability, Version 3.1 is a native graph store supporting the labeled property graph model and GSQL for queries. TigerGraph’s algorithm library, connectors, and TigerGraph Cloud Starter Kits with prebuilt graph schema and queries are all open hipstiwin gain camdoctepints?_pst=108400080, Abas 12a9tat2d0 tbdb62521016247.1042290090025.1042200090626. 1042200009625... 17/22 11012022, 807 ‘corer Rept source. Its on-premises Enterprise Edition is free up to 50GB of data as compressed by TigerGraph's storage engine; there is also a free tier available on TigerGraph Cloud for both development and production. TigerGraph features no-code migration from relational to graph, as well as a visual query builder for designing complex analytics patterns with no coding required. it supports Python for connection to ML frameworks via its pyTigerGraph API and also in-database ML via GSQL. TigerGraph’s data storage architecture is suited for multihop queries utilizing a highly distributed and parallel computational implementation. TigerGraph also positions itself as a generally accepted information principles (GAIP) provider. In early 2021, TigerGraph raised $105 million in Series C funding, the largest funding round to date within the graph database and analytics market Market Recommendations = Differentiate graph use cases between those that require transactional processing and analytical processing by defining query structures for partial graph traversal and entire graph processing. Choose the DBMS accordingly. = Assess graph DBMS product functionality by comparing ease of visualization, query and schema building, and system integration and management. 1 Determine if there is a clear requirement for using RDF as the graph data model by looking at schema separation, ontology reuse, inference and universal identifiers. Without these characteristics, the choice of graph data model is arbitrary. = Test the ability of existing multimodel offerings to deliver the required analytics, scalability and performance before adding another vendor and product combination. = Consider managed services, where available, to minimize the overhead of managing and maintaining graph DBMS, which may be unfamiliar to existing team members Acronym Key and Glossary Terms Edge A connection from one node to another node in the same graph structure. This connection type could be implied (without a label) when all edges represent the same connection (messages passed on a network, friends in a social network) or explicitly defined when there is more than one type (different message types, connections between people and content). In addition, edges can carry weights representing the strength of the connection or some other measure. Edges can also have direction. EKGP ‘An enterprise knowledge graph platform is a suite of tools to analyze data from hitpsiwwu.gartnercondectreprinis?__helo=108400980, 164ba5'!2a9fa a0 Tbdb62521010347 1642290390625.1642290909625.1642200309025.... 18122 1510112022, 1307 GAIP GQL GraphQl Hop Node Ontology RDF Semantics Traversal ariner Repent structured and unstructured data sources, and construct schemas and ontologies; automate the population of knowledge graphs according to those schemas; and provide querying capability and services for application development. A graph Al platform provides a suite of tools for constructing a graph data model and running in-database graph algorithms and constructing ML models, including graph-based feature engineering and graph representation learning. A collaboration between vendors, specialists, commercial organizations and academics to create a standard query language for property graph implementations. (See Graph Query Language GQL — GQL Standard.) ‘A query language for APIs that uses a syntax describing how to ask for data. It can be offered by many different databases and services, not just graph DBMSs. (See Graph Query Language GQL — GQL Standard.) A single hop takes a starting node and follows all edges from that node to adjacent nodes that are directly connected. "Multihop” paths are those that take subsequent hops from the adjacent nodes to nodes that are further away from the origin node. A single data point in a graph data structure. An ontology describes a domain of interest at a conceptual level and can be thought of as a richer form of data schema that can include rules, restrictions and complex relationships. An ontology can be modeled as a graph using the W3C standards for OWL 2, which includes a number of subsets implemented by Graph DBMS vendors. (See OWL 2 Web Ontology Language Primer [Second Edition]). Using an ontology to infer and add nodes and edges (triples) toa graph based on the defined rules is known as inference. Resource Description Framework is a W3C standard for knowledge representation using triples that can be represented as a graph and serialized in various formats. (See RDF 1.1 Primer.) The concepts, rules and relationships described in the graph for a particular domain are collectively referred to as semantics. Taking a number of hops from a starting node to an end node or set of nodes, either specifying the number of hops or the end node. A common computing framework used for graph traversal is the Apache TinkerPop framework and associated Gremlin language. (See Apache TinkerPop.) hitpsiwww.gartnerconvdectreprints?__helo=108400980, 164ba5 !2a9fa a0 Tbdb62521010347,1642290390625.1642280909625.1642200309025.... 19122 1510112022, 1307 Gartner Reprint Evidence 1 The 2019 Gartner Al in Organizations Survey was conducted online during November and December 2019, among 607 respondents from organizations in the U.S., the U.K. and Germany. Quotas were established for company size and for industries, to ensure that the sample had good representation. Organizations were required to have already developed Al or intend to deploy Al within the next three years. Respondents were screened to: 1. Be part of the organization's corporate leadership or report into corporate leadership roles. 2. Have a high level of involvement with at least one Al initiative. 3. Have one of the following roles when related to Al in their organizations: = Determine Al business objectives. = Measure the value derived from Al initiatives. = Manage Al initiative development and implementation, Results of this study do not represent global findings or the market as a whole, but reflect sentiment of the respondents and companies surveyed. Note 1: Representative Vendor Selection The 16 vendors named in this Market Guide guide were selected to represent the types of solutions as discussed: native and multimodel, property graph and RDF, on-premises, and cloud, offered as commercial products with support. Note 2: Gartner's Initial Market Coverage This Market Guide provides Gartner's initial coverage of the market and focuses on the market definition, rationale for the market and market dynamics. Note 3: Some Use-Case Examples Transactional use cases include complex join removal, recursive querying, 360-degree view of the customer, recommendation, compliance monitoring, semantic search, data fabric and content management. Product features and architecture enable these uses by supporting appropriate languages, which can vary by the architectural model chosen for the DBMS: = PGnative or multimodel — using tinkerpop with Gremlin and vendor-specific query languages = Triplestores native or multimodel — using RDF and SPARQL hitpssiwu.gartnercondectreprinis?__helo=108400980, 164ba5'!2a9fa a0 Tbdb62521016347 1642290390625.1642280909625.1642290399025.... 20:22 1510112022, 13:07 Gartner Reprint Analytics use cases include route optimization, community detection, link prediction, anti-money- laundering, influencer analysis, threat detection (cybersecurity), network analysis and supply chain optimization: = Graph DBMS (native or multimodel) with built-in functions/algorithms = Graph DBMS (native or multimodel) with integration with compute framework/BI tool = Stand-alone compute framework Note 4: The Processing Impact of Graph DBMS Architecture Consider the mathematics: A relational representation of 10 connected entities generally follows a predefined tree structure or established hierarchy resulting in a maximum of nine potential connections. Graph representations explore the entire potential of 10-factorial connections (3,628,800), although this would only occur in the case of a fully connected graph, which is rarely the case, Moving beyond the obvious, graph allows measurement and comparison of such attributes as. influence, frequency of interaction and probability of clusters of nodes forming. This is analysis by discovery, not just description and confirmation, and as such it can be profoundly transformative. ‘An effective way to pursue such problems — especially where scale or performance may be a critical impediment — is to use a tool designed for the task. Graph DBMS vendors are attacking this challenge with low-code and no-code graphical tools suitable for business analysts and other users for ease of self-service capabilities. As they move beyond the early offerings, which were challenging for existing database designers and users because of specialized modeling and language needs, graph DBMSs can open a new class of solutions to broad use, tackling hitherto unapproachable problems that can drive business transformation. Six main types of graph analytics — path, community, connectivity, centricity, link prediction and similarity — represent enormous new ‘opportunities to leverage data in entirely new ways. Learn how Gartner can help you succeed Crea htpssiwwu.gartnercondectreprinis?_hsto=108400980, 164ba5 !2a9fa a0 Tbdb62521010347 1642290390625.1642280909625. 1642200399025, 222 1510112022, 13:07 Gartner Reprint © 20ZZ Gartner, Inc. and/or its attlates. All rights reserved. Gartner Is a registered trademark of Gartner, Inc. and its, affiliates. This publication may not be reproduced or distributed in any form without Gartner's prior written permission. It consists of the opinions of Gartner's research organization, which should not be construed as statements of fact. While the information contained in this publication has been obtained from sources believed to be reliable, Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information. Although Gartner research may address legal and financial issues, Gartner does not provide legal or investment advice and its research should not be construed or used as such. Your access and use of this publication are governed by Gartner's Usage Policy. Gartner prides itself on its reputation for independence and objectivity. Its research is produced independently by its research organization without input or influence from any third party. For further information, see “Guiding Principles on Independence and Objectivity. About Careers Newsroom Policies Site Index IT Glossary Gartner Blog Network Contact Send Feedback Gartner, Ine. andioritsAflites. All Rights Reser hitpssiwwu.gartnercondectreprinis?__helo=108400980, 164ba5'!2a9fa a0 Tbdb62521010347,1642290390625.1642280909625.1642200309025.... 22:22

You might also like