Bridging generative AI and business data, with Neo4j
Dec 12, 2023
·Written by One Peak

Dec 12, 2023
Written by One Peak
How can enterprise companies bridge generative AI and business data?
Enterprises want to use AI in their mission critical applications, but can’t when the unpredictable nature of large language models (LLMs) can lead to hallucinations, and the path from the prompt of an LLM to the answer is a black box which can’t be explained.
Neo4j solves these big challenges by combining an LLM with a knowledge graph, which helps reveal the patterns powering its logic.
One Peak portfolio company founders and technical leaders recently joined Neo4j for a tour of their new GenAI Stack, which provides a working reference for combining the magic of large language models (LLMs) with the reliability and accuracy of databases.
The GenAI Stack is a partnership between Neo4j, Docker, LangChain, and Ollama, to quickly get started building GenAI-backed applications, bundled with the core components needed to get started, and enables experimentation with new models hosted locally or via APIs. It is set up with Neo4j as the default database for vector search and knowledge graphs, and helps enterprises use Retrieval Augmentation Generation (RAG) architecture for LLM apps, enabling streamlined integration with an LLM into an application, and giving it access to that company’s proprietary data.
5 takeaways from our recent webinar with Neo4j on generative AI and business data, from common patterns of integration to emerging business use cases:
1. Generative AI is an “alien technology” unlocking a number of magical use cases, but to really maximize its strategic value, businesses need to be able to verify its logic and also integrate their proprietary data sources with it.

A few popular memes around generative AI include that it is 1) a parrot, or a big neural network that knows how to regurgitate information it’s heard without necessarily understanding it; 2) a sock puppet whose responses are heavily influenced by who asks it questions, and how, and 3) an all-powerful alien technology that is really magical, but not always accurate or controllable.
In likening generative AI to an alien technology, Neo4j highlights the importance of verifying what’s actually going on in these black box systems, and helping its magic to extend beyond public knowledge and become anchored in integrations with data sources that enterprises have and need to work with.
2. Graphs can represent hidden patterns and relationships within data, enabling GenAI models to make sense of the world and understand a specific company’s data and view of the world.

Knowledge graphs can help solve for both of these challenges: verification and integration. Knowledge graphs provide facts about people, places or things interlinked by their relationships, in a human and LLM-friendly readable format, anchored in organizing principles that provide context about the data. Further, many organizations are faced with data silos, and knowledge graphs can help with data governance, knowledge management, cause-effect chains, search and recommendation, and more.
3. Retrieval augmented generation (RAG) makes requests to a database, through three potential data access scenarios — pure text, pure data, or mixed text and data — representing a range of business use cases.

Enterprises want to make sense of their data across three different formats: 1) pure text (which can include PDFs, blog posts, etc.), 2) mixed text and data (where a user might want to find an answer to a simple question using a database, without needing to write SQL or any other kind of query language), or 3) pure data (in the form of a data warehouse, database, or data lake). Each of these scenarios requires different approaches for translating this information into “language” for LLMs to understand and process.
4. Text is the coupling to the alien technology, bridging it to a company’s data, but translating unstructured data to text is not always a straightforward process.

LLMs need to process language, and to accomplish this when working with mixed text and data, or pure data scenarios, a process called “embeddings” can take place where text is turned into numbers that can co-exist in a vector space, so that similarity searches can be applied and relevant patterns identified. But language doesn’t exist in a vacuum — it needs context. There is a process called “entity recognition” that enables businesses to put context around text or gather the relevant metadata; for example not just recognizing that a piece of copy is prose, but also, who its author is, when and where it was written, and more.
5. RAG is the way to bridge GenAI with business data, overcoming complex processes to realize business results.

This is just one example of a large data processing effort, where a government agency had a huge volume of requests for procurement (RFPs), and struggled to process them and actually derive knowledge from them, largely because these are RFPs are written as one-offs. The government agency needed to distill the data from their RFPs into patterns to understand common requests and norms around pricing, to ultimately prioritize tax dollars. Using RAG, the customer was able to supply more context, extracting out entities that contain metadata, to help realize all the patterns that occur. By putting a graph around it, the customer has the objective truth and facts to make more informed business decisions.
This is just the tip of the spear when it comes to learning about how to bridge Generative AI with business data. To dig deeper into Neo4j and their new GenAI Stack, check out their website.