Explainable AI:

How Xapien’s AI due diligence software works

When it comes to due diligence, you need comprehensive research with maximum efficiency. But it’s easier said than done. Whether you rely on third-party firms or have your team manually search, analyse, and compile reports, the process is both costly and time-consuming. The role of a research analyst has become more research than analysis. However, advances in AI technology have introduced powerful AI due diligence software options to reduce manual strain.

It’s important to note that “AI” is often confused with Generative AI (GenAI), like tools such as ChatGPT and Gemini. This misunderstanding can be risky. GenAI isn’t suitable for due diligence because it lacks access to structured datasets such as AML databases or corporate registries. It’s also known for generating inaccurate information, often due to misinterpreting data, using poor-quality training data, or attempting to fill gaps with incorrect conclusions.

Reliable AI due diligence software combines multiple AI models that are wrapped around GenAI to ensure trustworthy results. Here’s how we’ve done exactly that at Xapien.

What does Xapien’s AI due diligence software do?

Searching through the entire indexed internet and compliance data spread across various platforms and web pages has become if not impossible, then impractical, for most businesses. The ever-growing quantity of sources and datasets is all-consuming. Meanwhile, a mounting pile of due diligence requests grows.

Xapien’s AI due diligence software handles manual research, analysis and report writing, freeing up teams’ time to focus on serving clients better. It automates in-depth research, covering the entire process from initial search to summarising insights in a final report. Our AI scours millions of registries and screening data, as well as trillions of web pages across the entire indexed internet. It extracts and contextualises fragments of information about your subject, whether it’s an individual or a company.

The findings are then compiled into a fully sourced, summarised report that colleagues, committees, regulators, and third parties can easily share. To begin, all that’s required is a name and some context. This allows due diligence teams and analysts to focus on building trust-based relationships with clients and third parties.

How does Xapien’s AI due diligence software work?

Xapien begins by searching across licensed data sets, official registries, and the entire indexed internet to comprehensively gather all available data about a subject. This data may be in various formats, such as images, web pages, news articles, and corporate filings.

Machines generally don’t do well with all these diverse formats, so we pass pages of information through our Natural Language Processing (NLP) engine to read them and extract valuable text. This leaves Xapien with a clean ‘document’ for each web page, news article, or similar source.

It then breaks down each document into individual sentences and smaller pieces of information, capturing all potential knowledge. Fragments are then encoded into vectors.

Vectors 101

Vectors are a way of encoding information using a set of numbers since machines can only understand numbers and not words. To illustrate this, consider eye colour. We can categorise individuals into teams based on their eye colour. For instance, blue eyes belong to Team One, green eyes to Team Two, and brown eyes to Team Three.

This is a one-dimensional vector. Now, let’s say a person has brown eyes but also has blonde hair, or brown eyes and brown hair. This represents a two-dimensional vector.

Xapien generates vectors with thousands of dimensions to depict names, organisations and insights about a subject. It uses not only the semantic meaning of the words from the text, but the linkages to other things that set those people, organisations or events in context.

Using advanced machine learning techniques, Xapien can then bring these vectors together based on their underlying similarities as part of a wider process to resolve people, companies and their mentions across different sources.

Xapien generates a vast vector database as part of each and every enquiry, as well as extracting detailed linkages between people, organisations, locations, events and thematics (a vector-augmented ‘knowledge graph’ if you prefer to get ‘technical’).

It then uses this enormous knowledge graph to generate fully-sourced, traceable insights about the subject using state-of-the-art generative techniques.

Xapien’s summarisation capability using GenAI

The team at Xapien have developed machine learning techniques to extract the relevant insights that clients need across every fragment of information collected. For example, knowing a subject’s Source of Wealth is essential for Enhanced Due Diligence.

To generate a summarised research report, Xapien runs hundreds of queries against its knowledge graph, asking all the questions users need the answers to, usually asking the question in lots of different ways.

In this scenario, the vectors and linkages provide a set of relevant insights related to a subject’s Source of Wealth. We then use a Large Language Model (LLM) as part of a wider ‘generative’ layer to summarise those insights into a concise paragraph, resembling a human-like written report. It repeats this process for each section in a Xapien Insights report.

What are Large Language Models (LLMs)?

Put simply, LLMs are advanced computer systems that have been taught to understand and generate human-like language. They can read, write, and understand text just like humans do.

But how does Xapien use LLMs to generate concise summaries?

LLMs are trained on vast datasets, subsuming large amounts of real-word knowledge, and have been trained to understand and learn common word sequences. Because of this, LLMs can generate human-like language by predicting the next word in a sentence.

For instance, when you input ‘the sky is…’, the model would predict ‘blue’ as the next word. It’s the most common phrase. But if you request a more creative response, it might generate ‘the limit.’ These predictions are based on statistical probabilities of word sequences.

A great deal of care needs to be used when working with LLMs and generative AI to ensure that the information is not just ‘nice to read’ or ‘impressive’, but is factually accurate, consistent and traceable.

How can I trust Xapien’s AI due diligence software?

Every piece of information provided in Xapien’s rich summaries is fully traceable, down to a sentence and sometimes word or phrase level, showing the exact part inside specific news articles or webpages from which Xapien collected the original information.

It’s essential that we provide fully attributable and traceable reports. When conducting research and due diligence, you can’t make decisions off facts or statements without being able to judge the credibility or reliability of the underlying source.

Xapien traces the lineage of insights and reveals where specific sentences within the report come from. This offers a fundamentally different way of receiving insights from generative AI—a transparent, intuitive, and enterprise-ready process.

How does Xapien combat hallucinations?

The world was initially captivated by the launch of ChatGPT and other generative AI tools. But as people encountered issues, some of that excitement began to fade.

Instances of false information and “hallucinations” generated by the technology raised concerns. The question then became: how do you use LLMs effectively without the risk of generating false information?

Xapien’s approach involves breaking down the process into atomic components. We don’t just ask an LLM to answer a question based on the data it’s been trained on. Xapien’s new generative AI system builds on over five years’ worth of tried and tested research, Natural Language Processing (NLP), and our proprietary disambiguation technology.

To illustrate the challenge, you can’t gather 500 documents about someone named Chris Smith and summarise financial crime insights from that. You’d have to determine whether this is the specific Chris Smith you’re interested in.

This is a fundamental, and incredibly difficult step. But it’s a technology we have been perfecting that sits right at the heart of Xapien.

It’s easy to think that when you type a famous person’s name into Google, you’re going to get content about them. But even famous people share their names. For example, the actor David Schwimmer has the same name as the CEO of the London Stock Exchange Group.

Working up at the generative AI level, we’ve developed proprietary technologies to counter the hallucinogenic tendencies innate within generative AI. We’ve built a system of algorithmic safeguards using non-generative, non-machine-learning technology with roots in deep NLP, which acts as a protective layer around our interaction with LLMs.

This enables us to harness their power but with unique control over the quality and accuracy of the output, ensuring that everything we surface has been independently checked and verified by a totally different algorithm. This means that insofar as can reasonably be asserted, you’re always working with a distilled and accurate representation of the information that Xapien has sourced.

Want to keep learning about our technology? Check out our other Explainable AI blogs.

From Ivy League to Magic Circle: How our customers use Xapien

Short answer: right at the start.

Xapien users run their clients, prospects, donors, suppliers, investors, and other third parties through the tool for “initial due diligence.” This enables them to gain early insights into potential risks and opportunities. Instead of spending time gathering research, their teams can focus on minimising those risks and maximising opportunities

Dartmouth College trusts Xapien for reputational risk checks

Dartmouth adopted Xapien to enhance its reputational risk research. Its efficiency and depth soon led to replacing traditional methods and centralising the process across the university. Dartmouth’s strategic analysts now manage all reputational risk assessments, identifying red flags early in prospecting. Xapien’s ability to perform donor due diligence at scale and generate comprehensive reports cleared a backlog of donor checks. Dartmouth now uses Xapien to assess over 200 board trustee names annually, many of whom may not be appointed but are on shortlists for each board pipeline.

Pinsent Masons use Xapien to take a risk-based approach to client onboarding

The compliance team have a financial crime checklist for their deal screening process which Xapien is a fundamental part of. They run Xapien reports on the sellers, the target, and the directors as soon as they have the information. This proactive approach allows them to catch risks early in the process and prevent both teams from investing excessive resources in a deal that might not proceed.

University of Liverpool built due diligence into prospect research

The University of Liverpool’s operations team built their due diligence process into prospect research from the start using Xapien. Instead of spending hours on manual research, Xapien summarises all relevant information about a prospect in minutes. This means due diligence can be done upfront without holding up the prospect research process. By combining the two early on, the fundraising team can focus their efforts on which prospects to approach for donations.

Griffin runs all third parties through Xapien at the start

As soon as a prospect or supplier is mentioned, banking as a service (BaaS) provider, Griffin, runs their name through Xapien. Inputting the search terms takes 30 seconds and compliance can leave the software to run while they get on with other tasks. The results are neatly summarised and only take half an hour to review. Xapien effectively flags direct risks, enabling the team to streamline their focus and skip irrelevant information.

The University of Michigan used Xapien to build a due diligence process

As soon as the research team is alerted of a new potential donor, they run them through Xapien. They do this before the fundraising team begins deeper conversations, which could result in wasted fundraiser time if an issue comes back in the initial report. Rather than serving as information gatherers, they are transformed into true analysts, enabling them to apply meaningful analysis to their briefings.

WaterAid improved the quality of its ethical checks with Xapien

Xapien is primarily used to conduct ethical checks relevant to WaterAid’s line of work, focusing on understanding potential risks associated with engaging new donors, partners, or suppliers, and identifying opportunities from these relationships. Xapien enables WaterAid to conduct a full, in-depth ethical assessment. WaterAid has achieved greater consistency in its ethical assessments across various departments and regions.

Discover how more organisations are using Xapien here.

Chat with us to learn more

Monthly learnings and insights to your inbox

Xapien streamlines  due diligence

Xapien's AI-powered research and due diligence tool goes faster than manual research and beyond traditional database checks. Fill in the form to the right to book in a 30 minute live demonstration.