Explainable AI:
How to use ChatGPT for due diligence
ChatGPT was a global sensation when it became publicly available in November 2022.
It could summarise fragmented thoughts, answer almost any question, and compose entire articles. For research teams, this was something they had never experienced before. Naturally, they began testing ChatGPT for due diligence purposes. However, it quickly became evident that ChatGPT had limitations, especially for business and regulatory use cases.
When using generative AI (or any AI tools) for due diligence, three crucial criteria must be met: the output must be traceable, consistent, and trustworthy. But as many have realised, ChatGPT and other generative AI tools like Gemini don’t meet them. This comparison blog will focus on how you can use ChatGPT for due diligence, where it falls short, and what you can use instead.
TL;DR
ChatGPT and similar AI tools can assist with initial research by identifying focus areas and summarising information. But you can’t simply ask it to produce a due diligence report. They lack the precision and nuanced understanding required for thorough analysis. You’d need to integrate personal expertise, maintain data privacy, remain cautious of data limitations and privacy risks, and ensure analysis quality.
Xapien, on the other hand, is purpose-built for due diligence. It’s been designed to provide accurate, comprehensive, and compliant reports from a single search, with everything you need to know, from PEP checks to source of funds. Its disambiguation algorithms, integration of structured and compliance datasets, privacy-friendly data processing, anti-hallucination technology, and consistent, unbiased analysis ensure reliable due diligence outcomes every time.
Where can you use GPT in due diligence, and what must you do?
A due diligence report should include:
- Political exposure
- Sanctions exposure
- Crime and controversy
- Source of wealth
- Business associates
- Career history
- ESG risks
- Locations
Can GPT help you find this information and present it coherently? Not by itself.
GPT models are designed to generate human-like text based on the input provided to them. They have been trained on vast amounts of text data from the internet, allowing them to understand and generate text across varying topics and writing styles. They can help you with a few stages of research and report writing. However, you will need to add your own research and analysis too.
Remember to consider data privacy
Before you start you’ll need to consider a critical risk: data privacy.
To get the best out of ChatGPT for background research on individuals and entities, you’ll likely need to feed it a large amount of data yourself. This is the only way that it will be able to find information that relates to the specific research subject that you are interested in, and not other individuals or entities with similar names.
Inputting sensitive or proprietary data into ChatGPT poses privacy risks, including potential data breaches or unintended disclosure of confidential information.
This is a contested space, but Italian data regulators have said they believe OpenAI’s use of data violates European data protection laws. Ask yourself what information you’re comfortable sharing with a major tech company.
Implementing measures such as anonymisation to safeguard sensitive information and ensure compliance with privacy regulations could be one solution, but this would limit the effectiveness and efficiency of the research.
With that in mind…
How you could go about using GPT for due diligence
1. Initial risk analysis
Objective: Gain a preliminary understanding of the entity’s sanctions exposure, PEPs risks and other key AML risk exposure from internet sources. Example: “Is X a politician” or “Is Entity Y state-owned”?
Challenge: GPT might overlook nuances or context, resulting in a superficial or incomplete analysis. Crucially, it doesn’t have access to AML screening databases, so it can only flag risks that appear on sites that search engines like Bing or Google can access.
Solution: Use GPT to identify key themes or patterns from web-based searches, then manually review the results to better understand the context. Cross-reference GPT’s summaries with original research to capture any nuances or important details missed. Check AML screening databases independently.
2. Deep dive into specific areas
Objective: Based on the initial results, investigate different risk areas using web-based information. For a comprehensive due diligence report, you should check political exposure, sanctions exposure, background history, crime and controversy, sources of wealth, business associates, career history, ESG risks, and locations.
Example: “Who owns Entity X or what companies does individual Y own”
Challenge: GPT can struggle with complex content, potentially missing critical insights. It doesn’t have access to all information, including databases and some web pages and images. It can confuse two or more entities if they share the same name.
Solution: Analyse smaller, detailed segments. Follow up on potential concerns identified by GPT with targeted research. For each search, you may need to provide more context so that GPT can find the right subject. For instance, if you’re investigating an individual, consider providing key information to help specify the research scope, such as their place of residence, date of birth, or current occupation.
3. Drawing connections
Objective: Identify links and networks which may expose your company to risks.
Example: “You said that Entity X is owned by a Russian businessman. Does he have any connections to state Y?”
4. Report writing
Objective: Synthesise findings into a comprehensive report.
Example: “Turn this list of questions and answers points into a report with sections on different risks.”
Challenge: GPT-generated drafts may oversimplify insights or hallucinate, resulting in incorrect information in your report.
Solution: Iteratively refine the GPT-generated draft, incorporating your own analysis and the latest web research findings. Include footnotes so that all content is traceable back to its original source. Create your own time-stamping methodology to use the reports for audit purposes.
Best practices for using GPT for due diligence
As you can see, GPT can help with initial research, identifying areas of focus, and summarisation, but you can’t simply ask it to produce a due diligence report for you. The result would be neither comprehensive nor compliant in a regulatory environment.
An effective approach combines GPT’s broad analysis with human refinement for in-depth interpretation. We suggest you always investigate GPT-identified issues and output yourself for comprehensive, compliant analysis.
1. Integrate your own expertise
Start by using GPT to analyse data broadly and identify patterns. Then, ensure that a trained due diligence analyst probes into specific areas that require deeper analysis. This is essential for the regulatory use case. ChatGPT won’t know what counts as a “risk” for the regulator that you report to, or to your business.
2. Be conscious of data privacy
Use only necessary data with GPT, prioritising anonymisation and understanding the data policies of any cloud-based services used to ensure security.
3. Recognise data limitations
Recognise that GPT may not have access to real time, live internet data, access to datasets or grasp complex, industry-specific nuances. Supplement its analysis with your own insights from other sources.
4. Check analysis quality
Be wary of GPT’s varying accuracy levels with complex documents. Review its analysis for accuracy by cross-referencing the data it provides with information that you have gathered elsewhere. This Is important to check it hasn’t produced any incorrect conclusions or outputs – also known as hallucinations.
5. Keep up to date with prompt engineering tips
Tools like ChatGPT are evolving continuously, and as a result, the most effective methods to prompt it to provide the answers you need are constantly changing. Stay updated on the latest developments by reading about them on tech forums such as Medium or Reddit.
Why is Xapien better at due diligence than GPT?
Using tools like GPT for due diligence might speed things up in certain areas, like report writing, but the need to double check everything doesn’t necessarily mean you’ll save a lot of time.
In contrast, Xapien provides fully sourced and compliance-ready reports that are comprehensive and reliable from a single search in minutes, with a focus on privacy and ease of use. These reports contain everything you need to move on to the next stage of business, whether it’s signing on a new client or accepting a donation.
Xapien uses generative AI to write reports, along with different AI models orchestrating the research and cross-referencing data from the internet and datasets. Crucially, you can be certain that you have access to all possible information about your subject.
Tools like ChatGPT do not offer the same level of specificity, accuracy, or data integration needed for thorough due diligence.
Here’s what sets Xapien apart.
1. Disambiguation and entity resolution
Xapien excels at accurately identifying and distinguishing between entities sharing the same or similar names. This ensures that users gather information about the correct subject, significantly reducing the time and potential errors involved in manual research. Unlike general generative AI, which struggles with this level of precision, Xapien uses sophisticated algorithms to filter out irrelevant data, enhancing the accuracy of due diligence reports.
2. Integration of structured and compliance-focused data sets
Xapien doesn’t stop at web searches — it incorporates a variety of data sources, including AML screening datasets and corporate records. This access allows Xapien to ground its findings in concrete, verified information. This is key, considering that researchers in most countries conducting due diligence will need to use an AML screening tool to check for risks associated with PEPs, sanctions, and watchlist exposure, and check corporate records
3. GDPR and data privacy compliance
Xapien minimises GDPR and data privacy risks by reducing the need for extensive personal data input. Its advanced disambiguation technology allows for minimal sensitive data to be provided upfront, adhering to privacy regulations more effectively than other AI tools, which may require a substantial amount of information to function accurately. The only data you need to provide Xapien in order for it to produce a comprehensive due diligence report is a name and one piece of context. It will do the rest from there, without any need to iteratively provide more and more sensitive data.
4. Anti-hallucination technology
Xapien addresses the issue of “hallucinations” or inaccuracies common in generative AI models by cross-referencing the AI’s outputs with verified sources. This ensures that the information provided in Xapien’s reports is not only relevant and coherent but also accurate and trustworthy. It provides sourcing for every piece of information provided, down to phrase level so you can check all the data yourself.
5. Consistent analysis
Through the use of pre-trained models and a structured methodology, Xapien offers consistent and analysis across all searches. It fully automates the research process from initial search to data extraction, analysis, and re-search until it can generate a comprehensive report containing all the data you need (and none that you don’t). This means you don’t have to be a prompt engineer to use Xapien, just input a name, some context and press “GO” and you can be confident you’ll get a full due diligence report.
6. Security
Xapien prioritises customer data privacy through encryption, strict data categorisation, and customer-determined retention policies. No customer data is used for model training, ensuring reliance solely on publicly available information. Staff undergo background checks and adhere to confidentiality clauses. Any data sent to third-party providers is protected through obfuscation (turned into code so humans can’t understand what it says), with all searches attributed to “Xapien” to maintain anonymity and confidentiality.
7. User-friendly interface
Despite its complex underlying technology, Xapien maintains a straightforward and consistent user interface, ensuring that users can always obtain the necessary information through a simple search, without the need to navigate changing prompts or methodologies.
Final thoughts
GPT and other Large Language Models (LLMs) could potentially help to generate a due diligence report, offering support for initial data analysis, identifying trends, and summarising complex information. However, their application requires a careful approach, blending AI’s broad capabilities with human expertise for depth and accuracy.
While GPT can streamline aspects of due diligence, like preliminary analysis and report drafting, it falls short in areas requiring precision, up-to-date data, and nuanced understanding. You’ll still need to integrate personal expertise, maintain data privacy, be wary of data limitations and privacy, and ensure analysis quality.
Xapien, on the other hand, is purpose-built for due diligence. It has been designed to provide accurate, comprehensive, and compliant reports from a single search, which contain everything you need to know, from PEP checks to source of funds.
Xapien’s sophisticated disambiguation algorithms, integration of structured and compliance-focused datasets, privacy-friendly data processing, anti-hallucination technology, and consistent, unbiased analysis ensure reliable due diligence outcomes every time. In short, Xapien meets the three crucial criteria for due diligence: the output is traceable, consistent, and trustworthy. The same can’t be said for GPT.
To try Xapien for yourself, fill in the form below.
Monthly learnings and insights to your inbox
Xapien streamlines due diligence
Xapien's AI-powered research and due diligence tool goes faster than manual research and beyond traditional database checks. Fill in the form to the right to book in a 30 minute live demonstration.