Client intake:

Why search engines aren’t reliable risk assessment tools

In today’s Big Data era, law firms have vast amounts of structured and unstructured information at their fingertips. Search engines might seem like the obvious tool for customer due diligence — they make it easy to find this data with a few keystrokes. However, the reality of using search engines for due diligence is more complex than that.

Due diligence today needs to go beyond traditional risk typologies to consider broader and more nuanced risks such as ESG. This demands having a comprehensive understanding of a subject’s background, not just a surface-level view of their activities.

A quick web search might surface some information, but it won’t tell you everything such as their crime and controversy exposure, source of wealth, business associates, and career history. Information relevant to due diligence is scattered across the open web in blogs, reports, the Panama Papers, press releases, news articles and media coverage.

While you can search for a name and get millions of results back, you’ll get no real insight. Analysts must cross-reference live internet data with corporate records, sanctions lists, and lists of politically exposed people to get the whole story. That requires time-consuming manual research, which can take hours for in-house teams or significant funds to outsource.

If you’re still using search engines for due diligence, here’s why it’s time to reconsider.

1. The information is unstructured

Unstructured data, though valuable, is often underused. Why? It’s messy and fragmented. For human analysts, finding a starting point is overwhelming. Search engines give links, not clean summaries. Analysts must trawl pages for what they need and stitch fragmented information together.

But that’s not all. Once data is gathered, insights need to be analysed and summarised. Turning findings into actionable intelligence needs analytical skills. But with so much information-gathering and cross-referencing to do, the mental space for that valuable work shrinks. Errors start to happen, and false interpretations become more common.

2. False positives are all-too-common

The paradox of having so much information is that more doesn’t necessarily mean better. In fact, it just creates more noise and clutter for analysts to sift through. This can easily result in mixing up names.

We call it the “Common Name” problem. For example, there are two people named Manjit Singh. One is a convicted money launderer who laundered over 15 million dollars from NatWest. The other works in threat intelligence at NatWest focused on preventing money laundering. While searching for a name will give you results about your subject, you’ll also get information about everyone else sharing that name — just like the two Manjit Singhs.

It hinges on the analyst’s ability to draw connections between specific locations, past organisations, business associates, and other personal identifiers to determine if they’re the person or organisation they’re searching for.

3. Search engines dictate what you see

Search engines cater to consumers, not to the needs of compliance teams. They use algorithms to predict what information users want and tailor their results accordingly. Eli Paiser called it the ‘filter bubble’. He illustrated it with this example: one user searching ‘BP’ found investment news first, while another found news on the Deepwater Horizon Oil Spill.

In another example, a corporate partner might’ve been part of a civil litigation case without facing criminal charges. Unless this was reported in the news, it’s unlikely an analyst would uncover it. Missing this kind of information puts an organisation on uneven ground, risking its reputation if a news story emerges later down the line.

4. SEO can manipulate the results

Search engines use bots to scan the web for keywords and index pages by popularity, prioritising what’s most useful to users. This is great for finding trending topics. But as search engines get better at connecting advertisers with consumers, they become worse at surfacing obscure, less-optimised information crucial to due diligence.

It’s much easier for bad actors to conceal information in the dark corners of the internet where a human researcher rarely reaches. Since search engines don’t specifically search for criminal activity, a reference to money laundering could be buried deep on page 80.

Search engines aren’t built for due diligence, but Xapien is

Web searches cover about two per cent of the internet’s total pages. The deep web, where most of the internet exists beyond our reach, remains largely hidden. There’s so much information that analysts don’t know exists, and lack the resources to access it.

Xapien is a research and due diligence tool with 20 specialised AI models under the hood. These models facilitate advanced screening across millions of registries, data screenings, comprehensive compliance datasets, and trillions of web pages on the indexed internet. Our generative AI layer synthesises and summarises all this information to produce a fully-sourced, written due diligence report.

Xapien can spot red flags in trillions of pages of unstructured data such as news articles, legal filings, or images. It identifies legal issues, financial instability, or connections to sanctioned entities by understanding context, recognising patterns, and using intelligent matching strategies. This ensures precise identification and compliance-ready reports.

By replacing the need for analysts to spend their time and resources on manual internet searches, AI transforms the research workflow to improve efficiency and effectiveness.

Complete the form to try Xapien for yourself.

Monthly learnings and insights to your inbox

Xapien streamlines  due diligence

Xapien's AI-powered research and due diligence tool goes faster than manual research and beyond traditional database checks. Fill in the form to the right to book in a 30 minute live demonstration.