HOW IT WORKS
Deep technology extracts, reads and links information delivering truly unique insight
Our knowledge extraction technology
Our technology works just like a human would do, reading, digesting and learning, but at unparalleled speed and scale.
Fluenci’s natural language processing and machine learning algorithms identify not just the knowledge in any piece of text, but importantly how it inter-relates.
This enables us to stitch together a unique understanding of your subject and the people, companies, events and concepts that relate to them. We tie together oblique references to your subject and the topics relating to them, saving you hours of wading through search results and websites.
Websites and news articles often only mention the full name of a person or company once. Thereafter pronouns (he/she/it) or descriptors ‘the billionaire businesswoman’ are used to reference them. Our machine learning models identify and tie together these references, enabling us to identify whether risk or facts tied to ‘her’ are actually about your subject or not.
Natural Language Processing algorithms join facts to people and companies, identifying how they participate in events, where those events took place and other people or companies involved in them. Whether it’s a job role, investment, takeover or marriage, we set everything in context to provide you the richest understanding.
Linking and enrichment
As humans, we know a huge amount about the world around us. We use this to inform and decipher ambiguity in what we read without thinking. Machines have limited real-world knowledge, so we have taught our software how to find supporting information on the web, allowing us to enrich everything and set it in real-world context.
Our disambiguation engine
Names are not unique identifiers. Whether you are reading text or reviewing corporate records, it is hard to identify whether a mention of a person or company is the same as the one you are looking for.
This is rarely a simple decision based on the name alone. As humans, we look at many factors surrounding the mention. We look at other people, sectors, topics or organisations mentioned in nearby context. We then use this information along with what we already know to form a judgement as to how likely it is to be about our subject.
We call this disambiguation. It’s a critical task when researching and incorrect decisions or assumptions can have huge consequences.
Our proprietary technology models every piece of information as ‘possibly’ true, capturing where it came from and our confidence in it. Our machine learning algorithms, working on top of complex networks of probabilistic modelling then resolve the most consistent view of the data.
This means the information we show you in our reports is harmonious, traceable and we have a high confidence that it is about your subject, not someone else.
Faces are one of the strongest indicators that two mentions across two webpages are indeed the same person, whether that’s a biography on a team page, a social profile or a photo of them in the news.
Our technology identifies every face in every image on every page and article we process. Fast neural networks extract the key features and match them across every other facial feature we have.
Global address matching
Address data is one of the most inconsistently presented data types, but being able to match addresses can be critical.
We have replicated this at global scale. Our technology uses vast geospatial data sets and machine inference to resolve locations, in any script, anywhere in the world.
Using population and size data about each location and locality, we are able to determine how significant that location is for matching upon.
When deciding if two people or companies are the same, we consider all the information surrounding them. This looks a bit like a solar-system of stars and linkages, known mathematically as a ‘graph’.
We use large scale parallel graph algorithms including machine learning techniques to resolve identities within our solar-system of facts and knowledge, making an incredibly complicated task as simple as clicking ‘go’.
Our neural risk classifier
We understand the ‘meaning’ of words.
We identify each mention of your subject within an article and examine the words linguistically associated with that mention. This means we can understand what each word really means, enabling us to flag when there is genuine risk and only when it actually relates to your subject, both directly and indirectly.
Risk means many different things to different people. Our technology can bring any topic to your attention, be that an accusation, an award or maybe just association with a sector of interest.
The same word can have many meanings. For example, to ‘poach an egg’ is very different to ‘poaching an elephant’. We use machine learning technology, trained over millions of documents to ‘understand’ the semantic meaning of words in context. Hence we can identify if a verb linked to your subject is risky or not.
Context is everything. To be accused of something is very different to being the person making the accusation.
Our natural language processing determines which way around each risk is. If it’s not directly implicating your subject then we mark it as an ‘indirect risk’, just like we do for other risks, maybe about the subject’s companies.
Not all risk is obvious in the words themselves. In many cases, an organisation simply being associated with a sector for example, tobacco, could be of concern.
We resolve every company, location, person and phrase to wider knowledge on the internet. This means that we know Marlboro, for example, is a brand of cigarettes which are a type of tobacco and hence can flag that as risk.
Search engines are great but they are only the starting point. Finding, reading and condensing the full picture is slow, hard, and painstaking work. Xapien can help.