Machine learning - the key to modern fraud detection

Machine learning detects fraud more efficiently than ever - and that has everything to do with the data it learns from.

Technical

Written by the Access Worldpay Team
10 January 2022

There are good reasons why machine learning is supplanting traditional systems as the preferred way to detect payments fraud. In this article, we describe the types of learning that underpin FraudSight, Worldpay's fraud detection system, and explain how it benefits from our place in the payments universe.

Detecting fraud involves striking a balance. Hypothetically, a system could achieve total security by blocking all payments; the other extreme would let every payment through. Neither is practical - a balance is necessary.

However, this also involves a balance of risks. On the one hand, detecting fraud while processing a payment request saves merchants the cost of chargebacks or lost goods, and reduces reputational risks. On the other hand, automated systems for real-time fraud detection always make some mistakes.

Some such errors are 'false positives' - genuine payments identified as fraudulent, then blocked. Other mistakes are 'false negatives' - fraudulent payments flagged as genuine. The mark of good fraud detection is to maximize the number of genuinely fraudulent ('true positive') payments while also minimizing the number of false positives and false negatives.

The human touch

Initially, human teams established and operated fraud detection systems. These usually amounted to lots of complicated business rules, informed by cases of genuine fraud discovered after the fact. A fraud manager would examine the characteristics of a fraudulent payment and try to infer a rule to catch similar frauds in the future.

For example, a fraud team might notice that a fraudulent payment involved a mismatch between billing and delivery addresses. A new proposal to block similar mismatches would certainly prevent some future fraud, but would also likely carry the cost of creating many new false positives.

Enter the machines

The cost of manually maintaining complex sets of business rules has proved enormous, with the rules themselves often producing poor results. As a result, even though manual systems may be more familiar and appear more intuitive, they are increasingly being replaced by automated systems based on the complex algorithms of machine learning.

The principle remains the same, though: in place of business rules, machine learning identifies new logic that can detect as many true positives as possible, while minimizing the false positives. This logic is then fed into the machine learning model, leveraging its ability to process quantities of information far beyond the capabilities of even the best human teams. Using this information, the model trains itself to let good payments go through by differentiating between those that are typically fraudulent and those that are not.

Finding the line

Worldpay's automated fraud-detection system, FraudSight, is underpinned by two main types of machine learning. The first is Supervised Learning, which determines whether a new transaction can be seen as typically fraudulent or typically genuine.

Imagine a graph populated with lots of red and green dots. Red dots represent fraudulent transactions; green dots are genuine ones.

Graph with red and green dots

If we know just two facts about any payment - the amount and time, say - then the graph is a simple X/Y chart and the line has just two dimensions. If we know a third fact, such as the age of the cardholder account, then the 'line' becomes a 3D plane.

In practice, FraudSight's Supervised Learning model ingests so many facts that the 'line' is a multi-dimensional hyperplane. Nevertheless, the model still works to identify the 'lines' that best distinguish genuine from fraudulent payments and then uses these lines to predict whether or not a new payment is genuine. Ultimately, its purpose is to say that 'we think this payment is fraud because it looks more like fraud than genuine payments'.

Spotting the anomalies

The second process is Unsupervised Learning. Instead of trying to differentiate between genuine and fraudulent transactions, it models the distribution of the data and then identifies anomalies. By asking questions such as: “Is this payment normal for this customer?”, it aims to say that a payment looks like fraud because it seems anomalous.

Unsupervised Learning can also enrich FraudSight's Supervised Learning. For example, instead of simply using the currency amount of a given transaction, the model might first learn what a 'normal' payment amount is for that card, and measure the difference of each new payment to that norm to see how anomalous it is. Such a 'difference to normal' ratio can then become a new characteristic in the multi-factor Supervised Learning model.

Diversity is strength

What Supervised and Unsupervised Learning both require is data - the more, the better. Here, FraudSight's access to 40 billion transactions each year, including millions of confirmed fraudulent transactions, is a valuable asset.

Quantity of data matters; diversity matters more. Worldpay's role across the payments ecosystem provides FraudSight with an exceptionally wide field of vision. Specifically, Worldpay can access three complementary types of transaction data:

Gateway data: rich in customer and merchant data but lacking in fraud outcomes.
Acquiring data: rich in fraud and payments data, but lacking in customer and merchant data.
Issuer data: rich in customer account-level data, but without device or holistic merchant level data.

FraudSight also sees data from a wide variety of issuers, merchants and acquirers, as well as different payment devices and schemes. This breadth of data offers an unusually rich view across the entire payments system, making FraudSight uniquely effective at detecting and preventing fraud. In fact, analysis of earlier versions of the system show that it reduced fraud chargebacks for a large omnichannel merchant by 95%.

How it works

FraudSight operates at three levels. The first is our real-time, instantaneous decision to accept or decline a payment. This decision is informed by a second level - a near-live process which is constantly recalculating different variables, such as the average spend over the last day or the maximum spend during the past week. These calculations mean that FraudSight's picture of the current payments landscape is always being updated.

The third layer is the build and update cycle, which underpins everything else. The cycle happens offline, at regular intervals, using a heavy compute platform and enormous datasets of transactions and fraud outcomes. Fundamentally, it analyzes the millions of data points and optimizes the information to find the 'line' separating the red and green dots.

What comes next?

Access FraudSight is currently in pilot and will slowly go live over the next few months. It uses our Access Worldpay platform to offer merchants a fraud-detection system that is wrapped up into a single HTTP event, making it straightforward to integrate.

We will publish an article that covers Access FraudSight in more detail in the near future - watch this space!