How To Un-Botch Predictive AI: Business Metrics

Data scientists consider business metrics more important than technical metrics.

Predictive AI offers tremendous potential – but it has a notoriously poor track record. Outside Big Tech and a handful of other leading companies, most initiatives fail to deploy, never realizing value. Why? Data professionals aren’t equipped to sell deployment to the business. The technical performance metrics they typically report on do not align with business goals – and mean nothing to decision makers.

For stakeholders and data scientists alike to plan, sell and greenlight predictive AI deployment, they must establish and maximize the value of each machine learning model in terms of business outcomes like profit, savings – or any KPI. Only by measuring value can the project actually pursue value. And only by getting business and data professionals onto the same value-oriented page can the initiative move forward and deploy.

Why Business Metrics Are So Rare for AI Projects

Given their importance, why are business metrics so rare? Research has shown that data scientists know better, but generally don’t abide: They rank business metrics as most important, but in practice focus more on technical metrics. Why do they usually skip past such a critical step – calculating the potential business value – much to the demise of their own projects?

That’s a damn good question.

The industry isn’t stuck in this rut for only psychological and cultural reasons – although those are contributing factors. After all, it’s gauche and so “on the nose” to talk money. Data professions feel compelled to stick with the traditional technical metrics that exercise and demonstrate their expertise. It’s not only that this makes them sound smarter – with jargon being a common way for any field to defend its own existence and salaries. There’s also a common but misguided belief that non-quants are incapable of truly understanding quantitative reports of predictive performance and would only be misled by reports meant to speak in their straightforward business language.

But if those were the only reasons, the “cultural inertia” would have succumbed years ago, given the enormous business win when ML models do successfully deploy.

The Credibility Challenge: Business Assumptions

Instead, the biggest reason is this: Any forecast of business value faces a credibility question because it must be based on certain assumptions. Estimating the value that a model would capture in deployment isn’t enough. The calculation has still got to prove its trustworthiness, because it depends on business factors that are subject to change or uncertainty, such as:

The monetary loss for each false positive, such as when a model flags a legitimate transaction as fraudulent. With credit card transactions, for example, this can cost around $100.
The monetary loss for each false negative, such as when a model fails to flag a fraudulent transaction. With credit card transactions, for example, this can cost the amount of the transaction.
Factors that influence the above two costs. For example, with credit card fraud detection, the cost for each undetected fraudulent transaction might be lessened if the bank has fraud insurance or if the bank’s enforcement activities recoup some fraud losses downstream. In that case, the cost of each FN might be only 80% or 90% of the transaction size. That percentage has wiggle room when estimating a model’s deployed value.
The decision boundary, that is, the percentage of cases to be targeted. For example, should the top 1.5% transactions that the model considers most likely to be fraudulent be blocked, or the top 2.5%? That percentage is the decision boundary (which in turn determines the decision threshold). Although this setting tends to receive little attention, it often makes a greater impact on project value than improvements to the model or data. Its setting is a business decision driven by business stakeholders, representing a fundamental that defines precisely how a model will be used in deployment. By turning this knob, the business can strike a balance in the tradeoff between a model’s primary bottom-line/monetary value and the number of false positives and false negatives, as well as other KPIs.

Establishing The Credibility of Forecasts Despite Uncertainty

The next step is to make an existential decision: Do you avoid forecasting the business value of ML value altogether? This would prevent the opening of a can of worms. Or do you recognize ML valuation as a challenge that must be addressed, given the dire need to calculate the potential upside of ML deployment in order to achieve it? If it isn’t already obvious, my vote is for the latter.

To address this credibility question and establish trust, the impact of uncertainty must be accounted for. Try out different values at the extreme ends of the uncertainty range. Interact in that way with the data and the reports. Find out how much the uncertainty matters and whether it must somehow be narrowed in order to establish a clear case for deployment. Only with insight and intuition into how much of a difference these factors make can your project establish a credible forecast of its potential business value – and thereby reliably achieve deployment.

Forbes