How To Get the Most Bang for your AI Buck

Nate Buchanan Director, Pathfindr

Those of you who are avid readers of this esteemed publication may recall that way back in Edition #4 we talked about why it’s so difficult to calculate value from AI. This week we’re going to show you how to overcome those difficulties and put together a value framework that will help your team decide where to invest in AI capabilities and how to maximize the return on that investment.

We begin by establishing a simple foundation, one that is likely familiar to anyone who has had to create a business case or track metrics for a project: the difference between quantitative value and qualitative value.

Note that there are some aspects of value that could fall into both categories, such as quality.

It’s possible to measure quality in a quantitative fashion - for example, you could calculate the number of defects found in a production application relative to how many were found during development and testing - but that doesn’t tell the whole story. As any moviegoer or hip-hop connoisseur will tell you, quality is also a qualitative metric that is heavily dependent on an individual’s point of view and past experience. For example, if I were to posit that the majority of hip hop released after 2010 is of low quality, that is a qualitative, not quantitative, statement (however true it might be).

When it comes to AI, there are many different ways to calculate the benefits that your team is getting. However, the best way is to isolate the process you are looking to improve in a test environment or “sandbox” that mirrors the real world as closely as possible. Once you’ve done that, you can introduce the AI enhancements that you’ve developed to improve the process and ask a seasoned practitioner to execute the process with and without the help of AI.

Suppose you supervise a team of 10 that is responsible for taking invoices received via email in PDF format and manually converting them into line items in a table in your accounting system. A simple experiment to calculate value could consist of splitting the team into two groups of five, giving each team the same stack of 100 invoices, and allowing one team (but not the other) to use a new AI application that you have developed that allows the user to upload a PDF invoice and converts it into an entry in a table that can then be uploaded into the accounting system. More than likely, the team using AI will be able to process the invoices in far less time - let’s say it’s 2 hours. By multiplying 2 hours by the average hourly rate of the team, you can arrive at an estimate of cost savings per 100 invoices processed. Dividing the number of invoices processed in a typical year by 100, multiply that by the cost savings number you just calculated, and you’ve got an estimate of annual cost savings from an AI-powered invoice processing solution.

For processes with discrete inputs and outputs - such as invoice processing, document translation, customer call summarization, and so on - this is a fairly straightforward way to calculate quantitative value. Other tasks may lean more towards qualitative value, usually because they are inconsistent or subjective in some way. These include:

1. Writing emails - while Microsoft Copilot, Google Gemini, and ChatGPT can be used to write emails for you, the length of an individual email varies widely depending on who’s writing it and the topic being discussed. While writing emails often takes up a significant portion of my day, I’d be hard-pressed to put a number on it because it changes often.

2. Content creation - creating content with AI, whether it’s a presentation, an image, or a video, can be a time saver but also highly unpredictable. For use cases where the specifics of content don’t matter so much - like a stock photo accompanying a marketing email blast for example - AI-generated images may suffice, but you wouldn’t want to rely on AI to create a client-facing presentation for you.

3. Coaching or advisory - LLMs can be great at helping you think through multiple angles to a problem or suggest ideas you might not have otherwise thought of, but it’s difficult to calculate value from this. Looking at outcomes in the aggregate - for example, the long-term success of an advertising campaign that used AI to help brainstorm ideas - is one way to show benefit from this type of application.

This is not to say that you can’t calculate both types of value for these use cases - you can - just that each one is different and your team needs to be able to understand value in different contexts. Of course, value is only one side of the ROI equation. The other side is cost, and cost can be quite tricky to calculate when you’re working with LLMs because of the nature of token-based pricing.

Let’s set aside considerations such as the build and maintenance of the AI application itself - those costs are not insignificant, but they are fixed (more or less) while each call to the LLM may vary in complexity and, therefore, cost. So any calculation of AI ROI needs to factor this in. There are many free calculators online to help you estimate the cost of an individual LLM call - here is one example, with a screenshot provided below (credit to DocsBot for making this available).

Suppose I want to build a custom application that uses an OpenAI service to automatically craft responses to customer complaints received via email. Let’s further suppose that the quality of the response is more important to me than speed, so I want to ensure I’m using the best model available for high-order analysis and reasoning. If I use GPT-4, and an average incoming complaint is 200 words (or about 150 tokens), and I estimate that a response would be more than double that at 500 words once we have accounted for profuse apologies and offers to “make it right”, it’ll cost me $4.80 per response. A good estimate for the per-minute wage of a typical customer service representative in Australia is about 60 cents, so in order to make “cents” (get it?) it would need to take a human at least 8 minutes to craft a customer complaint response manually. That might be unrealistic…but what if we were able to get satisfactory performance out of GPT-4 Turbo or even GPT-4o? Then the ROI changes significantly. Using GPT-4o, the threshold drops from 8 minutes to 2 minutes, which is much more reasonable. There’s also the opportunity cost to factor in - a person trained to respond to customer complaints could spend time on the phone ironing out more pressing issues directly with human-to-human contact instead of responding to emails. It’s hard to quantify that in the short term, but if the number of customer complaints drops in the long term, you know you’re doing something right.

Other Blogs from Nate

AI for Quality Engineering

Continuing our AI series that we began in last week’s edition with our deep-dive on how AI can make a difference in private equity, this week we’ll focus on a capability instead of an industry.

AI for Private Equity

Occasionally at The Path, we like to take a break from our regular, Pultizer-worthy content to write a deep dive on how AI can make a difference in a particular industry. This week we’re focusing on private equity and how GPs and their management teams can use AI to manage risk, optimize performance, and seize opportunities that others might miss.

It's not too late

Specifically, we’re going to unpack a particular finding in The State of Generative AI in the Enterprise, a report based on data gathered in 2023 and published by Menlo Ventures. Over 450 enterprise executives were surveyed to get their thoughts on how Gen AI adoption has been going at their companies.

Good AI Governance

It may not be everyone's favorite corporate function....but it's very necessary. No corporate buzzword elicits as many reactions - most of them negative - as “governance”. Whether it’s a Forum, Committee, or Tribe, anything governance-related is often perceived as something that gets in the way of progress, even if people acknowledge that it’s necessary.

Great AI, Great Responsibility

For every article, post, or video excitedly talking about the potential of AI, there is another one warning about its dangers. Given the press and hype around each new AI breakthrough, it’s no surprise that governments, business leaders, and academics are closely tracking the development of the technology and trying to put guardrails in place to ensure public safety.

AI for CFOs

For those who think about corporate financials all day, it’s tough out there right now. That won’t come as a surprise to CFOs, or people who work in a CFO’s organization, but it was certainly a wake up call for me as I started learning on the job at Pathfindr.

Build vs. Buy vs. Wait

In this blog, Nathan Buchanan explains why strategic decisions around AI implementation can be so difficult to make.

Know What You're Signing Up For

Previously, we talked about different ways to calculate value from AI implementation. We focused on the different types of value, where it could be found across an organization and the things to keep in mind when you’re trying to track it. What we DIDN’T focus on was the other side of the discussion.