The writer is the Andrew M. Heller Professor at the University of Pennsylvania’s Wharton School
Whether for writing essays, passing academic examinations, or creating software code, ChatGPT has been making headlines since its launch in November.
It is well documented that the artificial intelligence chatbot is knowledgeable — enough, for example, to be worth a solid pass in my MBA class at the Wharton School. It is also friendly and smart. But the quality of its answers is proving highly erratic.
Social media is flooded with examples of questions that ChatGPT fails to answer correctly. Its mistakes are now often referred to as hallucinations. These include its confident explanation of why adding broken porcelain to breast milk can help with an infant’s digestion, and its inference that, if two cars take two hours to drive from A to B, it must take four hours for four cars to complete the same journey.
What, then, is the best use for a technology that probably should not (yet) be trusted without close human supervision?
One opportunity is to turn the tool’s weakness — its unpredictability of response — into a strength.
In most management settings, “high variance” — erratic and unpredictable behaviour — is a bad thing. We want our pilots, doctors or underwriters to perform in an orderly way with the variance minimised. So, in these circumstances, Six Sigma — a set of management strategies for reducing errors — is the name of the game.
For example, any airline considering the recruitment of 10 pilots would prefer all of them to be solid, reliable hires (scoring, say, 7 out of 10 on piloting skill) rather than taking on one pilot who is brilliant (scoring 10 out of 10) and nine that are terrible (scoring 1 out of 10) who might crash the plane.
But, when it comes to creativity and innovation — such as finding a way to improve the air travel experience or launching a new aviation venture — that same airline would prefer one idea that is excellent (10 out of 10) and nine that are nonsense, rather than a set of ten solid ideas.
The reason is that, in creative tasks, variance is your friend. An idea is like a financial option: you simply use it if it is good (or better, great) and just forget about it if it is not. This insight has important implications for the design and use of AI systems, in general, and for their use in creativity and innovation, in particular.
First, we should distinguish between using ChatGPT for idea generation and for the execution of a specific assignment. A student asking: “What would be ten course project ideas given my interest in space travel and psychology?” is an example of idea generation. “Write me a three-page essay about the role of psychology in space travel” is an example of a specific assignment.
Though subtle, this distinction matters. In most assessments of ChatGPT in academic settings, it scored around 50-70 per cent correct answers. For students or managers, this is a helpful starting point for an assignment, such as an essay, but is not enough to get the job done. In idea generation, by contrast, all we need to succeed is one good idea, so we can tolerate many more mistakes.
Second, when we seek only one great idea, we should prompt our AI helper to go wild, just as we should welcome “out of the box” thinking in traditional brainstorming sessions. Using ChatGPT, this can be achieved by prefacing the prompt with “imagine you are a six-year old”, or “what ideas would Steve Jobs come up with?”
As new versions of the technology emerge, we can imagine the user setting a balance between factually accurate (low variance) and totally crazy (high variance).
Third, even the best idea is of little value if no one acts on it. What is the advantage of having one brilliant idea and nine bad ones if we, as human decision makers, are bad at picking the winner? For that, we have to think more carefully about the idea selection process. There exists a large body of academic research showing that even experts (such as venture capitalists) are bad at identifying the best idea.
One way of dealing with this selection problem is parallel exploration. Choosing the best from 10 ideas is hard. But could you pick the best five and explore them a bit further? This approach — which is often referred to as a tournament process — seeks to validate the ideas based on small and inexpensive experiments.
Another way is to turn to ChatGPT and ask it to critically evaluate its own answers. “What problems do you see with this idea?” might be a good question to ask. The answer could tell us why our idea is not as good as we thought. But, remember: all it takes for creativity and innovation is one good idea.