All posts

Developed to Take Actions

Ergodic Team
4 Dec

Developed To Take Actions

Imagine you are stuck in a maze, looking for the price place where rich rewards are hidden. Your task is a bit more complicated though, as there are dangers hidden along the way: you can either fall in one of many traps, or enter magic doors that will lead you somewhere randomly within the maze.

As intelligent human beings, we’re very quick to develop strategies to navigate this maze - we can identify the traps and learn to orient ourselves when we enter magic doors. We recognise situations we’ve seen before and can learn how to follow the “good paths” that lead us to rewards, all while avoiding the traps along the way.

World models are essential for decision-making and taking action –  in this article, we’ll explore how, similar to you navigating a maze, a world model enables a deeper understanding of your enterprise’s environment.

Creating a World Model

The first thing one does is to explore: by exploring the maze we’re learning how to construct a mental map of our environment: “if I turn left, I end up in the green room, if I turn right, I might return to the start”. We’re building a world model of the space around us: identifying our state within the map, our possible actions and the consequences of taking those actions.

This is not very different from a typical business environment: we have a set of complex actions we need to take, we have uncertainty over their consequences, and our goal is to reach a certain reward as quickly and cheaply as possible.

Let’s say you have a publication and your objective is to get as many paying customers as possible. A typical question you could ask is: does offering free articles increase the chance of conversion? Try asking that to ChatGPT! It depends, obviously. It depends on the customer, on the quality of the publication, on macroeconomic factors, etc etc… but does it work for you? Only you can answer that question.

Like the maze, you can’t possibly answer that question unless you start building a world model of this environment:

  • What if I offer one free article per viewer?
  • What if I offer 5 free articles but require a free trial subscription?
  • What if I offer 5 free articles but only for readers in London who are interested in finance?

There are countless questions one can ask in order to disentangle our path through this maze and find exactly the shortest path to the reward.

Which Actions Should We Take?

Like in the maze, we need to start somewhere. Maybe we do already have data on paths that were taken, and we have already measured their impact on our rewards. Say for instance you ran a trial program in which you offered 5 free articles to everyone. And let’s say it did improve conversions! Does that mean it worked?

To answer that question, let’s take the following example: eBay notoriously spent a lot of money on search ads on a specific keyword: “ebay”. You google eBay, and the first ad that appears is eBay. This was also a big source of revenue - these ads worked well at driving traffic to the website. But what if you turned them off? To answer that question, scientists at eBay simply ran a controlled experiment in which they turned off the ads for a third of the country and measured its impact: zero. It turns out that brand keyword ads have no short-term measured benefits.  

Building a world-model around the problem is only possible if we understand that creating an ad not only will impact the conversion through ads, but also “steal clicks” from the organic search, thus simply rechanneling traffic that would have gone anyway to ebay into the paid ad - they all look the same anyway: a very expensive re-routing.

So back to our example - we need to be able to identify every possible outcome and measure it in a controlled way, also understanding where the pocket of values for our interventions live. In order to properly get to know our environment, we need a system that can:

Measure -> Create Assumptions -> Plan -> Execute -> Measure

Measuring past data is crucial to help us form assumptions about the environment around us. What do we know? What don’t we know? How much do we stand to gain or lose?

Translating these statements into assumptions helps us actually get to a point where we can plan accordingly in order to target those exact pockets of value or ignorance – we either need to learn more or try to reap the rewards of intervening in our environment - and our plan should use our assumptions in order to tell us where to intervene.

Measuring the impact of our actions should again refine our assumptions and learn more about the environment, guiding us to building a better world model - which shows us where to find the rewards in the maze.

The Science

Coming Soon... 👀