6 months, 21 days ago

Exploitation and Exploration

Link: http://www.strategicstructures.com/?p=1583

I love going to jazz festivals. Listening to good jazz at home is enjoyable, but there is something special about the electricity that sparks during a live performance. And it’s not the same when you listen to a recording of a concert. It is completely different when you are actually there, immersed, experiencing directly with all your senses. I guess it’s similar with other types of music. But what makes the difference between listening to a recording and being at concert even bigger for jazz, is that it is all about improvisation. And then the experience with single concerts and festivals is also different. With concerts, you immerse yourself for a couple of hours into that magic and then go back to the normal world. But with jazz festivals, you relocate to live in a music village for a couple of days. This doesn’t only make it a different experience, but also calls for a different kind of decision.

Previously, when I learned of a new jazz festival or read the line-up of a familiar one, the way I decided whether to go was simple. I just checked who would be performing. If there were musicians that I liked, but hadn’t watched live, or some that I had but wanted to see again, then I went. If not, I usually wouldn’t risk it.

Once I chose to go, this brought another set of decisions. Jazz festivals usually have many stages with concerts going in parallel during the day and into the night. Last time I went to the North Sea Jazz Festival there were over eighty performances in only a few days.  So there is a good chance that some of those you want to watch will clash, and you are forced to choose. And I kept applying the same low-risk strategy for choosing what to watch as I did for deciding if I should go at all.

Then one day, I arrived late to a festival just before two clashing sets were about to begin. I dashed into the closest hall with no clue what I would find. And there I experienced what turned out to be the best concert of the whole festival. I hadn’t heard of the group and if I had read the description beforehand I would have avoided their performance.

I realised then, by only choosing concerts with familiar musicians, I was over-exploiting and under-exploring. My strategy was depriving me of learning opportunities and overall reduced the value I got from the festivals.

What happened at that festival changed the way I decide whether to go and to which concerts. Now I not only go to many more concerts of musicians previously unknown to me but not having a familiar name in the line-up does not determine the decision to buy tickets.

However, when the whole line-up of the festival is completely unknown, then going is all exploration and that’s highly risky. When there are no familiar musicians, I listen to recordings of previous concerts of some of the groups. If I like at least two of them, then I usually go to the festival, and once there, I would still check out a few acts I don’t know. That’s again a way to balance exploitation and exploration.

When I am in control, “I restrict the world to what I can imagine or permit”, writes Ranulph Granville. He gives the example of going to a restaurant with friends. If it’s always him who chooses the restaurant, the group will only go to the restaurants that he knows. They are limited to his taste and knowledge, or rather – as he admits – ignorance. Letting go of control by letting others choose, would not only expand his knowledge but would often give a better experience for everyone.

Having the wrong strategy when it comes to jazz festivals and restaurants would reduce the pleasure, but making such a small impact, these examples may not show how important this balance is (Note: Exploration and Exploitation is one of the Essential Balances in Organizations). Yet, we make similar choices all the time. For example, you might decide to invest your time in getting better at what you currently do well while not allocating time to trying out new things. This may put you in a very unpleasant situation in times when there is no more demand for what you are skilled at, or when you need a change but have difficulty choosing because you haven’t tested many alternatives.

Yet it seems that throughout our lives, many of us realise that when making choices, we should keep the balance between exploration and exploitation. We should let go of some control and not limit ourselves to what we already know. And that’s an important first step, but it’s not quite enough. It takes a greater effort to keep this awareness awake. But somehow, it’s also easier at a personal level.

We live our lives and are experiencing every minute of every day. We absorb sounds, tastes, smells and light and feel the air on our skin. Through evolution we are well equipped to receive a signal when there is even a small problem. We get a scratch and react right away. That’s not the case with organizations. They might be missing a whole limb or – and here the metaphor will fail to produce a feeling of exaggeration – a head, without noticing for years. To understand how the balance between exploration and exploitation works in organizations, we’ll start with the problem of resource allocation, and then move to more complex situations.

Allocation of resources

You have discovered a gold deposit. How do you allocate your limited resources? How much should you invest in exploiting the deposit you have, versus the amount spent in looking for new deposits? You don’t know how big the lode is and the amount of gold you can extract from the current mine and you also don’t know if you’ll find a new one. If you put all your efforts and resources in exploiting the current deposit, you might not have sufficient resources left to look for new ones when this one is exhausted. But the next one you find might be bigger or have higher quality. However, it is just as likely that you invest a lot in exploring and you don’t find another deposit at all, so unlike in exploitation, all that resource will be wasted.

Companies face this dilemma all the time. For example, when there are a few successful markets (clients, consumer groups or regions), should the company come up with new products and services to sell to the existing markets, or explore new territories with current (or new) offerings? Should it improve the current technologies or explore new ones? This dilemma is present not only for the marketing and production strategy but in almost every investment decision.

Thus, the basic understanding of exploitation-exploration is as an optimisation problem. And indeed, as such, it is heavily researched. In probability theory, it is known as the multi-armed bandit problem, and there are plenty of optimisation strategies and computer algorithms that were developed in the second half of the last century that can lead to solutions. The applications are wide-ranging, from clinical trials and financial portfolio design to machine learning. Going through all of them is beyond the objective of this post. It is sufficient to say that studying them is useful, as a lot of thinking, mathematics, modelling and experimenting have been invested with impressive results. The discovered patterns, algorithms and strategies can be useful to some extent in certain organizational contexts. But overall, all these studies work with a lot of assumptions and simplifications which don’t hold in real-life situations.

A broader understanding

The exploitation-exploration dilemma is present everywhere in organizations: in strategy, marketing, sales, research, operations and projects. It appears in different guises and is communicated in various terms. Yet it is not very common to hear it explicitly discussed at meetings.

As James March wrote in 1991,

Exploration includes things captured by terms such as search, variation, risk taking, experimentation, play, flexibility, discovery, innovation. Exploitation includes such things as refinement, choice, production, efficiency, selection, implementation, execution. Adaptive systems that engage in exploration to the exclusion of exploitation are likely to find that they suffer the costs of experimentation without gaining many of its benefits. They exhibit too many undeveloped new ideas and too little distinctive competence. Conversely, systems that engage in exploitation to the exclusion of exploration are likely to find themselves trapped in suboptimal stable equilibria. As a result, maintaining an appropriate balance between exploration and exploitation is a primary factor in system survival and prosperity.

Exploration and exploitation compete for resources and so organizations have to make choices. Some of them are explicit, but most are implicit. The explicit choices are seen as decisions made comparing alternatives. A typical example is investment decisions. In comparison, implicit choices are – as March put it – “buried in many features of organizational forms and customs, for example, in organizational procedures for accumulating and reducing slack, in search rules and practices, in the ways in which targets are set and changed, and in incentive systems”.

Working with the exploration-exploitation balance does not only help in seeing the exploitation and exploration patterns in organizational communications and decisions. It also shifts the attitude to what’s going on, away from what is accepted as rational or intuitive.

For example, a quick-learning new employee starts actually contributing to the organization sooner. She does so by being able to absorb the organizational knowledge in a shorter time. That may be good for her and for the organization in the short term, but it might be bad in the long run. When a slow learner joins the organization, it will take longer for him to fit in, but that might actually improve the organizational knowledge and norms. And the same person when well established, will be slow to absorb new knowledge. This would often be a healthy conservatism, as it would reduce the risk of investing in fads, as March pointed out.

Exploration and exploitation in time

Exploitation follows exploration. We first explore the menu, select, order and then consume what we’ve ordered (or what we think we’ve ordered). A pharmaceutical company carries out a lot of experiments to come up with a new formula which will work against a certain disease. These experiments may be futile or fruitful or, in fact, come up with something that does not treat the intended disease but turns out to be useful against another. In the first case, that exploratory path does not end in exploitation, but in the other two, it does, in an expected or an unexpected way. In any case, first comes exploration and then exploitation.

Yet we can see the exploration and exploitation in such a sequence only if we focus on a particular element like choosing the food in a restaurant. But things are connected and they interact all the time. If I go to explore a forest, I can do that by exploiting my shoes. The pharmaceutical company is carrying out experiments exploiting laboratory equipment. Spacecraft explore the universe by exploiting various technologies.

In these examples, exploration and exploitation go in parallel, coordinated, but their object is different. I’m exploring the forest, but exploiting my shoes when doing that, not the forest. The pharmaceutical company and the spacecraft have also different objects of exploration and exploitation. There are some cases, however, when exploration and exploitation can be on the same object and at the same time. And this can work pretty well.

One such case is Twitter. Sharing a tweet using a so-called “re-tweet”, was neither designed nor planned as a feature in the initial releases of Twitter. People simply started tweeting others’ tweets adding “RT” for re-tweet and this was taken up and became viral. Then both Twitter and apps and services in the Twitter ecosystem added a lot of capabilities around RT. It evolved this way because people were exploring Twitter at the same time they were exploiting it. That exploration produced many other ways of using it which did not catch on, or at least not to the point of becoming one of the essential capabilities of the service.

Something similar happens in the mobile apps markets. By the end of 2018, there were over two million apps in each of the two biggest stores. By making it easy for the app writers to release new versions and simple for the app users to install and update, an ecosystem was created that was quite different from the traditional software world. When releasing new apps and features, app writers explore the market while at the same time exploiting it. Each app and each feature work as an almost unbiased market survey. At the same time, they are actual business, actual exploitation, with revenue being generated either by ads or by selling the app.

It’s similar for the users. They don’t know what will fit their needs and preferences. While trying out new apps and features, app users are also using them, in this way exploring and exploiting at the same time.

Depending on the perspective, user or providers, and the zoom level, features, apps, market participants, we can see different dynamics. At the level of a single app, this period of parallel exploration and exploitation evolves into exploitation only, but if we zoom out, we’ll see that they keep running in parallel. While being used to certain apps, users keep exploiting the app market. By utilising new apps (a level up) or new features (a level down), they co-explore with the app writers.

App writers, both individual and companies, compete by improving the existing capabilities and releasing new ones. Some of them develop new apps entering again into a mode of parallel compressed exploration and exploitation. The balance can be seen at the next level as well, the level of the developers (entrepreneurs). New ones come and some grow, others go. The invisible hand of the market produces entrepreneurs who try out new things and stay (exploit) if they are successful and leave if not.

My going to jazz festivals taught me a lesson about the balance between exploration and exploitation. Observing app markets shows what it is to explore while exploiting. But for an example of the latter, I could’ve just stayed at the jazz festival. Jazz improvisation is where exploitation and exploration run in parallel to form a compressed and precarious balance. With classical music, exploration and exploitation are separated in time and space. A composer explores by trying out different harmonies and melodies and later on an orchestra exploits the composed fixed sequence of notes with prescribed length and manner of playing. In a jazz band, composing and performing happen at the same time. Each musician is exploring and exploiting the territory marked by the main theme, their own ideas and others’ inventions and provocations. It takes a long time and hard work to reach that level of awareness, intuition and skill. Musicians put in many years of practice to produce just a few minutes of good unprepared music. The years of preparation, supply jazz musicians both with an arsenal of patterns to use when short of ideas (exploit), and the skills necessary to break out of them (explore).

For organizations, over-exploiting would either exhaust the resource they have or would make them slow to adapt, less competitive and eventually they would be driven out of the market. Over-exploring, on the other hand, would exhaust their own resources. To maintain the balance is always necessary for survival. However, depending on the level of complexity, it might not be sufficient just to maintain it, but how the balance is kept becomes crucial.

In biological evolution, sight might be considered as one of the most impressive achievements. For species that excel in seeing, perception (exploitation) happens together and is complemented by the eye movement (exploration). The evolutionary advantage of not just balancing but bringing closer exploration and exploitation, applies to organizations as well, as the examples of Twitter and app markets show.

The more uncertain the environment, the more exploration and exploitation should follow each other in shorter cycles, and the more organizations need to invent new business models where exploration and exploitation go together or are both characteristics of one and the same activity.

This is an excerpt from the chapter “Exploration and Exploitation” from the forthcoming book “Essential Balances on Organizations”.

Thanks to Rob Worth for reviewing and improving the text.

 

 

Related posts

Notes on Stability-Diversity

Essential Balances in Organisations

QUTE: Enterprise Space and Time