August 20, 2020

Good moves and goodmoves

"All that matters on the chessboard is good moves."
Bobby Fischer, world champion chess

This quote comes from Bobby Fischer, winner of the "Match of the Century", in which he defeated his Soviet opponent Boris Spassky in 1972 and became the eleventh world champion in chess.

In this non-technical overview, you’ll learn how reinforcement learning can be used to self-learn the best moves on both the chess-board and when engaging with customers.

We chose deep reinforcement learning to tackle the challenges of campaign management, because:

The art of generalization

Deep neural nets can be used to find patterns amongst many individual customer touch-points

The number of possible states on a chess board is enormous -- some 10120 in total! This number is so vast, it even exceeds the number of atoms in the universe. That means it is physically impossible to store all chess board states. To be able to still store many of the possibilities, we could combine comparable board states. That’s precisely what deep neural nets are for: they are tools for finding those patterns in data that generalize the most.

By observing many matches, deep neural nets can learn that these two board states are similar.

.. deep neural nets are tools for finding patterns in data that generalize the most.

By learning these patterns, deep neural nets can make informed decisions for data it has never seen before -- be it in chess or in campaign management. Because of their power to generalize, deep neural nets can match human performance in leukemia diagnosis and greatly outperform it in chess-playing ability.

As in chess, the number of all possible interactions with your customers is enormous. goodmoves uses deep neural nets to learn what patterns generalize well across many individual customer touch-points. By learning these patterns, our deep neural nets can make informed decisions for new customers and in novel situations.

goodmoves uses deep neural nets to make informed decisions for new customers and in novel situations

Despite the hype surrounding deep learning, we recognize that deep learning is not always the right choice. Deep learning methods typically require large amounts of data to identify relevant features and associate them to a suitable action. Those large amounts of data are not always available. goodmoves therefore relies on a family of complementary algorithms that stand-by to take over the helm.

Acting with foresight

Reinforcement learning can be used for long-term, strategic thinking

What makes a chess move “good”? For example, is it always bad to lose a piece? Usually, it is — but sacrificing a piece can also be used to great strategic advantage. The loss of a single piece can be the best play, for example, if the opponent thereby breaks up an advantageous formation. In other words, decisions should be assessed based on their long-term benefit as opposed to immediate gains. In both chess and in campaign management, the best move is the one that maximizes pay-off for an entire sequence of actions rather than for decisions individually.


.. the best move is the one that maximizes pay-off for an entire sequence of actions rather than for decisions individually.


What kind of algorithm incorporates the sequence-dependent nature of chess and campaign management? The answer is elegant yet subtle: if the result of an individual decision within a sequence of interactions was good, one does not only increase the desirability value of this decision, but of all decisions that led to this situation. After all, winning a game of chess isn’t only about the very last move you play! To the contrary, a single key move may have set the stage for a win (or loss) many moves later.


.. if the result of an individual decision within a sequence of interactions was good, one does not only increase the desirability value of this decision, but of all decisions that led to this situation


A typical customer journey at a mobile virtual network operator. In this example, offering a 5% discount upfront would have eliminated the need for a 10% win-back bonus.

What does this have to do with customer loyalty? Well, offering a well-placed discount to a customer can be an example of strategic thinking -- in exactly the same way as sacrificing a pawn in the game of chess! Just like in chess, what is the best play in campaign management crucially depends on context and sequence. People refer to this as the problem of the next-best-action (NBA). The word "next" indicates the sequential nature of interacting with customers.

A customer that is considering terminating their contract might be swayed from churning -- but only if offered the right offer at the right time. For some customers, that might be a short-term sacrifice in the form of a discount; for others, notifying them of more premium features. Which of these actions to undertake depends on the customer (that's the context part). When to propose an action depends on how a customer will respond to it in the long term (that’s the sequence part of NBA).

The problem of churn does not usually arise from a single decision at the end of the contract, but rather from a sequence of earlier suboptimal decisions. At res mechanica, we took on the challenge of finding the NBA using reinforcement learning, the state-of-the-art artificial intelligence tool for solving sequential decision problems.


With the help of reinforcement learning, decisions can be optimized in their entirety and dynamically adapted.

To orchestrate all measures in sometimes long-standing customer relationships across many channels within a highly dynamic market poses a massive problem to many companies. Due to a lack of awareness or against better judgement, companies have so far optimized individual decisions in isolation and often based on simple rules. With the help of reinforcement learning, decisions can be optimized in their entirety and dynamically adapted.

Adaptability

Reinforcement learning algorithms can adapt to changing markets without expert fine-tuning

goodmoves engages with customers in an ever-changing market by continuously trying out new strategies. goodmoves explores these new strategies only rarely, because the current strategy is close to optimal, making those trial strategies suboptimal most of the time. But exploring new strategies is crucial -- the alternative is maintaining a strategy that was optimized for a market that no longer exists. Thus, there exists a trade-off between exploiting optimal ways to engage with customers in the market of yesterday, and exploring novel strategies that may better fit the market of today.


.. there's a trade-off between exploiting optimal ways to engage with customers in the market of yesterday, and exploring novel strategies that may better fit the market of today


Adapting to dynamic markets has traditionally been done by expert tuning and A/B tests. Both are costly and slow. Expert tuning has the advantage of human insight and intuition, but does not take into account all available information of your customers. A/B tests do incorporate that information, but can only be done infrequently. Because goodmoves tries out new strategies on-the-fly, it cuts costs by learning autonomously using all available information, and raises revenue by dynamically adapting to market-trends.


.. goodmoves cuts costs by learning autonomously, and raises revenue by dynamically adapting to market-trends.

Summary

Suceeding in either chess or campaign management requires three basic skills, which goodmoves tackles using deep reinforcement learning:

generalize
Learn from past customer touch-points about new customers and in new situations using deep learning.
act with foresight
Use reinforcement learning to predict the pay-off for an entire sequence of actions rather than for decisions individually.
adapt
Continue to explore novel strategies for changes in the marketplace.

Did we spark your interest? At res mechanica, we offer to statistically estimate the added value by running goodmoves on your historic data. Please contact us for a complementary consultation.

For the technology behind goodmoves, we achieved the highest score ever awarded for the renowned "EXIST" grant from the German Federal Ministry for Economic Affairs and Energy. Renowned companies from industries with long-term customer relationships (banks, mobile service providers, insurance and utility companies) are using our services.

We are happy to talk to you about customer retention and a statistical estimation of the benefits of goodmoves based on your historic data. Just get in touch!