Online merchandising and the bandit problem

There is a perception in ecommerce that a focus on promoting top selling products is a sound strategy. That is perfectly logical, reasonable, and wrong.

In fact, promoting only top selling products is a complete waste of time, money and effort, which springs from a fundamental misunderstanding of the ecommerce challenge and how it differs from the offline world.

Intelligent exposure

Product promotion via recommendation and search builds on one of retail’s oldest techniques – the top seller list, which first appeared in ‘The Bookman’ in London in 1891.

The success of the top seller list in that context was built on three simple truths. First, people buy what is popular; second, showing people the best sellers makes them buy more; and third, top seller lists encourage people to buy from the top of the list, not the long tail.

Offline there are many very good reasons for this - limited space for stock and the need to keep production costs down, for instance, mean the imperative is to streamline demand.

But ecommerce presents very different challenges, and demands a different view of promotion. Online, space is practically unlimited so the imperative is simpler – maximise sales. As a result, promotion is not about encouraging a focus on the top sellers only; it is about maximising product exposure in an intelligent way. In this context, intelligent means relevant – that is, exposing more of the right products to the right people, more of the time.

In terms of the timeless top seller list, our proxy for product promotion here, those online priorities change the rules. Yes, the aim is still to encourage people to buy more, but the need to expose more products, not fewer, requires that we add a crucial new ingredient to the mix - volatility.

Online, computing top sellers over a time period of a month is completely wrong. Rather, recommendations and search must reflect both consistent best sellers and micro trends – those short lived, but highly valuable sales spikes driven by hard-to-predict external events. Volatility is critical to capitalising on those events, when the time is right and for the right period of time.

Exploitation versus exploration

In essence then online promotion, via recommendation and search is about striking the right balance between exploitation and exploration:

  • Exploiting the top sellers to maintain consistent sales and to preserve their impact as the all important basket openers
  • Encouraging exploration further into the product catalogue, to drive additional sales while gathering the information that helps to identify emerging trends and predict future best sellers.

Crucially, that exploration should be shaped by a mixture of behavioural feedback to identify trends and business metrics to ensure promotion makes commercial sense.

The bandit problem

As it happens, this exploitation versus exploration conundrum is neatly encapsulated in a probability theory scenario known as the multi-armed bandit problem. In simple terms, the multi-armed bandit problem describes a gambler standing at a row of slot machines (or one-armed bandits) who must decide which machines to play, in what order and how many times in order to maximise returns.

The crucial trade-off the gambler faces is between the exploitation of the machine with the highest expected pay-out, and exploring the other machines in order to maximise the opportunity to earn rewards.

In ecommerce, focusing on top sellers is like pulling only one lever, and ignoring all the other machines. The challenge in this variation on the bandit problem is not to find the best single slot machine, but to find the combination of slot machines that, together, are better than the best.

This preserves the upsides of promoting top sellers, but also enables the significant benefits associated with exploration, or adding volatility to promotion – benefits that are otherwise ignored.

The winds of change

Few retailers have yet come to terms with these fundamental differences between offline and online merchandising, much less implemented strategies that exploit them. It is, however, only a matter of time. Once the results enjoyed by the few that have taken the lead are widely understood, the rest will follow.

One such leader is adlibris, a Swedish online bookseller. It took on the challenge of the transforming a key element of online merchandising - using real time behavioural data and longer term sales volumes to predict top seller trends and therefore surface more relevant products, more of the time, and for the right period of time. The results were extraordinary.

The announcement of the 2014 Nobel Peace Prize, awarded to Malala Yousafzai, provided the ultimate test, as it drove a huge spike in interest in her autobiography, I am Malala. And Adlibris’ sophisticated merchandising model passed with flying colours.

It drove efficient navigation and merchandising across multiple categories – even autocomplete - and merchandising was optimised in terms of timing. The hot trend was promoted for precisely the right time period – and this was clearly reflected in A/B test results.

In search alone, this new approach to merchandising in the online context drove revenue increases of 6.4 per cent (over two consecutive AB-tests), while the number of search refinements decreased by 41 per cent. Overall, the solution delivered a return on investment of £30,000 per month.

In a retail environment where smart retailers are already questioning the value of standalone, plug and play merchandising solution, it's only a matter of time before this kind of holistic approach is the norm, not the exception.

Jakob Bignert, Vice President of Product at Apptus

Image source: Shutterstock/Pretty Vectors