Implementing a machine learning model for ranking in an ecommerce search requires a well-designed approach to how the target metric is defined. The challenge in e-commerce is that “relevance, given an intent to buy something” differs from the pure question “How relevant is this product to a given query?”. Thus, we cannot use crowd sourced data for training, but need to deduct relevance from the customer interactions we collect every day on our website. We call these calculated relevancies judgements. We use them to create the ranking gold standard to train our ranking model.
We determine the KPI (Key Performance Indicator) (clicks, orders, ...) and the mathematical modelling of judgements based on experiments. For a sampled set of search queries, we show 50 percent of our customers a ranking based on our judgements, the other 50 percent see the status quo. By presenting our customers a ranking based on a pure target metric we can evaluate the quality of our definition of relevance.
In this talk we will share what we learned about our customers, our products, and the advantages of fast iterations along the way of finding a good judgement model.
This talk is presented by Otto.