Σεμινάριο Ioanna Manolopoulou

ΚΥΚΛΟΣ ΣΕΜΙΝΑΡΙΩΝ ΣΤΑΤΙΣΤΙΚΗΣ ΑΠΡΙΛΙΟΣ 2021

Ioanna Manolopoulou, Department of Statistical, Science, UCL, UK

Joint work with: Mariflor Vega and Mirco Musolesi

Posterior summaries of topic models: an example from grocery retail baskets

ΠΕΡΙΛΗΨΗ

Understanding the shopping motivations behind market baskets is an important goal in the grocery retail industry. Analyzing shopping transactions demands techniques that can cope with the volume and complicated dependencies of grocery transactional data, while keeping interpretable outcomes. Latent Dirichlet Allocation (LDA) provides a natural framework to process grocery transactions and to discover a broad representation of customers' shopping motivations. However, summarising the posterior distribution of an LDA model is challenging, because LDA is inherently a mixture model and can exhibit substantial label-switching. Averaging across posterior draws (even after resolving label-switching) inevitably merges semantically different topics which may appear or disappear across draws, and whose average may be semantically meaningless. Moreover, a summary of corresponding uncertainty is not straightforwardly available. In this paper, we introduce clustering methodology that post-processes posterior LDA draws to summarise the entire posterior distribution and identify semantic modes represented as recurrent topics. We illustrate our methods on an example from a large UK supermarket chain.

Teams link: https://bit.ly/3opz2kK

Ημερομηνία Εκδήλωσης: 
Πέμπτη, Απρίλιος 1, 2021 - 12:30