A Probabilistic Model for Personalized Tag Prediction

Dawei Yin, Zhenzhen Xue, Liangjie Hong and Brian D. Davison

Full Paper (10 pages)
Official ACM published version: http://doi.acm.org/10.1145/1835804.1835925
Author's version: PDF (192KB)

Watch KDD Presentation (10 min)

Abstract

Social tagging systems have become increasingly popular for sharing and organizing web resources. Tag recommendation is a common feature of social tagging systems. Social tagging by nature is an incremental process, meaning that once a user has saved a web page with tags, the tagging system can provide more accurate predictions for the user, based on the user's incremental behavior. However, existing tag prediction methods do not consider this important factor, in which their training and test datasets are either split by a fixed time stamp or randomly sampled from a larger corpus. In our temporal experiments, we perform a time-sensitive sampling on an existing public dataset, resulting in a new scenario which is much closer to "real-world". In this paper, we address the problem of tag prediction by proposing a probabilistic model for personalized tag prediction. The model is a Bayesian approach, and integrates three factors---an ego-centric effect, environmental effects and web page content. Two methods---both intuitive calculation and learning optimization---are provided for parameter estimation. Pure graph-based methods which may have significant constraints (such as every user, every item and every tag has to occur in at least p posts) cannot make a prediction in most real-world cases while our model improves the F-measure by over 30% compared to a leading algorithm on a publicly-available real-world dataset.

In Proceedings of the 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 959-968, Washington, DC, July 2010.

Back to Brian Davison's publications

Last modified: 3 October 2010
Brian D. Davison