The Tag Genome Dataset for Books

Denis Kotkov, Alan Medlar, Alexandr Maslov, Umesh Raj Satyal, Mats Neovius, Dorota Glowacka

Tutkimustuotos: Artikkeli kirjassa/raportissa/konferenssijulkaisussaKonferenssiartikkeliTieteellinenvertaisarvioitu

3 Sitaatiot (Scopus)


Attaching tags to items, such as books or movies, is found in many online systems. While a majority of these systems use binary tags, continuous item-tag relevance scores, such as those in tag genome, offer richer descriptions of item content. For example, tag genome for movies assigns the tag “gangster” to the movie “The Godfather (1972)” with a score of 0.93 on a scale of 0 to 1. Tag genome has received considerable attention in recommender systems research and has been used in a wide variety of studies, from investigating the effects of recommender systems on users to generating ideas for movies that appeal to certain user groups.

In this paper, we present tag genome for books, a dataset containing book-tag relevance scores, where a significant number of tags overlap with those from tag genome for movies. To generate our dataset, we designed a survey based on popular books and tags from the Goodreads dataset. In our survey, we asked users to provide ratings for how well tags applied to books. We generated book-tag relevance scores based on user ratings along with features from the Goodreads dataset. In addition to being used to create book recommender systems, tag genome for books can be combined with the tag genome for movies to tackle cross-domain problems, such as recommending books based on movie preferences.
Otsikko Proceedings of the 2022 ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR ’22)
AlaotsikkoMarch 14–18, 2022, Regensburg, Germany
JulkaisupaikkaNew York
ISBN (painettu)978-1-4503-9186-3
DOI - pysyväislinkit
TilaJulkaistu - 2022
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisuussa
TapahtumaACM SIGIR Conference on Human Information Interaction and Retrieval: CHIIR -
Kesto: 14 maalisk. 2022 → …


KonferenssiACM SIGIR Conference on Human Information Interaction and Retrieval
Ajanjakso14/03/22 → …


Sukella tutkimusaiheisiin 'The Tag Genome Dataset for Books'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.