BibRank: Automatic Keyphrase Extraction Platform Using Metadata

Kuupäev

2021

Ajakirja pealkiri

Ajakirja ISSN

Köite pealkiri

Kirjastaja

Tartu Ülikool

Abstrakt

Automatic Keyphrase extraction is the process of automatically identifying the essential phrases from a document. Keyphrases are used in crucial tasks such as document classification, clustering, recommendation, indexing, searching, and summarization. This thesis introduces BibRank, a new semi-supervised automatic keyphrase extraction method that exploits an information-rich dataset collected by parsing bibliographic data in BibTeX format. BibRank combines a novel weighting technique of the bibliographic data with positional, statistical, and word co-occurrence information. We have benchmarked BibRank and state-of-the-art techniques against the dataset. The evaluation indicates that BibRank is more stable and has a better performance than state-of-the-art methods.

Kirjeldus

Märksõnad

keyphrase Extraction, Metadata, Natural Language Processing

Viide