English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Paper

Uncovering Hidden Semantics of Set Information in Knowledge Bases

MPS-Authors
/persons/resource/persons249145

Ghosh,  Shrestha
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons212613

Razniewski,  Simon
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45720

Weikum,  Gerhard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

arXiv:2003.03155.pdf
(Preprint), 2MB

Supplementary Material (public)
There is no public supplementary material available
Citation

Ghosh, S., Razniewski, S., & Weikum, G. (2020). Uncovering Hidden Semantics of Set Information in Knowledge Bases. Retrieved from http://arxiv.org/abs/2003.03155.


Cite as: https://hdl.handle.net/21.11116/0000-0007-0662-4
Abstract
Knowledge Bases (KBs) contain a wealth of structured information about
entities and predicates. This paper focuses on set-valued predicates, i.e., the
relationship between an entity and a set of entities. In KBs, this information
is often represented in two formats: (i) via counting predicates such as
numberOfChildren and staffSize, that store aggregated integers, and (ii) via
enumerating predicates such as parentOf and worksFor, that store individual set
memberships. Both formats are typically complementary: unlike enumerating
predicates, counting predicates do not give away individuals, but are more
likely informative towards the true set size, thus this coexistence could
enable interesting applications in question answering and KB curation.
In this paper we aim at uncovering this hidden knowledge. We proceed in two
steps. (i) We identify set-valued predicates from a given KB predicates via
statistical and embedding-based features. (ii) We link counting predicates and
enumerating predicates by a combination of co-occurrence, correlation and
textual relatedness metrics. We analyze the prevalence of count information in
four prominent knowledge bases, and show that our linking method achieves up to
0.55 F1 score in set predicate identification versus 0.40 F1 score of a random
selection, and normalized discounted gains of up to 0.84 at position 1 and 0.75
at position 3 in relevant predicate alignments. Our predicate alignments are
showcased in a demonstration system available at
https://counqer.mpi-inf.mpg.de/spo.