English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Paper

GPTKB: Building Very Large Knowledge Bases from Language Models

MPS-Authors
/persons/resource/persons249145

Ghosh,  Shrestha
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons257119

Nguyen,  Tuan-Phong
Databases and Information Systems, MPI for Informatics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

arXiv:2411.04920.pdf
(Preprint), 365KB

Supplementary Material (public)
There is no public supplementary material available
Citation

Hu, Y., Ghosh, S., Nguyen, T.-P., & Razniewski, S. (2024). GPTKB: Building Very Large Knowledge Bases from Language Models. Retrieved from https://arxiv.org/abs/2411.04920.


Cite as: https://hdl.handle.net/21.11116/0000-0010-133A-8
Abstract
General-domain knowledge bases (KB), in particular the "big three" --
Wikidata, Yago and DBpedia -- are the backbone of many intelligent
applications. While these three have seen steady development, comprehensive KB
construction at large has seen few fresh attempts. In this work, we propose to
build a large general-domain KB entirely from a large language model (LLM). We
demonstrate the feasibility of large-scale KB construction from LLMs, while
highlighting specific challenges arising around entity recognition, entity and
property canonicalization, and taxonomy construction. As a prototype, we use
GPT-4o-mini to construct GPTKB, which contains 105 million triples for more
than 2.9 million entities, at a cost 100x less than previous KBC projects. Our
work is a landmark for two fields: For NLP, for the first time, it provides
\textit{constructive} insights into the knowledge (or beliefs) of LLMs. For the
Semantic Web, it shows novel ways forward for the long-standing challenge of
general-domain KB construction. GPTKB is accessible at http://gptkb.org.