File(s) under permanent embargo
HET-KG: Communication-Efficient Knowledge Graph Embedding Training via Hotness-Aware Cache
conference contribution
posted on 2023-02-21, 03:06 authored by S Dong, X Miao, P Liu, X Wang, B Cui, Jianxin LiJianxin LiWith the popularization and application of Artificial Intelligence technology, knowledge graph embedding methods are widely used for a variety of machine learning tasks. However, most of the current knowledge graph embedding models are trained with a large number of parameters and high computational time complexity. This becomes a main obstacle to apply these existing models to large-scale knowledge graphs. To address this challenge, we propose HET-KG, a distributed system for training knowledge graph embedding efficiently. HET-KG can reduce the communication overheads by introducing a cache embedding table structure to maintain hot-embeddings at each worker. To improve the effectiveness of the cache mechanism, we design a prefetching algorithm and a filtering algorithm for adaptively selecting hot-embeddings, and provide two kinds of hot-embedding table construction strategies. To address the issue of inconsistency between the local cached hot-embeddings and the global embeddings, we also develop a hot-embedding synchronization algorithm for dynamically updating the cache embedding table, which can guarantee the inconsistency bounded within a given threshold. Finally, extensive experiments are conducted on three knowledge graph datasets FB15k, WN18, and Freebase-86m. The experimental results show that HET-KG achieves 3.7x and 1.1x speedup over the state-of-the-art systems PyTorch-BigGraph and DGL-KE, respectively.
History
Volume
2022-MayPagination
1754-1766Location
ELECTR NETWORKPublisher DOI
Start date
2022-05-09End date
2022-05-11ISSN
1084-4627ISBN-13
9781665408837Language
EnglishPublication classification
E1 Full written paper - refereedTitle of proceedings
Proceedings - International Conference on Data EngineeringEvent
38th IEEE International Conference on Data Engineering (ICDE)Publisher
IEEE COMPUTER SOCSeries
IEEE International Conference on Data EngineeringUsage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC