BERTinchamps: Cost-Effective Training of Large Language Models for Medical Tasks in French

Fierens, Amaury; Jodogne, Sébastien

DIAL.pr - BOREAL

Accès à distance ? S'identifier sur le proxy UCLouvain

BERTinchamps: Cost-Effective Training of Large Language Models for Medical Tasks in French

Primary tabs

download

Paper.pdf (Author preprint)

Open access
PDF
0.99 M

Fierens, Amaury [UCL]

Jodogne, Sébastien [UCL]

Many medical applications are envisioned for Large Language Models (LLMs), such as the automated summary of the health condition of a patient, or the automated codification of electronic health records. Even though the training of LLMs directly inside hospitals is highly desirable to exploit the local clinical data while avoiding data privacy concerns, this process requires a costly, complex computing infrastructure. This paper explores the recent Cramming approach as a cost-effective way to train LLMs within medical institutions in one day using one GPU. We show that the Cramming approach that was originally designed for English can be transposed to French, and that the resulting models can be successfully fine-tuned to healthcare-related tasks in the French language. This research opens the path to the creation of LLMs that are tailored to the specific needs of institutions that handle sensitive textual data in another language than English.

metadata

Document type	Communication à un colloque (Conference Paper) – Présentation orale avec comité de sélection
Access type	Accès libre
Publication date	2023
Language	Anglais
Conference	"Workshop on Natural Language for Artificial Intelligence (NL4AI, 7th edition)", Rome, Italy (du 06/11/2023 au 09/11/2023)
Journal information	"CEUR Workshop Proceedings" - Vol. 3551 (2023)
Peer reviewed	yes
e-issn	1613-0073
Publisher	CEUR Workshop Proceedings
Publication status	Publié
Affiliation	UCL - SST/ICTM/INGI - Pôle en ingénierie informatique
Links	http://hdl.handle.net/2078.1/279237[Handle] https://ceur-ws.org/Vol-3551/

Bibliographic reference	Fierens, Amaury ; Jodogne, Sébastien. BERTinchamps: Cost-Effective Training of Large Language Models for Medical Tasks in French.Workshop on Natural Language for Artificial Intelligence (NL4AI, 7th edition) (Rome, Italy, du 06/11/2023 au 09/11/2023). In: CEUR Workshop Proceedings, Vol. 3551 (2023)
Permanent URL	http://hdl.handle.net/2078.1/279237

User menu

BERTinchamps: Cost-Effective Training of Large Language Models for Medical Tasks in French

Primary tabs

Footer Help

Languages

Footer menu

User menu

Search form

You are here

BERTinchamps: Cost-Effective Training of Large Language Models for Medical Tasks in French

Primary tabs

Footer Help

Languages

Footer menu