Persistent URL of this record https://hdl.handle.net/1887/77332
Documents
-
- Download
- Multimodal_Technologies_and_Interactions_3oe2019oe60
- Not Applicable (or Unknown)
-
open access
- Full text at publishers site
In Collections
This item can be found in the following collections:
Data-Driven Lexical Normalization for Medical Social Media
complementary knowledge source to scientific medical literature. The extraction of this knowledge is
complicated by colloquial language use and misspellings. However, lexical normalization of such
data has not been addressed effectively. This paper presents a data-driven lexical normalization
pipeline with a novel spelling correction module for medical social media. Our method significantly
outperforms state-of-the-art spelling correction methods and can detect mistakes with an F1 of 0.63
despite extreme imbalance in the data. We also present the first corpus for spelling mistake detection
and correction in a medical patient forum.
- All authors
- Dirkson, A.R.; Verberne, S.; Sarker, A.; Kraaij, W.
- Date
- 2019-08-20
- Volume
- 3
- Issue
- 3
- Pages
- 60