神戸大学附属図書館デジタルアーカイブ
入力補助
English
カテゴリ
学内刊行物
ランキング
アクセスランキング
ダウンロードランキング
https://hdl.handle.net/20.500.14094/0100488375
このアイテムのアクセス数:
17
件
(
2024-05-21
13:10 集計
)
閲覧可能ファイル
ファイル
フォーマット
サイズ
閲覧回数
説明
0100488375 (fulltext)
pdf
1.19 MB
18
メタデータ
ファイル出力
メタデータID
0100488375
アクセス権
open access
出版タイプ
Version of Record
タイトル
Dysarthric Speech Recognition Using Pseudo-Labeling, Self-Supervised Feature Learning, and a Joint Multi-Task Learning Approach
著者
Takashima, Ryoichi ; Sawa, Yuya ; Aihara, Ryo ; Takiguchi, Tetsuya ; Imai, Yoshie
著者ID
A2510
研究者ID
1000050846102
KUID
https://kuid-rm-web.ofc.kobe-u.ac.jp/search/detail?systemId=df7a61d0afafcfc6520e17560c007669
著者名
Takashima, Ryoichi
髙島, 遼一
タカシマ, リョウイチ
所属機関名
都市安全研究センター
著者名
Sawa, Yuya
著者名
Aihara, Ryo
著者ID
A1279
研究者ID
1000040397815
ORCID
0000-0001-5005-7679
KUID
https://kuid-rm-web.ofc.kobe-u.ac.jp/search/detail?systemId=b3ec2a1710d8267b520e17560c007669
著者名
Takiguchi, Tetsuya
滝口, 哲也
タキグチ, テツヤ
所属機関名
都市安全研究センター
著者名
Imai, Yoshie
収録物名
IEEE Access
巻(号)
12
ページ
36990-36999
出版者
Institute of Electrical and Electronics Engineers (IEEE)
刊行日
2024-03-07
公開日
2024-04-02
抄録
In this paper, we investigate the use of the spontaneous speech of dysarthric people for training an automatic speech recognition (ASR) model for them. Although the spontaneous speech of dysarthric people can be collected relatively easily compared to script-reading speech, which is obtained by having them read a prepared script, labeling the spontaneous speech of dysarthric people is very difficult and costly. For training an ASR model using unlabeled speech data, pseudo-labeling and self-supervised feature learning have been studied as effective approaches; however, the effectiveness of these approaches has not been clear when they are applied to the unlabeled dysarthric speech. In addition, pseudo-labeling may not be effective since the pseudo-labels of dysarthric speech include many errors and are not reliable. In this paper, we evaluate the above two approaches for the dysarthric speech recognition, and we propose a multi-task learning approach, which combines these approaches to train an ASR model that is robust against the errors in the pseudo-labels. Experimental results using Japanese and English datasets demonstrated that all approaches are effective, but among them, the proposed multi-task learning approach showed the best performance.
キーワード
Speech recognition
dysarthria
pseudo-labeling
self-supervised feature learning
カテゴリ
都市安全研究センター
学術雑誌論文
権利
© 2024 The Authors.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
詳細を表示
資源タイプ
journal article
言語
English (英語)
eISSN
2169-3536
OPACで所蔵を検索
CiNiiで学外所蔵を検索
関連情報
DOI
https://doi.org/10.1109/ACCESS.2024.3374874
ホームへ戻る