Character Eyes: Seeing Language through Character-Level Taggers

Marone, Marc

Title:

Character Eyes: Seeing Language through Character-Level Taggers

Files

MARONE-UNDERGRADUATERESEARCHOPTIONTHESIS-2018.pdf (260.5 KB)

Author(s)

Marone, Marc

Advisor(s)

Eisenstein, Jacob

Advisor(s)

Person

Eisenstein, Jacob

Associated Organization(s)

Organizational Unit

College of Computing

Organizational Unit

School of Computer Science

Organizational Unit

Undergraduate Research Opportunities Program

Abstract

Character-level models have been used extensively in recent years in NLP tasks as both supplements and replacements for closed-vocabulary token-level word representations. In one popular architecture, character-level RNNs, typically LSTMs, form a bottom tier creating a word representation for a sequence tagger used to predict token-level annotations such as part-of-speech (POS) tags. In this work, we examine the behavior of POS taggers from the perspective of individual hidden units within the character-level LSTM. Analysis of activation patterns on a macro scale allows us to identify the ways in which the burden of POS detection is spread across the hidden layer in different languages, as a function of their morphological properties. Using ablation tests, we show how different allocations of forward and backward units affect model arrangement and performance in different categories of languages. We use these results to offer heuristics for hyperparameter selection that are based on known linguistic traits.

Date Issued

2018-12

Resource Type

Text

Resource Subtype

Undergraduate Thesis

Full item page

Title:

Character Eyes: Seeing Language through Character-Level Taggers

Files

Author(s)

Authors

Advisor(s)

Advisor(s)

Editor(s)

Associated Organization(s)

Series

Collections

Supplementary to

Permanent Link

Abstract

Sponsor

Date Issued

Extent

Resource Type

Resource Subtype

Rights Statement

Rights URI

Georgia Tech Library

Title: Character Eyes: Seeing Language through Character-Level Taggers

Files

Author(s)

Authors

Advisor(s)

Advisor(s)

Editor(s)

Associated Organization(s)

Series

Collections

Supplementary to

Permanent Link

Abstract

Sponsor

Date Issued

Extent

Resource Type

Resource Subtype

Rights Statement

Rights URI

Title:

Character Eyes: Seeing Language through Character-Level Taggers