Files in this item

Icon
Name
paisa.raw.utf8.gz
Size
521.58 MB
Format
application/gzip
Description
raw cleaned web texts
MD5
d7804d4d9af31ddaec5bfa7409926f2e
 Download file
Icon
Name
paisa.annotated.CoNLL.utf8.gz
Size
1.84 GB
Format
application/gzip
Description
cleaned and linguistically annotated web texts in CoNLL format
MD5
9d49fd1e86c9e6de3a6cb67a6c10a2f2
 Download file
Icon
Name
lemma-WITHOUTnumberssymbols-frequencies-paisa.txt.gz
Size
6.94 MB
Format
application/gzip
Description
lemma frequencies (only composed of letters and the following three symbols: . - ' )
MD5
6d3959478ad4c5fecfe9c9cc305c68af
 Download file
Icon
Name
lemma-frequencies-paisa.txt.gz
Size
9.53 MB
Format
application/gzip
Description
lemma frequencies
MD5
ea27fe186efc59410d5ea39c4130315b
 Download file