Catalogo dei prodotti della ricerca

Semantic similarity has typically been measured across items of approximately similar sizes. As a result, similarity measures have largely ignored the fact that different types of linguistic item can potentially have similar or even identical meanings, and therefore are designed to compare only one type of lin-guistic item. Furthermore, nearly all current similarity benchmarks within NLP contain pairs of approximately the same size, such as word or sentence pairs, preventing the evaluation of methods that are capable of comparing different sized items. To address this, we introduce a new semantic evaluation called cross-level semantic similarity (CLSS), which measures the degree to which the meaning of a larger linguistic item, such as a paragraph, is captured by a smaller item, such as a sentence. Our pilot CLSS task was presented as part of SemEval-2014, which attracted 19 teams who submitted 38 systems. CLSS data contains a rich mixture of pairs, spanning from paragraphs to word senses to fully evaluate similarity measures that are capable of comparing items of any type. Furthermore, data sources were drawn from diverse corpora beyond just newswire, including domain-specific texts and social media. We describe the annotation process and its challenges, including a comparison with crowdsourcing, and identify the factors that make the dataset a rigorous assessment of a method’s quality. Furthermore, we examine in detail the systems participating in the SemEval task to identify the common factors associated with high performance and which aspects proved difficult to all systems. Our findings demonstrate that CLSS poses a significant challenge for similarity methods and provides clear directions for future work on universal similarity methods that can compare any pair of items.

Cross level semantic similarity: an evaluation framework for universal measures of similarity / JURGENS, DAVID ALAN; PILEHVAR, MOHAMMED TAHER; NAVIGLI, ROBERTO. - In: LANGUAGE RESOURCES AND EVALUATION. - ISSN 1574-020X. - STAMPA. - 50:1(2016), pp. 5-33. [10.1007/s10579-015-9318-3]

Cross level semantic similarity: an evaluation framework for universal measures of similarity

JURGENS, DAVID ALAN;PILEHVAR, MOHAMMED TAHER;NAVIGLI, ROBERTO

2016

Abstract

Semantic similarity has typically been measured across items of approximately similar sizes. As a result, similarity measures have largely ignored the fact that different types of linguistic item can potentially have similar or even identical meanings, and therefore are designed to compare only one type of lin-guistic item. Furthermore, nearly all current similarity benchmarks within NLP contain pairs of approximately the same size, such as word or sentence pairs, preventing the evaluation of methods that are capable of comparing different sized items. To address this, we introduce a new semantic evaluation called cross-level semantic similarity (CLSS), which measures the degree to which the meaning of a larger linguistic item, such as a paragraph, is captured by a smaller item, such as a sentence. Our pilot CLSS task was presented as part of SemEval-2014, which attracted 19 teams who submitted 38 systems. CLSS data contains a rich mixture of pairs, spanning from paragraphs to word senses to fully evaluate similarity measures that are capable of comparing items of any type. Furthermore, data sources were drawn from diverse corpora beyond just newswire, including domain-specific texts and social media. We describe the annotation process and its challenges, including a comparison with crowdsourcing, and identify the factors that make the dataset a rigorous assessment of a method’s quality. Furthermore, we examine in detail the systems participating in the SemEval task to identify the common factors associated with high performance and which aspects proved difficult to all systems. Our findings demonstrate that CLSS poses a significant challenge for similarity methods and provides clear directions for future work on universal similarity methods that can compare any pair of items.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
			2016
		
	Parole chiave
	
			evaluation; semantic textual similarity; similarity
		
	Tipologia
	
			01 Pubblicazione su rivista::01a Articolo in rivista
		
	Citazione
	
			Cross level semantic similarity: an evaluation framework for universal measures of similarity / JURGENS, DAVID ALAN; PILEHVAR, MOHAMMED TAHER; NAVIGLI, ROBERTO. - In: LANGUAGE RESOURCES AND EVALUATION. - ISSN 1574-020X. - STAMPA. - 50:1(2016), pp. 5-33. [10.1007/s10579-015-9318-3]
		
	Appartiene alla tipologia:
	
			01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Jurgens_Cross-level_2016.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 675.35 kB Formato Adobe PDF Contatta l'autore	675.35 kB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/845288

Citazioni

ND

9

8

social impact