A Fast Algorithm of Computing Word Similarity

Author

Xingyuan Chen ; Xia Yang ; Bingjun Su

Author_Institution

Sch. of Comput. Sci., Leshan Normal Univ., Leshan, China

fYear

2013

fDate

14-15 Dec. 2013

Firstpage

405

Lastpage

408

Abstract

Computing distributional similarity is an effective strategy for finding synonyms. The time complexity of the naive nearest-neighbor approach of computing distributional word similarity is O(n*n*m), it is inefficient for accurately representing synonymy using large corpus. We find a parse property of triple that the growth rate of average triples number of each word leveled off as corpus´s size increases. Using this property we design a fast algorithm for computing word similarity whose time complexity is O(n*n). We demonstrate the efficiency of this algorithm based on the English Gig word corpus.

Keywords

computational complexity; natural language processing; English Gig word corpus; distributional word similarity computation; naive nearest-neighbor approach; natural language processing; parse triple property; synonym finding; synonymy representation; time complexity; Algorithm design and analysis; Computer science; Context; Educational institutions; Manuals; Time complexity; Vocabulary; computing complexity; distributional word similarity; triples;

fLanguage

English

Publisher

ieee

Conference_Titel

Computational Intelligence and Security (CIS), 2013 9th International Conference on

Conference_Location

Leshan

Print_ISBN

978-1-4799-2548-3

Type

conf

DOI

10.1109/CIS.2013.92

Filename

6746428