Title :
A comparison between two literary printed Romanian corpora based on the statistical letter structure with orthography and punctuation marks
Author :
Ciuca, Stefan ; Vlad, Adriana ; Mitrea, Adrian
Author_Institution :
Fac. of Electron., Telecommun. & Inf. Technol., Politeh. Univ. of Bucharest, Bucharest, Romania
Abstract :
The problem under consideration is a comparison between a new literary linguistics corpus and an existing corpus for printed Romanian, with the purpose of strengthening the stationarity property of the language. By the applied statistical approach, the paper also includes a way of extending a corpus linguistics for mathematical purposes.
Keywords :
computational linguistics; document handling; statistical analysis; literary linguistics corpus; literary printed Romanian corpora; mathematical purpose; orthography; punctuation marks; statistical approach; statistical letter structure; Artificial intelligence; Biographies; Books; Estimation theory; Frequency; Information technology; Probability; Protection; Sampling methods; Testing; literary corpus linguistics; mathematical comparison between corpora; natural language stationarity; ortography and punctuation marks; statistical error control;
Conference_Titel :
Communications (COMM), 2010 8th International Conference on
Conference_Location :
Bucharest
Print_ISBN :
978-1-4244-6360-2
DOI :
10.1109/ICCOMM.2010.5509040