Title :
A study on the common words found in different literary Romanian corpora
Author :
Mitrea, Adrian ; Vlad, Adriana ; Hodea, Octavian ; Dragomir, Roxana
Author_Institution :
Fac. of Electron., Telecommun. & Inf. Technol., Politeh. Univ. of Bucharest, Bucharest, Romania
Abstract :
The experimental analysis focused on a corpus of literary works - novels and short stories (107 books) - comprising about 12.7 million words, obtained by bringing together two different collections of writings of similar length. The main objective was to identify the common words occurring in each of the books constituting the corpus as a whole, in each of the sub-corpora organized per author and also in each sub-collection of texts.
Keywords :
literature; natural languages; text analysis; common words; literary Romanian corpora; natural language; Artificial intelligence; Educational institutions; Mathematics; Natural languages; Pragmatics; Vocabulary; Writing; common words found in different corpora; mathematics of natural language;
Conference_Titel :
Communications (COMM), 2014 10th International Conference on
Conference_Location :
Bucharest
DOI :
10.1109/ICComm.2014.6866729