مرکز منطقه ای اطلاع رساني علوم و فناوري - Lexical criminal identification for chatting corpus

DocumentCode :

501554

Title :

Lexical criminal identification for chatting corpus

Author :

Marjuni, Siti Hanom ; Mahmod, Ramlan ; Ghani, Abd Azim Abd ; Zain, Abdullah Bin Mohd ; Mustapha, Aida

Author_Institution :

Fac. of Comput. Sci. & Inf. Tech, Univ. Putra Malaysia, Serdang, Malaysia

fYear :

2009

fDate :

8-11 Aug. 2009

Firstpage :

360

Lastpage :

364

Abstract :

This paper aims to identify lexical of criminal elements for chatting corpus, which involved suspect and victim conversation utterances. Lexical criminal identification requires three processes. The first is tokenization to automatically assign each lexical with a corresponding serial number in every suspect and victim utterance. The second is tagging the lexical with parts of speech to identify verbs and nouns in the utterances. The third is to identify and analyze the interrogative criminal construct to get the criminal evidence. The chatting corpus consists of 3,067 suspect and victim utterances with 16,278 words, collected from 9 criminal chatting cases. The results indicate that both verb and noun are the most important part of speech elements that represent the criminal constructs in chat utterances.

Keywords :

natural language processing; police data processing; speech processing; chatting corpus; lexical criminal identification; nouns; suspect utterances; tokenization; verbs; victim conversation utterance; victim utterance; victim utterances; words; Application software; Communication system security; Computer crime; Computer science; Computer security; Context; Information systems; Speech; Tagging; Vocabulary; Chatting; Criminal Construct; Criminal Evidence; Lexicon; Part of Speech; Tagging;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Science and Information Technology, 2009. ICCSIT 2009. 2nd IEEE International Conference on

Conference_Location :

Beijing

Print_ISBN :

978-1-4244-4519-6

Electronic_ISBN :

978-1-4244-4520-2

Type :

conf

DOI :

10.1109/ICCSIT.2009.5234700

Filename :

5234700

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=501554