Title :
Textual risk mining for maritime situational awareness
Author :
Razavi, Amir H. ; Inkpen, Diana ; Falcon, Rafael ; Abielmona, Rami
Author_Institution :
Electr. Eng. & Comput. Sci., Univ. of Ottawa, Ottawa, ON, Canada
Abstract :
In this paper, we propose an auxiliary Machine Learning (ML) and Natural Language Processing (NLP) integrated system for maritime situational awareness (MSA) operations. We bring into account a new and influential asset - human intuition and perception - to the existing semi-automated decision support systems that mostly rely on numerical data collected by electronic sensors or cameras located either directly on the vessels or in the maritime command-and-control centers. For our project, we gathered weekly textual reports spanning twelve months from the United States Worldwide Threats to Shipping Reports repository that belongs to the National Geospatial-Intelligence Agency (NGA), We considered the maritime incident reports written by human operators as a valuable and accessible unstructured textual input source in which a span of text1 is called “risk” if it expresses one of the following kinds of vessel incidents: fired, robbed, boarded, hijacked, attacked, chased, approached, kidnapped, boarding attempted, suspiciously approached or clashed with. Our approach benefits from probability distributions of some useful features annotated based on a list of lexicons that contain expressions denoting vessel types, risks types, risk associates, maritime geographical locations, dates and times. These distributions are captured and used to anchor the span of “risks” as they are described in the textual reports. After some preprocessing steps that include tokenization, named entity extraction and part-of-speech tagging, the textual risk mining system applies a variety of sequence classification algorithms, e.g., Conditional Random Fields, Conditional Markov Models and Hidden Markov Models in order to compare the risk classification performance. Empirical results show that our NLP/ML-based system can extract variable-length risk spans from the textual reports with about 90% correctness.
Keywords :
command and control systems; decision support systems; hidden Markov models; learning (artificial intelligence); marine engineering; marine safety; natural language processing; risk management; statistical distributions; MSA operation; NGA; NLP/ML-based system; National Geospatial-Intelligence Agency; United States worldwide threats to shipping reports repository; approached vessel incident; attacked vessel incident; auxiliary ML; auxiliary machine learning; boarded vessel incident; boarding attempted vessel incident; cameras; chased vessel incident; conditional Markov models; conditional random fields; electronic sensors; entity extraction; feature annotation; fired vessel incident; hidden Markov models; hijacked vessel incident; human intuition; human perception; kidnapped vessel incident; maritime command-and-control centers; maritime geographical locations; maritime incident reports; maritime situational awareness operation; natural language processing; part-of-speech tagging; probability distributions; risk associates; risk classification performance; risks types; robbed vessel incident; semiautomated decision support systems; sequence classification algorithms; textual risk mining system; tokenization; variable-length risk span extraction; vessel types; vessels; weekly textual reports; Classification algorithms; Data mining; Feature extraction; Hidden Markov models; Markov processes; Risk management; machine learning; maritime domain awareness; maritime situational awareness; natural language processing; risk detection; sequence-based classifiers; text analysis;
Conference_Titel :
Cognitive Methods in Situation Awareness and Decision Support (CogSIMA), 2014 IEEE International Inter-Disciplinary Conference on
Conference_Location :
San Antonio, TX
Print_ISBN :
978-1-4799-3563-5
DOI :
10.1109/CogSIMA.2014.6816558