مرکز منطقه ای اطلاع رساني علوم و فناوري - Extracting statistical data from free-form text

DocumentCode :

1291658

Title :

Extracting statistical data from free-form text

Author :

Hill, L. Owen ; Zein, David A.

Author_Institution :

IBM Corp., East Fishkill, NY, USA

Volume :

Issue :

fYear :

1986

fDate :

5/1/1986 12:00:00 AM

Firstpage :

Lastpage :

Abstract :

The authors describe a method for processing free-form text files. The method consists of segregating and separating four physically and logically identifiable regions. The four regions are postprocessed to update three history files that contain information about manufactured products over a period of time. The technique used in processing such files falls under the general category of data segregation and character recognition. It involves the use of logical and mathematical operations in recognizing region boundaries and types of data fields and establishing uniqueness in name recognition. Hashing methods are used, combined with logical matrix multiplication in updating the history files. Sparse formats are used to store multiple large arrays on disks, reducing storage requirements by more than a factor of two. The techniques are implemented using multiprogramming environments in an automated system.

Keywords :

data handling; manufacturing data processing; statistics; word processing; character recognition; data extraction; data fields; data segregation; free-form text; hashing; history files; logical operations; manufactured products; mathematical operations; matrix multiplication; multiple large arrays; multiprogramming environments; name recognition; region boundaries; sparse formats; statistical data; storage requirements; uniqueness; Arrays; Data mining; Graphics; History; Logic arrays; Matrix converters; Vectors;

fLanguage :

English

Journal_Title :

Circuits and Devices Magazine, IEEE

Publisher :

ieee

ISSN :

8755-3996

Type :

jour

DOI :

10.1109/MCD.1986.6311822

Filename :

6311822

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1291658