• DocumentCode
    476080
  • Title

    A manual experiment on commonsense knowledge acquisition from web corpora

  • Author

    Yao Zhui ; Liang-Jun Zang ; Dong-Sheng Wang ; Cun-Gen Cao

  • Author_Institution
    Key Lab. of Intell. Inf. Process., Chinese Acad. of Sci., Beijing
  • Volume
    3
  • fYear
    2008
  • fDate
    12-15 July 2008
  • Firstpage
    1564
  • Lastpage
    1569
  • Abstract
    Acquiring commonsense knowledge from text is an important but challenging problem. In this paper, we described a three-subject experiment on commonsense knowledge acquisition from Chinese sentences extracted from a web corpus, aiming to investigate how people acquire commonsensical assertions from given sentences. We analyzed the experiment results from the perspectives of agreement test, concordance test, and divergence test. An important conclusion of our experiment is that sentences are different in their suitability, i.e. difficulty grade, for commonsense knowledge acquisition. And this difficulty grade also affects the number of commonsensical assertions acquired from a sentence, as well as the difference among the acquisition performances of different human subjects. We also discussed the problem of characterizing the difficulty grade by co-occurrence frequency of words and basic level category words.
  • Keywords
    Internet; knowledge acquisition; text analysis; Chinese sentences; Web corpora; Web corpus; commonsense knowledge acquisition; commonsensical assertions; Art; Cybernetics; Frequency; Humans; Knowledge acquisition; Large-scale systems; Machine learning; Manuals; Neck; Testing; Agreement Test; Basic Level Category; Co-occurrence Frequency; Commonsense Knowledge Acquisition; Concordance Test; Divergence Test; Manual Experiment; Web Corpora;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2008 International Conference on
  • Conference_Location
    Kunming
  • Print_ISBN
    978-1-4244-2095-7
  • Electronic_ISBN
    978-1-4244-2096-4
  • Type

    conf

  • DOI
    10.1109/ICMLC.2008.4620655
  • Filename
    4620655