• DocumentCode
    1875735
  • Title

    Text mining of English guidebooks for Hokuriku region in Japan

  • Author

    Ban, Hiromi ; Oyabu, Takashi

  • Author_Institution
    Eng. Dept., Fukui Univ. of Technol., Fukui, Japan
  • fYear
    2012
  • fDate
    6-8 Sept. 2012
  • Firstpage
    326
  • Lastpage
    331
  • Abstract
    Ishikawa Prefecture is located in the Hokuriku region in Japan. One of the problems of the tourism in Ishikawa is to increase the number of tourists from foreign countries. In order to solve this problem, it should be necessary to provide foreign tourists with “language service.” In this study, in order to understand a state of language service to foreign tourists, we investigated what linguistic characteristics could be found in English guidebooks for Kanazawa, which is the capital city of Ishikawa, and Toyama, which is also in Hokuriku, comparing with the official guidebooks for Tokyo, Fuji, Kyoto, and Hida. In short, frequency characteristics of character- and word-appearance were investigated using a program written in C++. These characteristics were approximated by an exponential function. Furthermore, we calculated the percentage of Japanese junior high school required vocabulary and American basic vocabulary to obtain the difficulty-level as well as the K-characteristic of each material. As a result, it was clearly shown that English guidebooks for Hokuriku have a similar tendency to literary writings in the characteristics of character-appearance. Besides, the values of the K-characteristic for them are high, and the difficulty level, especially for Kanazawa, is low.
  • Keywords
    computational linguistics; data mining; text analysis; American basic vocabulary; C++; English guidebooks; Hokuriku region; Ishikawa Prefecture; Japanese junior high school; character appearance; exponential function; foreign tourist; frequency characteristics; language service; linguistic characteristics; literary writings; text mining; Cities and towns; Educational institutions; Materials; Organizations; Pragmatics; Vocabulary; Writing; metrical linguistics; statistical analysis; text mining; tourism;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems (IS), 2012 6th IEEE International Conference
  • Conference_Location
    Sofia
  • Print_ISBN
    978-1-4673-2276-8
  • Type

    conf

  • DOI
    10.1109/IS.2012.6335237
  • Filename
    6335237