• DocumentCode
    2406585
  • Title

    A simple architecture for the fine-grained documentation of endangered languages: The LACITO multimedia archive

  • Author

    Michailovsky, Boyd ; Michaud, Alexis ; Guillaume, Séverine

  • Author_Institution
    LACITO, France
  • fYear
    2011
  • fDate
    26-28 Oct. 2011
  • Firstpage
    14
  • Lastpage
    23
  • Abstract
    The LACITO multimedia archive [1] provides free access to documents of connected, spontaneous speech, mostly in “rare” or endangered languages, recorded in their cultural context and transcribed in consultation with native speakers. Its goal is to contribute to the documentation and study of a precious human heritage: the world´s languages. It has a special strength in languages of Asia and the Pacific. The LACITO archive was built with little personnel and less funding. It has been devised, developed and maintained over two decades by two researchers assisted by one engineer. Its simple architecture is based on current standards: Unicode character coding and XML markup; and Dublin Core/Open Language Archives Community recommendations for metadata. The data can be consulted online with any standard browser. The technical simplicity of the tools developed at LACITO makes them suitable for the creation of similar databases at other institutions. (For instance, tools from this archive were successfully adapted in the creation of the Formosan Languages archive [2].)
  • Keywords
    document handling; information retrieval; information retrieval systems; linguistics; multimedia databases; speech coding; Asia and the Pacific; Dublin Core recommendation; Formosan languages archive; LACITO multimedia archive; Open Language Archives Community recommendation; XML markup; cultural context; document free access; endangered languages; fine-grained documentation; human heritage; unicode character coding; Communities; Databases; Documentation; Pragmatics; Speech; Standards; XML; Multimedia corpora; endangered languages; interlinear glossing; language documentation; long-term preservation; online databases; spontaneous speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Speech Database and Assessments (Oriental COCOSDA), 2011 International Conference on
  • Conference_Location
    Hsinchu
  • Print_ISBN
    978-1-4577-0930-2
  • Type

    conf

  • DOI
    10.1109/ICSDA.2011.6085973
  • Filename
    6085973