• DocumentCode
    2455478
  • Title

    Automatic learning of phonetic mappings for cross-language phonetic-search in keyword spotting

  • Author

    Bar-Yosef, Yossi ; Aloni-Lavi, Ruth ; Opher, Irit ; Lotner, Noam ; Tetariy, Ella ; Silber-Varod, Vered ; Aharonson, Vered ; Moyal, Ami

  • Author_Institution
    NICE Syst., Ra´´anana, Israel
  • fYear
    2012
  • fDate
    14-17 Nov. 2012
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Phonetic-search (PS) is an extremely fast technique used for spoken keyword spotting over large amounts of audio data. PS is based on matching a desired phonetic pattern over existing phonetic lattices, avoiding heavy computations of acoustic probabilities during the search. Since PS requires substantial acoustic and language resources (LR) for training acoustic models, there is a need for reducing model training costs to support new target languages. Particular cases of under-resourced languages pose even a greater challenge for PS as the available LR are not sufficient for acoustic model training. This study examines methods for keyword search in a new target language, using existing models of another source language in the lattice generation phase. We explore methodologies for learning cross-language phonetic mappings depending on the availability of data in the target language. We describe three approaches for creating phonetic-mappings: linguistic, acoustic, and statistic, introducing an efficient way for learning a robust statistical cross-language mapping. Our cross-language PS experiments showed that learning a good cross-language mapping can alleviate acoustic mismatches between languages, to significantly improve cross-language phonetic-search.
  • Keywords
    learning (artificial intelligence); probability; speech recognition; statistical analysis; acoustic model training; acoustic probabilities; audio data; automatic learning; automatic speech recognition; cross-language PS experiments; cross-language phonetic mapping learning; cross-language phonetic-search; language resources; lattice generation phase; model training cost reduction; phonetic lattices; phonetic mappings; phonetic pattern; robust statistical cross-language mapping; spoken keyword spotting; target language; under-resourced language; Acoustics; Adaptation models; Hidden Markov models; Lattices; Pragmatics; Speech; Training; phonetic-mapping; phonetic-search; spotting; under-resourced languages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electrical & Electronics Engineers in Israel (IEEEI), 2012 IEEE 27th Convention of
  • Conference_Location
    Eilat
  • Print_ISBN
    978-1-4673-4682-5
  • Type

    conf

  • DOI
    10.1109/EEEI.2012.6376955
  • Filename
    6376955