مرکز منطقه ای اطلاع رساني علوم و فناوري - مقايسه روش‌هاي مختلف يادگيري ماشين در خلاصه‌سازي استخراجي گفتار به گفتار فارسي بدون استفاده از رونوشت

شماره ركورد :

997281

عنوان مقاله :

مقايسه روش‌هاي مختلف يادگيري ماشين در خلاصه‌سازي استخراجي گفتار به گفتار فارسي بدون استفاده از رونوشت

عنوان به زبان ديگر :

A comparison of machine learning techniques for Persian Extractive Speech to Speech Summarization without Transcript

پديد آورندگان :

سادات جعفري، هدي دانشگاه صنعتي اميركبير، تهران - دانشكده مهندسي كامپيوتر و فناوري اطلاعات , همايون پور، محمدمهدي دانشگاه صنعتي اميركبير، تهران - دانشكده مهندسي كامپيوتر و فناوري اطلاعات

تعداد صفحه :

از صفحه :

143

تا صفحه :

157

كليدواژه :

خلاصه‌سازي استخراجي گفتار , سيگنال گفتار , الگوهاي كليدي , الگوريتم S-DTW , يادگيري ماشين

چكيده فارسي :

در اين مقاله، خلاصه‌سازي استخراجي گفتار با استفاده از روش‌هاي مختلف يادگيري ماشين مورد مطالعه قرار گرفته است. خلاصه‌سازي يك فايل گفتاري به معناي استخراج بخش‌هاي مهم و شاخص گفتار به‌منظور دسترسي، جستجو، استخراج و مرورگري آسان‌تر و كم‌هزينه‌تر اطلاعات فايل‌هاي گفتاري است. در اين مقاله، يك روش جديد خلاصه‌سازي گفتار بدون استفاده از سامانه بازشناسي خودكار گفتار ارائه شده است. الگوهاي تكراري بين دو جمله گفتاري با استفاده از الگوريتم S-DTW، به‌طورمستقيم از روي سيگنال گفتار شناسايي مي‌شوند. بعد از تعيين شباهت بين دو جمله و استخراج تعدادي ويژگي از هر جمله تأثير روش‌هاي مختلف يادگيري ماشين، بانظارت، بي‌نظارت و نيمه‌نظارتي مورد بررسي قرار گرفته است. آزمايش‌ها برروي يك پيكره خوانده‌شده اخبار فارسي انجام شده است. نتايج نشان مي‌دهد با استفاده از ويژگي‌هاي مناسب، بدون استفاده از رونوشت به كارايي بالاتري نسبت به روش‌هاي پايه (3٪ افزايش در مقايسه با انتخاب نخستين جملات و 5٪ افزايش در مفايسه با انتخاب طولاني‌ترين جملات با استفاده از معيار ROUGE-3) مي‌توان دست پيدا كرد.

چكيده لاتين :

In this paper, extractive speech summarization using different machine learning algorithms was investigated. The task of Speech summarization deals with extracting important and salient segments from speech in order to access, search, extract and browse speech files easier and in a less costly manner. In this paper, a new method for speech summarization without using automatic speech recognition system (ASR) is proposed. ASR systems usually have high error rates especially in adverse acoustic environment and for low resource languages. Our goal was to answer this question: is it possible to summarize a Persian speech without ASR using less or no training data? We have proposed a method which discovers salient parts directly from speech signal by using a semi-supervised algorithm. The proposed algorithm consists of three main stages, features extraction, identifying key patterns and selecting important sentences. First we have segmented speech voices manually into sentences to eliminate sentence segmentation errors. Therefore, we could have better comparison between different summarization methods. Then we have extracted some features from each sentence such as sentence duration, if the sentence is first or last sentence in the speech and so on. Also, repetitive patterns between each two sentence of speech are discovered directly from speech signal by using S-DTW algorithm. S-DTW algorithm can discover repetitive patterns between two speech signals by using MFCC features. By using these repetitive patterns between each pair of sentences we can make a similarity matrix. Therefore, we could measure the similarity distance between each pair of sentences and eliminate redundant sentences from summary without the need to use an ASR system After finding the similarity between each two speech segments and extracting some features from each segment, various machine learning algorithms including unsupervised (MMR, TextRank), supervised (SVM, Naïve Bayes) and semi-supervised algorithms (self-training, Co-training) are used in order to extract salient parts. Experiences are done in read Persian news. The results show that using semi-supervised co-training method and appropriate features, the performance of speech summarization system on read Persian news corpus can improve about 3% compared to selecting the first sentences and by 5% compared to longest sentences when ROUGE-3 is used as the evaluation measure.

سال انتشار :

1396

عنوان نشريه :

پردازش علائم و داده ها

فايل PDF :

7329421

عنوان نشريه :

پردازش علائم و داده ها

لينک به اين مدرک :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=8&DC=997281