تفكيك كور منابع گفتار دوكاناله بر اساس مكان‌يابي

عنوان به زبان ديگر

Blind Two-Channel Speech Source Separation Based on Localization

پديد آورندگان

صوفي، حسن علي دانشگاه فردوسي مشهد - دانشكده مهندسي - گروه برق , خادمي، مرتضي دانشگاه فردوسي مشهد - دانشكده مهندسي - گروه برق , ابراهيمي مقدم، عباس دانشگاه فردوسي مشهد - دانشكده مهندسي - گروه برق

تعداد صفحه

از صفحه

از صفحه (ادامه)

تا صفحه

تا صفحه(ادامه)

كليدواژه

اسپكتوگرام زاويه اي , تابع همبستگي متقابل تعميم يافته , تفكيك كور منابع گفتار

چكيده فارسي

در اين مقاله يك روش جديد براي تفكيك كور منابع گفتار دوكاناله، بدون نياز به دانش قبلي در مورد منابع گفتار آمده است. در روش پيشنهادي، با وزن‌دادن به طيف سيگنال تركيب‌شده بر اساس فاصله منابع گفتار با ميكروفون، تفكيك منابع گفتار انجام مي‌شود. بنابراين ابتدا با تشكيل اسپكتوگرام زاويه‌اي توسط تابع همبستگي متقابل تعميم‌يافته، منابع گفتار موجود در سيگنال تركيب‌شده مكان‌يابي مي‌شوند. سپس با توجه به موقعيت مكاني منابع از نظر فاصله با ميكروفون‌ها، اندازه طيف سيگنال تركيب‌شده، وزن‌دهي مي‌شود. با ضرب اندازه طيف وزن داده شده در مقادير حاصل از اسپكتوگرام زاويه‌اي و مقايسه آنها با هم، براي هر منبع يك نقاب باينري ساخته مي‌شود. با اعمال نقاب باينري به اندازه طيف سيگنال تركيب‌شده، منابع گفتار موجود در آن از هم جدا مي‌شوند. اين روش روي داده‌هاي پايگاه داده SiSEC آزمايش و از ابزار سنجش و معيارهاي موجود در اين پايگاه، براي ارزيابي استفاده شده است. نتايج نشان مي‌دهد كه روش پيشنهادي، از جهت معيارهاي موجود در پايگاه مذكور با روش‌هاي رقيب قابل مقايسه بوده و پيچيدگي محاسباتي كمتري دارد.

چكيده لاتين

This paper presents a new method for blind two-channel speech sources separation without the need for prior knowledge about speech sources. In the proposed method, by weighting the mixture signal spectrum based on the location of the speech sources in terms of distance to the microphone, the speech sources are separated. Therefore, by forming an angular spectrum by generalized cross-correlation function, the speech sources in the mixture signal are localized. First, by creating an angular spectrogram by generalized cross-correlation function, the speech sources in the mixture signal are localized. Then according to the location of the sources, the amplitude of the mixture signal spectrum is weighted. By multiplying the weighted spectrum by the values obtained from the angular spectrograms, a binary mask is constructed for each source. By applying the binary mask to the amplitude of the mixture signal spectrum, the speech sources are separated. This method is evaluated on SiSEC database and the measurement tools and criteria contained in this database are used for evaluation. The results show that the proposed method is comparable in terms of the criteria available in the database to the competing ones, has lower computational complexity.

سال انتشار

1400

عنوان نشريه

مهندسي برق و مهندسي كامپيوتر ايران

فايل PDF

8476427

لينک به اين مدرک

https://search.isc.ac/dl/search/defaultta.aspx?DTC=8&DC=1248092