مرکز منطقه ای اطلاع رساني علوم و فناوري - ارزيابي خودكار جويش‌گرهاي ويدئويي حوزه وب فارسي بر اساس تجميع آرا

شماره ركورد :

1017915

عنوان مقاله :

ارزيابي خودكار جويش‌گرهاي ويدئويي حوزه وب فارسي بر اساس تجميع آرا

عنوان به زبان ديگر :

Automatic Evaluation of Video Search engines in Persian Web domain based on Majority Voting

پديد آورندگان :

يدالهي ،محمد مهدي مركز تحقيقات مخابرات ايران , زرگري، فرزاد مركز تحقيقات مخابرات ايران , فرهودي، مژگان مركز تحقيقات مخابرات ايران

تعداد صفحه :

از صفحه :

تا صفحه :

كليدواژه :

وب فارسي , جويش‌گرهاي ويدئويي , ارزيابي خودكار

چكيده فارسي :

امروزه رشد بسيار سريع اينترنت و نفوذ روزافزون آن در زندگي افراد باعث شده تا كاربران بسياري براي رفع نيازهاي روزمره خود به جويش‌گرها مراجعه كنند و اين جويش‌گرها به توسعه و بهبود مستمر نياز دارند. از‌اين‌رو ارزيابي جويش‌گرها براي تعيين كارايي آنها اهميت به‌سزايي دارد. در ايران نيز همانند ساير كشورها پژوهش‌هاي گسترده‌اي در زمينه ايجاد جويش‌گرهاي خاص‌منظوره بومي انجام شده است. يكي از مهم‌ترين جويش‌گرهاي خاص‌منظوره ايجاد‌شده، جويش‌گر ويدئويي است كه وظيفه بازيابي ويدئوها از سطح وب را برعهده دارد. براي ارزيابي كيفيت اين جويش‌گرها و بهبود مستمر آنها بايد سطح خدمات‌دهي هر كدام از جويش‌گرها در مقايسه با ديگر جويش‌گرهاي موجود مورد ارزيابي قرار گيرد. از آنجا كه سرعت ارزيابي نقش مهمي در تعيين روند اصلاحات مورد نياز دارد، بحث ارزيابي خودكار جويش‌گر‌ها بسيار پراهميت خواهد شد. در اين مقاله روشي مبتني بر تجميع آرا به‌منظور ارزيابي خودكار جويش‌گرهاي ويدئويي ارائه شده است. تمركز اصلي اين روش بر روي حوزه وب فارسي بوده و با توسعه روشي نوين براي شباهت‌سنجي مبتني بر محتوا براساس بردار‌هاي حركت ويدئوها، سعي در ارزيابي جويش‌گرهاي ويدئويي دارد. براي محك‌زدن روش معرفي‌شده، سازوكاري طراحي شد تا نتايج به‌دست‌آمده با نتايج حاصل از ارزيابي انساني مورد مقايسه قرار گيرد. نتايج به‌دست‌آمده نشان‌دهنده ميزان همبستگي بيش از 94% دو روش است كه قابل اتكا‌بودن روش خودكار ارزيابي پيشنهادي را بيان مي‌كند.

چكيده لاتين :

Today, the growth of the internet and its high influence in individuals’ life have caused many users to solve their daily needs by search engines and hence, the search engines need to be modified and continuously improved. Therefore, evaluating search engines to determine their performance is of paramount importance. In Iran, as well as other countries, extensive researches are being performed on search engines. To evaluate the quality of search engines and continually improve their performance, it is necessary to evaluate search engines and compare them to other existing ones. Since the speed plays an important role in the assessment of the performance, automatic search engine evaluation methods attracted grate attention. In this paper, a method based on the majority voting is proposed to assess the video search engines. We introduced a mechanism to assess the automatic evaluation method by comparing its results with the results obtained by human search engine evaluation. The results obtained, shows 94 % correlation of the two methods which indicate the reliability of automated approach. In general, the proposed method can be described in three steps. Step 1: Retrieve first k_retrieve results of n different video search engines and build the return result set for each written query. Step 2: Determine the level of relevance of each retrieved result from the search engines Step 3: Evaluating the search engines by computing different evaluation criteria based on decisions on relevance of the retrieved videos by each search engine Clearly, the main part of any evaluation system with the goal of evaluating the accuracy of search engines is the second step. In this paper, we have tried to present a new solution based on the aggregation of votes in order to determine whether a result is relevant or not, as well as its level of relevance. For this purpose, for each query the return results from different search engines are compared with each other, and the result returned by more than m of the search engines (m ; and the result of which their URLs (after the normalization) are similar to the normalized URL from the m-1 of the other search engines, are considered as the relevant results. At the second level, the retrieved results will be compared in terms of content. In this way, after calculating the address-like similarity, all the results are transmitted to the motion vector extraction component to extract and store the motion vector. In the content based similarity algorithm, the set of motion vectors is initially considered as a sequence of motion vector. We, then, try to find the greatest similarity of the smaller sequence with the larger sequence. After this step, we will report the maximum similarity of the two videos. The process of finding the maximum similarity is that we consider a window with a smaller video sequence length. In this window we calculate and hold the similarity of two sequences. In the proposed method, after identifying the similarity between the return results of different search engines, their level is ranked at three different levels: "unrelated" (0), "quantitatively related" (1) and "related" (2). Since Google's search engine is currently the world’s largest and best-performing search engine, and most search engines have been compared to it, and are also trying to achieve the same function, the first five Google search engine results are get the minimum relevance, by default, "slightly related". Then the similarity module is used to evaluate the similarity of the retrieved n results of the tested search engines.

سال انتشار :

1397

عنوان نشريه :

پردازش علائم و داده ها

فايل PDF :

7500377

عنوان نشريه :

پردازش علائم و داده ها

لينک به اين مدرک :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=8&DC=1017915