• DocumentCode
    185683
  • Title

    Design and implementation of one vertical video search engine

  • Author

    Yingyi Liang ; Zhenyu He ; Yi Li

  • Author_Institution
    Sch. of Comput. Sci., Harbin Inst. of Technol., Shenzhen, China
  • fYear
    2014
  • fDate
    18-19 Oct. 2014
  • Firstpage
    75
  • Lastpage
    79
  • Abstract
    In this paper, a video vertical search engine is designed and implemented based on the theory of vertical search engine. Firstly, we introduce the vertical search engine and its research situation at home and abroad, analyze the principle of implementing the vertical search engine, and introduce the key technology used in this paper, such as subject information acquisition method, Chinese segmentation algorithm, and the search result re-sorting. We provide the video resource acquisition process and the video resources storage, and repeat video resources exclusion. Then, we analyze an information retrieval tool library, Lucene, which is with a advanced design and superior performance. Based on this library, a Chinese segmentation algorithm and a result sorting method are added. Unlike current other studies, a variable length matching strategy is taken for designing Chinese word with bidirectional matching method for disambiguation. Compared with the latest open source word segmentation algorithm, our segmentation algorithm designed in this paper outperforms better. With the video resources fetching from the internet and the Chinese word segmentation of VKAnalyzer extending from Lucene designed and implemented in the paper, we design related video re-sorting methods by different ways, such as length, times and comments, and implement the sorting method for search results according to users´ various requirements. The experiments shows that the recall rate of the search engine is 90% and the accuracy is 97%, as are satisfactory.
  • Keywords
    Internet; image matching; natural language processing; search engines; video retrieval; Chinese segmentation algorithm; Chinese word segmentation; Internet; Lucene; VKAnalyzer; bidirectional matching method; information retrieval tool library; one vertical video search engine; recall rate; repeat video resources exclusion; search result resorting; subject information acquisition method; variable length matching strategy; video resorting methods; video resource acquisition process; video resource fetching; video resources storage; Educational institutions; Indexing; Internet; Search engines; Sorting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Security, Pattern Analysis, and Cybernetics (SPAC), 2014 International Conference on
  • Conference_Location
    Wuhan
  • Print_ISBN
    978-1-4799-5352-3
  • Type

    conf

  • DOI
    10.1109/SPAC.2014.6982660
  • Filename
    6982660