پيش‌بيني ژن¬هاي كاذب جديد در ژنوم مرجع گوسفند

عنوان فرعي

Prediction of Novel Pseudogenes in Ovine Reference Genome

پديد آورنده

بختیاری زاده محمد رضا

پديد آورندگان

مزدوري زهره نويسنده , شاكری پدرام نويسنده دانشجوی دكترای ژنتیك و اصلاح دام Shakeri Pedram

سازمان

گروه علوم دام و طیور پردیس ابوریحان

تعداد صفحه

از صفحه

484

تا صفحه

497

كليدواژه

تشابه , عدم تطابق , نرم افزار PseudoPipe , حاشيه نويسي ژنوم

چكيده فارسي

ژن¬های كاذب نسخه¬هایی از ژن اجدادی می¬باشند كه به مرور زمان فعالیت آنها نسبت به ژن اولیه تغییر كرده است و در ژنوم بر اثر فرآیندهایی مانند مضاعف شدگی ژنی و همچنین رونویسی واژگون ایجاد شده¬اند. ژن¬های كاذب تا مدت¬ها به‌عنوان توالی¬های غیر عملكردی ژنوم در نظر گرفته می¬شدند. با این وجود پژوهش‌های اخیر گزارشاتی مبنی بر فعالیت زیستی این ژن¬ها ارائه داده¬اند، در نتیجه عملكردی بودن این ژن¬ها موجب افزایش حاشیه نویسی صحیح¬تر این ژن¬ها در ژنوم موجودات شده است. در پژوهش حاضر به منظور بهبود حاشیه نویسی ژنوم گوسفند، برای نخستین بار با استفاده از روش¬های محاسباتی بر پایه بررسی تشابه با استفاده از نرم¬افزار PseudoPipe، ژن¬های كاذب مرتبط با ژن¬های كدكننده پروتئین در سطح ژنوم شناسایی شدند. همچنین گروه¬های كاركردی ژن¬های والدی كه ژن¬های كاذب از آنها مشتق شده¬اند با استفاده از پایگاه اینترنتی DAVID بررسی شدند. در نهایت ویژگی¬های مختلف ژن¬های كاذب كاندید جدید شناسایی شده با ژن¬های كاذب شناخته¬شده در گونه¬های انسان، موش و گاو مقایسه شدند. به طور كلی 4098 ژن كاذب با سطح اطمینان بالا شامل 1102 ژن كاذب از نوع مضاعف شده و 2996 از نوع پردازش شده شناسایی شدند. نتایج نشان داد كه ژن¬های كاذب شناسایی شده در فرآیندهای زیستی گوناگونی مانند splicing mRNA، پیدایش ریبوزوم، اتصال rRNA، انتقال الكترون میتوكندریایی، ترجمه و غیره نقش دارند. مقایسه ویژگی¬های مختلف ژن¬های شناسایی شده با دیگر گونه¬ها نشان داد كه نتایج حاصل از این پژوهش در تطابق با پژوهش‌های گذشته می¬باشد. نتایج حاصل از این پژوهش به بهبود حاشیه نویسی ژنوم گوسفند كمك خواهد كرد.

چكيده لاتين

Introduction Pseudogenes are copies of the ancestral genes which have undergone changes that were constructed based on gene duplications and reverse transcription in the genome. They have been reported in all types of organisms ranging from bacteria to mammals. Pseudogenes increase the genetic diversity of a plethora of genes and they do so through gene conversion and recombination. Three classes of pseudogenes are known to exist: duplicated pseudogenes; processed or retrotransposed pseudogenes; and unitary or disabled pseudogenes. Pseudogenes have long been considered as nonfunctional genomic sequences. However, recent studies reported that many of them might have some form of biological activity. Recently, it has reported that pseudogenes represent a conspicuous part of the human transcriptome and proteome, as thousands of them are transcribed and hundreds are also translated. Also, it has been demonstrated that pseudogenes exert important coding-dependent and coding-independent functions that are involved in complex regulatory networks. Hence, the possibility of functionality of these genes, has increased interest in their accurate annotation. According to the best of our knowledge, there is no available report on the high-throughput pseudogene identification in sheep. Therefore, in the present study, to improve the annotation of sheep genome, we present the first genome-wide pseudogene identification for protein-coding genes using a homology-based computational approach. Materials and Methods The pseudogene content in the sheep genome was estimated using an in-house computational annotation pipeline, named PseudoPipe. The PseudoPipe pipeline predicts pseudogenes in the genome using homology-based method (BLAST and a clustering algorithm). In the present study, repeat-masked sheep genome reference (Ovis_aries.Oar_v3.1), genome annotation gtf file (version 77) and all of the protein coding genes sequences were downloaded from ENSEMBL database. To identify pseudogenes, the sheep genome was searched in a comprehensive and consistent manner. The key steps in the pipeline involved using BLAST to rapidly cross-reference potential ‘‘parent’’ proteins against the intergenic regions of the genome and then processing the resulting ‘‘raw hits’’ such as eliminating redundant ones, clustering together neighbors, and associating and aligning clusters with a unique parent. Then, pseudogenes were classified based on a combination of criteria including homology, intron/exon structure, and existence of stop codons and frameshifts. Finally, we investigated the results manually and false positive results were removed. Also, the gene ontology (GO) of the parental genes that pseudogenes derived from them, have been investigate by DAVID software. Furthermore, different characteristics of the identified new candidate pseudogene were compared with known pseudogenes in the human, mice and cattle species. Results and Discussion It is vital to identify pseudogenes to better understand genome annotation and disease-related molecular mechanism. Identification of pseudogenes is an ongoing eﬀort, and there are several groups continuously working on identiﬁcation of pseudogenes. The complexity of the identiﬁcation of pseudogenes can be addressed by in silico analysis and using a homology-based whole genome identiﬁcation approach. Here, using a computational method, we identified 4,098 high confidence pseudogenes including 1,102 duplicated and 2,996 processed pseudogenes in sheep genome. The results of the GO analysis showed that identified pseudogenes are significantly enriched in various biological processes, such as mRNA splicing, ribosome structure, binding rRNA, mitochondrial electron transport, translation and etc. Interestingly, a growing body of evidence suggests parental genes of pseudogenes roles are associated with ribosome, rRNA and translational biological processes. Detailed comparison of our results with other species showed that our results are in consistence with previous studies. For example, pseudogene distribution on the sheep chromosomes was in consistence with human and mouse genome. Moreover, it is reported that, duplicated pseudogenes are commonly found on the same chromosome as their parent genes. Our results showed that about 28% of the identified duplicated pseudogenes were on the same chromosome with their parent genes. The results of the study will help to improve the annotation of the sheep genome. The coincidence of the results of this study with previous studies indicates accuracy of the method used in this research. Conclusion This study, for the first time, has generated the catalog of 4,098 sheep putative pseudogenes. Our findings provide an evidence for pseudogene content in sheep which is a starting point for understanding of their regulatory mechanism. The identiﬁcation of the novel pseudogenes have greatly improved the genome annotation of sheep. The results of this study will help to better annotation of sheep genome. By using such methods, we can also improve annotation genomes of various organisms.

سال انتشار

1396

عنوان نشريه

پژوهشهاي علوم دامي ايران

عنوان نشريه

پژوهشهاي علوم دامي ايران

لينک به اين مدرک

https://search.isc.ac/dl/search/defaultta.aspx?DTC=8&DC=969887