ارزیابی عملکرد روش‌هایBoosting و بیز A در چالش‌های مختلف معماری ژنومی صفات گسسته و پیوسته

نوع مقاله : مقاله پژوهشی

نویسنده

عضو هیات علمی تمام وقت دانشگاه آزاد اسلامی واحد آستارا

چکیده

سابقه و هدف: گزینش ژنومی چالشی امید بخش برای کشف رموز ژنتیکی صفات کمی و کیفی به ‌منظور بهبود رشد ژنتیکی و صحت پیش ‌بینی ژنومی در اصلاح دام می‌باشد .در این پژوهش، عملکرد روش‌های ‌Boosting و بیز A در برآورد ارزش‌های اصلاحی ژنومی صفات آستانه‌ای دودویی و پیوسته در تراکم مختلف نشانگری با استفاده از معماری‌های مختلف ژنومی مورد بررسی قرار گرفت.
مواد و روش‌ها: داده‌های ژنومی از طریق نرم افزار QMSim با سطوح متفاوت وراثت ‌پذیری (1/0 و 3/0)، سطوح مختلف LD (کم و زیاد)، تراکم‌های متفاوت جایگاه‌های صفات کمی (150 و 450) و تراکم مختلف نشانگری (K10 و k 50) برای تعداد 30 کروموزم شبیه ‌سازی شدند. جهت ایجاد فنوتیپ آستانه‌ای دودویی در مجموعه مرجع، افراد هر نسل بر اساس فنوتیپ پیوسته در خروجی QMSim رتبه‌ بندی شدند، سپس فنوتیپ آستانه‌ای افراد، وابسته به میانگین جمعیت شبیه ‌سازی ‌شده به ترتیب کد صفر (پایین‌تر از میانگین صفت) و کد یک (بالاتر از میانگین صفت) در نظرگرفته شد. در نهایت، ارزش‌های اصلاحی ژنومی با استفاده از روش‌های Boosting و بیز A محاسبه و جهت ارزیابی صحت ژنومی صفات آستانه‌ای و پیوسته، مورد استفاده قرار گرفتند.
نتایج: روش Boosting دامنه گسترده‌ای از صحت ژنومی در مقایسه با روش‌ بیز A با تغییرات تراکم نشانگرها نشان داد. روش Boosting در مقایسه با روش بیز آستانه‌ای A به ترتیب افزایشی 3/6 و 3/7 درصدی در صحت ژنومی صفات آستانه‌ای برای تراکم‌های نشانگری k10 و k50 نشان داد. عملکرد بیز A برای صفات با توزیع فنوتیپی پیوسته به طور قابل توجهی بیشتر از روش Boosting بود، خصوصا هنگامی که سناریوهای با تراکم نشانگری پایین استفاده شدند. ساختار معماری ژنومی از جمله وراثت ‌پذیری، تعداد QTL و LD از فاکتورهای موثر بر صحت ژنومی روشهای بیز و Boosting بودند. در این راستا نقش وراثت‌ پذیری بر عملکرد هریک از این روش‌ها مشهودتر بود. در مجموع، صحت‌های ژنومی روش بیز برای نوسانات تعداد QTL و روش Boosting برای نوسانات سطوح LD، حساسیت بیشتری نشان دادند. در تراکم بالای نشانگرها و برای صفات با فنوتیپ آستانه‌ای، بیشترین و کمترین میزان صحت ژنومی به ترتیب برای روش Boosting (598/0) و بیز آستانه‌ای A (510/0) هنگامی بود که تعداد بالای QTL وجود داشت. برای صفات پیوسته، بیشترین و کمترین میزان صحت ژنومی به ترتیب برای روش بیز A (702/0) و Boosting (569/0) در تعداد QTL پایین مشاهده شد. اثر مثبت افزایش LD بر صحت ژنومی روش‌های Boosting و بیز Aدر سناریوهای با تراکم نشانگر پایین نسبت به سناریوهای با تراکم نشانگری بالا مشهودتر بود.
نتیجه‌گیری: روند کلی نتایج این تحقیق نشان داد که روش Boosting در ارزیابی ژنومی صفات آستانه‌ای و روش بیز A در ارزیابی صفات پیوسته بهترین عملکرد را نشان می‌دهند.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Performance evaluation of Boosting and Bayes A methods by different challenges of genomic architectures in discrete and continue traits

نویسنده [English]

  • Yousef Naderi
Assistant professor, Islamic Azad University, Astara Branch, Department of Animal Science, Astara, Iran. Young Researchers club, Islamic Azad University, Astara Branch, Astara, Iran
چکیده [English]

Background and objectives: Genomic selection is a promising challenge for discovering genetic variants influencing quantitative and threshold traits for improving the genetic gain and accuracy of genomic prediction in animal breeding. In this study, performance of Boosting and Bayes A methods was investigated to evaluate genomic breeding values for binary threshold and quantitative traits in different marker densities using different genomic architectures.
Materials and methods: Genomic data were simulated by QMSim software to reflect variations in heritability (h2 = 0.1 and 0.3), linkage disequilibrium (LD=low and high), number of QTL (QTL=150 and 450) and marker densities (10k and 50k) for 30 chromosomes. To create discrete threshold phenotypes in training set, individuals per generation were ranked ascending order according continuous phenotypes of QMSim output. Afterwards, depending on average simulated population, the threshold phenotype of individuals was define was code 0 (higher than average trait) and code 1 (lower than average trait). Eventually, genomic estimated breeding values were calculated using Bayes A and Boosting methods to evaluate accuracy of genomic prediction for threshold and continue traits.
Results: Comparing to Bayes A method, Boosting algorithm was showed a wide range of genomic accuracy to changes marker density. Comparing to threshold Bayes A method, Boosting algorithm demonstrated an increase of 6.3 and 7.3 percentage on genomic accuracy of threshold traits when 10k and 50k SNPs panels were used, respectively. For traits with continue phenotypic distribution, performance of Bayes A was much more than Boosting, especially when the sparse panels were used. The structure of genomic architecture including heritability, number of QTL and LD were the most important factors affecting the accuracy of genomic prediction using Bayes A and Boosting methods. In this way, impact of heritability on performance of each of these models was more evident. Overall, genomic accuracies of Bayes A and Boosting methods showed more sensitive to QTL and LD fluctuations, respectively. For threshold traits with high density marker panels, the highest and lowest of genomic accuracy were obtained using Boosting (0.598) and Bayes A (0.510) methods, respectively, when the data set containing a lot of QTL was applied. For continue traits, the highest and lowest of genomic accuracy were obtained using Bayes A (0.702) and Boosting (0.569) methods, respectively, when the data set containing a few of QTL was used. the positive effect of increase LD on accuracies of genomic prediction of Boosting and Bayes A for the sparse panels was much more noticeable than high density panels.
Conclusion: The general trend of the present results indicated that Boosting and Bayes A methods showed their best performance for threshold and continue traits, respectively.

کلیدواژه‌ها [English]

  • Threshold traits
  • Genomic accuracy
  • Heritability
  • Machine learning
  • Linkage disequilibrium
1. Abdollahi-Arpanahi, R., Pakdel, A., Nejati-Javaremi, A. and Shahrbabak, M.M. 2013. Comparison of genomic evaluation methods in complex traits with different genetic architecture. Journal of Animal Production. 15: 65-77.
2. Baneh, H., Nejati Javaremi, A., Rahimi-Mianji, G. and Honarva, M. 2017. Genomic evaluation of threshold traits with different genetic architecture using bayesian approaches. Research on Animal Production. 8(15): 149-54.(In Persian).
3. Bazzi, H., Tahmoorespour, M. and Rokoui, M. 2017. Accuracy of Bayesian methods in genomic evaluation threshold traits with different genetic architecture.  Journal of Ruminant Research. 5(2): 129-43. (In Persian).
4. Bo, Z., Zhang, J.-J., Hong, N., Long, G., Peng, G., Xu, L.-Y., Yan, C., Zhang, L.-P., Gao, H.-J. and Xue, G. 2017. Effects of marker density and minor allele frequency on genomic prediction for growth traits in Chinese Simmental beef cattle. Journal of Integrative Agriculture. 16(4):911-20.
5. Calus, M., De Roos, A. and Veerkamp, R. 2008. Accuracy of genomic selection using different methods to define haplotypes. Genetics. 178(1): 553-61.
6. Chen, L., Li, C., Sargolzaei, M. and Schenkel, F. 2014. Impact of genotype imputation on the performance of GBLUP and Bayesian methods for genomic prediction. PloS One. 9(7): 101544.
7. De Los Campos, G., Naya, H., Gianola, D., Crossa, J., Legarra, A., Manfredi, E., Weigel, K. and Cotes, J.M. 2009. Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics. 182(1): 375-85.
8. Egger-Danner, C., Cole, J., Pryce, J., Gengler, N., Heringstad, B., Bradley, A. and Stock, K.F. 2015. Invited review: overview of new traits and phenotyping strategies in dairy cattle with a focus on functional traits. Animal. 9(2):191-207.
9. Freund, Y. and Schapire, R.E. 1996. Experiments with a new boosting algorithm. Icml. 96: 148-56.
10.      Ghafouri-Kesbi, F., Rahimi-Mianji, G., Honarvar, M. and Nejati-Javaremi, A. 2017. Predictive ability of Random Forests, Boosting, Support Vector Machines and Genomic Best Linear Unbiased Prediction in different scenarios of genomic evaluation. Animal Production Science. 57(2): 229-36.
11.Gianola, D. 2013. Priors in whole-genome regression: the Bayesian alphabet returns. Journal of Genetics. 194(3): 573-96.
12.Goddard, M. 2009. Genomic selection: prediction of accuracy and maximisation of long term response. Journal of Genetics. 136(2): 245-57.
13.Goldstein, B.A., Hubbard, A.E., Cutler, A. and Barcellos, L.F. 2010. An application of Random Forests to a genome-wide association dataset: methodological considerations & new findings. Journal of BMC Genetics. 11(1): 49.
14.González-Recio, O. and Forni, S. 2011. Genome-wide prediction of discrete traits using Bayesian regressions and machine learning. Journal of Genetics Selection Evolution. 43(1): 7.
15.Habier, D., Fernando, R.L. and Dekkers, J.C. 2009. Genomic selection using low-density marker panels. Journal of Genetics. 182(1): 343-53.
16.Hayes, B.J., Bowman, P.J., Chamberlain, A. and Goddard, M. 2009. Invited review: Genomic selection in dairy cattle: Progress and challenges. Journal of  Dairy Science. 92(2): 433-43.
17.Hill, W. and Robertson, A. 1968. Linkage disequilibrium in finite populations. TAG Theoretical and Applied Genetics. 38(6): 226-31.
18.Jónás, D., Ducrocq, V. and Croiseau, P. 2017. The combined use of linkage disequilibrium–based haploblocks and allele frequency–based haplotype selection methods enhances genomic evaluation accuracy in dairy cattle. Journal of Dairy Science. 100(4): 2905-8.
19.Ke, X., Hunt, S., Tapper, W., Lawrence, R., Stavrides, G., Ghori, J., Whittaker, P., Collins, A., Morris, A.P. and Bentley, D. 2004. The impact of SNP density on fine-scale patterns of linkage disequilibrium. Journal of Human Molecular Genetics. 13(6):5 77-88.
20.Meuwissen, T., Hayes, B. and Goddard, M. 2001. Prediction of total genetic value using genome-wide dense marker maps. Journal of Genetics. 157(4): 1819-29.
21.      Muir, W. 2007. Comparison of genomic and traditional BLUP‐estimated breeding value accuracy and selection response under alternative trait and genomic parameters. Journal of Animal Breeding and Genetics. 124(6): 342-55.
22.Naderi, S., Yin, T. and König, S. 2016. Random forest estimation of genomic breeding values for disease susceptibility over different disease incidences and genomic architectures in simulated cow calibration groups. Journal of  Dairy Science. 99(9): 7261-73.
23.Naderi, Y. 2018a. Evaluation of genomic prediction accuracy in different genomic architectures of quantitative and threshold traits with the imputation of simulated genomic data using random forest method. Research on Animal Production. 9(20): 129-38.(In Persian).
24.Naderi, Y. 2018b. Impact of genotype imputation and different genomic architectures on the performance of random forest and threshold Bayes A methods for genomic prediction. Iranian Journal of Animal Science. 49(1): 145-57.(In Persian).
25.Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M.A., Bender, D., Maller, J., Sklar, P., De Bakker, P.I. and Daly, M.J. 2007. PLINK: a tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics. 81(3): 559-75.
26.Sadeghi, S., Rafat, s.A. and Alijani, S. 2018. Evaluation of imputed genomic data in discrete traits using Random forest and Bayesian threshold methods. Acta Scientiarum Animal Sciences. 40: 39007.
27.Sargolzaei, M. and Schenkel, F.S. 2009. QMSim: a large-scale genome simulator for livestock. Journal of  Bioinformatics. 25(5): 680-1.
28.Schaeffer, L. 2006. Strategy for applying genome‐wide selection in dairy cattle. Journal of Animal Breeding and Genetics. 123(4): 218-223.
29.Solberg, T., Sonesson, A. and Wooliams, J., editors. 2006. Genomic selection using different markers and density. Proceedings of the 8th World Congress on Genetics Applied to Livestock Production, Belo Horizonte, Minas Gerais, Brazil, 13-18 August, Instituto Prociência.
30.Solberg, T., Sonesson, A. and Woolliams, J. 2008. Genomic selection using different marker types and densities. Journal of Animal Science. 86(10): 2447-54.
31.Sun, X., Fernando, R. and Dekkers, J. 2016. Contributions of linkage disequilibrium and co-segregation information to the accuracy of genomic prediction. Journal of  Genetics Selection Evolution. 48(1): 77.
32.Villumsen, T., Janss, L. and Lund, M. 2009. The importance of haplotype length and heritability using genomic selection in dairy cattle. Journal of Animal Breeding and Genetics. 126(1): 3-13.
33.Wang, C., Ding, X., Wang, J., Liu, J., Fu, W., Zhang, Z., Yin, Z. and Zhang, Q. 2013. Bayesian methods for estimating GEBVs of threshold traits. Heredity. 110(3): 213-9.
34.Wang, C., Li, X., Qian, R., Su, G.,Zhang, Q. and Ding, X. 2017.
Bayesian methods for jointly estimating genomic breeding values of one continuous and one threshold trait. PloS One. 12(4): 175448.
35.Wang, Q., Yu, Y., Yuan, J., Zhang, X., Huang, H., Li, F. and Xiang, J. 2017. Effects of marker density and population structure on the genomic prediction accuracy for growth trait in Pacific white shrimp Litopenaeus vannamei. Journal of  BMC Genetics. 18(1): 45.
36.Wiggans, G., VanRaden, P. and Cooper, T. 2011. The genomic evaluation system in the United States: Past, present, future. Journal of Dairy Science. 94(6): 3202-11.
37.Yang, P., Hwa Yang, Y., B Zhou, B. and Y Zomaya, A. 2010. A review of ensemble methods in bioinformatics. Current Bioinformatics. 5(4): 296-308.
38.Yin, T., Pimentel, E., Borstel, U.K.v. and König, S. 2014. Strategy for the simulation and analysis of longitudinal phenotypic and genomic data in the context of a temperature× humidity-dependent covariate. Journal of Dairy Science. 97(4): 2444-2454.