Research Paper Volume 12, Issue 5 pp 4445—4462
Distribution patterns of microsatellites and development of its marker in different genomic regions of forest musk deer genome based on high throughput sequencing
- 1 Chongqing Engineering Laboratory of Green Planting and Deep Processing of Three Gorges Reservoir Famous-region Drug, College of Biology and Food Engineering, Chongqing Three Gorges University, Chongqing 404120, P. R. China
- 2 Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, P. R. China
- 3 Sichuan Institute of Musk Deer Breeding, Chengdu 611830, P. R. China
- 4 College of Environmental and Chemistry Engineering, Chongqing Three Gorges University, Chongqing 404120, P. R. China
- 5 Chongqing Engineering Technology Research Center for GAP of Genuine Medicinal Materials, Chongqing Institute of Medicinal Plant Cultivation, Chongqing 408435, P. R. China
Received: November 2, 2019 Accepted: February 25, 2020 Published: March 10, 2020
https://doi.org/10.18632/aging.102895How to Cite
Copyright © 2020 Qi et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Abstract
Forest musk deer (Moschus berezovskii, FMD) is an endangered artiodactyl species, male FMD produce musk. We have sequenced the whole genome of FMD, completed the genomic assembly and annotation, and performed bioinformatic analyses. Our results showed that microsatellites (SSRs) displayed nonrandomly distribution in genomic regions, and SSR abundances were much higher in the intronic and intergenic regions compared to other genomic regions. Tri- and hexanucleotide perfect (P) SSRs predominated in coding regions (CDSs), whereas, tetra- and pentanucleotide P-SSRs were less abundant. Trifold P-SSRs had more GC-contents in the 5′-untranslated regions (5'UTRs) and CDSs than other genomic regions, whereas mononucleotide P-SSRs had the least GC-contents. The repeat copy numbers (RCN) of the same mono- to hexanucleotide P-SSRs had different distributions in different genomic regions. The RCN of trinucleotide P-SSRs had increased significantly in the CDSs compared to the transposable elements (TEs), intronic and intergenic regions. The analysis of coefficient of variability (CV) of P-SSRs showed that the RCN of mononucleotide P-SSRs had relative higher variation in different genomic regions, followed by the CV pattern of RCN: dinucleotide P-SSRs > trinucleotide P-SSRs > tetranucleotide P-SSRs > pentanucleotide P-SSRs > hexanucleotide P-SSRs. The CV variations of RCN of the same mono- to hexanucleotide P-SSRs were relative higher in the intron and intergenic regions, followed by that in the TEs, and the relative lower was in the 5'UTR, CDSs and 3'UTRs. 58 novel polymorphic SSR loci were detected based on genotyping DNA from 36 captive FMD and 22 SSR markers finally showed polymorphism, stability, and repetition.