The repetitive content of the genome, once considered to be “junk DNA”, is in fact an essential component of genomic architecture and evolution. In this study, we used the genomes of three varieties of Cannabis sativa, three varieties of Humulus lupulus and one genotype of Morus notabilis to explore their repetitive content using a graph-based clustering method, designed to explore and compare repeat content in genomes that have not been fully assembled.
The repetitive content in the C. sativa genome is mainly composed of the retrotransposons LTR/Copia and LTR/Gypsy (14% and 14.8%, respectively), ribosomal DNA (2%), and low-complexity sequences (29%). We observed a recent copy number expansion in some transposable element families. Simple repeats and low complexity regions of the genome show higher intra and inter species variation.
As with other sequenced genomes, the repetitive content of C. sativa’s genome exhibits a wide range of evolutionary patterns. Some repeat types have patterns of diversity consistent with expansions followed by losses in copy number, while others may have expanded more slowly and reached a steady state. Still, other repetitive sequences, particularly ribosomal DNA (rDNA), show signs of concerted evolution playing a major role in homogenizing sequence variation.
Graph based clustering; Next generation sequencing; Repeat explorer; Transposable elements
- PMID: 29466945
- DOI: 10.1186/s12864-018-4494-3