Plasmid genomics: gplas
Published:
Update 2020-04-11: Now published in Bioinformatics
gplas is a software tool to find different plasmids in short read sequenced bacterial genomes. It bins contigs that are predicted as plasmids into separate entities, corresponding to different plasmids. To do that, it uses kmer information to determine plasmid nodes, then it makes use of the nformation in the assembly graph, i.e. the connectivity of the nodes and finally it takes the coverage of the separate nodes into account. It works best in combination with the prediction of mlplasmids, thus with species-specific models for classification, but it can also use plasflow which has a species-independent model for classification.
The preprint is available at biorXiv and the algorithm can be retrieved from the gitlab repo
Want to know more about it? There is a podcast about plasmid genomics, gplas and mlplasmids at the bioinformatics chat with Sergio Arredondo and me.