Plasmid genomics: gplas

Update 2020-04-11: Now published in Bioinformatics

gplas is a software tool to find different plasmids in short read sequenced bacterial genomes. It bins contigs that are predicted as plasmids into separate entities, corresponding to different plasmids. To do that, it uses kmer information to determine plasmid nodes, then it makes use of the nformation in the assembly graph, i.e. the connectivity of the nodes and finally it takes the coverage of the separate nodes into account. It works best in combination with the prediction of mlplasmids, thus with species-specific models for classification, but it can also use plasflow which has a species-independent model for classification.

The preprint is available at biorXiv and the algorithm can be retrieved from Sergio Arredondo’s gitlab repo

