Introduction
|
After the first two days you will have some familiarity with working on the command line, data management, cleaning and visualization, automation and scripting
|
Logging onto Cloud
|
|
Introducing the Shell
|
The shell gives you the ability to work more efficiently by using keyboard commands rather than a GUI.
Useful commands for navigating your file system include: ls , pwd , and cd .
Most commands take options (flags) which begin with a - .
Tab completion can reduce errors from mistyping and make work more efficient in the shell.
|
Navigating Files and Directories
|
The / , ~ , and .. characters represent important navigational shortcuts.
Hidden files and directories start with . and can be viewed using ls -a .
Relative paths specify a location starting from the current location, while absolute paths specify a location from the root of the file system.
|
Working with Files and Directories
|
You can view file contents using less , cat , head or tail .
The commands cp , mv , and mkdir are useful for manipulating existing files and creating new directories.
You can view file permissions using ls -a and change permissions using chmod .
The history command and the up arrow on your keyboard can be used to repeat recently used commands.
|
Redirection
|
grep is a powerful search tool with many options for customization.
> , >> , and | are different ways of redirecting output.
command > file redirects a command’s output to a file.
command >> file redirects a command’s output to a file without overwriting the existing contents of the file.
command_1 | command_2 redirects the output of the first command as input to the second command.
|
Writing Scripts
|
|
Project Organization
|
|
Assessing Read Quality
|
|
Trimming and Filtering
|
|
Variant Calling Workflow
|
Bioinformatics command line tools are collections of commands that can be used to carry out bioinformatics analyses.
To use most powerful bioinformatics tools, you’ll need to use the command line.
There are many different file formats for storing genomics data. It’s important to understand these file formats and know how to convert among them.
|
Automating a Variant Calling Workflow
|
|
R for microbial genomics
|
|
Introduction Day3
|
|
Sequence assembly
|
|
Sequence Quality
|
|
Inspecting sequence graphs
|
|
Annotation
|
Genome annotation includes prediction of protein-coding genes, as well as other functional genome units
It often starts by identifying open reading frames
Predicted sequences are further analysed with BLAST
Larger DNA sequences or genomes require automated prediction and annotation
|
Pangenome analysis
|
The microbial pangenome is the union of genes in genomes of interest.
The microbial core genome is the intersection of genes shared by genomes of interest.
Roary is a pipeline to determine genes of the pangenome.
|
Phylogenetic trees from the core genome
|
|
Bacterial GWAS
|
|
Wrapup Day 3 and 4
|
|
{:auto_ids}
key word 1
: explanation 1
key word 2
: explanation 2