A sequence logo is a graphical representation of the sequence conservation of nucleotides (in a strand of DNA/RNA) or amino acids (in protein sequences). The sequence logo is useful in diverse omics fields, such as chip-seq and m6A-seq.
In R, there are too many packages to generate a beautiful sequence logo.
# First, install ggseqlogo from CRAN install.packages("ggseqlogo") #or from github using the devtools package: devtools::install_github("omarwagih/ggseqlogo")
Assume you have a fasta file that contains 100 short peptides of 15 amino acides. I use seqinr to import the sequences into RStudio.
library(seqinr) sample_seq<-read.fasta("~/Downloads/sample_seq.fasta") sample_seq_df<-data.frame(IDs=names(sample_seq),Sequences=unlist(lapply(sample_seq,function(x) toupper(paste0(x,collapse = "")))))
Plot sequence logo
You will generate a sequence logo as follows:
You can easily customize the plots using annotation tools. The details can be found here. https://omarwagih.github.io/ggseqlogo/
The weblog (https://weblogo.berkeley.edu/logo.cgi) is also popular, and you can create the sequence logo online.