Get alignment summary from the output of STAR program.
Source:R/general_bioinfo.R
get_star_align_log_summary.Rd
STAR is widely used alignment software program for RNA-seq and other high throughput genomics data. Once the alignment finishes obvious question is to know what percentage of the reads mapped to the reference. These information is logged in *Log.final.out file which is one of the several files STAR generates at the end of alignment. This function parse *Log.final.out file and extract the mapping statistics in a dataframe format.
Examples
star_align_log_file <- system.file("extdata" , "a_Log.final.out" , package = "parcutils")
x = get_star_align_log_summary(log_file = star_align_log_file)
print(x)
#> # A tibble: 13 × 2
#> type val
#> <chr> <chr>
#> 1 Number of input reads 41936201
#> 2 Uniquely mapped reads number 40090105
#> 3 Uniquely mapped reads % 95.60%
#> 4 Average input read length 300
#> 5 Average mapped length 298.24
#> 6 Number of reads mapped to multiple loci 900879
#> 7 % of reads mapped to multiple loci 2.15%
#> 8 Number of reads mapped to too many loci 9335
#> 9 % of reads mapped to too many loci 0.02%
#> 10 Number of reads unmapped: too short 921464
#> 11 % of reads unmapped: too short 2.20%
#> 12 Number of reads unmapped: other 14418
#> 13 % of reads unmapped: other 0.03%