README.Rmd

---
output: github_document
author: Marco Cacciabue, Melina Obregón, Axel N. Fenoglio
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, include = FALSE}

app_name <- "voRtex"
```

# **`r app_name `** <img src='man/Figures/hex2.png' style="float:right; height:200px;"/> 


## **`r app_name `** is an R package

voRtex is a package created with the purpose to help in the data analysis of large groups of foot and mouth desease rna sequences.

this package contains a collection of functions designed to manage and analyze sample data in an efficient way.
It manages VCF files, bed files, fasta files, DNAStringSet and else.

## Examples

#### VCFToDataFrame
VCFToDataFrame.R creates a data frame off a vcf file, containing the information
of allel position, allel frecuency and depth coverage 

``` r

  file <- system.file("extdata", "variant_file.vcf", package = "voRtex", mustWork = TRUE)
  vcf_data <- VariantAnnotation::readVcf(file)
  vcfdataframe<-VCFToDataFrame(vcf_data)
  vcfdataframe

```
#### Compute_coverage
With compute_coverage.R we can create a dataframe containing the average position and coverage of a sequence, using a .bed file and giving the function a window size of choosing

``` r

FilePath <- system.file("extdata", "SRR12664421_full_coverage.bed",
                          package = "voRtex", mustWork = TRUE)

data <- read.table(FilePath, col.names = c("reference", "startpos", "endpos", "coverage"))

data_processed<-compute_coverage(data, 50,TRUE)


```

#### ggplot_heatmap
Then with ggplot_heatmap.R we can create a heatmap based on the data frame created with compute_coverage

``` r

color  <- c("#D53E4F","#F46D43","#FDAE61","#FEE08B","#E6F598","#ABDDA4","#66C2A5","#3288BD")

Rplot<-ggplot_heatmap(inputdata=data_processed,
               color_pal = color)

Rplot
```
Resulting in this beautiful heat map

![SRR12664421_Heatmap](Rplot.png)

#### NreadFilter
With NreadFilter we use a .bed file containing the coverage of a sequence, and it creates a dateframe with that information and then filters the rows based on a filter value, and returns a list containing the filtered dataframe and the percentage of bases that passed the filter.

``` r
 FilePath <- system.file("extdata", "SRR12664421_full_coverage.bed",
                         package = "voRtex", mustWork = TRUE)
 OGDataFrame <- read.table(FilePath,
                          col.names = c("reference","startpos","endpos","nreads"))
 salida <- NreadFilter(OGDataFrame,5000)
 
```

#### NCountFilter
With NCountFilter we take a DNAStringSet file and filter it according to the minimum number of "N" (base not readed) we want.

``` r
  FilePath <- system.file("extdata",
                         "renamed_all.fasta",
                         package = "voRtex",
                         mustWork = TRUE)

 DNASequence <- Biostrings::readDNAStringSet(FilePath)

 NCountFilter(DNASequence,1000)
 
```