Skip to content

Commit

Permalink
Fixing remaining BiocCheck and readme
Browse files Browse the repository at this point in the history
  • Loading branch information
qpmnguyen committed Mar 3, 2022
1 parent 9a6652c commit 3b93776
Show file tree
Hide file tree
Showing 5 changed files with 58 additions and 28 deletions.
13 changes: 8 additions & 5 deletions R/cbea_internals.R
Original file line number Diff line number Diff line change
Expand Up @@ -163,21 +163,23 @@ get_diagnostics <- function(env = caller_env()){

req_objs <- c("output", "distr", "parametric", "raw_scores", "perm_scores", "adj")
obj_names <- env_names(env)
sapply(req_objs, function(x){
vapply(req_objs, function(x){
if(!x %in% obj_names){
stop(x, " not found")
}
})
return(0)
}, FUN.VALUE = 0)
if (env$parametric == TRUE){
add_objs <- c("final_distr", "perm_distr")
if (env$adj == TRUE){
add_objs <- c(add_objs, "unperm_distr")
}
sapply(add_objs, function(x){
vapply(add_objs, function(x){
if(!x %in% obj_names){
stop(x, " not found")
}
})
return(0)
}, FUN.VALUE = 0)

}
if (env$output == "raw" | env$parametric == FALSE){
Expand Down Expand Up @@ -544,7 +546,8 @@ get_adj_mnorm <- function(perm, unperm, verbose = FALSE, fix_comp = "none") {
mu = perm$mu, mean = perm_mean
)
if (verbose == TRUE) {
print(paste("Total sd is", unperm_sd, "and estimated sd is", estim_sd))
message("Total sd is ", unperm_sd, " and estimated sd is ",
estim_sd)
}
param <- list(mu = perm$mu, sigma = est_sig, lambda = perm$lambda)
return(param)
Expand Down
4 changes: 2 additions & 2 deletions R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ pmnorm <- function(q, mu, sigma, lambda, log = FALSE, verbose = FALSE) {
q <- as.vector(q)
n_components <- length(sigma)
if (verbose == TRUE) {
message(paste(n_components, "components!", "\n"))
message(n_components, " components!", "\n")
}
comp <- vector(mode = "list", length = n_components)
for (i in seq_len(n_components)) {
Expand All @@ -53,7 +53,7 @@ dmnorm <- function(x, mu, sigma, lambda, log = FALSE, verbose = FALSE) {
x <- as.vector(x)
n_components <- length(sigma)
if (verbose == TRUE) {
message(paste(n_components, "components!", "\n"))
message(n_components, "components!", "\n")
}
comp <- vector(mode = "list", length = n_components)
for (i in seq_len(n_components)) {
Expand Down
19 changes: 8 additions & 11 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,13 @@ knitr::opts_chunk$set(
[![Codecov test coverage](https://codecov.io/gh/qpmnguyen/CBEA/branch/master/graph/badge.svg)](https://codecov.io/gh/qpmnguyen/CBEA?branch=master)
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![R-CMD-check](https://github.com/qpmnguyen/CBEA/workflows/R-CMD-check-bioc/badge.svg)](https://github.com/qpmnguyen/CBEA/actions)
[![BioC status](http://www.bioconductor.org/shields/build/release/bioc/CBEA.svg)](https://bioconductor.org/checkResults/release/bioc-LATEST/CBEA)
<!-- [![BioC status](http://www.bioconductor.org/shields/build/release/bioc/CBEA.svg)](https://bioconductor.org/checkResults/release/bioc-LATEST/CBEA) -->
<!-- badges: end -->

### Quang Nguyen

The `CBEA` package provides basic functionality to perform taxonomic enrichment analysis in R. This package mainly supports the `CBEA` method, and provides additional support for generating sets for analyses using approaches commonly used in the gene set testing literature.

**This package is under ongoing development and might not be stable at the moment. Only install the development version if R CMD CHECK badge is green (passed) or if the error is due to dependency installation.**

### Installation

And the development version from [GitHub](https://github.com/) with:
Expand All @@ -38,13 +36,12 @@ devtools::install_github("qpmnguyen/CBEA")

### Features

This package supports the implementation of the CBEA approach (formerly known as cILR) for taxonomic enrichment analysis.

### Dependency Graph
This package implements the CBEA approach for performing set-based enrichment analysis for microbiome relative abundance data. A preprint of the package can be found [on bioXriv](https://www.biorxiv.org/content/10.1101/2021.09.07.459294v1.full). In summary, CBEA (Competitive Balances for taxonomic Enrichment Analysis) provides an estimate of the activity of a set by transforming an input taxa-by-sample data matrix into a corresponding set-by-sample data matrix. The resulting output can be used for additional downstream analyses such as differential abundance, classification, clustering, etc. using set-based features instead of the original units.

```{r, include = TRUE, echo=FALSE}
#dd <- deepdep::deepdep(package = "CBEA", local = TRUE, depth = 2, dependency_type = "Imports")
#ggplot2::ggsave(plot = deepdep::plot_dependencies(dd), filename = "man/figures/dep_new.png")
The transformation that CBEA applies is based on the isometric log ratio transformation:
$$
CBEA_{i,\mathbb{S}} = \sqrt{\frac{|\mathbb{S}||\mathbb{S_c}|}{|\mathbb{S}| + |\mathbb{S_c}|}} \ln \frac{g(X_{i,j | j\in \mathbb{S}})}{g(X_{i,j | j \notin \mathbb{S}})}
$$
Where $\mathbb{S}$ is the set of interest, $\mathbb{S}_C$ is it's complement, $g()$ is the geometric mean operation, and $X$ is the original data matrix where $i$ is the index representing samples and $j$ is the index representing variables (or taxa).

knitr::include_graphics("man/figures/dep_new.png")
```
The inference procedure is performed through estimating the null distribution of the test statistic. This can be done either via permutations or a parametric fit of a distributional form on the permuted scores. Users can also adjust for variance inflation due to inter-taxa correlation. Please refer to the main manuscript for any additional details.
49 changes: 39 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,7 @@ coverage](https://codecov.io/gh/qpmnguyen/CBEA/branch/master/graph/badge.svg)](h
state and is being actively
developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![R-CMD-check](https://github.com/qpmnguyen/CBEA/workflows/R-CMD-check-bioc/badge.svg)](https://github.com/qpmnguyen/CBEA/actions)
[![BioC
status](http://www.bioconductor.org/shields/build/release/bioc/CBEA.svg)](https://bioconductor.org/checkResults/release/bioc-LATEST/CBEA)
<!-- [![BioC status](http://www.bioconductor.org/shields/build/release/bioc/CBEA.svg)](https://bioconductor.org/checkResults/release/bioc-LATEST/CBEA) -->
<!-- badges: end -->

### Quang Nguyen
Expand All @@ -22,10 +21,6 @@ enrichment analysis in R. This package mainly supports the `CBEA`
method, and provides additional support for generating sets for analyses
using approaches commonly used in the gene set testing literature.

**This package is under ongoing development and might not be stable at
the moment. Only install the development version if R CMD CHECK badge is
green (passed) or if the error is due to dependency installation.**

### Installation

And the development version from [GitHub](https://github.com/) with:
Expand All @@ -37,9 +32,43 @@ devtools::install_github("qpmnguyen/CBEA")

### Features

This package supports the implementation of the CBEA approach (formerly
known as cILR) for taxonomic enrichment analysis.
This package implements the CBEA approach for performing set-based
enrichment analysis for microbiome relative abundance data. A preprint
of the package can be found [on
bioXriv](https://www.biorxiv.org/content/10.1101/2021.09.07.459294v1.full).
In summary, CBEA (Competitive Balances for taxonomic Enrichment
Analysis) provides an estimate of the activity of a set by transforming
an input taxa-by-sample data matrix into a corresponding set-by-sample
data matrix. The resulting output can be used for additional downstream
analyses such as differential abundance, classification, clustering,
etc. using set-based features instead of the original units.

The transformation that CBEA applies is based on the isometric log ratio
transformation:

![
CBEA\_{i,\\mathbb{S}} = \\sqrt{\\frac{\|\\mathbb{S}\|\|\\mathbb{S\_c}\|}{\|\\mathbb{S}\| + \|\\mathbb{S\_c}\|}} \\ln \\frac{g(X\_{i,j \| j\\in \\mathbb{S}})}{g(X\_{i,j \| j \\notin \\mathbb{S}})}
](https://latex.codecogs.com/png.image?%5Cdpi%7B110%7D&space;%5Cbg_white&space;%0ACBEA_%7Bi%2C%5Cmathbb%7BS%7D%7D%20%3D%20%5Csqrt%7B%5Cfrac%7B%7C%5Cmathbb%7BS%7D%7C%7C%5Cmathbb%7BS_c%7D%7C%7D%7B%7C%5Cmathbb%7BS%7D%7C%20%2B%20%7C%5Cmathbb%7BS_c%7D%7C%7D%7D%20%5Cln%20%5Cfrac%7Bg%28X_%7Bi%2Cj%20%7C%20j%5Cin%20%5Cmathbb%7BS%7D%7D%29%7D%7Bg%28X_%7Bi%2Cj%20%7C%20j%20%5Cnotin%20%5Cmathbb%7BS%7D%7D%29%7D%0A "
CBEA_{i,\mathbb{S}} = \sqrt{\frac{|\mathbb{S}||\mathbb{S_c}|}{|\mathbb{S}| + |\mathbb{S_c}|}} \ln \frac{g(X_{i,j | j\in \mathbb{S}})}{g(X_{i,j | j \notin \mathbb{S}})}
")

### Dependency Graph
Where
![\\mathbb{S}](https://latex.codecogs.com/png.image?%5Cdpi%7B110%7D&space;%5Cbg_white&space;%5Cmathbb%7BS%7D "\mathbb{S}")
is the set of interest,
![\\mathbb{S}\_C](https://latex.codecogs.com/png.image?%5Cdpi%7B110%7D&space;%5Cbg_white&space;%5Cmathbb%7BS%7D_C "\mathbb{S}_C")
is it’s complement,
![g()](https://latex.codecogs.com/png.image?%5Cdpi%7B110%7D&space;%5Cbg_white&space;g%28%29 "g()")
is the geometric mean operation, and
![X](https://latex.codecogs.com/png.image?%5Cdpi%7B110%7D&space;%5Cbg_white&space;X "X")
is the original data matrix where
![i](https://latex.codecogs.com/png.image?%5Cdpi%7B110%7D&space;%5Cbg_white&space;i "i")
is the index representing samples and
![j](https://latex.codecogs.com/png.image?%5Cdpi%7B110%7D&space;%5Cbg_white&space;j "j")
is the index representing variables (or taxa).

<img src="man/figures/dep_new.png" width="100%" />
The inference procedure is performed through estimating the null
distribution of the test statistic. This can be done either via
permutations or a parametric fit of a distributional form on the
permuted scores. Users can also adjust for variance inflation due to
inter-taxa correlation. Please refer to the main manuscript for any
additional details.
1 change: 1 addition & 0 deletions vignettes/basic_usage.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@ If there are any issues with the installation procedure or package features, the
library("CBEA")
library(BiocSet)
library(tidyverse)
set.seed(1020)
```

## Loading sample data
Expand Down

0 comments on commit 3b93776

Please sign in to comment.