Skip to content

Commit

Permalink
Merge pull request #9 from mrc-ide/develop
Browse files Browse the repository at this point in the history
added function to subset based on a position string
  • Loading branch information
bobverity authored Jan 16, 2025
2 parents 8750e55 + 2d649bd commit a00e521
Show file tree
Hide file tree
Showing 5 changed files with 63 additions and 2 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: variantstring
Type: Package
Title: Functions for working with variant string format
Version: 1.5.1
Version: 1.6.0
Authors@R: c(
person("Bob", "Verity", email = "r.verity@imperial.ac.uk", role = c("aut", "cre"))
)
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ export(order_position_string)
export(order_variant_string)
export(position_from_variant_string)
export(position_to_long)
export(subset_position)
export(variant_to_long)
import(dplyr)
importFrom(stats,na.omit)
Expand Down
41 changes: 41 additions & 0 deletions R/main.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
# position_to_long
# long_to_position
# position_from_variant_string
# subset_position
# order_variant_string
# order_position_string
# count_het_loci
Expand Down Expand Up @@ -730,6 +731,46 @@ position_from_variant_string <- function(x) {

}

#------------------------------------------------
#' @title Subset position of a variant string
#'
#' @description
#' Given a vector of variant strings and a single position string, subsets all
#' variant strings to only the genes and codons in the position string. Retains
#' read counts at these positions if present.
#'
#' @param position_string a single position string.
#' @param variant_strings a variant string or vector of variant strings.
#'
#' @import dplyr
#'
#' @export

subset_position <- function(position_string, variant_strings) {

# checks
check_position_string(position_string)
stopifnot(length(position_string) == 1)
check_variant_string(variant_strings)

# get position string in long form
df_position <- position_to_long(position_string)[[1]]

ret <- mapply(function(x) {
df_diff <- anti_join(df_position, x, by = c("gene", "pos"))
if (nrow(df_diff) == 0) {
df_sub <- semi_join(x, df_position, by = c("gene", "pos"))
ret <- long_to_variant(list(df_sub))
} else {
ret <- NA
}
ret
}, variant_to_long(variant_strings), SIMPLIFY = FALSE) |>
unlist()

return(ret)
}

#------------------------------------------------
#' @title Reorders a variant string
#'
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ In brief, this package contains functions that...

- Check for correctly formatted variant and position strings.
- Extract a position string from a variant string.
- Subset a variant string based on a position string.
- Compare two variant strings to look for a match (useful in numerator of prevalence calculation). Reports if this is an exact match or an ambiguous match.
- Compare a position string against a variant string to look for a match (useful in denominator of prevalence calculation).
- Convert between string format and a long-form data.frame format.
Expand All @@ -20,4 +21,4 @@ There are also a few more utility functions not listed here - see the package he

## Release history

The current version is 1.5.1, released 16 Jan 2025.
The current version is 1.6.0, released 16 Jan 2025.
18 changes: 18 additions & 0 deletions man/subset_position.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit a00e521

Please sign in to comment.