This function removes positions containing specified ambiguous residues (e.g., "X") from a character vector of aligned sequences. Removal is performed listwise—i.e., any position containing an ambiguous residue in any sequence will be removed from all sequences. This ensures all sequences remain aligned.
Value
A character vector of the same length as `seqs`, with the same names (if any), where all positions containing ambiguous residues in any sequence have been removed.
Details
If `ambiguous_residues` is an empty string `""`, no residues will be removed.
A common modification that is required for some distance metrics is adding a gap character to `ambiguous_residies`, i.e., `"xX?-"`.
Examples
seqs <- c(a = "ACDXFG", b = "AXCXFG", c = "ACDYFG")
remove_ambiguous_residues(seqs)
#> a b c
#> "ADFG" "ACFG" "ADFG"
# Returns c(a = "ACFG", b = "ACFG", c = "ACFG")