
Calculate Substitution Distance Between Amino Acid Sequences
Source:R/substitution.R
substitution.Rd
Computes the average substitution distance between two aligned amino acid sequences based on a specified substitution matrix. Currently supports the Grantham and FLU matrices.
Arguments
- seq1
A character string representing the first aligned amino acid sequence.
- seq2
A character string representing the second aligned amino acid sequence.
- method
Character string specifying the substitution matrix to use. Supported values are
"grantham"
and"flu"
(case-insensitive).- ambiguous_residues
A character string of ambiguous residues to remove before computing distance.
Details
This function first removes ambiguous residues from both sequences
using remove_ambiguous_residues
, validates the remaining
residues, and computes pairwise distances using a substitution matrix. The
result is normalized by sequence length.
Eventually we plan to support more matrices like BLOSUM and Sneath's index. If you want to use a specific substitution matrix please let us know.
References
- Grantham, R. (1974). Amino acid difference formula to help explain protein evolution. Science, 185(4154), 862-864. doi:10.1126/science.185.4154.862
- Dang, C.C., Le, Q.S., Gascuel, O., & Lartillot, N. (2010). FLU, an amino acid substitution model for influenza proteins. BMC Evolutionary Biology, 10, 99. doi:10.1186/1471-2148-10-99