|
double | log_2 (double a) |
|
long | seq_limit (Bmolgroup *molgroup, Bstring &refseq) |
| Limits the selection to the reference sequence in an aligned set. More...
|
|
Matrix | seq_aligned_identity (Bmolgroup *molgroup) |
| Calculates the pairwise identities between aligned sequences. More...
|
|
Matrix | seq_aligned_similarity (Bmolgroup *molgroup, double threshold, Bresidue_matrix *simat) |
| Calculates the pairwise similarities between aligned sequences. More...
|
|
long | seq_select (Bmolgroup *molgroup, long minlen, long maxlen) |
| Selects sequences within a range of lengths. More...
|
|
long | seq_select (Bmolgroup *molgroup, Matrix mat, long ref, double cutoff) |
| Selects sequences based on a comparison matrix of aligned sequences. More...
|
|
long | seq_delete (Bmolgroup *molgroup, Matrix mat) |
| Deletes non-selected sequences and corresponding elelments of a comparison matrix. More...
|
|
string | seq_aligned_profile (Bmolgroup *molgroup) |
| Generates a PROSITE format profile from an aligned set of sequences. More...
|
|
int | seq_aligned_information (Bmolgroup *molgroup, int window, Bstring &psfile) |
| Calculates the sequence logo representation for an alignment. More...
|
|
int | seq_aligned_hydrophobicity (Bmolgroup *molgroup, int window, double threshold, Bstring &hphobfile, Bstring &psfile) |
| Calculates the average hydrophobicity at every position in an alignment. More...
|
|
vector< Complex< float > > | seq_frequency_analysis (long win, long start, long end, vector< double > &data) |
| Fourier transforms a vector for frequency analysis. More...
|
|
vector< double > | seq_aligned_weight (Bmolgroup *molgroup) |
|
Matrix | seq_correlated_mutation (Bmolgroup *molgroup, Bstring &refseqid, double cutoff, Bstring &simfile) |
| Correlated mutation analysis of an alignment. More...
|
|
Analysis of protein sequences.
- Author
- Bernard Heymann
- Date
- Created: 19990123
-
Modified: 20210426
string seq_aligned_profile |
( |
Bmolgroup * |
molgroup | ) |
|
Generates a PROSITE format profile from an aligned set of sequences.
- Parameters
-
*molgroup | the set of sequences. |
- Returns
- string profile in PROSITE format.
At each position in the alignment, the number of distinct residue types are counted. If there are more than 3 residue types represented at a position, or there is a gap, it is designated as variable by an "x". The profile finally contains 1-3 residue type possibilities for highly conserved positions interspersed by variable length gaps.
Correlated mutation analysis of an alignment.
- Parameters
-
*molgroup | the set of aligned sequences. |
refseqid | reference sequence to report on. |
cutoff | cutoff for reporting correlated mutations. |
&simfile | similarity matrix file. |
- Returns
- Matrix the analysis result matrix.
Reference: Gobel, Sander & Schneider (1994) Proteins 18, 309-317. Mutation (residue variation) correlation is defined as: 1 r(i,j) = ----------— sum(w(k,l)*(s(i,k,l) - <s(i)>)*(s(j,k,l) - <s(j)>)) m^2*o(i)*o(j) where: m: number of sequences o(i): standard deviation of similarities at alignment position i w(k,l): weight for sequences k and l (1 - fractional identity: see function seq_aligned_identity) s(i,k,l): similarity for alignment position i between sequences k and l <s(i)>: average similarity at alignment position i Individual high-scoring correlations (using the given cutoff value) are reported as follows: Res1 Num1 Res2 Num2 Total Corr T 9 I 17 210 0.631 TAIIIVVVIVVVIVIIIIIII IILLLLLLLLLLLLLLLLLLL The first 4 values gives the type and alignment position of the correlating residues. The total is the number of comparisons made: maximally m*(m-1)/2 The last number is the correlation coefficient. The following two lines gives the corresponding residues at the two alignment positions for all the sequences, allowing the user to see on what basis this is a high correlation.