Amino acid replacement

Amino acid replacement is a change from one amino acid to a different amino acid in a protein due to point mutation in the corresponding DNA sequence. It is caused by nonsynonymous missense mutation which changes the codon sequence to code other amino acid instead of the original.

Notable mutations

Conservative and radical replacements

Not all amino acid replacements have the same effect on function or structure of protein. The magnitude of this process may vary depending on how similar or dissimilar the replaced amino acids are, as well as on their position in the sequence or the structure. Similarity between amino acids can be calculated based on substitution matrices, physico-chemical distance, or simple properties such as amino acid size or charge[1] (see also amino acid chemical properties). Usually amino acids are thus classified into two types:[2]

  • Conservative replacement - an amino acid is exchanged into another that has similar properties. This type of replacement is expected to rarely result in dysfunction in the corresponding protein [citation needed].
  • Radical replacement - an amino acid is exchanged into another with different properties. This can lead to changes in protein structure or function, which can cause potentially lead to changes in phenotype, sometimes pathogenic. A well known example in humans is sickle cell anemia, due to a mutation in beta globin where at position 6 glutamic acid (negatively charged) is exchanged with valine (not charged).

Physicochemical distances

Physicochemical distance is a measure that assesses the difference between replaced amino acids. The value of distance is based on properties of amino acids. There are 134 physicochemical properties that can be used to estimate similarity between amino acids.[3] Each physicochemical distance is based on different composition of properties.

Properties of amino acids employed for estimating overall similarity[3]
Two-state characters Properties
1-5 Presence respectively of: β―CH2, γ―CH2, δ―CH2 (proline scored as positive), ε―CH2 group and a―CH3 group
6-10 Presence respectively of: ω―SH, ω―COOH, ω―NH2 (basic), ω―CONH2 and ―CHOH groups
11-15 Presence respectively of: benzene ring (including tryptophan as positive), branching in side chain by a CH group, a second CH3 group, two but not three ―H groups at the ends of the side chain (proline scored as positive) and a C―S―C group
16-20 Presence respectively of: guanido group, α―NH2, α―NH group in ring, δ―NH group in ring, ―N= group in ring
21-25 Presence respectively of: ―CH=N, indolyl group, imidazole group, C=O group in side chain, and configuration at α―C potentially changing direction of the peptide chain (only proline scores positive)
26-30 Presence respectively of: sulphur atom, primary aliphatic ―OH group, secondary aliphatic ―OH group, phenolic ―OH group, ability to form S―S bridges
31-35 Presence respectively of: imidazole ―NH group, indolyl ―NH group, ―SCH3 group, a second optical centre, the N=CR―NH group
36-40 Presence respectively of: isopropyl group, distinct aromatic reactivity, strong aromatic reactivity, terminal positive charge, negative charge at high pH (tyrosine scored positive)
41 Presence of pyrrolidine ring
42-53 Molecular weight (approximate) of side chain, scored in 12 additive steps (sulphur counted as the equivalent of two carbon, nitrogen or oxygen atoms)
54-56 Presence, respectively, of: flat 5-, 6- and 9-membered ring system
57-64 pK at isoelectric point, scored additively in steps of 1 pH
65-68 Logarithm of solubility in water of the ʟ-isomer in mg/100 ml., scored additively
69-70 Optical rotation in 5 ɴ-HCl, [α]D 0 to -25, and over -25, respectively
71-72 Optical rotation in 5 ɴ-HCI, [α] 0 to +25, respectively (values for glutamine and tryptophan with water as solvent, and for asparagine 3·4 ɴ-HCl)
73-74 Side-chain hydrogen bonding (ionic type), strong donor and strong acceptor, respectively
75-76 Side-chain hydrogen bonding (neutral type), strong donor and strong acceptor, respectively
77-78 Water structure former, respectively moderate and strong
79 Water structure breaker
80-82 Mobile electrons few, moderate and many, respectively (scored additively)
83-85 Heat and age stability moderate, high and very high, respectively (scored additively)
86-89 RF in phenol-water paper chromatography in steps of 0·2 (scored additively)
90-93 RF in toluene-pyridine-glycolchlorhydrin (paper chromatography of DNP-derivative) in steps of 0·2 (scored additively: for lysine the di-DNP derivative)
94-97 Ninhydrin colour after collidine-lutidine chromatography and heating 5 min at 100 °C, respectively purple, pink, brown and yellow
98 End of side-chain furcated
99-101 Number of substituents on the β-carbon atom, respectively 1, 2 or 3 (scored additively)
102-111 The mean number of lone pair electrons on the side-chain (scored additively)
112-115 Number of bonds in the side-chain allowing rotation (scored additively)
116-117 Ionic volume within rings slight, or moderate (scored additively)
118-124 Maximum moment of inertia for rotation at the α―β bond (scored additively in seven approximate steps)
125-131 Maximum moment of inertia for rotation at the β―γ bond (scored additively in seven approximate steps)
132-134 Maximum moment of inertia for rotation at the γ―δ bond (scored additively in three approximate steps)

Grantham's distance

Grantham's distance depends on three properties: composition, polarity and molecular volume.[4]

Distance difference D for each pair of amino acid i and j is calculated as:

where c = composition, p = polarity, and v = molecular volume; and are constants of squares of the inverses of the mean distance for each property, respectively equal to 1.833, 0.1018, 0.000399. According to Grantham's distance, most similar amino acids are leucine and isoleucine and the most distant are cysteine and tryptophan.

Difference D for amino acids[4]
Arg Leu Pro Thr Ala Val Gly Ile Phe Tyr Cys His Gln Asn Lys Asp Glu Met Trp
110 145 74 58 99 124 56 142 155 144 112 89 68 46 121 65 80 135 177 Ser
102 103 71 112 96 125 97 97 77 180 29 43 86 26 96 54 91 101 Arg
98 92 96 32 138 5 22 36 198 99 113 153 107 172 138 15 61 Leu
38 27 68 42 95 114 110 169 77 76 91 103 108 93 87 147 Pro
58 69 59 89 103 92 149 47 42 65 78 85 65 81 128 Thr
64 60 94 113 112 195 86 91 111 106 126 107 84 148 Ala
109 29 50 55 192 84 96 133 97 152 121 21 88 Val
135 153 147 159 98 87 80 127 94 98 127 184 Gly
21 33 198 94 109 149 102 168 134 10 61 Ile
22 205 100 116 158 102 177 140 28 40 Phe
194 83 99 143 85 160 122 36 37 Tyr
174 154 139 202 154 170 196 215 Cys
24 68 32 81 40 87 115 His
46 53 61 29 101 130 Gln
94 23 42 142 174 Asn
101 56 95 110 Lys
45 160 181 Asp
126 152 Glu
67 Met

Sneath's index

Sneath's index takes into account 134 categories of activity and structure.[3] Dissimilarity index D is a percentage value of the sum of all properties not shared between two replaced amino acids. It is percentage value expressed by , where S is Similarity.

Dissimilarity D between amino acids[3]
Leu Ile Val Gly Ala Pro Gln Asn Met Thr Ser Cys Glu Asp Lys Arg Tyr Phe Trp
Ile 5
Val 9 7
Gly 24 25 19
Ala 15 17 12 9
Pro 23 24 20 17 16
Gln 22 24 25 32 26 33
Asn 20 23 23 26 25 31 10
Met 20 22 23 34 25 31 13 21
Thr 23 21 17 20 20 25 24 19 25
Ser 23 25 20 19 16 24 21 15 22 12
Cys 24 26 21 21 13 25 22 19 17 19 13
Glu 30 31 31 37 34 43 14 19 26 34 29 33
Asp 25 28 28 33 30 40 22 14 31 29 25 28 7
Lys 23 24 26 31 26 31 21 27 24 34 31 32 26 34
Arg 33 34 36 43 37 43 23 31 28 38 37 36 31 39 14
Tyr 30 34 36 36 34 37 29 28 32 32 29 34 34 34 34 36
Phe 19 22 26 29 26 27 24 24 24 28 25 29 35 35 28 34 13
Trp 30 34 37 39 36 37 31 32 31 38 35 37 43 45 34 36 21 13
His 25 28 31 34 29 36 27 24 30 34 28 31 27 35 27 31 23 18 25

Epstein's coefficient of difference

Epstein's coefficient of difference is based on the differences in polarity and size between replaced pairs of amino acids.[5] This index that distincts the direction of exchange between amino acids, described by 2 equations:

when smaller hydrophobic residue is replaced by larger hydrophobic or polar residue

when polar residue is exchanged or larger residue is replaced by smaller

Coefficient of difference [5]
Phe Met Leu Ile Val Pro Tyr Trp Cys Ala Gly Ser Thr His Glu Gln Asp Asn Lys Arg
Phe 0.05 0.08 0.08 0.1 0.1 0.21 0.25 0.22 0.43 0.53 0.81 0.81 0.8 1 1 1 1 1 1
Met 0.1 0.03 0.03 0.1 0.1 0.25 0.32 0.21 0.41 0.42 0.8 0.8 0.8 1 1 1 1 1 1
Leu 0.15 0.05 0 0.03 0.03 0.28 0.36 0.2 0.43 0.51 0.8 0.8 0.81 1 1 1 1 1 1.01
Ile 0.15 0.05 0 0.03 0.03 0.28 0.36 0.2 0.43 0.51 0.8 0.8 0.81 1 1 1 1 1 1.01
Val 0.2 0.1 0.05 0.05 0 0.32 0.4 0.2 0.4 0.5 0.8 0.8 0.81 1 1 1 1 1 1.02
Pro 0.2 0.1 0.05 0.05 0 0.32 0.4 0.2 0.4 0.5 0.8 0.8 0.81 1 1 1 1 1 1.02
Tyr 0.2 0.22 0.22 0.22 0.24 0.24 0.1 0.13 0.27 0.36 0.62 0.61 0.6 0.8 0.8 0.81 0.81 0.8 0.8
Trp 0.21 0.24 0.25 0.25 0.27 0.27 0.05 0.18 0.3 0.39 0.63 0.63 0.61 0.81 0.81 0.81 0.81 0.81 0.8
Cys 0.28 0.22 0.21 0.21 0.2 0.2 0.25 0.35 0.25 0.31 0.6 0.6 0.62 0.81 0.81 0.8 0.8 0.81 0.82
Ala 0.5 0.45 0.43 0.43 0.41 0.41 0.4 0.49 0.22 0.1 0.4 0.41 0.47 0.63 0.63 0.62 0.62 0.63 0.67
Gly 0.61 0.56 0.54 0.54 0.52 0.52 0.5 0.58 0.34 0.1 0.32 0.34 0.42 0.56 0.56 0.54 0.54 0.56 0.61
Ser 0.81 0.8 0.8 0.8 0.8 0.8 0.62 0.63 0.6 0.4 0.3 0.03 0.1 0.21 0.21 0.2 0.2 0.21 0.24
Thr 0.81 0.8 0.8 0.8 0.8 0.8 0.61 0.63 0.6 0.4 0.31 0.03 0.08 0.21 0.21 0.2 0.2 0.21 0.22
His 0.8 0.8 1 1 0.8 0.8 0.6 0.61 0.61 0.42 0.34 0.1 0.08 0.2 0.2 0.21 0.21 0.2 0.2
Glu 1 1 1 1 1 1 0.8 0.81 0.8 0.61 0.52 0.22 0.21 0.2 0 0.03 0.03 0 0.05
Gln 1 1 1 1 1 1 0.8 0.81 0.8 0.61 0.52 0.22 0.21 0.2 0 0.03 0.03 0 0.05
Asp 1 1 1 1 1 1 0.81 0.81 0.8 0.61 0.51 0.21 0.2 0.21 0.03 0.03 0 0.03 0.08
Asn 1 1 1 1 1 1 0.81 0.81 0.8 0.61 0.51 0.21 0.2 0.21 0.03 0.03 0 0.03 0.08
Lys 1 1 1 1 1 1 0.8 0.81 0.8 0.61 0.52 0.22 0.21 0.2 0 0 0.03 0.03 0.05
Arg 1 1 1 1 1.01 1.01 0.8 0.8 0.81 0.62 0.53 0.24 0.22 0.2 0.05 0.05 0.08 0.08 0.05

Miyata's distance

Miyata's distance is based on 2 physicochemical properties: volume and polarity.[6]

Distance between amino acids ai and aj is calculated as where is value of polarity difference between replaced amino acids and and is difference for volume; and are standard deviations for and

Amino acid pair distance (dij)[6]
Cys Pro Ala Gly Ser Thr Gln Glu Asn Asp His Lys Arg Val Leu Ile Met Phe Tyr Trp
1.33 1.39 2.22 2.84 1.45 2.48 3.26 2.83 3.48 2.56 3.27 3.06 0.86 1.65 1.63 1.46 2.24 2.38 3.34 Cys
0.06 0.97 0.56 0.87 1.92 2.48 1.8 2.4 2.15 2.94 2.9 1.79 2.7 2.62 2.36 3.17 3.12 4.17 Pro
0.91 0.51 0.9 1.92 2.46 1.78 2.37 2.17 2.96 2.92 1.85 2.76 2.69 2.42 3.23 3.18 4.23 Ala
0.85 1.7 2.48 2.78 1.96 2.37 2.78 3.54 3.58 2.76 3.67 3.6 3.34 4.14 4.08 5.13 Gly
0.89 1.65 2.06 1.31 1.87 1.94 2.71 2.74 2.15 3.04 2.95 2.67 3.45 3.33 4.38 Ser
1.12 1.83 1.4 2.05 1.32 2.1 2.03 1.42 2.25 2.14 1.86 2.6 2.45 3.5 Thr
0.84 0.99 1.47 0.32 1.06 1.13 2.13 2.7 2.57 2.3 2.81 2.48 3.42 Gln
0.85 0.9 0.96 1.14 1.45 2.97 3.53 3.39 3.13 3.59 3.22 4.08 Glu
0.65 1.29 1.84 2.04 2.76 3.49 3.37 3.08 3.7 3.42 4.39 Asn
1.72 2.05 2.34 3.4 4.1 3.98 3.69 4.27 3.95 4.88 Asp
0.79 0.82 2.11 2.59 2.45 2.19 2.63 2.27 3.16 His
0.4 2.7 2.98 2.84 2.63 2.85 2.42 3.11 Lys
2.43 2.62 2.49 2.29 2.47 2.02 2.72 Arg
0.91 0.85 0.62 1.43 1.52 2.51 Val
0.14 0.41 0.63 0.94 1.73 Leu
0.29 0.61 0.86 1.72 Ile
0.82 0.93 1.89 Met
0.48 1.11 Phe
1.06 Tyr
Trp

Experimental Exchangeability

Experimental Exchangeability was devised by Yampolsky and Stoltzfus.[7] It is the measure of the mean effect of exchanging one amino acid into a different amino acid.

It is based on analysis of experimental studies where 9671 amino acids replacements from different proteins, were compared for effect on protein activity.

Exchangeability (x1000) by source (row) and destination (column)[7]
Cys Ser Thr Pro Ala Gly Asn Asp Glu Gln His Arg Lys Met Ile Leu Val Phe Tyr Trp Exsrc
Cys . 258 121 201 334 288 109 109 270 383 258 306 252 169 109 347 89 349 349 139 280
Ser 373 . 481 249 490 418 390 314 343 352 353 363 275 321 270 295 358 334 294 160 351
Thr 325 408 . 164 402 332 240 190 212 308 246 299 256 152 198 271 362 273 260 66 287
Pro 345 392 286 . 454 404 352 254 346 384 369 254 231 257 204 258 421 339 298 305 335
Ala 393 384 312 243 . 387 430 193 275 320 301 295 225 549 245 313 319 305 286 165 312
Gly 267 304 187 140 369 . 210 188 206 272 235 178 219 197 110 193 208 168 188 173 228
Asn 234 355 329 275 400 391 . 208 257 298 248 252 183 236 184 233 233 210 251 120 272
Asp 285 275 245 220 293 264 201 . 344 263 298 252 208 245 299 236 175 233 227 103 258
Glu 332 355 292 216 520 407 258 533 . 341 380 279 323 219 450 321 351 342 348 145 363
Gln 383 443 361 212 499 406 338 68 439 . 396 366 354 504 467 391 603 383 361 159 386
His 331 365 205 220 462 370 225 141 319 301 . 275 332 315 205 364 255 328 260 72 303
Arg 225 270 199 145 459 251 67 124 250 288 263 . 306 68 139 242 189 213 272 63 259
Lys 331 376 476 252 600 492 457 465 272 441 362 440 . 414 491 301 487 360 343 218 409
Met 347 353 261 85 357 218 544 392 287 394 278 112 135 . 612 513 354 330 308 633 307
Ile 362 196 193 145 326 160 172 27 197 191 221 124 121 279 . 417 494 331 323 73 252
Leu 366 212 165 146 343 201 162 112 199 250 288 185 171 367 301 . 275 336 295 152 248
Val 382 326 398 201 389 269 108 228 192 280 253 190 197 562 537 333 . 207 209 286 277
Phe 176 152 257 112 236 94 136 90 62 216 237 122 85 255 181 296 291 . 332 232 193
Tyr 142 173 . 194 402 357 129 87 176 369 197 340 171 392 . 362 . 360 . 303 258
Trp 137 92 17 66 63 162 . . 65 61 239 103 54 110 . 177 110 364 281 . 142
Exdest 315 311 293 192 411 321 258 225 262 305 290 255 225 314 293 307 305 294 279 172 291

Typical and idiosyncratic amino acids

Amino acids can also be classified according to how many different amino acids they can be exchanged by through single nucleotide substitution.

  • Typical amino acids - there are several other amino acids which they can change into through single nucleotide substitution. Typical amino acids and their alternatives usually have similar physicochemical properties. Leucine is an example of a typical amino acid.
  • Idiosyncratic amino acids - there are few similar amino acids that they can mutate to through single nucleotide substitution. In this case most amino acid replacements will be disruptive for protein function. Tryptophan is an example of an idiosyncratic amino acid.[8]

Tendency to undergo amino acid replacement

Some amino acids are more likely to be replaced. One of the factors that influences this tendency is physicochemical distance. Example of a measure of amino acid can be Graur's Stability Index.[9] The assumption of this measure is that the amino acid replacement rate and protein's evolution is dependent on the amino acid composition of protein. Stability index S of an amino acid is calculated based on physicochemical distances of this amino acid and its alternatives than can mutate through single nucleotide substitution and probabilities to replace into these amino acids. Based on Grantham's distance the most immutable amino acid is cysteine, and the most prone to undergo exchange is methionine.

Example of calculating stability index[9] for Methionine coded by AUG based on Grantham's physicochemical distance
Alternative codons Alternative amino acids Probabilities Grantham's distances[4] Average distance
AUU, AUC, AUA Isoleucine 1/3 10 3.33
ACG Threonine 1/9 81 9.00
AAG Lysine 1/9 95 10.56
AGG Arginine 1/9 91 10.11
UUG, CUG Leucine 2/9 15 3.33
GUG Valine 1/9 21 2.33
Stability index[9] 38.67

Patterns of amino acid replacement

Evolution of proteins is slower than DNA since only nonsynonymous mutations in DNA can result in amino acid replacements. Most mutations are neutral to maintain protein function and structure. Therefore, the more similar amino acids are, the more probable that they will be replaced. Conservative replacements are more common than radical replacements, since they can result in less important phenotypic changes.[10] On the other hand, beneficial mutations, enhancing protein functions are most likely to be radical replacements.[11] Also, the physicochemical distances, which are based on amino acids properties, are negatively correlated with probability of amino acids substitutions. Smaller distance between amino acids indicates that they are more likely to undergo replacement.

References

  1. ^ Dagan, Tal; Talmor, Yael; Graur, Dan (July 2002). "Ratios of Radical to Conservative Amino Acid Replacement are Affected by Mutational and Compositional Factors and May Not Be Indicative of Positive Darwinian Selection". Molecular Biology and Evolution. 19 (7): 1022–1025. doi:10.1093/oxfordjournals.molbev.a004161. PMID 12082122.
  2. ^ Graur, Dan (2015-01-01). Molecular and Genome Evolution. Sinauer. ISBN 9781605354699.
  3. ^ a b c d Sneath, P. H. (1966-11-01). "Relations between chemical structure and biological activity in peptides". Journal of Theoretical Biology. 12 (2): 157–195. Bibcode:1966JThBi..12..157S. doi:10.1016/0022-5193(66)90112-3. ISSN 0022-5193. PMID 4291386 – via Elsevier Science Direct.
  4. ^ a b c Grantham, R. (1974-09-06). "Amino acid difference formula to help explain protein evolution". Science. 185 (4154): 862–864. Bibcode:1974Sci...185..862G. doi:10.1126/science.185.4154.862. ISSN 0036-8075. PMID 4843792. S2CID 35388307.
  5. ^ a b Epstein, Charles J. (1967-07-22). "Non-randomness of Ammo-acid Changes in the Evolution of Homologous Proteins". Nature. 215 (5099): 355–359. Bibcode:1967Natur.215..355E. doi:10.1038/215355a0. PMID 4964553. S2CID 38859723.
  6. ^ a b Miyata, T.; Miyazawa, S.; Yasunaga, T. (1979-03-15). "Two types of amino acid substitutions in protein evolution". Journal of Molecular Evolution. 12 (3): 219–236. Bibcode:1979JMolE..12..219M. doi:10.1007/BF01732340. ISSN 0022-2844. PMID 439147. S2CID 20978738.
  7. ^ a b Yampolsky, Lev Y.; Stoltzfus, Arlin (2005-08-01). "The Exchangeability of Amino Acids in Proteins". Genetics. 170 (4): 1459–1472. doi:10.1534/genetics.104.039107. ISSN 0016-6731. PMC 1449787. PMID 15944362.
  8. ^ Xia, Xuhua (2000-03-31). Data Analysis in Molecular Biology and Evolution. Springer Science & Business Media. ISBN 9780792377672.
  9. ^ a b c Graur, D. (1985-01-01). "Amino acid composition and the evolutionary rates of protein-coding genes". Journal of Molecular Evolution. 22 (1): 53–62. Bibcode:1985JMolE..22...53G. doi:10.1007/BF02105805. ISSN 0022-2844. PMID 3932664. S2CID 23374899.
  10. ^ Zuckerkandl; Pauling (1965). "Evolutionary divergence and convergence in proteins". New York: Academic Press: 97–166.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  11. ^ Dagan, Tal; Talmor, Yael; Graur, Dan (2002-07-01). "Ratios of radical to conservative amino acid replacement are affected by mutational and compositional factors and may not be indicative of positive Darwinian selection". Molecular Biology and Evolution. 19 (7): 1022–1025. doi:10.1093/oxfordjournals.molbev.a004161. ISSN 0737-4038. PMID 12082122.