Edward Nikolayevich Trifonov (Hebrew: אדוארד טריפונוב, Russian: Эдуapд Тpифoнoв; b. March 31, 1937) is a Russian-born Israeli molecular biophysicist and a founder of Israeli bioinformatics. In his research, he specializes in the recognition of weak signal patterns in biological sequences and is known for his unorthodox scientific methods.
He discovered the 3-bp and 10-bp periodicity in the DNA sequences, as well as the rules determining the curvature of DNA molecules and their bending within nucleosomes. Trifonov unveiled multiple novel codes in biological sequences and the modular structure of proteins. He proposed an abiogenic theory of the origin of life, and molecular evolution from single nucleotides and amino acids to present-day DNA and protein sequences.
Biography
Trifonov was born in Leningrad (now Saint Petersburg), USSR in 1937. He was raised by his mother, Riva, and his step-father, Nikolay Nikolayevich Trifonov. In his school years, he became interested in medicine and physics.[1] As a result, he went to study biophysics in Moscow. He started his scientific career in the USSR. In 1976, he made aliyah (immigrated as a Jew) to Israel.[2] His role model is Gregor Mendel.[1][3]
Education and scientific career
Trifonov graduated[4] in biophysics from the Moscow Institute of Physics and Technology in 1961 and earned his PhD degree in molecular biophysics there in 1970. He worked as a researcher at the Moscow Physico-Technical Institute from 1961 to 1964. Then he moved to the Biological Department at the I. V. Kurchatov Institute of Atomic Energy in Moscow, staying there until 1975. After his immigration to Israel, he joined the Department of Polymer Research at The Weizmann Institute of Science as an associate professor. He worked there from 1976 to 1991 before moving to the Department of Structural Biology as a full professor in 1992. He was appointed professor emeritus in 2003. During that time, he was also a head of the Center for Genome Structure and Evolution at the Institute of Molecular Sciences in Palo Alto, California (1992–1995).
Trifonov has been a head of the Genome Diversity Center at the Institute of Evolution at the University of Haifa in Israel since 2002, and a professor at Masaryk University in Brno, Czech Republic since 2007.
Membership of learned societies
USSR Biochemical Society (1970)
The Israel National Committee for CODATA (1987)
International Society of Molecular Evolution (1993)
International Society of Gene Therapy and Molecular Biology (1997)
Editorial and advisory Boards
Editor, microbiology and biochemistry sections of Russian "Biological Abstracts" (1970–1975)
Editor, Journal of Biomolecular Structure and Dynamics (1988–1995)
Editorial board and associate editor, Journal of Molecular Evolution (1993–2004)
Editorial board of Gene Therapy and Molecular Biology (since 1997)
Editorial board of OMICS, Journal of Integrative Biology (since 2006)
Editorial advisory board, Journal of Biomolecular Structure and Dynamics (since 2010)
Research
At the beginning of his scientific career, Trifonov studied characteristics of the DNA with biophysical methods. After his relocation to Israel in 1976, he switched over to bioinformatics, and established the first research group for that discipline in the country.[2] He is known for his innovative insights into the world of biological sequences.[5]
Research areas
Periodicity in biological sequences
Trifonov pioneered the application of digital signal processing techniques to biological sequences. In 1980, he and Joel Sussman used autocorrelation to analyse chromatin DNA sequences.[6] They were the first to discover two periodical patterns in the DNA sequences, namely 3 bp and 10-11bp (10.4) periodicity.[7]
Chromatin structure
Since the beginning of his Israeli scientific period Trifonov has been studying the chromatin structure,[8] investigating how certain segments of the DNA are packed inside the cells in protein-DNA complexes called nucleosomes. In a nucleosome, the DNA winds around the histone protein component. The principle of this winding (and thus the rules determining nucleosome positions), was not known at the beginning of the 1980s, although multiple models had been suggested.[9] These included
The "hinge" model: the DNA molecule was assumed to be a rigid rod-like structure interrupted by sharp kinks (up to 90°), with the straight segments being a multiple of 10 bp long.
The "isotropic" model: the DNA molecule is bent smoothly along its length, with the same angle between every two base pairs.
The "mini-kinks" model: Similar to the hinge model, but with smoother kinks every 5 bp.
Trifonov supported the concept of smooth bending of the DNA.[10] However, he proposed that angles between the base pairs are not equal, but their size depends on the particular neighboring base pairs thus introducing an "anisotropic" or "wedge" model.
This model was based on the work of Trifonov and Joel Sussman who had shown[11] in 1980 that some of the dinucleotides (nucleotidedimers) are frequently placed in regular (periodical) distances from each other in the chromatin DNA. This was a breakthrough discovery[11] initiating a search for sequence patterns in the chromatin DNA. They had also pointed out that those dinucleotides repeated with the same period as the estimated pitch (the length of one DNA helix repeat) of the chromatin DNA (10.4 bp).
Thus in his wedge model, Trifonov supposed that each combination of neighboring base pairs form a certain angle (specific for these base pairs). He called this feature curvature.[12] Moreover, he suggested that in addition to curvature, each base pairs step could be deformed to different extent being bound to the histone octamer and he called it bending.[13] These two features of DNA present in the nucleosomes – curvature and bending have been now considered major factors playing a role in the nucleosome positioning.[14]: 41 Periodicity of other dinucleotides were confirmed later by Alexander Bolshoy and co-workers.[15] Finally, an ideal sequence of the nucleosomal DNA was derived in 2009 by Gabdank, Barash and Trifonov.[16] The proposed sequence CGRAAATTTYCG (R standing for a purine: A or G, Y for a pyrimidine: C or T) expresses the preferential order of the dinucleotides in the sequence of the nucleosomal DNA. However, these inferences are disputed by some scientists.[17]
Another question closely related to the chromatin structure which Trifonov pursued to answer was the length of the DNA helical repeat (turn) within nucleosomes.[14]: 42 It is known that in free DNA (i.e. DNA which is not part of a nucleosome), the DNA helix twists 360° per approximately 10.5 bp. In 1979, Trifonov and Thomas Bettecken estimated[18] the length of a nucleosomal DNA repeat to be 10.33–10.4 bp. This value was finally confirmed and refined to 10.4 bp with crystallographic analysis in 2006.[19]
Multiple genetic codes
Trifonov advocates[20]: 4 the notion that biological sequences bear many codes contrary to the generally recognized one genetic code (coding amino acids order). He was also the first to demonstrate[21] that there are multiple codes present in the DNA. He points out that even so called non-coding DNA has a function, i.e. contains codes, although different from the triplet code.
Trifonov recognizes[20]: 5–10 specific codes in the DNA, RNA and proteins:
Clusters of rare codons are placed in the distance of 150 bp from each other.
The translation time of these codons is longer than of their synonymous counterparts which slows down the translation process and thus provides time for the fresh-synthesized segment of a protein to fold properly.
The first ancient codons were GGC and GCC from which the other codons have been derived by series of point mutations. Nowadays, we can see it in modern genes as "mini-genes" containing a purine at the middle position in the codons alternating with segments having a pyrimidine in the middle nucleotides.
Methionines tend to occur every 400 bps in the modern DNA sequences as a result of fusion of ancient independent sequences.
The codes can overlap[20]: 10 each other so that up to 4 different codes can be identified in one DNA sequence (specifically a sequence involved in a nucleosome). According to Trifonov, other codes are yet to be discovered.
Modular structure of proteins
Trifonov's concept of proteinmodules tries to address the questions of proteins evolution and protein folding. In 2000, Trifonov with Berezovsky and Grosberg studied[22] protein sequences and tried to identify simple sequential elements in proteins. They postulated that structurally diverse closed loops of 25–30 amino acid residues
are universal building blocks of protein folds.
They speculated that at the beginning of the evolution, there were short polypeptide chains which later formed these closed loops. They supposed[23] that the loops structure provided more stability to the sequence and thus was favored in the evolution. Modern proteins are probably a group of closed loops fused together.
To trace the evolution of sequences, Trifonov and Zakharia Frenkel introduced[24][25] a concept of proteinsequence space based on the protein modules. It is a network arrangement of sequence fragments of the length of 20 amino acids obtained from a collection of fully sequenced genomes. Each fragment is represented as a node. Two fragments with certain level of similarity to each other are connected with an edge. This approach should make it possible to determine function of uncharacterized proteins.
Protein modularity could also give an answer to the Levinthal's paradox, i.e. the question how a protein sequence can fold in a very short time.[26]
Molecular evolution and the origin of life
In 1996 Thomas Bettecken, a German geneticist noticed[27]: 108 that most of the triplet expansion diseases can be attributed only to two triplets: GCU and GCC, the rest being their permutations or complementary counterparts. He discussed this finding with Trifonov, his friend and colleague. Trifonov had earlier discovered (GCU)n to be a hidden mRNAconsensus sequence. Thus the combination of these two facts led them to the idea that the (GCU)n could reflect a pattern of ancient mRNA sequences.
The first triplets
Since GCU and GCC appeared to be the most expandable (or the most "aggressive") triplets, Trifonov and Bettecken inferred that they could be the first two codons. Their ability to expand rapidly comparing to other triplets would provide them with evolutionary advantage.[28]: 123 Single point mutations of these two would give rise to 14 other triplets.
Consensus temporal order of amino acids
Having the suspected first two triplets, they pondered which amino acids appeared the first, or more generally in which order all the proteinogenic amino acids emerged. To address this question, they resorted[27]: 108 to three, according to them the most natural, hypotheses:
The earliest amino acids were chemically the simplest.
Later on, Trifonov collected even 101 criteria[20]: 123 for the amino acids order. Each criterion could be represented as a vector of length 20 (for 20 basic amino acids). Trifonov averaged over them and got the proposed temporal order of the amino acids emergence, glycine and alanine being the first two ones.
Results and predictions
Trifonov elaborated these concepts further and proposed[27]: 110–115 these notions:
Glycine-content of a protein can be used as a measure of the respective protein age (Glycine clock).[29]
Proteins are composed of short oligopeptides derived from ancient sequences being either oligoalanines or oligoglycines (thus two "alphabets").
These two alphabets distinguished by the type of nucleotide in the middle positions within triplets (purines or pyrimidines) provide us with a "binary code" which can be used for more accurate analyses of proteins relatedness.
Definition of life
A part of Trifonov's work on the molecular evolution is his aim to find a concise definition of life. He collected[30] 123 definitions by other authors. Instead of dealing with logical or philosophical arguments, he analyzed the vocabulary of the present definitions. By an approach close to the Principal component analysis, he derived a consensus definition: "Life is self-reproduction with variations". This work gained multiple critical comments.[31]
Linguistic sequence complexity[32] (LC) is a measure introduced by Trifonov in 1990. It is used for analyses and characterization of biological sequences. LC of a sequence is defined as "richness" of its vocabulary, i.e. how many different substrings of certain length are present in the sequence.
Terminology
DNA curvature vs. DNA bending
Trifonov strictly differentiates[14]: 47 between two notions:
curvature
a property of free DNA which has curvilinear shape due to slight differences in the angles between neighboring base pairs
bending
a deformation of DNA as a result of binding to proteins (e.g. to the histone octamer)
Both of these features are directed by the particular DNA sequence.
(Multiple) Genetic codes
While the scientific community recognize one genetic code,[20]: 4 Trifonov promotes the idea of multiple genetic codes. He adverts to recurring events of a discovery of yet another "the second" genetic code.
Honors
Kurchatov Prize for Young Scientists (1969)
Kurchatov Prize for Basic Research (1971)
Kleeman Professor of Molecular Biophysics (1982–2002)
Trifonov, Edward N. (2008a). "Codes of biosequences". In Barbieri, Marcello (ed.). The Codes of Life. Biosemiotics. Vol. 1. Springer (published 2008). pp. 3–14. doi:10.1007/978-1-4020-6340-4_1. ISBN978-1-4020-6339-8.
Poptsova, Maria S. (2014). Poptsova, Maria S. (ed.). Genome analysis : current procedures and applications. Norfolk: Caister Academic Press. ISBN9781908230294.
Vaidyanathan, P.P.; Yoon, Byung-Jun (2004). "The role of signal-processing concepts in genomics and proteomics". Journal of the Franklin Institute. 341 (1–2): 111–135. CiteSeerX10.1.1.72.6984. doi:10.1016/j.jfranklin.2003.12.001.
Trifonov, Edward N. (1987). "Translation framing code and frame-monitoring mechanism as suggested by the analysis of mRNA and 16 S rRNA nucleotide sequences". Journal of Molecular Biology. 194 (4): 643–652. doi:10.1016/0022-2836(87)90241-5. ISSN0022-2836. PMID2443708.
Trifonov, Edward N. (1990). "Making sense of the human genome". Structure and Methods, Vol. 1. Human Genome Initiative and DNA Recombination; Proceedings of the Sixth Conversation in the Discipline Biomolecular Stereodynamics. Albany, New York, USA: Adenine Press. pp. 69–78. ISBN0-940030-29-2.
Trifonov, Edward N. (1999). "Glycine clock: eubacteria first, archaea next, protoctista, fungi, planta and animalia at last". Gene Therapy and Molecular Biology. 4: 313–322.
Berezovsky, Igor N.; Grosberg, Alexander Y.; Trifonov, Edward N. (2000). "Closed loops of nearly standard size: common basic element of protein structure". FEBS Letters. 466 (2–3): 283–286. doi:10.1016/S0014-5793(00)01091-7. ISSN0014-5793. PMID10682844.
Trifonov, Edward N.; Berezovsky, Igor N. (2003). "Evolutionary aspects of protein structure and folding". Current Opinion in Structural Biology. 13 (1): 110–114. doi:10.1016/S0959-440X(03)00005-8. ISSN0959-440X. PMID12581667. ISI:000181133300015.
Trifonov, Edward N. "Edward N. Trifonov Ph.D." The Institute of Evolution, University of Haifa. Archived from the original on 20 February 2012. Retrieved 20 March 2012.
Trifonov, Edward N. "(Curriculum Vitae)". The Institute of Evolution, University of Haifa. Archived from the original on 28 September 2013. Retrieved 20 March 2012.