In many cases the trefoil knot is part of the active site or a ligand-binding site and is critical to the activity of the enzyme in which it appears. Before the discovery of the first knotted protein, it was believed that the process of protein folding could not efficiently produce deep knots in protein backbones. Studies of the folding kinetics of a dimeric protein from Haemophilus influenzae have revealed that the folding of trefoil knot proteins may depend on proline isomerization.[5] Computational algorithms have been developed to identify knotted protein structures, both to canvas the Protein Data Bank for previously undetected natural knots and to identify knots in protein structure predictions, where they are unlikely to accurately reproduce the native-state structure due to the rarity of knots in known proteins.[6]
Knottins are small, diverse and stable proteins with important drug design potential. They can be classified in 30 families which cover a wide range of sequences (1621 sequenced), three-dimensional structures (155 solved) and functions (> 10). Inter knottin similarity lies mainly between 20% and 40% sequence identity and 1.5 to 4 A backbone deviations although they all share a tightly knotted disulfide core. This important variability is likely to arise from the highly diverse loops which connect the successive knotted cysteines. The prediction of structural models for all knottin sequences would open new directions for the analysis of interaction sites and to provide a better understanding of the structural and functional organization of proteins sharing this scaffold.[7]
Trefoil domain
Protein domain
Trefoil (P-type) domain
Structure of pancreatic spasmolytic polypeptide.[8]
Trefoil (P-type) domain is a cysteine-rich domain of approximately forty five amino-acid residues has been found in some extracellular eukaryotic proteins.[9][10][11][12] It is known as either the 'P', 'trefoil' or 'TFF' domain, and contains six cysteines linked by three disulphide bonds with connectivity 1–5, 2–4, 3–6.
The domain has been found in a variety of extracellular eukaryotic proteins,[9][11][12] including protein pS2 (TFF1) a protein secreted by the stomach mucosa; spasmolytic polypeptide (SP) (TFF2), a protein of about 115 residues that inhibits gastrointestinal motility and gastric acidsecretion; intestinal trefoil factor (ITF) (TFF3); Xenopus laevis stomach proteins xP1 and xP4; xenopus integumentary mucins A.1 (preprospasmolysin) and C.1, proteins which may be involved in defense against microbial infections by protecting the epithelia from the external environment; xenopus skin protein xp2 (or APEG); Zona pellucida sperm-binding protein B (ZP-B); intestinal sucrase-isomaltase (EC3.2.1.48 / EC3.2.1.10), a vertebrate membrane bound, multifunctional enzyme complex which hydrolyzes sucrose, maltose and isomaltose; and lysosomal alpha-glucosidase (EC3.2.1.20).
Examples
Human gene encoding proteins containing the trefoil domain include:
^Mallam, Anna L.; Jackson, Sophie E. (2006). "Probing Nature's Knots: The Folding Pathway of a Knotted Homodimeric Protein". Journal of Molecular Biology. 359 (5): 1420–1436. doi:10.1016/j.jmb.2006.04.032. PMID16787779.
^Gajhede M, Petersen TN, Henriksen A, et al. (December 1993). "Pancreatic spasmolytic polypeptide: first three-dimensional structure of a member of the mammalian trefoil family of peptides". Structure. 1 (4): 253–62. doi:10.1016/0969-2126(93)90014-8. PMID8081739.
^ abHoffmann W, Hauser F (1993). "The P-domain or trefoil motif: a role in renewal and pathology of mucous epithelia?". Trends in Biochemical Sciences. 18 (7): 239–243. doi:10.1016/0968-0004(93)90170-R. PMID8267796.
Tkaczuk KL, Dunin-Horkawicz S, Purta E, Bujnicki JM. (2007). Structural and evolutionary bioinformatics of the SPOUT superfamily of methyltransferases. BMC Bioinformatics. 8:73