A central question in protein evolution is the extent to which naturally occurring proteins sample the space of folded structures accessible to the polypeptide chain. Repeat proteins composed of multiple tandem copies of a modular structure unit are widespread in nature and play critical roles in molecular recognition, signaling, and other essential biological processes. Naturally occurring repeat proteins have been reengineered for molecular recognition and modular scaffolding applications. Here we use computational protein design to investigate the space of folded structures that can be generated by tandem repeating a simple helix-loop-helix-loop structural motif. 83 designs with sequences unrelated to known repeat proteins were experimentally characterized. 53 were monomeric and stable at 95 degrees, and 43 have solution x-ray scattering spectra closely consistent with the design models. Crystal structures of 15 designs spanning a broad range of curvatures are in close agreement with the design models with RMSDs ranging from 0.7 to 2.5 Å. Our results show that existing repeat proteins occupy only a small fraction of the possible repeat protein sequence and structure space and that it is possible to design novel repeat proteins with precisely specified geometries, opening up a wide array of new possibilities for biomolecular engineering.
In this figure we show the helical repeat protein universe. a, the geometry of a repeat protein can be described by helical parameters. Axial displacement (z), radius of the helix (r) and angular displacement or twist (ω) between repeat units are depicted. b, designed repeat proteins (grey) cover radius and twist spaces not found in native repeat protein families, in color. Positive ω values indicate designs forming right-handed helices; negative left-handed. Native families are: ANK, ankyrin; ARM, armadillo; TPR, tetratricopeptide repeat; HAT, half TPR; PPR, pentatricopeptide repeat; HEAT, heat repeat; PUM, pumilio homology domain; mTERF, mitochondrial termination factor; TAL, transcription activator-like effector; OTHER, alpha helical repeat proteins not in the other families. Designs structurally validated by small angle x-ray scattering (SAXS) (black) or crystallography (black with red circle) are distributed throughout the space. On top, representative experimentally validated designs of a variety of shapes.