Dec 5, 2024
2:30pm - 2:45pm
Hynes, Level 1, Room 103
Thalyta Santiago1,2,Brian Carrick2,Melody Morris2,Marisa Beppu1,Bradley Olsen2
Universidade Estadual de Campinas1,Massachusetts Institute of Technology2
Thalyta Santiago1,2,Brian Carrick2,Melody Morris2,Marisa Beppu1,Bradley Olsen2
Universidade Estadual de Campinas1,Massachusetts Institute of Technology2
Intrinsically disordered proteins (IDPs) have the ability to undergo liquid-liquid phase separation (LLPS) in cells to form biomolecular coacervates that facilitate complex biological processes. The ability of an IDP to undergo LLPS is frequently associated with the presence of low-complexity regions, characterized by low amino acid diversity and often repetitive sequences. Motivated by this, possible consensus repeat sequences of IDPs were identified from the DisProt database by categorizing amino acids into four classes based on charge and hydrophilicity, generating a simplified protein representation. These simplified sequences were aligned at various repeat unit cassette lengths and compared using a one-hot encoding strategy, revealing consensus repeat units intrinsic to the protein sequence. These reduced motifs were then manually transcribed to base amino acids and expressed in <i>E. coli</i>. The ability to form phase separation complexes was confirmed <i>via</i> turbidimetry, demonstrating that the engineered proteins exhibited a similar coacervation behavior compared to the native protein. We propose that additional consensus repeat sequences can be identified and extracted to further develop a platform of engineered protein materials based on these systems.