Excellence in Research: De Novo Protein Modeling, Beyond Topology Determination Problem

  • Al Nasr, Kamal Hasan (PI)
  • Chen, Wei (CoPI)

Project: Research project

Project Details

Description

Three-dimensional (3D) structure has a large impact on the behavior of biological systems. For example, proteins can speed up reactions, recognize and kill pathogens, or shuttle material into and out of a cell. Protein structure helps determine its ultimate function. Viruses are protected by proteins that form an outer coating, creating a viral particle. The structure of viral particles helps determine how effective a virus is in slipping past the immune system and infecting cells. Machine learning will be employed to try to understand the relationship between a protein's 3-D structure and its function. This project is expected to contribute to protein modeling and drug design. This project will also provide research and educational opportunities for students at Tennessee State University (TSU), an HBCU. This will increase the diversity of the workforce prepared to contribute to the emerging biotechnology economy.

Most pharmaceutical drugs target membrane proteins. Conventional determination techniques such as X-ray crystallography and traditional computational techniques are unsuccessful with many types of proteins, such as membrane proteins, proteins that are hard to crystallize, and macromolecular proteins. Cryo-Electron Microscopy (cryo-EM) is a relatively new biophysics technique with the capability to generate volumetric images of macromolecular complexes and ensembles like ribosomes and viral capsids. Usually, cryo-EM images are large and noisy datasets. The project will investigate a novel idea to reduce the size of the image and build volume components using a geometric graph model that can drastically reduce the complexity of the problem. The volume components will be analyzed and classified to identify the secondary structure elements (SSEs) of proteins using advanced machine learning and geometric properties of the volume components. The goal of this project is to develop efficient algorithms in a fully automatic framework to identify the spatial information of the SSEs. The project has three objectives:1) Build a new mathematical model based on geometrical graph theory to reduce the size of cryo-EM images. 2) Develop efficient algorithms, including convolutional neural networks and other machine learning approaches, to process, analyze, and extract the SSEs from the volumetric images of protein macromolecule. 3) Integrate the new methods with an existing visualization and modeling platform for molecules like Chimera. If successful, this project could advance the state-of-the-art methods recently developed in protein modeling using cryo-EM by providing an automated framework that is less sensitive to noise and is segmentation-free.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

StatusFinished
Effective start/end date9/1/228/31/25

Funding

  • National Science Foundation: $535,066.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.