Skip to yearly menu bar Skip to main content


Poster

Protein Fitness Landscape: Spectral Graph Theory Perspective

Danqi Liao · Daniel Steinberg


Abstract:

In this work, we present a novel theoretical framework for analyzing and modeling protein fitness landscapes using spectral graph theory. By representing the protein sequence space as a generalized Hamming graph and studying its spectral properties, we derive a set of powerful tools for quantifying the ruggedness, epistasis, and other key characteristics of the landscape. We prove strong approximation and sampling results, showing that the landscape can be efficiently learned and optimized from limited and noisy data. Building on this foundation, we introduce Propagational Convolutional Neural Networks (PCNNs), a new class of inductive surrogate oracle. We provide rigorous theoretical guarantees on the generalization and convergence properties of PCNNs, using techniques from the Neural Tangent Kernel framework. Extensive experiments on real-world protein engineering tasks demonstrate the superiority of PCNNs over state-of-the-art methods, achieving higher fitness and better generalization from limited data.

Live content is unavailable. Log in and register to view live content