Abnormal protein–membrane attachment is involved in deregulated cellular pathways and in disease. Therefore, the possibility to modulate protein–membrane interactions represents a new promising therapeutic strategy. A major obstacle in this drug design strategy is that the membrane-binding domains of peripheral membrane proteins are usually unknown. The development of fast and efficient algorithms predicting the protein– membrane interface would shed light into the accessibility of membrane–protein interfaces by drug-like molecules. Herein, we describe an ensemble machine learning methodology and algorithm for predicting membrane-penetrating amino acids. We utilize available experimental data from the literature for training 21 machine learning classifiers and meta-classifiers. Evaluation of the best ensemble classifier model accuracy yields a macro-averaged F1 score = 0.92 and a Matthews correlation coefficient = 0.84 for predicting correctly membrane-penetrating amino acids on unknown proteins of a validation set. The python code for predicting protein–membrane interfaces of peripheral membrane proteins is available at https://github.com/zoecournia/DREAMM.
Predicting protein-membrane interfaces of peripheral membrane proteins using ensemble machine learning.
Chatzigoulas A, Cournia Z. Brief Bioinform. 2022 Mar 10;23(2):bbab518. doi: 10.1093/bib/bbab518.