Benchmarking of AlphaFold2 accuracy self-estimates as indicators of empirical model quality and ranking: a comparison with independent model quality assessment programmes

[thumbnail of Open Access]
Preview
Text (Open Access) - Published Version
· Available under License Creative Commons Attribution.
· Please see our End User Agreement before downloading.
| Preview
Available under license: Creative Commons Attribution

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Edmunds, N. S., Genc, A. G. and McGuffin, L. J. orcid id iconORCID: https://orcid.org/0000-0003-4501-4767 (2024) Benchmarking of AlphaFold2 accuracy self-estimates as indicators of empirical model quality and ranking: a comparison with independent model quality assessment programmes. Bioinformatics, 40 (8). btae491. ISSN 1460-2059 doi: 10.1093/bioinformatics/btae491

Abstract/Summary

Motivation Despite an increase in protein modelling accuracy following the development of AlphaFold2, there remains an accuracy gap between predicted and observed model quality assessment (MQA) scores. In CASP15, variations in AlphaFold2 model accuracy prediction were noticed for quaternary models of very similar observed quality. In this study, we compare plDDT and pTM to their observed counterparts the local distance difference test (lDDT) and TM-score for both tertiary and quaternary models to examine whether reliability is retained across the scoring range under normal modelling conditions and in situations where AlphaFold2 functionality is customized. We also explore plDDT and pTM ranking accuracy in comparison with the published independent MQA programmes ModFOLD9 and ModFOLDdock. Results plDDT was found to be an accurate descriptor of tertiary model quality compared to observed lDDT-Cα scores (Pearson r = 0.97), and achieved a ranking agreement true positive rate (TPR) of 0.34 with observed scores, which ModFOLD9 could not improve. However, quaternary structure accuracy was reduced (plDDT r = 0.67, pTM r = 0.70) and significant overprediction was seen with both scores for some lower quality models. Additionally, ModFOLDdock was able to improve upon AF2-Multimer model ranking compared to TM-score (TPR 0.34) and oligo-lDDT score (TPR 0.43). Finally, evidence is presented for increased variability in plDDT and pTM when using custom template recycling, which is more pronounced for quaternary structures.

Altmetric Badge

Item Type Article
URI https://reading-clone.eprints-hosting.org/id/eprint/117806
Identification Number/DOI 10.1093/bioinformatics/btae491
Refereed Yes
Divisions Interdisciplinary centres and themes > Institute for Cardiovascular and Metabolic Research (ICMR)
Life Sciences > School of Biological Sciences > Biomedical Sciences
Publisher Oxford University Press
Download/View statistics View download statistics for this item

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Search Google Scholar