Benchmarking of AlphaFold2 accuracy self-estimates as indicators of empirical model quality and ranking: a comparison with independent model quality assessment programmes

Download

Preview

Text (Open Access) - Published Version
· Available under License Creative Commons Attribution.
· Please see our End User Agreement before downloading. | Preview

Available under license: Creative Commons Attribution

Advice

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Tools

Lists

Edmunds, N. S., Genc, A. G. and McGuffin, L. J. ORCID: https://orcid.org/0000-0003-4501-4767 (2024) Benchmarking of AlphaFold2 accuracy self-estimates as indicators of empirical model quality and ranking: a comparison with independent model quality assessment programmes. Bioinformatics, 40 (8). btae491. ISSN 1460-2059 doi: 10.1093/bioinformatics/btae491

Abstract/Summary

Motivation Despite an increase in protein modelling accuracy following the development of AlphaFold2, there remains an accuracy gap between predicted and observed model quality assessment (MQA) scores. In CASP15, variations in AlphaFold2 model accuracy prediction were noticed for quaternary models of very similar observed quality. In this study, we compare plDDT and pTM to their observed counterparts the local distance difference test (lDDT) and TM-score for both tertiary and quaternary models to examine whether reliability is retained across the scoring range under normal modelling conditions and in situations where AlphaFold2 functionality is customized. We also explore plDDT and pTM ranking accuracy in comparison with the published independent MQA programmes ModFOLD9 and ModFOLDdock. Results plDDT was found to be an accurate descriptor of tertiary model quality compared to observed lDDT-Cα scores (Pearson r = 0.97), and achieved a ranking agreement true positive rate (TPR) of 0.34 with observed scores, which ModFOLD9 could not improve. However, quaternary structure accuracy was reduced (plDDT r = 0.67, pTM r = 0.70) and significant overprediction was seen with both scores for some lower quality models. Additionally, ModFOLDdock was able to improve upon AF2-Multimer model ranking compared to TM-score (TPR 0.34) and oligo-lDDT score (TPR 0.43). Finally, evidence is presented for increased variability in plDDT and pTM when using custom template recycling, which is more pronounced for quaternary structures.

Altmetric Badge

Item Type	Article
URI	https://reading-clone.eprints-hosting.org/id/eprint/117806
Identification Number/DOI	10.1093/bioinformatics/btae491
Refereed	Yes
Divisions	Interdisciplinary centres and themes > Institute for Cardiovascular and Metabolic Research (ICMR) Life Sciences > School of Biological Sciences > Biomedical Sciences
Publisher	Oxford University Press
Download/View statistics	View download statistics for this item

Download Statistics

Downloads

Downloads per month over past year

Deposit Details

Date Deposited:	28 Aug 2024 09:55	Date item deposited into CentAUR
Last Modified:	23 Mar 2025 02:30	Date item last modified

University Staff: Request a correction | Centaur Editors: Update this record

Search Google Scholar