TY - JOUR
T1 - Structure-based learning to predict and model protein-DNA interactions and transcription-factor co-operativity in cis-regulatory elements
AU - Fornes, Oriol
AU - Meseguer, Alberto
AU - Aguirre-Plans, Joachim
AU - Gohl, Patrick
AU - Bota, Patricia M.
AU - Molina Fernández, Ruben
AU - Bonet, Jaume
AU - Chinchilla Hernandez, Altair
AU - Pegenaute, Ferran
AU - Gallego, Oriol
AU - Fernandez Fuentes, Narcis
AU - Olivia, Baldo
N1 - © The Author(s) 2024. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.
PY - 2024/6/12
Y1 - 2024/6/12
N2 - Transcription factor (TF) binding is a key component of genomic regulation. There are numerous high-throughput experimental methods to characterize TF-DNA binding specificities. Their application, however, is both laborious and expensive, which makes profiling all TFs challenging. For instance, the binding preferences of ∼25% human TFs remain unknown; they neither have been determined experimentally nor inferred computationally. We introduce a structure-based learning approach to predict the binding preferences of TFs and the automated modelling of TF regulatory complexes. We show the advantage of using our approach over the classical nearest-neighbor prediction in the limits of remote homology. Starting from a TF sequence or structure, we predict binding preferences in the form of motifs that are then used to scan a DNA sequence for occurrences. The best matches are either profiled with a binding score or collected for their subsequent modeling into a higher-order regulatory complex with DNA. Co-operativity is modelled by: (i) the co-localization of TFs and (ii) the structural modeling of protein-protein interactions between TFs and with co-factors. We have applied our approach to automatically model the interferon-β enhanceosome and the pioneering complexes of OCT4, SOX2 (or SOX11) and KLF4 with a nucleosome, which are compared with the experimentally known structures.
AB - Transcription factor (TF) binding is a key component of genomic regulation. There are numerous high-throughput experimental methods to characterize TF-DNA binding specificities. Their application, however, is both laborious and expensive, which makes profiling all TFs challenging. For instance, the binding preferences of ∼25% human TFs remain unknown; they neither have been determined experimentally nor inferred computationally. We introduce a structure-based learning approach to predict the binding preferences of TFs and the automated modelling of TF regulatory complexes. We show the advantage of using our approach over the classical nearest-neighbor prediction in the limits of remote homology. Starting from a TF sequence or structure, we predict binding preferences in the form of motifs that are then used to scan a DNA sequence for occurrences. The best matches are either profiled with a binding score or collected for their subsequent modeling into a higher-order regulatory complex with DNA. Co-operativity is modelled by: (i) the co-localization of TFs and (ii) the structural modeling of protein-protein interactions between TFs and with co-factors. We have applied our approach to automatically model the interferon-β enhanceosome and the pioneering complexes of OCT4, SOX2 (or SOX11) and KLF4 with a nucleosome, which are compared with the experimentally known structures.
UR - http://www.scopus.com/inward/record.url?scp=85196121835&partnerID=8YFLogxK
U2 - 10.1093/nargab/lqae068
DO - 10.1093/nargab/lqae068
M3 - Article
C2 - 38867914
SN - 2631-9268
VL - 6
JO - NAR Genomics and Bioinformatics
JF - NAR Genomics and Bioinformatics
IS - 2
M1 - lqae068
ER -