TY - JOUR
T1 - Leveraging transformers-based language models in proteome bioinformatics
AU - Le, Nguyen Quoc Khanh
N1 - Publisher Copyright:
© 2023 Wiley-VCH GmbH.
PY - 2023/12
Y1 - 2023/12
N2 - In recent years, the rapid growth of biological data has increased interest in using bioinformatics to analyze and interpret this data. Proteomics, which studies the structure, function, and interactions of proteins, is a crucial area of bioinformatics. Using natural language processing (NLP) techniques in proteomics is an emerging field that combines machine learning and text mining to analyze biological data. Recently, transformer-based NLP models have gained significant attention for their ability to process variable-length input sequences in parallel, using self-attention mechanisms to capture long-range dependencies. In this review paper, we discuss the recent advancements in transformer-based NLP models in proteome bioinformatics and examine their advantages, limitations, and potential applications to improve the accuracy and efficiency of various tasks. Additionally, we highlight the challenges and future directions of using these models in proteome bioinformatics research. Overall, this review provides valuable insights into the potential of transformer-based NLP models to revolutionize proteome bioinformatics.
AB - In recent years, the rapid growth of biological data has increased interest in using bioinformatics to analyze and interpret this data. Proteomics, which studies the structure, function, and interactions of proteins, is a crucial area of bioinformatics. Using natural language processing (NLP) techniques in proteomics is an emerging field that combines machine learning and text mining to analyze biological data. Recently, transformer-based NLP models have gained significant attention for their ability to process variable-length input sequences in parallel, using self-attention mechanisms to capture long-range dependencies. In this review paper, we discuss the recent advancements in transformer-based NLP models in proteome bioinformatics and examine their advantages, limitations, and potential applications to improve the accuracy and efficiency of various tasks. Additionally, we highlight the challenges and future directions of using these models in proteome bioinformatics research. Overall, this review provides valuable insights into the potential of transformer-based NLP models to revolutionize proteome bioinformatics.
KW - bioinformatics
KW - deep learning
KW - drug discovery
KW - explainable artificial intelligence
KW - natural language processing
KW - protein expression
KW - protein function prediction
KW - transformer attention
UR - http://www.scopus.com/inward/record.url?scp=85163661049&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85163661049&partnerID=8YFLogxK
U2 - 10.1002/pmic.202300011
DO - 10.1002/pmic.202300011
M3 - Review article
C2 - 37381841
AN - SCOPUS:85163661049
SN - 1615-9853
VL - 23
JO - Proteomics
JF - Proteomics
IS - 23-24
M1 - 2300011
ER -