The influence of prompt engineering on large language models for protein–protein interaction identification in biomedical literature

Yung Chun Chang, Ming Siang Huang, Yi Hsuan Huang, Yi Hsuan Lin

Research output: Contribution to journalArticlepeer-review

Abstract

Identifying protein–protein interactions (PPIs) is a foundational task in biomedical natural language processing. While specialized models have been developed, the potential of general-domain large language models (LLMs) in PPI extraction, particularly for researchers without computational expertise, remains unexplored. This study evaluates the effectiveness of proprietary LLMs (GPT-3.5, GPT-4, and Google Gemini) in PPI prediction through systematic prompt engineering. We designed six prompting scenarios of increasing complexity, from basic interaction queries to sophisticated entity-tagged formats, and assessed model performance across multiple benchmark datasets (LLL, IEPA, HPRD50, AIMed, BioInfer, and PEDD). Carefully designed prompts effectively guided LLMs in PPI prediction. Gemini 1.5 Pro achieved the highest performance across most datasets, with notable F1-scores in LLL (90.3%), IEPA (68.2%), HPRD50 (67.5%), and PEDD (70.2%). GPT-4 showed competitive performance, particularly in the LLL dataset (87.3%). We identified and addressed a positive prediction bias, demonstrating improved performance after evaluation refinement. While not surpassing specialized models, general-purpose LLMs with appropriate prompting strategies can effectively perform PPI prediction tasks, offering valuable tools for biomedical researchers without extensive computational expertise.

Original languageEnglish
Article number15493
JournalScientific Reports
Volume15
Issue number1
DOIs
Publication statusPublished - Dec 2025

Keywords

  • Large language model
  • Natural language processing
  • Protein–protein interaction
  • Relation extraction

ASJC Scopus subject areas

  • General

Fingerprint

Dive into the research topics of 'The influence of prompt engineering on large language models for protein–protein interaction identification in biomedical literature'. Together they form a unique fingerprint.

Cite this