2023 LLMInstructionExampleAdaptivePr

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Biomedical Relation Extraction, LEAP Framework, Complex Relation Extraction.

Notes

Cited By

Quotes

Abstract

Objective
To investigate the demonstration in Large Language Models (LLMs) for clinical relation extraction. We focus on examining two types of adaptive demonstration: instruction adaptive prompting, and example adaptive prompting to understand their impacts and effectiveness.
Materials and Methods
The study unfolds in two stages. Initially, we explored a range of demonstration components vital to LLMs’ clinical data extraction, such as task descriptions and examples, and tested their combinations. Subsequently, we introduced the Instruction-Example Adaptive Prompting (LEAP) Framework, a system that integrates two types of adaptive prompts: one preceding instruction and another before examples. This framework is designed to systematically explore both adaptive task description and adaptive examples within the demonstration. We evaluated LEAP framework’s performance on the DDI and BC5CDR chemical interaction datasets, applying it across LLMs such as Llama2-7b, Llama2-13b, and MedLLaMA_13B.
Results
The study revealed that Instruction + Options + Examples and its expanded form substantially raised F1-scores over the standard Instruction + Options mode. LEAP framework excelled, especially with example adaptive prompting that outdid traditional instruction tuning across models. Notably, the MedLLAMA-13b model scored an impressive 95.13 F1 on the BC5CDR dataset with this method. Significant improvements were also seen in the DDI 2013 dataset, confirming the method’s robustness in sophisticated data extraction.
Conclusion
The LEAP framework presents a promising avenue for refining LLM training strategies, steering away from extensive finetuning towards more contextually rich and dynamic prompting methodologies.

References

  1. (Y. Wang et al., 2018) ⇒ Y. Wang, L. Wang, M. Rastegar-Mojarad, S. Moon, F. Shen, N. Afzal, S. Liu, Y. Zeng, S. Mehrabi, S. Sohn, and H. Liu. (2018). “Clinical information extraction applications: A literature review.” In: Journal of Biomedical Informatics, 77, 34–49. [1]
  2. (Y. Gu et al., 2022) ⇒ Y. Gu, R. Tinn, H. Cheng, M. Lucas, N. Usuyama, X. Liu, T. Naumann, J. Gao, and H. Poon. (2022). “Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing.” In: ACM Transactions on Computing for Healthcare, 3(1), 1–23. [2]
  3. (J. Lee et al., 2020) ⇒ J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang. (2020). “BioBERT: A pre-trained biomedical language representation model for biomedical text mining.” In: Bioinformatics, 36(4), 1234–1240. [3]
  4. (A. Roy & S. Pan, 2021) ⇒ A. Roy and S. Pan. (2021). “Incorporating medical knowledge in BERT for clinical relation extraction.” In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 5357–5366. [4]
  5. (I. Beltagy et al., 2019) ⇒ I. Beltagy, K. Lo, and A. Cohan. (2019). “SciBERT: A Pretrained Language Model for Scientific Text.” In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 3613–3618. [5]
  6. (M. Yasunaga et al., 2022) ⇒ M. Yasunaga, J. Leskovec, and P. Liang. (2022). “LinkBERT: Pretraining Language Models with Document Links.” In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 8003–8016. [6]
  7. (H. Zhou et al., 2023) ⇒ H. Zhou, R. Austin, S.-C. Lu, G. M. Silverman, Y. Zhou, H. Kilicoglu, H. Xu, and R. Zhang. (2023). “Complementary and Integrative Health Information in the literature: Its lexicon and named entity recognition.” In: Journal of the American Medical Informatics Association, ocad216. [7]
  8. (J. Clusmann et al., 2023) ⇒ J. Clusmann, F. R. Kolbinger, H. S. Muti, Z. I. Carrero, J.-N. Eckardt, N. G. Laleh, C. M. L. Löffler, S.-C. Schwarzkopf, M. Unger, G. P. Veldhuizen, S. J. Wagner, and J. N. Kather. (2023). “The future landscape of large language models in medicine.” In: Communications Medicine, 3(1), 141. [8]
  9. (D. Demszky et al., 2023) ⇒ D. Demszky, D. Yang, D. S. Yeager, C. J. Bryan, M. Clapper, S. Chandhok, J. C. Eichstaedt, C. Hecht, J. Jamieson, M. Johnson, M. Jones, D. Krettek-Cobb, L. Lai, N. JonesMitchell, D. C. Ong, C. S. Dweck, J. J. Gross, J. W. Pennebaker, and others. (2023). “Using large language models in psychology.” In: Nature Reviews Psychology, 2(11), 688–701. [9]
  10. (M. Li et al., 2023) ⇒ M. Li, M. Chen, H. Zhou, and R. Zhang. (2023). “PeTailor: Improving Large Language Model by Tailored Chunk Scorer in Biomedical Triple Extraction.” [10]
  11. (A. B. Mbakwe et al., 2023) ⇒ A. B. Mbakwe, I. Lourentzou, L. A. Celi, O. J. Mechanic, and A. Dagan. (2023). “ChatGPT passing USMLE shines a spotlight on the flaws of medical education.” In: PLOS Digital Health, 2(2), e0000205. [11]
  12. (K. Singhal et al., 2023) ⇒ K. Singhal, S. Azizi, T. Tu, S. S. Mahdavi, J. Wei, H. W. Chung, N. Scales, A. Tanwani, H. Cole-Lewis, S. Pfohl, P. Payne, M. Seneviratne, P. Gamble, C. Kelly, A. Babiker, N. Schärli, A. Chowdhery, P. Mansfield, D. Demner-Fushman, and others. (2023). “Large language models encode clinical knowledge.” In: Nature, 620(7972), 172–180. [12]
  13. (L. Tang et al., 2023) ⇒ L. Tang, Z. Sun, B. Idnay, J. G. Nestor, A. Soroush, P. A. Elias, Z. Xu, Y. Ding, G. Durrett, J. F. Rousseau, C. Weng, and Y. Peng. (2023). “Evaluating large language models on medical evidence summarization.” In: Npj Digital Medicine, 6(1), 158. [13]
  14. (A. J. Thirunavukarasu et al., 2023) ⇒ A. J. Thirunavukarasu, D. S. J. Ting, K. Elangovan, L. Gutierrez, T. F. Tan, and D. S. W. Ting. (2023). “Large language models in medicine.” In: Nature Medicine, 29(8), 1930–1940. [14]
  15. (S. Zhang et al., 2023) ⇒ S. Zhang, L. Dong, X. Li, S. Zhang, X. Sun, S. Wang, J. Li, R. Hu, T. Zhang, F. Wu, and G. Wang. (2023). “Instruction Tuning for Large Language Models: A Survey.” [15]
  16. (R. Lou et al., 2023) ⇒ R. Lou, K. Zhang, and W. Yin. (2023). “Is Prompt All You Need? No. A Comprehensive and Broader View of Instruction Learning.” [16]
  17. (J. Wei et al., 2021) ⇒ J. Wei, M. Bosma, V. Y. Zhao, K. Guu, A. W. Yu, B. Lester, N. Du, A. M. Dai, and Q. V. Le. (2021). “Finetuned Language Models Are Zero-Shot Learners.” [17]
  18. (M. Li & L. Huang, 2023) ⇒ M. Li and L. Huang. (2023). “Understand the Dynamic World: An End-to-End Knowledge Informed Framework for Open Domain Entity State Tracking.” [18]
  19. (A. Prasad et al., 2023) ⇒ A. Prasad, P. Hase, X. Zhou, and M. Bansal. (2023). “GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models.” In: Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 3845–3864. [19]
  20. (Y. Zhou et al., 2022) ⇒ Y. Zhou, A. I. Muresanu, Z. Han, K. Paster, S. Pitis, H. Chan, and J. Ba. (2022). “Large Language Models Are Human-Level Prompt Engineers.” [20]
  21. (J. Wei et al., 2022) ⇒ J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, and D. Zhou. (2022). “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.” [21]
  22. (Z. Wan et al., 2023) ⇒ Z. Wan, F. Cheng, Z. Mao, Q. Liu, H. Song, J. Li, and S. Kurohashi. (2023). “GPT-RE: In-context Learning for Relation Extraction using Large Language Models.” [22]
  23. (F. Chen & Y. Feng, 2023) ⇒ F. Chen and Y. Feng. (2023). “Chain-of-Thought Prompt Distillation for Multimodal Named Entity Recognition and Multimodal Relation Extraction.” [23]
  24. (S. Wadhwa et al., 2023) ⇒ S. Wadhwa, S. Amir, and B. C. Wallace. (2023). “Revisiting Relation Extraction in the era of Large Language Models.” [24]
  25. (S. Meng et al., 2023) ⇒ S. Meng, X. Hu, A. Liu, S. Li, F. Ma, Y. Yang, and L. Wen. (2023). “RAPL: A Relation-Aware Prototype Learning Approach for Few-Shot Document-Level Relation Extraction (arXiv:2310.15743).” In: arXiv. [25]
  26. (X. Xu et al., 2023) ⇒ X. Xu, Y. Zhu, X. Wang, and N. Zhang. (2023). “How to Unleash the Power of Large Language Models for Few-shot Relation Extraction?" [26]
  27. (M. Li et al., 2023) ⇒ M. Li, M. Chen, H. Zhou, and R. Zhang. (2023). “PeTailor: Improving Large Language Model by Tailored Chunk Scorer in Biomedical Triple Extraction.” [27]
  28. (C. Gao et al., 2023) ⇒ C. Gao, X. Fan, J. Sun, and X. Wang. (2023). “PromptRE: Weakly-Supervised Document-Level Relation Extraction via Prompting-Based Data Programming.” [28]
  29. (O. Rubin et al., 2021) ⇒ O. Rubin, J. Herzig, and J. Berant. (2021). “Learning To Retrieve Prompts for In-Context Learning.” [29]
  30. (S. Zhang et al., 2023) ⇒ S. Zhang, L. Dong, X. Li, S. Zhang, X. Sun, S. Wang, J. Li, R. Hu, T. Zhang, F. Wu, and G. Wang. (2023). “Instruction Tuning for Large Language Models: A Survey.” [30]
  31. (X. Liu et al., 2022) ⇒ X. Liu, K. Ji, Y. Fu, W. Tam, Z. Du, Z. Yang, and J. Tang. (2022). “P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks.” In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 61–68. [31]
  32. (B. Lester et al., 2021) ⇒ B. Lester, R. Al-Rfou, and N. Constant. (2021). “The Power of Scale for Parameter-Efficient Prompt Tuning.” [32]
  33. (X. Liu et al., 2021) ⇒ X. Liu, K. Ji, Y. Fu, W. L. Tam, Z. Du, Z. Yang, and J. Tang. (2021). “P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks.” [33]
  34. (T. Schick & H. Schütze, 2021) ⇒ T. Schick and H. Schütze. (2021). “Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference.” In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 255–269. [34]
  35. (T. B. Brown et al., 2020) ⇒ T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, and others. (2020). “Language Models are Few-Shot Learners.” [35]
  36. (M. Herrero-Zazo et al., 2013) ⇒ M. Herrero-Zazo, I. Segura-Bedmar, P. Martínez, and T. Declerck. (2013). “The DDI corpus: An annotated corpus with pharmacological substances and drug–drug interactions.” In: Journal of Biomedical Informatics, 46(5), 914–920. [36]
  37. (O. Taboureau et al., 2011) ⇒ O. Taboureau, S. K. Nielsen, K. Audouze, N. Weinhold, D. Edsgard, F. S. Roque, I. Kouskoumvekaki, A. Bora, R. Curpan, T. S. Jensen, S. Brunak, and T. I. Oprea. (2011). “ChemProt: A disease chemical biology database.” In: Nucleic Acids Research, 39(Database), D367–D372. [37]
  38. (H. Touvron et al., 2023) ⇒ H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, D. Bikel, L. Blecher, C. C. Ferrer, M. Chen, G. Cucurull, D. Esiobu, J. Fernandes, J. Fu, W. Fu, and others. (2023). “Llama 2: Open Foundation and Fine-Tuned Chat Models.” [38]
  39. (Chaoyi-wu/MedLLaMA_13B) ⇒ Chaoyi-wu/MedLLaMA_13B. (n.d.). “MedLLaMA_13B.” Retrieved December 13, 2023, from https://huggingface.co/chaoyi-wu/MedLLaMA_13B
  40. (C. Peng et al., 2023) ⇒ C. Peng, X. Yang, K. E. Smith, Z. Yu, A. Chen, J. Bian, and Y. Wu. (2023). “Model Tuning or Prompt Tuning? A Study of Large Language Models for Clinical Concept and Relation Extraction.” [39];


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2023 LLMInstructionExampleAdaptivePrHuixue Zhou
Mingchen Li
Yongkang Xiao
Han Yang
Rui Zhang
LLM Instruction-Example Adaptive Prompting (LEAP) Framework for Clinical Relation Extraction10.1101/2023.12.15.233000592023