A Comparative Evaluation of QLoRA and AdaLoRA for Parameter-Efficient Fine-Tuning of Large Language Models on Medical Textbook Question Answering

Seda Bayat Toksoz; Gultekin Isik

doi:10.69882/adba.ai.2026014

Vol. 2 No. 1 (2026), Articles

Vol. 2 No. 1 (2026)

A Comparative Evaluation of QLoRA and AdaLoRA for Parameter-Efficient Fine-Tuning of Large Language Models on Medical Textbook Question Answering

Articles

Published 2026-02-24

Seda Bayat Toksoz⁺⁻
Gultekin Isik⁺⁻

Seda Bayat Toksoz

Igdir University, Türkiye

https://orcid.org/0000-0002-8427-9971

Gultekin Isik

Igdir University, Türkiye

https://orcid.org/0000-0003-3037-5586

Pdf File

Keywords

Parameterefficient finetuning
QLoRA
AdaLoRA
Large language models
Medical question answering

How to Cite

A Comparative Evaluation of QLoRA and AdaLoRA for Parameter-Efficient Fine-Tuning of Large Language Models on Medical Textbook Question Answering. (2026). Artificial Intelligence in Applied Sciences, 2(1), 27-31. https://doi.org/10.69882/adba.ai.2026014

Abstract

Parameter-efficient fine-tuning methods have emerged as practical solutions for adapting large language models to specialized domains while minimizing computational overhead. This study presents a systematic comparison of two prominent approaches, QLoRA and AdaLoRA, for fine-tuning instruction-tuned language models on medical textbook question answering. We evaluated both methods using two backbone architectures, Llama-3-8B-Instruct and Qwen2-7B-Instruct, on a dataset comprising 6,500 question-answer pairs derived from 13 authoritative medical textbooks spanning diverse clinical and biomedical disciplines. Our experiments demonstrate that QLoRA consistently outperforms AdaLoRA under single-epoch training conditions, achieving validation perplexity values of 1.085 and 1.086 for Llama-3 and Qwen2, respectively, compared to AdaLoRA’s 1.125 and 1.169. These results correspond to relative validation loss reductions of 30.8% for Llama-3 and 47.5% for Qwen2 when using QLoRA over AdaLoRA. Both methods maintained comparable trainable parameter counts, approximately 167 million for Llama-3 and 161 million for Qwen2, representing roughly 3.5% of total model parameters. Our findings indicate that QLoRA provides more stable convergence behavior within limited training budgets, while AdaLoRA’s adaptive rank allocation mechanism may require extended training schedules to realize its theoretical advantages. These results offer practical guidance for deploying parameter-efficient fine-tuning in medical natural language processing applications where computational resources are constrained.

Pdf File

References

Bayat Toksoz, S. and G. Isik, 2025. Parameter-efficient fine-tuning of LLaMA models for financial sentiment classification. Cluster Computing 29: 41.

Brown, T. B., B. Mann, N. Ryder, et al., 2020. Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pp. 1877–1901.

Chowdhery, A., S. Narang, J. Devlin, et al., 2023. PaLM: Scaling language modeling with pathways. Journal of Machine Learning Research 24: 1–113.

Dettmers, T., A. Pagnoni, A. Holtzman, and L. Zettlemoyer, 2024. QLoRA: Efficient fine-tuning of quantized LLMs. In Advances in Neural Information Processing Systems, volume 36, pp. 10088–10115.

Han, T., L. C. Adams, J.-M. Papaioannou, et al., 2023. MedAlpaca: An open-source collection of medical conversational AI models and training data. arXiv preprint.

Han, Z., C. Gao, J. Liu, et al., 2024. Parameter-efficient fine-tuning for large models: A comprehensive survey. arXiv preprint.

Houlsby, N., A. Giurgiu, S. Jastrzebski, et al., 2019. Parameter-efficient transfer learning for NLP. In Proceedings of ICML, volume 97, pp. 2790–2799.

Hu, E. J., Y. Shen, P. Wallis, et al., 2022. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations (ICLR).

Huang, K., J. Altosaar, and R. Ranganath, 2019. ClinicalBERT: Modeling clinical notes and predicting hospital readmission. arXiv preprint.

Karabacak, M. and K. Margetis, 2023. Embracing large language models for medical applications. Cureus 15: e39305.

Lee, J., W. Yoon, S. Kim, et al., 2020. BioBERT: A pre-trained biomedical language representation model. Bioinformatics 36: 1234–1240.

Lester, B., R. Al-Rfou, and N. Constant, 2021. The power of scale for parameter-efficient prompt tuning. In Proceedings of EMNLP, pp. 3045–3059.

Li, X. L. and P. Liang, 2021. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the ACL, pp. 4582–4597.

Li, Y., Z. Li, K. Zhang, et al., 2023. ChatDoctor: A medical chat model fine-tuned on LLaMA. Cureus 15: e40895.

Lialin, V., V. Deshpande, and A. Rumshisky, 2023. Scaling down to scale up: A guide to parameter-efficient fine-tuning. arXiv preprint.

Singhal, K., S. Azizi, T. Tu, et al., 2023. Large language models encode clinical knowledge. Nature 620: 172–180.

Thirunavukarasu, A. J., D. S. J. Ting, et al., 2023. Large language models in medicine. Nature Medicine 29: 1930–1940.

Zhang, Q., M. Chen, A. Bukharin, et al., 2023. AdaLoRA: Adaptive budget allocation for parameter-efficient fine-tuning. In International Conference on Learning Representations (ICLR).

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

A Comparative Evaluation of QLoRA and AdaLoRA for Parameter-Efficient Fine-Tuning of Large Language Models on Medical Textbook Question Answering

Keywords

How to Cite

Download Citation

Abstract

References

Similar Articles