Abstract
The identification of effective inhibitors targeting β-site amyloid precursor protein cleaving enzyme-1 (BACE-1) is crucial for developing therapeutic strategies for Alzheimer’s disease. This study developed a structure-based computational framework for predicting BACE-1 inhibitory activity using both deep learning and conventional machine learning techniques. A publicly available BACE-1 dataset with chemical structures defined in SMILES (Simplified Molecular Input Line Entry System) format was subjected to feature extraction using the RDKit program. Global molecular characteristics and substructural information were captured using both molecular fingerprint representations and physicochemical descriptors. Circular (Morgan/ECFP4) fingerprints, RDKit fingerprints, and MACCS keys were used to encode molecular substructures into binary vectors. Subsequently, Support Vector Machines (SVM), k-Nearest Neighbors (kNN), deep neural networks (DNN), and enhanced deep neural networks were trained and validated using these features under the same experimental conditions. Confusion-matrix analysis and standard classification metrics (accuracy, precision, recall, and F1-score) were used to assess the model’s performance. Deep learning models outperformed traditional machine learning techniques in capturing intricate nonlinear structure–activity correlations, according to comparison research. The proposed enhanced DNN demonstrated balanced precision and recall across both classes and achieved an accuracy of 0.99 on a test set of 303 molecules, including 138 active inhibitors and 165 inactive non-inhibitors. All things considered, these results imply that deep learning models, in conjunction with molecular fingerprints, offer a robust and reliable approach to BACE-1 inhibitor prediction and could accelerate early-stage virtual screening. All experiments were conducted using a fixed random seed and a held-out random split to ensure reproducibility.
References
Knopman, D. S., Amieva, H., Petersen, R. C., et al., 2021. Alzheimer disease. Nat Rev Dis Primers 7: 33. https://doi.org/10.1038/s41572-021-00269-y
Hampel, H., Caruso, G., Nisticò, R., Piccioni, G., Mercuri, N. B., Giorgi, F. S., Ferrarelli, F., Lemercier, P., Caraci, F., Lista, S., Vergallo, A., Neurodegeneration Precision Medicine Initiative (NPMI), 2023. Biological Mechanism-based Neurology and Psychiatry: A BACE1/2 and Downstream Pathway Model. Current Neuropharmacology 21(1). https://doi.org/10.2174/1570159X19666211201095701
Coimbra, J. R. M., Resende, R., Custódio, J. B. A., Salvador, J. A. R., Santos, A. E., 2024. BACE1 Inhibitors for Alzheimer’s Disease: Current Challenges and Future Perspectives. Journal of Alzheimer’s Disease 101(s1): S53–S78. https://doi.org/10.3233/JAD-240146
Ghosh, A. K., 2024. BACE1 inhibitor drugs for the treatment of Alzheimer’s disease: Lessons learned, challenges to overcome, and future prospects. Global Health & Medicine 6(3): 164–169. https://doi.org/10.35772/ghm.2024.01033
Zhang, H., Saravanan, K. M., 2024. Advances in Deep Learning Assisted Drug Discovery Methods: A Self-review. Current Bioinformatics 19(10). https://doi.org/10.2174/0115748936285690240101041704
Pala, M. A., 2025. XP-GCN: Extreme learning machines and parallel graph convolutional networks for high-throughput prediction of blood-brain barrier penetration based on feature fusion. Computational Biology and Chemistry 120: 108755. https://doi.org/10.1016/j.compbiolchem.2025.108755
Pala, M. A., 2025. Graph-Aware AURALSTM: An attentive unified representation architecture with BiLSTM for enhanced molecular property prediction. Molecular Diversity. https://doi.org/10.1007/s11030-025-11197-4
Pala, M. A., 2025. DeepInsulin-Net: A deep learning model for identifying drug interactions leading to specific insulin-related adverse events. Sakarya University Journal of Computer and Information Sciences 8(2): 245–259. https://doi.org/10.35377/saucis...1646658
Qian, C., Tang, H., Yang, Z., Liang, H., Liu, Y., 2023. Can Large Language Models Empower Molecular Property Prediction? arXiv preprint. http://arxiv.org/abs/2307.07443
Wu, Z., Ramsundar, B., Feinberg, E. N., Gomes, J., Geniesse, C., Pappu, A. S., Leswing, K., Pande, V., 2018. MoleculeNet: A benchmark for molecular machine learning. Chemical Science 9(2): 513–530. https://doi.org/10.1039/C7SC02664A
Weininger, D., 1988. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of Chemical Information and Computer Sciences 28(1): 31–36. https://doi.org/10.1021/ci00057a005.
Lenselink, E. B., ten Dijke, N., Bongers, B., Papadatos, G., van Vlijmen, H. W. T., Kowalczyk, W., IJzerman, A. P., & van Westen, G. J. P. (2017). Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set. Journal of Cheminformatics, 9(1), 45. https://doi.org/10.1186/s13321-017-0232-0
Todeschini, R., & Consonni, V. (2009). Molecular Descriptors for Chemoinformatics. Wiley. https://doi.org/10.1002/9783527628766
Rogers, D., & Hahn, M. (2010). Extended-Connectivity Fingerprints. Journal of Chemical Information and Modeling, 50(5), 742–754. https://doi.org/10.1021/ci100050t
Durant, J. L., Leland, B. A., Henry, D. R., & Nourse, J. G. (2002). Reoptimization of MDL Keys for Use in Drug Discovery. Journal of Chemical Information and Computer Sciences, 42(6), 1273–1280. https://doi.org/10.1021/ci010132r
Yang, K., Swanson, K., Jin, W., Coley, C., Eiden, P., Gao, H., Guzman-Perez, A., Hopper, T., Kelley, B., Mathea, M., Palmer, A., Settels, V., Jaakkola, T., Jensen, K., & Barzilay, R. (2019). Analyzing Learned Molecular Representations for Property Prediction. Journal of Chemical Information and Modeling, 59(8), 3370–3388. https://doi.org/10.1021/acs.jcim.9b00237
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. https://doi.org/10.1109/TIT.1967.1053964
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1), 1929–1958.
Prechelt, L. (1998). Early stopping—but when? In G. B. Orr & K.-R. Müller (Eds.), Neural networks: Tricks of the trade (pp. 55–69). Springer.
Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427–437. https://doi.org/10.1016/j.ipm.2009.03.002.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
