1
|
Wang X, Zhou J, Mueller J, Quinn D, Carvalho A, Moody TS, Huang M. BioStructNet: Structure-Based Network with Transfer Learning for Predicting Biocatalyst Functions. J Chem Theory Comput 2025; 21:474-490. [PMID: 39705058 PMCID: PMC11736791 DOI: 10.1021/acs.jctc.4c01391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Revised: 12/03/2024] [Accepted: 12/12/2024] [Indexed: 12/21/2024]
Abstract
Enzyme-substrate interactions are essential to both biological processes and industrial applications. Advanced machine learning techniques have significantly accelerated biocatalysis research, revolutionizing the prediction of biocatalytic activities and facilitating the discovery of novel biocatalysts. However, the limited availability of data for specific enzyme functions, such as conversion efficiency and stereoselectivity, presents challenges for prediction accuracy. In this study, we developed BioStructNet, a structure-based deep learning network that integrates both protein and ligand structural data to capture the complexity of enzyme-substrate interactions. Benchmarking studies with different algorithms showed the enhanced predictive accuracy of BioStructNet. To further optimize the prediction accuracy for the small data set, we implemented transfer learning in the framework, training a source model on a large data set and fine-tuning it on a small, function-specific data set, using the CalB data set as a case study. The model performance was validated by comparing the attention heat maps generated by the BioStructNet interaction module with the enzyme-substrate interactions revealed from molecular dynamics simulations of enzyme-substrate complexes. BioStructNet would accelerate the discovery of functional enzymes for industrial use, particularly in cases where the training data sets for machine learning are small.
Collapse
Affiliation(s)
- Xiangwen Wang
- School
of Chemistry and Chemical Engineering, Queen’s
University Belfast, BT9 5AG Belfast, Northern Ireland, U.K.
- Department
of Biocatalysis and Isotope Chemistry, Almac
Sciences, BT63 5QD Craigavon, Northern
Ireland, U.K.
| | - Jiahui Zhou
- School
of Chemistry and Chemical Engineering, Queen’s
University Belfast, BT9 5AG Belfast, Northern Ireland, U.K.
| | - Jane Mueller
- Department
of Biocatalysis and Isotope Chemistry, Almac
Sciences, BT63 5QD Craigavon, Northern
Ireland, U.K.
| | - Derek Quinn
- Department
of Biocatalysis and Isotope Chemistry, Almac
Sciences, BT63 5QD Craigavon, Northern
Ireland, U.K.
| | - Alexandra Carvalho
- Department
of Biocatalysis and Isotope Chemistry, Almac
Sciences, BT63 5QD Craigavon, Northern
Ireland, U.K.
| | - Thomas S. Moody
- Department
of Biocatalysis and Isotope Chemistry, Almac
Sciences, BT63 5QD Craigavon, Northern
Ireland, U.K.
- Arran
Chemical Company Limited, Unit 1 Monksland Industrial Estate, Athlone, Co. Roscommon N37 DN24, Ireland
| | - Meilan Huang
- School
of Chemistry and Chemical Engineering, Queen’s
University Belfast, BT9 5AG Belfast, Northern Ireland, U.K.
| |
Collapse
|
2
|
Hashemi M, Zabihian A, Hajsaeedi M, Hooshmand M. Antivirals for monkeypox virus: Proposing an effective machine/deep learning framework. PLoS One 2024; 19:e0299342. [PMID: 39264896 DOI: 10.1371/journal.pone.0299342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 07/07/2024] [Indexed: 09/14/2024] Open
Abstract
Monkeypox (MPXV) is one of the infectious viruses which caused morbidity and mortality problems in these years. Despite its danger to public health, there is no approved drug to stand and handle MPXV. On the other hand, drug repurposing is a promising screening method for the low-cost introduction of approved drugs for emerging diseases and viruses which utilizes computational methods. Therefore, drug repurposing is a promising approach to suggesting approved drugs for the MPXV. This paper proposes a computational framework for MPXV antiviral prediction. To do this, we have generated a new virus-antiviral dataset. Moreover, we applied several machine learning and one deep learning method for virus-antiviral prediction. The suggested drugs by the learning methods have been investigated using docking studies. The target protein structure is modeled using homology modeling and, then, refined and validated. To the best of our knowledge, this work is the first work to study deep learning methods for the prediction of MPXV antivirals. The screening results confirm that Tilorone, Valacyclovir, Ribavirin, Favipiravir, and Baloxavir marboxil are effective drugs for MPXV treatment.
Collapse
Affiliation(s)
- Morteza Hashemi
- Department of Computer Science, Institute for Advanced Studies in Basic Sciences, Zanjan, Iran
| | - Arash Zabihian
- Department of QA, Kimia Zist Parsian Pharmaceutical Company, Zanjan, Iran
| | - Masih Hajsaeedi
- Department of Computer Science, Institute for Advanced Studies in Basic Sciences, Zanjan, Iran
| | - Mohsen Hooshmand
- Department of Computer Science, Institute for Advanced Studies in Basic Sciences, Zanjan, Iran
| |
Collapse
|
3
|
Majidifar S, Zabihian A, Hooshmand M. Combination therapy synergism prediction for virus treatment using machine learning models. PLoS One 2024; 19:e0309733. [PMID: 39231124 PMCID: PMC11373828 DOI: 10.1371/journal.pone.0309733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2024] [Accepted: 08/16/2024] [Indexed: 09/06/2024] Open
Abstract
Combining different drugs synergistically is an essential aspect of developing effective treatments. Although there is a plethora of research on computational prediction for new combination therapies, there is limited to no research on combination therapies in the treatment of viral diseases. This paper proposes AI-based models for predicting novel antiviral combinations to treat virus diseases synergistically. To do this, we assembled a comprehensive dataset comprising information on viral strains, drug compounds, and their known interactions. As far as we know, this is the first dataset and learning model on combination therapy for viruses. Our proposal includes using a random forest model, an SVM model, and a deep model to train viral combination therapy. The machine learning models showed the highest performance, and the predicted values were validated by a t-test, indicating the effectiveness of the proposed methods. One of the predicted combinations of acyclovir and ribavirin has been experimentally confirmed to have a synergistic antiviral effect against herpes simplex type-1 virus, as described in the literature.
Collapse
Affiliation(s)
- Shayan Majidifar
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, Iran
| | - Arash Zabihian
- Department of QA, Kimia Zist Parsian Pharmaceutical Company, Zanjan, Iran
| | - Mohsen Hooshmand
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, Iran
| |
Collapse
|
4
|
Zabihian A, Asghari J, Hooshmand M, Gharaghani S. A comparative analysis of computational drug repurposing approaches: proposing a novel tensor-matrix-tensor factorization method. Mol Divers 2024; 28:2177-2196. [PMID: 38683487 DOI: 10.1007/s11030-024-10851-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 03/18/2024] [Indexed: 05/01/2024]
Abstract
Efficient drug discovery relies on drug repurposing, an important and open research field. This work presents a novel factorization method and a practical comparison of different approaches for drug repurposing. First, we propose a novel tensor-matrix-tensor (TMT) formulation as a new data array method with a gradient-based factorization procedure. Additionally, this paper examines and contrasts four computational drug repurposing approaches-factorization-based methods, machine learning methods, deep learning methods, and graph neural networks-to fulfill the second purpose. We test the strategies on two datasets and assess each approach's performance, drawbacks, problems, and benefits based on results. The results demonstrate that deep learning techniques work better than other strategies and that their results might be more reliable. Ultimately, graph neural methods need to be in an inductive manner to have a reliable prediction.
Collapse
Affiliation(s)
- Arash Zabihian
- Department of Bioinformatics, Kish International Campus, University of Tehran, Kish, Iran
| | - Javad Asghari
- Department of Computer Science and Information Technology, Institute of Advanced Studies in Basic Sciences, Zanjan, Iran
| | - Mohsen Hooshmand
- Department of Computer Science and Information Technology, Institute of Advanced Studies in Basic Sciences, Zanjan, Iran.
| | - Sajjad Gharaghani
- Laboratory of Bioinformatics and Drug Design, University of Tehran, Tehran, Iran
| |
Collapse
|
5
|
Bian J, Lu H, Dong G, Wang G. Hierarchical multimodal self-attention-based graph neural network for DTI prediction. Brief Bioinform 2024; 25:bbae293. [PMID: 38920341 PMCID: PMC11200190 DOI: 10.1093/bib/bbae293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2024] [Revised: 05/17/2024] [Accepted: 06/06/2024] [Indexed: 06/27/2024] Open
Abstract
Drug-target interactions (DTIs) are a key part of drug development process and their accurate and efficient prediction can significantly boost development efficiency and reduce development time. Recent years have witnessed the rapid advancement of deep learning, resulting in an abundance of deep learning-based models for DTI prediction. However, most of these models used a single representation of drugs and proteins, making it difficult to comprehensively represent their characteristics. Multimodal data fusion can effectively compensate for the limitations of single-modal data. However, existing multimodal models for DTI prediction do not take into account both intra- and inter-modal interactions simultaneously, resulting in limited presentation capabilities of fused features and a reduction in DTI prediction accuracy. A hierarchical multimodal self-attention-based graph neural network for DTI prediction, called HMSA-DTI, is proposed to address multimodal feature fusion. Our proposed HMSA-DTI takes drug SMILES, drug molecular graphs, protein sequences and protein 2-mer sequences as inputs, and utilizes a hierarchical multimodal self-attention mechanism to achieve deep fusion of multimodal features of drugs and proteins, enabling the capture of intra- and inter-modal interactions between drugs and proteins. It is demonstrated that our proposed HMSA-DTI has significant advantages over other baseline methods on multiple evaluation metrics across five benchmark datasets.
Collapse
Affiliation(s)
- Jilong Bian
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin, Heilongjiang 150040, China
| | - Hao Lu
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin, Heilongjiang 150040, China
| | - Guanghui Dong
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin, Heilongjiang 150040, China
| | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin, Heilongjiang 150040, China
| |
Collapse
|
6
|
Pillai M, Wu D. Validation approaches for computational drug repurposing: a review. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2024; 2023:559-568. [PMID: 38222367 PMCID: PMC10785886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Affiliation(s)
- Malvika Pillai
- Stanford University, Stanford, CA
- University of North Carolina, Chapel Hill, NC
| | - Di Wu
- University of North Carolina, Chapel Hill, NC
| |
Collapse
|
7
|
Li Y, Fan Z, Rao J, Chen Z, Chu Q, Zheng M, Li X. An overview of recent advances and challenges in predicting compound-protein interaction (CPI). MEDICAL REVIEW (2021) 2023; 3:465-486. [PMID: 38282802 PMCID: PMC10808869 DOI: 10.1515/mr-2023-0030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 08/30/2023] [Indexed: 01/30/2024]
Abstract
Compound-protein interactions (CPIs) are critical in drug discovery for identifying therapeutic targets, drug side effects, and repurposing existing drugs. Machine learning (ML) algorithms have emerged as powerful tools for CPI prediction, offering notable advantages in cost-effectiveness and efficiency. This review provides an overview of recent advances in both structure-based and non-structure-based CPI prediction ML models, highlighting their performance and achievements. It also offers insights into CPI prediction-related datasets and evaluation benchmarks. Lastly, the article presents a comprehensive assessment of the current landscape of CPI prediction, elucidating the challenges faced and outlining emerging trends to advance the field.
Collapse
Affiliation(s)
- Yanbei Li
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhehuan Fan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jingxin Rao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhiyi Chen
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Qinyu Chu
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Mingyue Zheng
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
8
|
DRaW: prediction of COVID-19 antivirals by deep learning-an objection on using matrix factorization. BMC Bioinformatics 2023; 24:52. [PMID: 36793010 PMCID: PMC9931173 DOI: 10.1186/s12859-023-05181-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 02/09/2023] [Indexed: 02/17/2023] Open
Abstract
BACKGROUND Due to the high resource consumption of introducing a new drug, drug repurposing plays an essential role in drug discovery. To do this, researchers examine the current drug-target interaction (DTI) to predict new interactions for the approved drugs. Matrix factorization methods have much attention and utilization in DTIs. However, they suffer from some drawbacks. METHODS We explain why matrix factorization is not the best for DTI prediction. Then, we propose a deep learning model (DRaW) to predict DTIs without having input data leakage. We compare our model with several matrix factorization methods and a deep model on three COVID-19 datasets. In addition, to ensure the validation of DRaW, we evaluate it on benchmark datasets. Furthermore, as an external validation, we conduct a docking study on the COVID-19 recommended drugs. RESULTS In all cases, the results confirm that DRaW outperforms matrix factorization and deep models. The docking results approve the top-ranked recommended drugs for COVID-19. CONCLUSIONS In this paper, we show that it may not be the best choice to use matrix factorization in the DTI prediction. Matrix factorization methods suffer from some intrinsic issues, e.g., sparsity in the domain of bioinformatics applications and fixed-unchanged size of the matrix-related paradigm. Therefore, we propose an alternative method (DRaW) that uses feature vectors rather than matrix factorization and demonstrates better performance than other famous methods on three COVID-19 and four benchmark datasets.
Collapse
|
9
|
Hu L, Fu C, Ren Z, Cai Y, Yang J, Xu S, Xu W, Tang D. SSELM-neg: spherical search-based extreme learning machine for drug-target interaction prediction. BMC Bioinformatics 2023; 24:38. [PMID: 36737694 PMCID: PMC9896467 DOI: 10.1186/s12859-023-05153-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 01/18/2023] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND The experimental verification of a drug discovery process is expensive and time-consuming. Therefore, efficiently and effectively identifying drug-target interactions (DTIs) has been the focus of research. At present, many machine learning algorithms are used for predicting DTIs. The key idea is to train the classifier using an existing DTI to predict a new or unknown DTI. However, there are various challenges, such as class imbalance and the parameter optimization of many classifiers, that need to be solved before an optimal DTI model is developed. METHODS In this study, we propose a framework called SSELM-neg for DTI prediction, in which we use a screening approach to choose high-quality negative samples and a spherical search approach to optimize the parameters of the extreme learning machine. RESULTS The results demonstrated that the proposed technique outperformed other state-of-the-art methods in 10-fold cross-validation experiments in terms of the area under the receiver operating characteristic curve (0.986, 0.993, 0.988, and 0.969) and AUPR (0.982, 0.991, 0.982, and 0.946) for the enzyme dataset, G-protein coupled receptor dataset, ion channel dataset, and nuclear receptor dataset, respectively. CONCLUSION The screening approach produced high-quality negative samples with the same number of positive samples, which solved the class imbalance problem. We optimized an extreme learning machine using a spherical search approach to identify DTIs. Therefore, our models performed better than other state-of-the-art methods.
Collapse
Affiliation(s)
- Lingzhi Hu
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China
| | - Chengzhou Fu
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China ,Guangdong Province Precise Medicine Big Data of Traditional Chinese Medicine Engineering Technology Research Center, Guangzhou, People’s Republic of China
| | - Zhonglu Ren
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China
| | - Yongming Cai
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China ,Guangdong Province Precise Medicine Big Data of Traditional Chinese Medicine Engineering Technology Research Center, Guangzhou, People’s Republic of China
| | - Jin Yang
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China ,Guangdong Province Precise Medicine Big Data of Traditional Chinese Medicine Engineering Technology Research Center, Guangzhou, People’s Republic of China
| | - Siwen Xu
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China
| | - Wenhua Xu
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China
| | - Deyu Tang
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China ,grid.79703.3a0000 0004 1764 3838School of Computer Science and Engineering, South China University of Technology, Guangzhou, People’s Republic of China ,Guangdong Province Precise Medicine Big Data of Traditional Chinese Medicine Engineering Technology Research Center, Guangzhou, People’s Republic of China
| |
Collapse
|
10
|
McNair D. Artificial Intelligence and Machine Learning for Lead-to-Candidate Decision-Making and Beyond. Annu Rev Pharmacol Toxicol 2023; 63:77-97. [PMID: 35679624 DOI: 10.1146/annurev-pharmtox-051921-023255] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The use of artificial intelligence (AI) and machine learning (ML) in pharmaceutical research and development has to date focused on research: target identification; docking-, fragment-, and motif-based generation of compound libraries; modeling of synthesis feasibility; rank-ordering likely hits according to structural and chemometric similarity to compounds having known activity and affinity to the target(s); optimizing a smaller library for synthesis and high-throughput screening; and combining evidence from screening to support hit-to-lead decisions. Applying AI/ML methods to lead optimization and lead-to-candidate (L2C) decision-making has shown slower progress, especially regarding predicting absorption, distribution, metabolism, excretion, and toxicology properties. The present review surveys reasons why this is so, reports progress that has occurred in recent years, and summarizes some of the issues that remain. Effective AI/ML tools to derisk L2C and later phases of development are important to accelerate the pharmaceutical development process, ameliorate escalating development costs, and achieve greater success rates.
Collapse
Affiliation(s)
- Douglas McNair
- Global Health, Integrated Development, Bill & Melinda Gates Foundation, Seattle, Washington, USA;
| |
Collapse
|
11
|
Johnson TO, Akinsanmi AO, Ejembi SA, Adeyemi OE, Oche JR, Johnson GI, Adegboyega AE. Modern drug discovery for inflammatory bowel disease: The role of computational methods. World J Gastroenterol 2023; 29:310-331. [PMID: 36687123 PMCID: PMC9846937 DOI: 10.3748/wjg.v29.i2.310] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 11/02/2022] [Accepted: 12/21/2022] [Indexed: 01/06/2023] Open
Abstract
Inflammatory bowel diseases (IBDs) comprising ulcerative colitis, Crohn’s disease and microscopic colitis are characterized by chronic inflammation of the gastrointestinal tract. IBD has spread around the world and is becoming more prevalent at an alarming rate in developing countries whose societies have become more westernized. Cell therapy, intestinal microecology, apheresis therapy, exosome therapy and small molecules are emerging therapeutic options for IBD. Currently, it is thought that low-molecular-mass substances with good oral bio-availability and the ability to permeate the cell membrane to regulate the action of elements of the inflammatory signaling pathway are effective therapeutic options for the treatment of IBD. Several small molecule inhibitors are being developed as a promising alternative for IBD therapy. The use of highly efficient and time-saving techniques, such as computational methods, is still a viable option for the development of these small molecule drugs. The computer-aided (in silico) discovery approach is one drug development technique that has mostly proven efficacy. Computational approaches when combined with traditional drug development methodology dramatically boost the likelihood of drug discovery in a sustainable and cost-effective manner. This review focuses on the modern drug discovery approaches for the design of novel IBD drugs with an emphasis on the role of computational methods. Some computational approaches to IBD genomic studies, target identification, and virtual screening for the discovery of new drugs and in the repurposing of existing drugs are discussed.
Collapse
Affiliation(s)
| | | | | | | | - Jane-Rose Oche
- Department of Biochemistry, University of Jos, Jos 930222, Plateau, Nigeria
| | - Grace Inioluwa Johnson
- Faculty of Clinical Sciences, College of Health Sciences, University of Jos, Jos 930222, Plateau, Nigeria
| | | |
Collapse
|
12
|
Askr H, Elgeldawi E, Aboul Ella H, Elshaier YAMM, Gomaa MM, Hassanien AE. Deep learning in drug discovery: an integrative review and future challenges. Artif Intell Rev 2022; 56:5975-6037. [PMID: 36415536 PMCID: PMC9669545 DOI: 10.1007/s10462-022-10306-1] [Citation(s) in RCA: 85] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/24/2022] [Indexed: 11/18/2022]
Abstract
Recently, using artificial intelligence (AI) in drug discovery has received much attention since it significantly shortens the time and cost of developing new drugs. Deep learning (DL)-based approaches are increasingly being used in all stages of drug development as DL technology advances, and drug-related data grows. Therefore, this paper presents a systematic Literature review (SLR) that integrates the recent DL technologies and applications in drug discovery Including, drug-target interactions (DTIs), drug-drug similarity interactions (DDIs), drug sensitivity and responsiveness, and drug-side effect predictions. We present a review of more than 300 articles between 2000 and 2022. The benchmark data sets, the databases, and the evaluation measures are also presented. In addition, this paper provides an overview of how explainable AI (XAI) supports drug discovery problems. The drug dosing optimization and success stories are discussed as well. Finally, digital twining (DT) and open issues are suggested as future research challenges for drug discovery problems. Challenges to be addressed, future research directions are identified, and an extensive bibliography is also included.
Collapse
Affiliation(s)
- Heba Askr
- Faculty of Computers and Artificial Intelligence, University of Sadat City, Sadat City, Egypt
| | - Enas Elgeldawi
- Computer Science Department, Faculty of Science, Minia University, Minia, Egypt
| | - Heba Aboul Ella
- Faculty of Pharmacy and Drug Technology, Chinese University in Egypt (CUE), Cairo, Egypt
| | | | - Mamdouh M. Gomaa
- Computer Science Department, Faculty of Science, Minia University, Minia, Egypt
| | - Aboul Ella Hassanien
- Faculty of Computers and Artificial Intelligence, Cairo University, Cairo, Egypt
| |
Collapse
|
13
|
Zhang Y, Hu Y, Li H, Liu X. Drug-protein interaction prediction via variational autoencoders and attention mechanisms. Front Genet 2022; 13:1032779. [PMID: 36313473 PMCID: PMC9614151 DOI: 10.3389/fgene.2022.1032779] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 09/30/2022] [Indexed: 09/29/2023] Open
Abstract
During the process of drug discovery, exploring drug-protein interactions (DPIs) is a key step. With the rapid development of biological data, computer-aided methods are much faster than biological experiments. Deep learning methods have become popular and are mainly used to extract the characteristics of drugs and proteins for further DPIs prediction. Since the prediction of DPIs through machine learning cannot fully extract effective features, in our work, we propose a deep learning framework that uses variational autoencoders and attention mechanisms; it utilizes convolutional neural networks (CNNs) to obtain local features and attention mechanisms to obtain important information about drugs and proteins, which is very important for predicting DPIs. Compared with some machine learning methods on the C.elegans and human datasets, our approach provides a better effect. On the BindingDB dataset, its accuracy (ACC) and area under the curve (AUC) reach 0.862 and 0.913, respectively. To verify the robustness of the model, multiclass classification tasks are performed on Davis and KIBA datasets, and the ACC values reach 0.850 and 0.841, respectively, thus further demonstrating the effectiveness of the model.
Collapse
Affiliation(s)
- Yue Zhang
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | | | | | | |
Collapse
|
14
|
Pu Y, Li J, Tang J, Guo F. DeepFusionDTA: Drug-Target Binding Affinity Prediction With Information Fusion and Hybrid Deep-Learning Ensemble Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2760-2769. [PMID: 34379594 DOI: 10.1109/tcbb.2021.3103966] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Identification of drug-target interaction (DTI) is the most important issue in the broad field of drug discovery. Using purely biological experiments to verify drug-target binding profiles takes lots of time and effort, so computational technologies for this task obviously have great benefits in reducing the drug search space. Most of computational methods to predict DTI are proposed to solve a binary classification problem, which ignore the influence of binding strength. Therefore, drug-target binding affinity prediction is still a challenging issue. Currently, lots of studies only extract sequence information that lacks feature-rich representation, but we consider more spatial features in order to merge various data in drug and target spaces. In this study, we propose a two-stage deep neural network ensemble model for detecting drug-target binding affinity, called DeepFusionDTA, via various information analysis modules. First stage is to utilize sequence and structure information to generate fusion feature map of candidate protein and drug pair through various analysis modules based deep learning. Second stage is to apply bagging-based ensemble learning strategy for regression prediction, and we obtain outstanding results by combining the advantages of various algorithms in efficient feature abstraction and regression calculation. Importantly, we evaluate our novel method, DeepFusionDTA, which delivers 1.5 percent CI increase on KIBA dataset and 1.0 percent increase on Davis dataset, by comparing with existing prediction tools, DeepDTA. Furthermore, the ideas we have offered can be applied to in-silico screening of the interaction space, to provide novel DTIs which can be experimentally pursued. The codes and data are available from https://github.com/guofei-tju/DeepFusionDTA.
Collapse
|
15
|
El-Behery H, Attia AF, El-Fishawy N, Torkey H. An ensemble-based drug-target interaction prediction approach using multiple feature information with data balancing. J Biol Eng 2022; 16:21. [PMID: 35941686 PMCID: PMC9361677 DOI: 10.1186/s13036-022-00296-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 06/02/2022] [Indexed: 11/16/2022] Open
Abstract
Background Recently, drug repositioning has received considerable attention for its advantage to pharmaceutical industries in drug development. Artificial intelligence techniques have greatly enhanced drug reproduction by discovering therapeutic drug profiles, side effects, and new target proteins. However, as the number of drugs increases, their targets and enormous interactions produce imbalanced data that might not be preferable as an input to a prediction model immediately. Methods This paper proposes a novel scheme for predicting drug–target interactions (DTIs) based on drug chemical structures and protein sequences. The drug Morgan fingerprint, drug constitutional descriptors, protein amino acid composition, and protein dipeptide composition were employed to extract the drugs and protein’s characteristics. Then, the proposed approach for extracting negative samples using a support vector machine one-class classifier was developed to tackle the imbalanced data problem feature sets from the drug–target dataset. Negative and positive samplings were constructed and fed into different prediction algorithms to identify DTIs. A 10-fold CV validation test procedure was applied to assess the predictability of the proposed method, in addition to the study of the effectiveness of the chemical and physical features in the evaluation and discovery of the drug–target interactions. Results Our experimental model outperformed existing techniques concerning the curve for receiver operating characteristic (AUC), accuracy, precision, recall F-score, mean square error, and MCC. The results obtained by the AdaBoost classifier enhanced prediction accuracy by 2.74%, precision by 1.98%, AUC by 1.14%, F-score by 3.53%, and MCC by 4.54% over existing methods.
Collapse
Affiliation(s)
- Heba El-Behery
- Department of Computer Science and Engineering, Faculty of Engineering, Kafrelsheikh University, Kafr_El_Sheikh, Egypt.
| | - Abdel-Fattah Attia
- Department of Computer Science and Engineering, Faculty of Engineering, Kafrelsheikh University, Kafr_El_Sheikh, Egypt
| | - Nawal El-Fishawy
- Computer Science & Engineering Department, Faculty of Electronic Engineering, Menoufia University, Menouf, Egypt
| | - Hanaa Torkey
- Computer Science & Engineering Department, Faculty of Electronic Engineering, Menoufia University, Menouf, Egypt
| |
Collapse
|
16
|
Lian M, Wang X, Du W. Integrated multi-similarity fusion and heterogeneous graph inference for drug-target interaction prediction. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
17
|
Sharifabad MM, Sheikhpour R, Gharaghani S. Drug-target interaction prediction using reliable negative samples and effective feature selection methods. J Pharmacol Toxicol Methods 2022; 116:107191. [PMID: 35738316 DOI: 10.1016/j.vascn.2022.107191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Revised: 06/04/2022] [Accepted: 06/14/2022] [Indexed: 11/28/2022]
Abstract
Machine learning-based approaches in the field of drug discovery have dramatically reduced the time and cost of the laboratory process of detecting potential drug-target interactions (DTIs). Standard binary classifiers require both positive and negative samples in the training and validation phases. One of the major challenges in the DTI context is the lack of access to non-interacting pairs as negative samples in the learning process. Many recent studies in this field have randomly selected negative samples from unlabeled drug-target pairs. Therefore, due to the probability of the presence of unknown positive samples in a set considered as negative samples, the model results may be affected and appear with a high rate of false positive. In this study, an algorithm called Reliable Non-Interacting Drug-Target Pairs (RNIDTP) is proposed to select reliable negative samples and an efficient algorithm to select relevant features for drug-target interaction prediction. To validate the performance of the proposed RNIDTP algorithm in the selection of negative samples, a benchmark drug-target interactions dataset is used. The results demonstrate the superiority of the proposed algorithm compared with other algorithms in most cases. The results also indicate that by using an appropriate algorithm for the selection of negative samples, the performance of the learning process is significantly increased compared to random selection.
Collapse
Affiliation(s)
- Mohammad Morovvati Sharifabad
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Razieh Sheikhpour
- Department of Computer Engineering, Faculty of Engineering, Ardakan University, P.O. Box 184, Ardakan, Iran.
| | - Sajjad Gharaghani
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran.
| |
Collapse
|
18
|
Xuan P, Zhang X, Zhang Y, Hu K, Nakaguchi T, Zhang T. multi-type neighbors enhanced global topology and pairwise attribute learning for drug-protein interaction prediction. Brief Bioinform 2022; 23:6581435. [PMID: 35514190 DOI: 10.1093/bib/bbac120] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 03/07/2022] [Accepted: 03/15/2022] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Accurate identification of proteins interacted with drugs helps reduce the time and cost of drug development. Most of previous methods focused on integrating multisource data about drugs and proteins for predicting drug-target interactions (DTIs). There are both similarity connection and interaction connection between two drugs, and these connections reflect their relationships from different perspectives. Similarly, two proteins have various connections from multiple perspectives. However, most of previous methods failed to deeply integrate these connections. In addition, multiple drug-protein heterogeneous networks can be constructed based on multiple kinds of connections. The diverse topological structures of these networks are still not exploited completely. RESULTS We propose a novel model to extract and integrate multi-type neighbor topology information, diverse similarities and interactions related to drugs and proteins. Firstly, multiple drug-protein heterogeneous networks are constructed according to multiple kinds of connections among drugs and those among proteins. The multi-type neighbor node sequences of a drug node (or a protein node) are formed by random walks on each network and they reflect the hidden neighbor topological structure of the node. Secondly, a module based on graph neural network (GNN) is proposed to learn the multi-type neighbor topologies of each node. We propose attention mechanisms at neighbor node level and at neighbor type level to learn more informative neighbor nodes and neighbor types. A network-level attention is also designed to enhance the context dependency among multiple neighbor topologies of a pair of drug and protein nodes. Finally, the attribute embedding of the drug-protein pair is formulated by a proposed embedding strategy, and the embedding covers the similarities and interactions about the pair. A module based on three-dimensional convolutional neural networks (CNN) is constructed to deeply integrate pairwise attributes. Extensive experiments have been performed and the results indicate GCDTI outperforms several state-of-the-art prediction methods. The recall rate estimation over the top-ranked candidates and case studies on 5 drugs further demonstrate GCDTI's ability in discovering potential drug-protein interactions.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China.,School of Computer Science, Shaanxi Normal University, Xi'an 710062, China
| | - Xiaowen Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Yu Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Kaimiao Hu
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| |
Collapse
|
19
|
DTIP-TC2A: An analytical framework for drug-target interactions prediction methods. Comput Biol Chem 2022; 99:107707. [DOI: 10.1016/j.compbiolchem.2022.107707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 05/01/2022] [Accepted: 05/26/2022] [Indexed: 11/18/2022]
|
20
|
Sinha K, Ghosh J, Sil PC. Machine Learning in Drug Metabolism Study. Curr Drug Metab 2022; 23:1012-1026. [PMID: 36578255 DOI: 10.2174/1389200224666221227094144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 10/27/2022] [Accepted: 11/01/2022] [Indexed: 12/30/2022]
Abstract
Metabolic reactions in the body transform the administered drug into metabolites. These metabolites exhibit diverse biological activities. Drug metabolism is the major underlying cause of drug overdose-related toxicity, adversative drug effects and the drug's reduced efficacy. Though metabolic reactions deactivate a drug, drug metabolites are often considered pivotal agents for off-target effects or toxicity. On the other side, in combination drug therapy, one drug may influence another drug's metabolism and clearance and is thus considered one of the primary causes of drug-drug interactions. Today with the advancement of machine learning, the metabolic fate of a drug candidate can be comprehensively studied throughout the drug development procedure. Naïve Bayes, Logistic Regression, k-Nearest Neighbours, Decision Trees, different Boosting and Ensemble methods, Support Vector Machines and Artificial Neural Network boosted Deep Learning are some machine learning algorithms which are being extensively used in such studies. Such tools are covering several attributes of drug metabolism, with an emphasis on the prediction of drug-drug interactions, drug-target-interactions, clinical drug responses, metabolite predictions, sites of metabolism, etc. These reports are crucial for evaluating metabolic stability and predicting prospective drug-drug interactions, and can help pharmaceutical companies accelerate the drug development process in a less resourcedemanding manner than what in vitro studies offer. It could also help medical practitioners to use combinatorial drug therapy in a more resourceful manner. Also, with the help of the enormous growth of deep learning, traditional fields of computational drug development like molecular interaction fields, molecular docking, quantitative structure-toactivity relationship (QSAR) studies and quantum mechanical simulations are producing results which were unimaginable couple of years back. This review provides a glimpse of a few contextually relevant machine learning algorithms and then focuses on their outcomes in different studies.
Collapse
Affiliation(s)
- Krishnendu Sinha
- Department of Zoology, Jhargram Raj College, Jhargram-721507, India
| | - Jyotirmoy Ghosh
- Department of Chemistry, Banwarilal Bhalotia College, Asansol-713303, India
| | - Parames Chandra Sil
- Department of Division of Molecular Medicine, Bose Institute, Kolkata-700054, India
| |
Collapse
|
21
|
Venkateswaran MR, Vadivel TE, Jayabal S, Murugesan S, Rajasekaran S, Periyasamy S. A review on network pharmacology based phytotherapy in treating diabetes- An environmental perspective. ENVIRONMENTAL RESEARCH 2021; 202:111656. [PMID: 34265348 DOI: 10.1016/j.envres.2021.111656] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 06/19/2021] [Accepted: 07/04/2021] [Indexed: 06/13/2023]
Abstract
Diabetes has become common lifestyle disorder associated with obesity and cardiovascular diseases. Environmental factors like physical inactivity, polluted surroundings and unhealthy dieting also plays a vital role in diabetes pathogenesis. As the current anti-diabetic drugs possess unprecedented side effects, traditional herbal medicine can be used an alternative therapy. The paramount challenge with the herbal formulation usage is the lack of standardized procedure, entangled with little knowledge on drug safety and mechanism of drug action. Heavy metal contamination is a major environmental hazard where plants tend to accumulate toxic metals like nickel, chromium and lead through industrial and agricultural activities. It becomes inappropriate to use these plants for phytotherapy as it may affect the human health on long term consumption. This review discuss about the environmental risk factors related to diabetes and better implication of medicinal plants in anti-diabetic therapy using network pharmacology. It is an in silico analytical tool that helps to unravel the multi-targeted action of herbal formulations rich in secondary metabolites. Also, a special focus is attempted to pool the databases regarding the medicinal plants for diabetes and associated diseases, their bioactive compounds, possible diabetic targets, drug-target interaction and toxicology reports that may open an aisle in safer, effective and toxicity-free drug discovery.
Collapse
Affiliation(s)
- Meenakshi R Venkateswaran
- Department of Biotechnology, Anna University, BIT-Campus, Tiruchirappalli, 620024, Tamil Nadu, India
| | - Tamil Elakkiya Vadivel
- Department of Biotechnology, Anna University, BIT-Campus, Tiruchirappalli, 620024, Tamil Nadu, India
| | - Sasidharan Jayabal
- Department of Biotechnology, Anna University, BIT-Campus, Tiruchirappalli, 620024, Tamil Nadu, India
| | - Selvakumar Murugesan
- Department of Biotechnology, Anna University, BIT-Campus, Tiruchirappalli, 620024, Tamil Nadu, India
| | - Subbiah Rajasekaran
- Department of Biochemistry, ICMR-National Institute for Research in Environmental Health, Bhopal, India.
| | - Sureshkumar Periyasamy
- Department of Biotechnology, Anna University, BIT-Campus, Tiruchirappalli, 620024, Tamil Nadu, India.
| |
Collapse
|
22
|
Xuan P, Hu K, Cui H, Zhang T, Nakaguchi T. Learning multi-scale heterogeneous representations and global topology for drug-target interaction prediction. IEEE J Biomed Health Inform 2021; 26:1891-1902. [PMID: 34673498 DOI: 10.1109/jbhi.2021.3121798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Identification of drug-target interactions (DTIs) plays a critical role in drug discovery and repositioning. Deep integration of inter-connections and intra-similarities between heterogeneous multi-source data related to drugs and targets, however, is a challenging issue. We propose a DTI prediction model by learning from drug and protein related multi-scale attributes and global topology formed by heterogeneous connections. A drug-protein-disease heterogeneous network (RPD-Net) is firstly constructed to associate diverse similarities, interactions and associations across nodes. Secondly, we propose a multi-scale pairwise deep representation learning module consisting of a new embedding strategy to integrate diverse inter-relations and intra-relations, and dilation convolutions for multi-scale deep representation extraction. A global topology learning module is proposed which is composed of strategy based on non-negative matrix factorization (NMF) to extract topology from RPD-Net, and a new relational-level attention mechanism for discriminative topology embedding. Experimental results using public dataset demonstrate improved performance over state-of-the-art methods and contributions of our major innovations. Evaluation results by top k recall rates and case studies on five drugs further show the effectiveness in retrieving potential target candidates for drugs.
Collapse
|
23
|
Kim J, Park S, Min D, Kim W. Comprehensive Survey of Recent Drug Discovery Using Deep Learning. Int J Mol Sci 2021; 22:9983. [PMID: 34576146 PMCID: PMC8470987 DOI: 10.3390/ijms22189983] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 09/09/2021] [Accepted: 09/10/2021] [Indexed: 02/07/2023] Open
Abstract
Drug discovery based on artificial intelligence has been in the spotlight recently as it significantly reduces the time and cost required for developing novel drugs. With the advancement of deep learning (DL) technology and the growth of drug-related data, numerous deep-learning-based methodologies are emerging at all steps of drug development processes. In particular, pharmaceutical chemists have faced significant issues with regard to selecting and designing potential drugs for a target of interest to enter preclinical testing. The two major challenges are prediction of interactions between drugs and druggable targets and generation of novel molecular structures suitable for a target of interest. Therefore, we reviewed recent deep-learning applications in drug-target interaction (DTI) prediction and de novo drug design. In addition, we introduce a comprehensive summary of a variety of drug and protein representations, DL models, and commonly used benchmark datasets or tools for model training and testing. Finally, we present the remaining challenges for the promising future of DL-based DTI prediction and de novo drug design.
Collapse
Affiliation(s)
- Jintae Kim
- KaiPharm Co., Ltd., Seoul 03759, Korea; (J.K.); (S.P.)
| | - Sera Park
- KaiPharm Co., Ltd., Seoul 03759, Korea; (J.K.); (S.P.)
| | - Dongbo Min
- Computer Vision Lab, Department of Computer Science and Engineering, Ewha Womans University, Seoul 03760, Korea
| | - Wankyu Kim
- KaiPharm Co., Ltd., Seoul 03759, Korea; (J.K.); (S.P.)
- System Pharmacology Lab, Department of Life Sciences, Ewha Womans University, Seoul 03760, Korea
| |
Collapse
|
24
|
Binding affinity prediction for binary drug-target interactions using semi-supervised transfer learning. J Comput Aided Mol Des 2021; 35:883-900. [PMID: 34189637 DOI: 10.1007/s10822-021-00404-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 06/18/2021] [Indexed: 10/21/2022]
Abstract
In the field of drug-target interactions prediction, the majority of approaches formulated the problem as a simple binary classification task. These methods used binary drug-target interaction datasets to train their models. The prediction of drug-target interactions is inherently a regression problem and these interactions would be identified according to the binding affinity between drugs and targets. This paper deals the binary drug-target interactions and tries to identify the binary interactions based on the binding strength of a drug and its target. To this end, we propose a semi-supervised transfer learning approach to predict the binding affinity in a continuous spectrum for binary interactions. Due to the lack of training data with continuous binding affinity in the target domain, the proposed method makes use of the information available in other domains (i.e. source domain), via the transfer learning approach. The general framework of our algorithm is based on an objective function, which considers the performance in both source and target domains as well as the unlabeled data in the target domain via a regularization term. To optimize this objective function, we make use of a gradient boosting machine which constructs the final model. To assess the performance of the proposed method, we have used some benchmark datasets with binary interactions for four classes of human proteins. Our algorithm identifies interactions in a more realistic situation. According to the experimental results, our regression model performs better than the state-of-the-art methods in some procedures.
Collapse
|
25
|
Xuan P, Zhang Y, Cui H, Zhang T, Guo M, Nakaguchi T. Integrating multi-scale neighbouring topologies and cross-modal similarities for drug-protein interaction prediction. Brief Bioinform 2021; 22:6220173. [PMID: 33839743 DOI: 10.1093/bib/bbab119] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2021] [Revised: 02/15/2021] [Accepted: 03/12/2021] [Indexed: 01/02/2023] Open
Abstract
MOTIVATION Identifying the proteins that interact with drugs can reduce the cost and time of drug development. Existing computerized methods focus on integrating drug-related and protein-related data from multiple sources to predict candidate drug-target interactions (DTIs). However, multi-scale neighboring node sequences and various kinds of drug and protein similarities are neither fully explored nor considered in decision making. RESULTS We propose a drug-target interaction prediction method, DTIP, to encode and integrate multi-scale neighbouring topologies, multiple kinds of similarities, associations, interactions related to drugs and proteins. We firstly construct a three-layer heterogeneous network to represent interactions and associations across drug, protein, and disease nodes. Then a learning framework based on fully-connected autoencoder is proposed to learn the nodes' low-dimensional feature representations within the heterogeneous network. Secondly, multi-scale neighbouring sequences of drug and protein nodes are formulated by random walks. A module based on bidirectional gated recurrent unit is designed to learn the neighbouring sequential information and integrate the low-dimensional features of nodes. Finally, we propose attention mechanisms at feature level, neighbouring topological level and similarity level to learn more informative features, topologies and similarities. The prediction results are obtained by integrating neighbouring topologies, similarities and feature attributes using a multiple layer CNN. Comprehensive experimental results over public dataset demonstrated the effectiveness of our innovative features and modules. Comparison with other state-of-the-art methods and case studies of five drugs further validated DTIP's ability in discovering the potential candidate drug-related proteins.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Yu Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne 3083, Australia
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| |
Collapse
|
26
|
Wang C, Kurgan L. Survey of Similarity-Based Prediction of Drug-Protein Interactions. Curr Med Chem 2021; 27:5856-5886. [PMID: 31393241 DOI: 10.2174/0929867326666190808154841] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Revised: 04/16/2018] [Accepted: 10/23/2018] [Indexed: 12/20/2022]
Abstract
Therapeutic activity of a significant majority of drugs is determined by their interactions with proteins. Databases of drug-protein interactions (DPIs) primarily focus on the therapeutic protein targets while the knowledge of the off-targets is fragmented and partial. One way to bridge this knowledge gap is to employ computational methods to predict protein targets for a given drug molecule, or interacting drugs for given protein targets. We survey a comprehensive set of 35 methods that were published in high-impact venues and that predict DPIs based on similarity between drugs and similarity between protein targets. We analyze the internal databases of known PDIs that these methods utilize to compute similarities, and investigate how they are linked to the 12 publicly available source databases. We discuss contents, impact and relationships between these internal and source databases, and well as the timeline of their releases and publications. The 35 predictors exploit and often combine three types of similarities that consider drug structures, drug profiles, and target sequences. We review the predictive architectures of these methods, their impact, and we explain how their internal DPIs databases are linked to the source databases. We also include a detailed timeline of the development of these predictors and discuss the underlying limitations of the current resources and predictive tools. Finally, we provide several recommendations concerning the future development of the related databases and methods.
Collapse
Affiliation(s)
- Chen Wang
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| |
Collapse
|
27
|
Kim H, Kim E, Lee I, Bae B, Park M, Nam H. Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches. BIOTECHNOL BIOPROC E 2021; 25:895-930. [PMID: 33437151 PMCID: PMC7790479 DOI: 10.1007/s12257-020-0049-y] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 05/27/2020] [Accepted: 06/03/2020] [Indexed: 02/07/2023]
Abstract
As expenditure on drug development increases exponentially, the overall drug discovery process requires a sustainable revolution. Since artificial intelligence (AI) is leading the fourth industrial revolution, AI can be considered as a viable solution for unstable drug research and development. Generally, AI is applied to fields with sufficient data such as computer vision and natural language processing, but there are many efforts to revolutionize the existing drug discovery process by applying AI. This review provides a comprehensive, organized summary of the recent research trends in AI-guided drug discovery process including target identification, hit identification, ADMET prediction, lead optimization, and drug repositioning. The main data sources in each field are also summarized in this review. In addition, an in-depth analysis of the remaining challenges and limitations will be provided, and proposals for promising future directions in each of the aforementioned areas.
Collapse
Affiliation(s)
- Hyunho Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Eunyoung Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Ingoo Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Bongsung Bae
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Minsu Park
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| |
Collapse
|
28
|
Chu Y, Shan X, Chen T, Jiang M, Wang Y, Wang Q, Salahub DR, Xiong Y, Wei DQ. DTI-MLCD: predicting drug-target interactions using multi-label learning with community detection method. Brief Bioinform 2020; 22:5910189. [PMID: 32964234 DOI: 10.1093/bib/bbaa205] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Revised: 08/06/2020] [Accepted: 08/10/2020] [Indexed: 12/20/2022] Open
Abstract
Identifying drug-target interactions (DTIs) is an important step for drug discovery and drug repositioning. To reduce the experimental cost, a large number of computational approaches have been proposed for this task. The machine learning-based models, especially binary classification models, have been developed to predict whether a drug-target pair interacts or not. However, there is still much room for improvement in the performance of current methods. Multi-label learning can overcome some difficulties caused by single-label learning in order to improve the predictive performance. The key challenge faced by multi-label learning is the exponential-sized output space, and considering label correlations can help to overcome this challenge. In this paper, we facilitate multi-label classification by introducing community detection methods for DTI prediction, named DTI-MLCD. Moreover, we updated the gold standard data set by adding 15,000 more positive DTI samples in comparison to the data set, which has widely been used by most of previously published DTI prediction methods since 2008. The proposed DTI-MLCD is applied to both data sets, demonstrating its superiority over other machine learning methods and several existing methods. The data sets and source code of this study are freely available at https://github.com/a96123155/DTI-MLCD.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Xiaoqi Shan
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Tianhang Chen
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Mingming Jiang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| |
Collapse
|
29
|
Jiang M, Li Z, Zhang S, Wang S, Wang X, Yuan Q, Wei Z. Drug-target affinity prediction using graph neural network and contact maps. RSC Adv 2020; 10:20701-20712. [PMID: 35517730 PMCID: PMC9054320 DOI: 10.1039/d0ra02297g] [Citation(s) in RCA: 166] [Impact Index Per Article: 33.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Accepted: 05/07/2020] [Indexed: 02/01/2023] Open
Abstract
Computer-aided drug design uses high-performance computers to simulate the tasks in drug design, which is a promising research area. Drug-target affinity (DTA) prediction is the most important step of computer-aided drug design, which could speed up drug development and reduce resource consumption. With the development of deep learning, the introduction of deep learning to DTA prediction and improving the accuracy have become a focus of research. In this paper, utilizing the structural information of molecules and proteins, two graphs of drug molecules and proteins are built up respectively. Graph neural networks are introduced to obtain their representations, and a method called DGraphDTA is proposed for DTA prediction. Specifically, the protein graph is constructed based on the contact map output from the prediction method, which could predict the structural characteristics of the protein according to its sequence. It can be seen from the test of various metrics on benchmark datasets that the method proposed in this paper has strong robustness and generalizability.
Collapse
Affiliation(s)
- Mingjian Jiang
- Department of Computer Science and Technology, Ocean University of China China
| | - Zhen Li
- Department of Computer Science and Technology, Ocean University of China China
| | - Shugang Zhang
- Department of Computer Science and Technology, Ocean University of China China
| | - Shuang Wang
- Department of Computer Science and Technology, Ocean University of China China
| | - Xiaofeng Wang
- Department of Computer Science and Technology, Ocean University of China China
| | - Qing Yuan
- Department of Computer Science and Technology, Ocean University of China China
| | - Zhiqiang Wei
- Department of Computer Science and Technology, Ocean University of China China
| |
Collapse
|
30
|
Mathai N, Kirchmair J. Similarity-Based Methods and Machine Learning Approaches for Target Prediction in Early Drug Discovery: Performance and Scope. Int J Mol Sci 2020; 21:ijms21103585. [PMID: 32438666 PMCID: PMC7279241 DOI: 10.3390/ijms21103585] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 05/13/2020] [Accepted: 05/16/2020] [Indexed: 12/20/2022] Open
Abstract
Computational methods for predicting the macromolecular targets of drugs and drug-like compounds have evolved as a key technology in drug discovery. However, the established validation protocols leave several key questions regarding the performance and scope of methods unaddressed. For example, prediction success rates are commonly reported as averages over all compounds of a test set and do not consider the structural relationship between the individual test compounds and the training instances. In order to obtain a better understanding of the value of ligand-based methods for target prediction, we benchmarked a similarity-based method and a random forest based machine learning approach (both employing 2D molecular fingerprints) under three testing scenarios: a standard testing scenario with external data, a standard time-split scenario, and a scenario that is designed to most closely resemble real-world conditions. In addition, we deconvoluted the results based on the distances of the individual test molecules from the training data. We found that, surprisingly, the similarity-based approach generally outperformed the machine learning approach in all testing scenarios, even in cases where queries were structurally clearly distinct from the instances in the training (or reference) data, and despite a much higher coverage of the known target space.
Collapse
Affiliation(s)
- Neann Mathai
- Department of Chemistry and Computational Biology Unit (CBU), University of Bergen, N-5020 Bergen, Norway;
| | - Johannes Kirchmair
- Department of Chemistry and Computational Biology Unit (CBU), University of Bergen, N-5020 Bergen, Norway;
- Department of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, 1090 Vienna, Austria
- Correspondence:
| |
Collapse
|
31
|
Wang X, Liu Y, Lu F, Li H, Gao P, Wei D. Dipeptide Frequency of Word Frequency and Graph Convolutional Networks for DTA Prediction. Front Bioeng Biotechnol 2020; 8:267. [PMID: 32318557 PMCID: PMC7147459 DOI: 10.3389/fbioe.2020.00267] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 03/13/2020] [Indexed: 11/13/2022] Open
Abstract
Deep learning is an effective method to capture drug-target binding affinity, but low accuracy is still an obstacle to be overcome. Thus, we propose a novel predictor for drug-target binding affinity based on dipeptide frequency of word frequency encoding and a hybrid graph convolutional network. Word frequency characteristics of natural language are used to improve the frequency characteristics of peptides to express target proteins. For each drug molecules, the five different features of drug atoms and the atomic bond relationships are expressed as graphs. The obtained protein features and graph structure are used as the input of convolution neural network and the input of graph convolution neural network, respectively. A prediction model is established to predict the drug affinity by calculating the hidden relationship. In the KIBA data set test experiment, the consistency coefficient of the model is 0.901, which is 0.01 higher than the existing model, and the MSE (mean square error) of the model is 0.126, which is 5% lower than the existing model. In Davis data set test experiment, the consistency coefficient of the model is 0.895, which is 0.006 higher than the existing model, and the MSE of the model is 0.220, which is 4% lower than the existing model. These results show that our proposed method can not only predict the affinity better than those existing models, but also outperform unitary deep learning approaches.
Collapse
Affiliation(s)
- Xianfang Wang
- School of Computer Science and Technology, Henan Institute of Technology, Xinxiang, China.,School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Yifeng Liu
- School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Fan Lu
- School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Hongfei Li
- School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Peng Gao
- School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Dongqing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
32
|
Rayhan F, Ahmed S, Mousavian Z, Farid DM, Shatabda S. FRnet-DTI: Deep convolutional neural network for drug-target interaction prediction. Heliyon 2020; 6:e03444. [PMID: 32154410 PMCID: PMC7052404 DOI: 10.1016/j.heliyon.2020.e03444] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Revised: 06/16/2019] [Accepted: 02/14/2020] [Indexed: 01/09/2023] Open
Abstract
The task of drug-target interaction prediction holds significant importance in pharmacology and therapeutic drug design. In this paper, we present FRnet-DTI, an auto-encoder based feature manipulation and a convolutional neural network based classifier for drug target interaction prediction. Two convolutional neural networks are proposed: FRnet-Encode and FRnet-Predict. Here, one model is used for feature manipulation and the other one for classification. Using the first method FRnet-Encode, we generate 4096 features for each of the instances in each of the datasets and use the second method, FRnet-Predict, to identify interaction probability employing those features. We have tested our method on four gold standard datasets extensively used by other researchers. Experimental results shows that our method significantly improves over the state-of-the-art method on three out of four drug-target interaction gold standard datasets on both area under curve for Receiver Operating Characteristic (auROC) and area under Precision Recall curve (auPR) metric. We also introduce twenty new potential drug-target pairs for interaction based on high prediction scores. The source codes and implementation details of our methods are available from https://github.com/farshidrayhanuiu/FRnet-DTI/ and also readily available to use as an web application from http://farshidrayhan.pythonanywhere.com/FRnet-DTI/.
Collapse
Affiliation(s)
- Farshid Rayhan
- Department of Computer Science and Engineering, United International University, Plot 2, United City, Madani Avenue, Satarkul, Badda, Dhaka-1212, Bangladesh
| | - Sajid Ahmed
- Department of Computer Science and Engineering, United International University, Plot 2, United City, Madani Avenue, Satarkul, Badda, Dhaka-1212, Bangladesh
| | - Zaynab Mousavian
- School of Mathematics, Statistics, and Computer Science, College of Science, University of Tehran, Tehran, Iran
| | - Dewan Md Farid
- Department of Computer Science and Engineering, United International University, Plot 2, United City, Madani Avenue, Satarkul, Badda, Dhaka-1212, Bangladesh
| | - Swakkhar Shatabda
- Department of Computer Science and Engineering, United International University, Plot 2, United City, Madani Avenue, Satarkul, Badda, Dhaka-1212, Bangladesh
| |
Collapse
|
33
|
Zhao L, Wang J, Pang L, Liu Y, Zhang J. GANsDTA: Predicting Drug-Target Binding Affinity Using GANs. Front Genet 2020; 10:1243. [PMID: 31993067 PMCID: PMC6962343 DOI: 10.3389/fgene.2019.01243] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 11/11/2019] [Indexed: 01/09/2023] Open
Abstract
The computational prediction of interactions between drugs and targets is a standing challenge in drug discovery. State-of-the-art methods for drug-target interaction prediction are primarily based on supervised machine learning with known label information. However, in biomedicine, obtaining labeled training data is an expensive and a laborious process. This paper proposes a semi-supervised generative adversarial networks (GANs)-based method to predict binding affinity. Our method comprises two parts, two GANs for feature extraction and a regression network for prediction. The semi-supervised mechanism allows our model to learn proteins drugs features of both labeled and unlabeled data. We evaluate the performance of our method using multiple public datasets. Experimental results demonstrate that our method achieves competitive performance while utilizing freely available unlabeled data. Our results suggest that utilizing such unlabeled data can considerably help improve performance in various biomedical relation extraction processes, for example, Drug-Target interaction and protein-protein interaction, particularly when only limited labeled data are available in such tasks. To our best knowledge, this is the first semi-supervised GANs-based method to predict binding affinity.
Collapse
Affiliation(s)
- Lingling Zhao
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Junjie Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Long Pang
- Institute of Space Environment and Material Science, Harbin Institute of Technology, Harbin, China
| | - Yang Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Jun Zhang
- Department of Rehabilitation, Heilongjiang Province Land Reclamation Headquarters General Hospital, Harbin, China
| |
Collapse
|
34
|
Chu Y, Kaushik AC, Wang X, Wang W, Zhang Y, Shan X, Salahub DR, Xiong Y, Wei DQ. DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features. Brief Bioinform 2019; 22:451-462. [PMID: 31885041 DOI: 10.1093/bib/bbz152] [Citation(s) in RCA: 113] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Revised: 11/01/2019] [Accepted: 11/04/2019] [Indexed: 12/18/2022] Open
Abstract
Drug-target interactions (DTIs) play a crucial role in target-based drug discovery and development. Computational prediction of DTIs can effectively complement experimental wet-lab techniques for the identification of DTIs, which are typically time- and resource-consuming. However, the performances of the current DTI prediction approaches suffer from a problem of low precision and high false-positive rate. In this study, we aim to develop a novel DTI prediction method for improving the prediction performance based on a cascade deep forest (CDF) model, named DTI-CDF, with multiple similarity-based features between drugs and the similarity-based features between target proteins extracted from the heterogeneous graph, which contains known DTIs. In the experiments, we built five replicates of 10-fold cross-validation under three different experimental settings of data sets, namely, corresponding DTI values of certain drugs (SD), targets (ST), or drug-target pairs (SP) in the training sets are missed but existed in the test sets. The experimental results demonstrate that our proposed approach DTI-CDF achieves a significantly higher performance than that of the traditional ensemble learning-based methods such as random forest and XGBoost, deep neural network, and the state-of-the-art methods such as DDR. Furthermore, there are 1352 newly predicted DTIs which are proved to be correct by KEGG and DrugBank databases. The data sets and source code are freely available at https://github.com//a96123155/DTI-CDF.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | - Xiangeng Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Wei Wang
- Mathematical Sciences, Shanghai Jiao Tong University
| | - Yufang Zhang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | | | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| |
Collapse
|
35
|
Zhang W, Lin W, Zhang D, Wang S, Shi J, Niu Y. Recent Advances in the Machine Learning-Based Drug-Target Interaction Prediction. Curr Drug Metab 2019; 20:194-202. [PMID: 30129407 DOI: 10.2174/1389200219666180821094047] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 01/18/2018] [Accepted: 03/19/2018] [Indexed: 12/28/2022]
Abstract
BACKGROUND The identification of drug-target interactions is a crucial issue in drug discovery. In recent years, researchers have made great efforts on the drug-target interaction predictions, and developed databases, software and computational methods. RESULTS In the paper, we review the recent advances in machine learning-based drug-target interaction prediction. First, we briefly introduce the datasets and data, and summarize features for drugs and targets which can be extracted from different data. Since drug-drug similarity and target-target similarity are important for many machine learning prediction models, we introduce how to calculate similarities based on data or features. Different machine learningbased drug-target interaction prediction methods can be proposed by using different features or information. Thus, we summarize, analyze and compare different machine learning-based prediction methods. CONCLUSION This study provides the guide to the development of computational methods for the drug-target interaction prediction.
Collapse
Affiliation(s)
- Wen Zhang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Weiran Lin
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Ding Zhang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Siman Wang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Jingwen Shi
- School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China
| | - Yanqing Niu
- School of Mathematics and Statistics, South-Central University for Nationalities, Wuhan 430074, China
| |
Collapse
|
36
|
Mahmud SMH, Chen W, Meng H, Jahan H, Liu Y, Hasan SMM. Prediction of drug-target interaction based on protein features using undersampling and feature selection techniques with boosting. Anal Biochem 2019; 589:113507. [PMID: 31734254 DOI: 10.1016/j.ab.2019.113507] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Revised: 11/05/2019] [Accepted: 11/08/2019] [Indexed: 12/29/2022]
Abstract
Accurate identification of drug-target interaction (DTI) is a crucial and challenging task in the drug discovery process, having enormous benefit to the patients and pharmaceutical company. The traditional wet-lab experiments of DTI is expensive, time-consuming, and labor-intensive. Therefore, many computational techniques have been established for this purpose; although a huge number of interactions are still undiscovered. Here, we present pdti-EssB, a new computational model for identification of DTI using protein sequence and drug molecular structure. More specifically, each drug molecule is transformed as the molecular substructure fingerprint. For a protein sequence, different descriptors are utilized to represent its evolutionary, sequence, and structural information. Besides, our proposed method uses data balancing techniques to handle the imbalance problem and applies a novel feature eliminator to extract the best optimal features for accurate prediction. In this paper, four classes of DTI benchmark datasets are used to construct a predictive model with XGBoost. Here, the auROC is utilized as an evaluation metric to compare the performance of pdti-EssB method with recent methods, applying five-fold cross-validation. Finally, the experimental results indicate that our proposed method is able to outperform other approaches in predicting DTI, and introduces new drug-target interaction samples based on prediction probability scores. pdti-EssB webserver is available online at http://pdtiessb-uestc.com/.
Collapse
Affiliation(s)
- S M Hasan Mahmud
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - Wenyu Chen
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - Han Meng
- School of Political Science and Public Administration, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - Hosney Jahan
- College of Computer Science, Sichuan University, Chengdu, 610065, China.
| | - Yongsheng Liu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - S M Mamun Hasan
- Department of Internal Medicine, Rangpur Medical College, Rangpur, 5400, Bangladesh.
| |
Collapse
|
37
|
A Multi-Label Learning Framework for Drug Repurposing. Pharmaceutics 2019; 11:pharmaceutics11090466. [PMID: 31505805 PMCID: PMC6781509 DOI: 10.3390/pharmaceutics11090466] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2019] [Revised: 08/22/2019] [Accepted: 09/05/2019] [Indexed: 01/10/2023] Open
Abstract
Drug repurposing plays an important role in screening old drugs for new therapeutic efficacy. The existing methods commonly treat prediction of drug-target interaction as a problem of binary classification, in which a large number of randomly sampled drug-target pairs accounting for over 50% of the entire training dataset are necessarily required. Such a large number of negative examples that do not come from experimental observations inevitably decrease the credibility of predictions. In this study, we propose a multi-label learning framework to find new uses for old drugs and discover new drugs for known target genes. In the framework, each drug is treated as a class label and its target genes are treated as the class-specific training data to train a supervised learning model of l2-regularized logistic regression. As such, the inter-drug associations are explicitly modelled into the framework and all the class-specific training data come from experimental observations. In addition, the data constraint is less demanding, for instance, the chemical substructures of a drug are no longer needed and the novel target genes are inferred only from the underlying patterns of the known genes targeted by the drug. Stratified multi-label cross-validation shows that 84.9% of known target genes have at least one drug correctly recognized, and the proposed framework correctly recognizes 86.73% of the independent test drug-target interactions (DTIs) from DrugBank. These results show that the proposed framework could generalize well in the large drug/class space without the information of drug chemical structures and target protein structures. Furthermore, we use the trained model to predict new drugs for the known target genes, identify new genes for the old drugs, and infer new associations between old drugs and new disease phenotypes via the OMIM database. Gene ontology (GO) enrichment analyses and the disease associations reported in recent literature provide supporting evidences to the computational results, which potentially shed light on new clinical therapies for new and/or old disease phenotypes.
Collapse
|
38
|
Robert BM, Brindha GR, Santhi B, Kanimozhi G, Prasad NR. Computational models for predicting anticancer drug efficacy: A multi linear regression analysis based on molecular, cellular and clinical data of oral squamous cell carcinoma cohort. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2019; 178:105-112. [PMID: 31416538 DOI: 10.1016/j.cmpb.2019.06.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Revised: 04/15/2019] [Accepted: 06/11/2019] [Indexed: 06/10/2023]
Abstract
BACKGROUND AND OBJECTIVES The computational prediction of drug responses based on the analysis of multiple clinical features of the tumor will be a novel strategy for accomplishing the long-term goal of precision medicine in oncology. The cancer patients will be benefitted if we computationally account all the tumor characteristics (data) for the selection of most effective and precise therapeutic drug. In this study, we developed and validated few computational models to predict anticancer drug efficacy based on molecular, cellular and clinical features of 31 oral squamous cell carcinoma (OSCC) cohort using computational methods. METHODS We developed drug efficacy prediction models using multiple tumor features by employing the statistical methods like multi linear regression (MLR), modified MLR-weighted least square (MLR-WLS) and enhanced MLR-WLS. All the three developed drug efficacy prediction models were then validated using the data of actual OSCC samples (train-test ratio 31: 31) and actual Vs hypothetical samples (train-test ratio 31: 30). The selected best statistical model i.e. enhanced MLR-WLS has then been cross-validated (CV) using 341 theoretical tumor data. Finally, the performances of the models were assessed by the level of learning confidence, significance, accuracy and error terms. RESULTS The train-test process for the real tumor samples of MLR-WLS method revealed the drug efficacy prediction enhancement and we observed that there was very less priming difference between actual and predicted. Furthermore, we found there was a less difference between actual apoptotic priming and predicted apoptotic priming for the tumors 6, 8, 21 and 30 whereas, for the remaining tumors there were no differences between predicted and actual priming data. The error terms (Actual Vs Predicted) also revealed the reliability of enhanced MLR-WLS model for drug efficacy prediction. CONCLUSION We developed effective computational prediction models using MLR analysis for anticancer drug efficacy which will be useful in the field of precision medicine to choose the choice of drug in a personalized manner. We observed that the enhanced MLR-WLS model was the best fit to predict anticancer drug efficacy which may have translational applications.
Collapse
Affiliation(s)
- Beaulah Mary Robert
- Department of Biochemistry and Biotechnology, Annamalai University, Annamalainagar 608 002, Tamilnadu, India
| | - G R Brindha
- School of Computing, SASTRA Deemed to be University, Tirumalaisamudram, Thanjavur 613401, Tamilnadu, India.
| | - B Santhi
- School of Computing, SASTRA Deemed to be University, Tirumalaisamudram, Thanjavur 613401, Tamilnadu, India
| | - G Kanimozhi
- Department of Biochemistry, Dharmapuramn Gnanambigai Government Arts and Science College for Women, Mayiladuthurai, Tamilnadu, India
| | - Nagarajan Rajendra Prasad
- Department of Biochemistry and Biotechnology, Annamalai University, Annamalainagar 608 002, Tamilnadu, India.
| |
Collapse
|
39
|
Nguyen AH, Marsh P, Schmiess-Heine L, Burke PJ, Lee A, Lee J, Cao H. Cardiac tissue engineering: state-of-the-art methods and outlook. J Biol Eng 2019; 13:57. [PMID: 31297148 PMCID: PMC6599291 DOI: 10.1186/s13036-019-0185-0] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2019] [Accepted: 06/03/2019] [Indexed: 12/17/2022] Open
Abstract
The purpose of this review is to assess the state-of-the-art fabrication methods, advances in genome editing, and the use of machine learning to shape the prospective growth in cardiac tissue engineering. Those interdisciplinary emerging innovations would move forward basic research in this field and their clinical applications. The long-entrenched challenges in this field could be addressed by novel 3-dimensional (3D) scaffold substrates for cardiomyocyte (CM) growth and maturation. Stem cell-based therapy through genome editing techniques can repair gene mutation, control better maturation of CMs or even reveal its molecular clock. Finally, machine learning and precision control for improvements of the construct fabrication process and optimization in tissue-specific clonal selections with an outlook of cardiac tissue engineering are also presented.
Collapse
Affiliation(s)
- Anh H. Nguyen
- Electrical and Computer Engineering Department, University of Alberta, Edmonton, Alberta Canada
- Electrical Engineering and Computer Science Department, University of California Irvine, Irvine, CA USA
| | - Paul Marsh
- Electrical Engineering and Computer Science Department, University of California Irvine, Irvine, CA USA
| | - Lauren Schmiess-Heine
- Electrical Engineering and Computer Science Department, University of California Irvine, Irvine, CA USA
| | - Peter J. Burke
- Electrical Engineering and Computer Science Department, University of California Irvine, Irvine, CA USA
- Biomedical Engineering Department, University of California Irvine, Irvine, CA USA
- Chemical Engineering and Materials Science Department, University of California Irvine, Irvine, CA USA
| | - Abraham Lee
- Biomedical Engineering Department, University of California Irvine, Irvine, CA USA
- Mechanical and Aerospace Engineering Department, University of California Irvine, Irvine, CA USA
| | - Juhyun Lee
- Bioengineering Department, University of Texas at Arlington, Arlington, TX USA
| | - Hung Cao
- Electrical Engineering and Computer Science Department, University of California Irvine, Irvine, CA USA
- Biomedical Engineering Department, University of California Irvine, Irvine, CA USA
- Henry Samueli School of Engineering, University of California, Irvine, USA
| |
Collapse
|
40
|
Masoudi-Sobhanzadeh Y, Omidi Y, Amanlou M, Masoudi-Nejad A. Trader as a new optimization algorithm predicts drug-target interactions efficiently. Sci Rep 2019; 9:9348. [PMID: 31249365 PMCID: PMC6597553 DOI: 10.1038/s41598-019-45814-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2019] [Accepted: 06/17/2019] [Indexed: 12/29/2022] Open
Abstract
Several machine learning approaches have been proposed for predicting new benefits of the existing drugs. Although these methods have introduced new usage(s) of some medications, efficient methods can lead to more accurate predictions. To this end, we proposed a novel machine learning method which is based on a new optimization algorithm, named Trader. To show the capabilities of the proposed algorithm which can be applied to the different scope of science, it was compared with ten other state-of-the-art optimization algorithms based on the standard and advanced benchmark functions. Next, a multi-layer artificial neural network was designed and trained by Trader to predict drug-target interactions (DTIs). Finally, the functionality of the proposed method was investigated on some DTIs datasets and compared with other methods. The data obtained by Trader showed that it eliminates the disadvantages of different optimization algorithms, resulting in a better outcome. Further, the proposed machine learning method was found to achieve a significant level of performance compared to the other popular and efficient approaches in predicting unknown DTIs. All the implemented source codes are freely available at https://github.com/LBBSoft/Trader .
Collapse
Affiliation(s)
- Yosef Masoudi-Sobhanzadeh
- Laboratory of systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Yadollah Omidi
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Massoud Amanlou
- Drug Design and Development Research Center, The Institute of Pharmaceutical Sciences (TIPS), Tehran University of Medical Sciences, Tehran, 14176-53955, Iran
| | - Ali Masoudi-Nejad
- Laboratory of systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran.
| |
Collapse
|
41
|
CFSBoost: Cumulative feature subspace boosting for drug-target interaction prediction. J Theor Biol 2019; 464:1-8. [DOI: 10.1016/j.jtbi.2018.12.024] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 12/13/2018] [Accepted: 12/18/2018] [Indexed: 01/12/2023]
|
42
|
Abstract
Pharmacological science is trying to establish the link between chemicals, targets, and disease-related phenotypes. A plethora of chemical proteomics and structural data have been generated, thanks to the target-based approach that has dominated drug discovery at the turn of the century. There is an invaluable source of information for in silico target profiling. Prediction is based on the principle of chemical similarity (similar drugs bind similar targets) or on first principles from the biophysics of molecular interactions. In the first case, compound comparison is made through ligand-based chemical similarity search or through classifier-based machine learning approach. The 3D techniques are based on 3D structural descriptors or energy-based scoring scheme to infer a binding affinity of a compound with its putative target. More recently, a new approach based on compound set metric has been proposed in which a query compound is compared with a whole of compounds associated with a target or a family of targets. This chapter reviews the different techniques of in silico target profiling and their main applications such as inference of unwanted targets, drug repurposing, or compound prioritization after phenotypic-based screening campaigns.
Collapse
|
43
|
Alberga D, Trisciuzzi D, Montaruli M, Leonetti F, Mangiatordi GF, Nicolotti O. A New Approach for Drug Target and Bioactivity Prediction: The Multifingerprint Similarity Search Algorithm (MuSSeL). J Chem Inf Model 2018; 59:586-596. [PMID: 30485097 DOI: 10.1021/acs.jcim.8b00698] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
We present MuSSeL, a multifingerprint similarity search algorithm, able to predict putative drug targets for a given query small molecule as well as to return a quantitative assessment of its bioactivity in terms of Ki or IC50 values. Predictions are automatically made exploiting a large collection of high quality experimental bioactivity data available from ChEMBL (version 22.1) combining, in a consensus-like approach, predictions resulting from a similarity search performed using 13 different fingerprint definitions. Importantly, the herein proposed algorithm is also effective in detecting and handling activity cliffs. A calibration set including small molecules present in the last updated version of ChEMBL (version 23) was employed to properly tune the algorithm parameters. Three randomly built external sets were instead challenged for model performances. The potential use of MuSSeL was also challenged by a prospective exercise for the prediction of five bioactive compounds taken from articles published in the Journal of Medicinal Chemistry just few months ago. The paper emphasizes the importance of implementing multifingerprint consensus strategies to increase the confidence in prediction of similarity search algorithms and provides a fast and easy-to-run tool for drug target and bioactivity prediction.
Collapse
Affiliation(s)
- Domenico Alberga
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Michele Montaruli
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Francesco Leonetti
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Giuseppe Felice Mangiatordi
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Orazio Nicolotti
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| |
Collapse
|
44
|
Yang J, Li A, Li Y, Guo X, Wang M. A novel approach for drug response prediction in cancer cell lines via network representation learning. Bioinformatics 2018; 35:1527-1535. [DOI: 10.1093/bioinformatics/bty848] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 09/09/2018] [Accepted: 10/09/2018] [Indexed: 11/13/2022] Open
Affiliation(s)
- Jianghong Yang
- School of Information Science and Technology, University of Science and Technology of China, Hefei AH230037, China
| | - Ao Li
- School of Information Science and Technology, University of Science and Technology of China, Hefei AH230037, China
- Centers for Biomedical Engineering, University of Science and Technology of China, Hefei AH230037, China
| | - Yongqiang Li
- Department of Preventive Medicine, Institute of Biomedical Informatics, Cell Signal Transduction Laboratory, School of Basic Medical Sciences, Henan University, Kaifeng, China
| | - Xiangqian Guo
- Department of Preventive Medicine, Institute of Biomedical Informatics, Cell Signal Transduction Laboratory, School of Basic Medical Sciences, Henan University, Kaifeng, China
| | - Minghui Wang
- School of Information Science and Technology, University of Science and Technology of China, Hefei AH230037, China
- Centers for Biomedical Engineering, University of Science and Technology of China, Hefei AH230037, China
| |
Collapse
|
45
|
iDTI-ESBoost: Identification of Drug Target Interaction Using Evolutionary and Structural Features with Boosting. Sci Rep 2017; 7:17731. [PMID: 29255285 PMCID: PMC5735173 DOI: 10.1038/s41598-017-18025-2] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Accepted: 12/05/2017] [Indexed: 02/07/2023] Open
Abstract
Prediction of new drug-target interactions is critically important as it can lead the researchers to find new uses for old drugs and to disclose their therapeutic profiles or side effects. However, experimental prediction of drug-target interactions is expensive and time-consuming. As a result, computational methods for predictioning new drug-target interactions have gained a tremendous interest in recent times. Here we present iDTI-ESBoost, a prediction model for identification of drug-target interactions using evolutionary and structural features. Our proposed method uses a novel data balancing and boosting technique to predict drug-target interaction. On four benchmark datasets taken from a gold standard data, iDTI-ESBoost outperforms the state-of-the-art methods in terms of area under receiver operating characteristic (auROC) curve. iDTI-ESBoost also outperforms the latest and the best-performing method found in the literature in terms of area under precision recall (auPR) curve. This is significant as auPR curves are argued as suitable metric for comparison for imbalanced datasets similar to the one studied here. Our reported results show the effectiveness of the classifier, balancing methods and the novel features incorporated in iDTI-ESBoost. iDTI-ESBoost is a novel prediction method that has for the first time exploited the structural features along with the evolutionary features to predict drug-protein interactions. We believe the excellent performance of iDTI-ESBoost both in terms of auROC and auPR would motivate the researchers and practitioners to use it to predict drug-target interactions. To facilitate that, iDTI-ESBoost is implemented and made publicly available at: http://farshidrayhan.pythonanywhere.com/iDTI-ESBoost/.
Collapse
|
46
|
Bolgár B, Antal P. VB-MK-LMF: fusion of drugs, targets and interactions using variational Bayesian multiple kernel logistic matrix factorization. BMC Bioinformatics 2017; 18:440. [PMID: 28978313 PMCID: PMC5628496 DOI: 10.1186/s12859-017-1845-z] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Accepted: 09/21/2017] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Computational fusion approaches to drug-target interaction (DTI) prediction, capable of utilizing multiple sources of background knowledge, were reported to achieve superior predictive performance in multiple studies. Other studies showed that specificities of the DTI task, such as weighting the observations and focusing the side information are also vital for reaching top performance. METHOD We present Variational Bayesian Multiple Kernel Logistic Matrix Factorization (VB-MK-LMF), which unifies the advantages of (1) multiple kernel learning, (2) weighted observations, (3) graph Laplacian regularization, and (4) explicit modeling of probabilities of binary drug-target interactions. RESULTS VB-MK-LMF achieves significantly better predictive performance in standard benchmarks compared to state-of-the-art methods, which can be traced back to multiple factors. The systematic evaluation of the effect of multiple kernels confirm their benefits, but also highlights the limitations of linear kernel combinations, already recognized in other fields. The analysis of the effect of prior kernels using varying sample sizes sheds light on the balance of data and knowledge in DTI tasks and on the rate at which the effect of priors vanishes. This also shows the existence of "small sample size" regions where using side information offers significant gains. Alongside favorable predictive performance, a notable property of MF methods is that they provide a unified space for drugs and targets using latent representations. Compared to earlier studies, the dimensionality of this space proved to be surprisingly low, which makes the latent representations constructed by VB-ML-LMF especially well-suited for visual analytics. The probabilistic nature of the predictions allows the calculation of the expected values of hits in functionally relevant sets, which we demonstrate by predicting drug promiscuity. The variational Bayesian approximation is also implemented for general purpose graphics processing units yielding significantly improved computational time. CONCLUSION In standard benchmarks, VB-MK-LMF shows significantly improved predictive performance in a wide range of settings. Beyond these benchmarks, another contribution of our work is highlighting and providing estimates for further pharmaceutically relevant quantities, such as promiscuity, druggability and total number of interactions.
Collapse
Affiliation(s)
- Bence Bolgár
- Department of Measurement and Information Systems, Budapest University of Technology and Economics, Magyar tudósok krt. 2., Budapest, 1117 Hungary
| | - Péter Antal
- Department of Measurement and Information Systems, Budapest University of Technology and Economics, Magyar tudósok krt. 2., Budapest, 1117 Hungary
| |
Collapse
|