Biomarker Discovery for an Identified Target/ Disease Indication

Objective

Biomarkers have wide variety of therapeutic applications such as diagnosis of disease or disease subtypes, monitor the clinical response to the therapeutic, identification of the disease recurrence or prognosis and differentiation of the responding disease subtypes to the drugs. Thus, Identification of biomarkers can be crucial for success of the therapeutic drug in clinical trials by selection of the accurate disease subtype of patients and to identify the responding subpopulations of the patients in ongoing trials and to monitor clinical efficacy of the therapeutic.

Methodology

  • Apply multiple machine learning (ML) methods, including Natural Language Processing (NLP) and Entity Linking, to vast scientific literature databases (e.g., PubMed, Semantic Scholar, Google Scholar).
  • Extract all biomarker data related to a specific disease indication from these literature sources.
  • Collaborate with biology scientists and data scientists to integrate publication data with multi-omics datasets (genomic, transcriptomic, proteomic, metabolomic).
  • Use diverse ML and deep learning (DL) models for multi-data type analysis to identify novel prognostic, predictive, and diagnostic biomarkers specific to disease subtypes.
  • Generate customized dashboards for visualizing and selecting biomarker data using specific filters/criteria, including various types and levels of evidence.
Outcomes / Impact

Instead of the need for very time-consuming manual literature search, the application of ML methods on multi-modal datasets facilitates the time and cost-effective approach to biomarker discovery. Utilizing this integrated AI based methods with multiple data types provides faster and effective solution for identification of disease appropriate biomarker/ biomarker panels otherwise missed in laborious manual exploration methods. This automated approach for search biomarkers greatly fast-tracks research of personalized medicine by early diagnosis, enabling disease subclassification, patient stratification and identification of responding patient populations in clinical trials.