Before joining VCU, I earned my BS in Mathematics from University of Cincinnati (under the advice of Prof. Wang Xia). During my thesis work, I focused on developing expertise in generalized linear regression models and the least absolute shrinkage and selection operator (LASSO).
My professional interests primarily revolve around clustering, machine learning, optimization, and regression analysis. A significant portion of my research is dedicated to the development of specialized, automated tools for the preprocessing and analysis of complex, highly-multiplexed, and multimodal imaging data, particularly in single-cell studies. This includes work with advanced systems like PhenoCycler, MERSCOPE, and Xenium. In addition, I have been actively involved in harnessing the synergistic potential of methylation and fragmentomic signatures from blood-based circulating tumor DNA. By integrating these with multimodal deep-learning analyses, my work contributes to advancing early cancer detection methodologies.
Please feel free to connect if you are interested in spatial omics and early cancer detection, or would like to discuss my current and previous studies.
Below are my publications:
Contributions: statistical analysis, data visualization, critical revision of the manuscript
Key findings:
Integrated Periodontal Meta-Atlas: We created the first integrated periodontal meta-atlas by integrating four single-cell RNA sequencing (scRNAseq) datasets using the new open-source software Cellenics, including a public database link (cellxgene). This atlas includes harmonized annotation of 32 unique cell types, featuring previously undiscovered populations of oral epithelial cells.
Spatial Segregation in Immune Responses: Our findings reveal spatial segregation between innate and adaptive immune responses in periodontitis. The tissue closest to the tooth, which supports facultative anaerobes, shows significant upregulation of immune checkpoint molecules such as PD-L1.
Differential Gene Expression and Cellular Activity: Approximately two-thirds of differentially expressed genes in periodontitis originate from keratinocytes, fibroblasts, and vascular endothelial cells. Keratinocytes near the tooth exhibit high activity in receptor-ligand interactions, even in healthy conditions.
Discovery of Intracellular Polybacterial Associations: We discovered intracellular polybacterial associations (PIC) in human tissues for the first time, particularly in keratinocytes. This includes a novel finding of an epithelial stem cell containing at least four intracellular microbes, a previously unknown phenomenon in human cells.
Contributions: Data collection, implemented the algorithm and conducted benchmarking experiments , designed the algorithm, statistical analysis, data visualization, wrote the original draft, critical revision of the manuscript
Key findings:
Versatile and scalable in various contexts: disease, tissue, assay, modality, and species.
Superior accuracy and scalability compared to existing unsupervised cell type annotation methods.
Enhanced cell type annotation for spatial transcriptomics data via label transfer from single-cell RNA-seq.
Revealed new cellular associations in inflammatory salivary gland diseases, demonstrating its utility for deep phenotyping and clinical applications.
Consistent cell typing results across spatial transcriptomics and proteomics, supporting joint multiomics approaches.
Essential for advancing spatial biology towards translational and clinical research, necessitating multimodal panel designs and flexible analysis pipelines.
Aligned matching cells ID between spatial protein and spatial transcriptomics, facilitating integrated multiomic analyses.
Contributions: statistical analysis, data visualization, critical revision of the manuscript
Key findings:
Comprehensive Cellular Mapping: Through the use of unbiased single-cell and spatial transcriptomics, a comprehensive understanding of the cellular landscape of healthy salivary glands was developed, along with how this landscape changes in patients with Sjögren's Disease.
Identification of Novel Cell Types: The study identified novel seromucous acinar cell types, including a specific population of PRR4+CST3+WFDC2- seromucous acinar cells that are targeted in SjD, suggesting potential biomarkers or therapeutic targets.
Cytotoxic T Cells in SjD: Notably, the research highlighted the presence of GZMK+CD8 T cells that are enriched in SjD, exhibit a cytotoxic phenotype, and are physically associated with immune-engaged epithelial cells, indicating a direct role in the disease pathology.
Impact on Acinar Cells and Disease Pathology: The findings elucidate the immune response’s impact on transitioning acinar cells with high secretion levels and explain the loss of these cells in Sjögren's Disease, offering insights into the complex interplay of various cell types in the salivary glands and their contributions to the disease’s pathology.
Contributions: Implemented the machine learning algorithm, statistical analysis, data visualization, critical revision of the manuscript
Key findings:
Development of SPOT-MAS Assay: We developed the SPOT-MAS (Screening for the Presence Of Tumor by Methylation And Size) assay, which uses targeted and shallow genome-wide sequencing (∼0.55X) to profile methylomics, fragmentomics, copy number, and end motifs in cell-free DNA in a single workflow.
Application and Test Performance: The assay was applied to 738 nonmetastatic cancer patients and 1,550 healthy controls, demonstrating a sensitivity of 72.4% at 97.0% specificity for detecting five types of cancer, including breast, colorectal, gastric, lung, and liver cancers. Sensitivity was higher in later stages, reaching 88.3% for stage IIIA cancers.
Machine Learning Integration: Machine learning techniques were employed to extract multiple cancer and tissue-specific signatures, enhancing the assay's capability to detect and accurately determine the tumor origin, achieving an accuracy of 0.7.
eLife (impact factor: 8.7 | updated in 2023) arxiv
Contributions: Implemented the machine learning algorithm and conducted benchmarking experiments , statistical analysis, data visualization, critical revision of the manuscript
Key findings:
Developed the SPOT-MAS assay for early CRC detection, integrating DNA methylation and fragment size analysis in a single test.
Utilized plasma cell-free DNA from 159 CRC patients and 158 healthy controls, analyzed with a deep neural network to differentiate between cases and controls.
Achieved a high diagnostic performance with an area under the curve (AUC) of 0.989, sensitivity of 96.8%, and specificity of 97% in detecting CRC.
External validation confirmed robust performance with an AUC of 0.96, underscoring the assay’s potential for clinical application in early cancer detection.
Future Oncology (impact factor: 3.3 | updated in 2023) arxiv
Contributions: Statistical analysis, data visualization, critical revision of the manuscript
Key findings:
Developed a classification model based on cancer-related mutations and ctDNA fragment length profiles, employing deep sequencing of thirteen HCC-associated genes.
Achieved a diagnostic performance with an area under the curve (AUC) of 0.88, sensitivity of 89%, and specificity of 82% in the initial discovery cohort of 55 persons with HCC and 55 healthy participants.
Validation in an independent cohort maintained robust performance, achieving an AUC of 0.86 with sensitivity and specificity of 81%.
Provides a strong rationale for further clinical evaluation in a larger-scale prospective study to confirm the assay's effectiveness for early HCC detection.
BMC Cancer (impact factor: 3.8 | updated in 2023) arxiv
Contributions: Implemented the machine learning algorithm, statistical analysis, data visualization, critical revision of the manuscript
Key findings:
Utilized a multimodal approach, SPOT-MAS, to analyze methylation changes, copy number alterations (CNA), and 4-nucleotide oligomer end motifs (EM) in cell-free DNA (cfDNA) from 239 nonmetastatic breast cancer patients and 278 healthy subjects.
Identified distinct profiles of genome-wide methylation changes, copy number alterations, and end motifs in cfDNA, which are significant for breast cancer detection.
Developed a multi-featured machine learning model integrating all three signatures (methylation, CNA, and EM), which significantly outperformed models based on individual features.
Achieved an area under the curve (AUC) of 0.91 with a sensitivity of 65% at 96% specificity, demonstrating enhanced accuracy for detecting early-stage breast cancer.
Frontier in Oncology (impact factor: 4.7 | updated in 2023) arxiv
Contributions: Statistical analysis, data visualization, critical revision of the manuscript
Key findings:
Developed a tumor-specific methylation atlas (TSMA) using whole-genome bisulfite sequencing data from five types of tumor tissues (breast, colorectal, gastric, liver, and lung cancers) and paired white blood cells (WBC).
Implemented a non-negative least square matrix factorization (NNLS) deconvolution algorithm with TSMA to identify tumor tissue types in WGBS samples, though it faced challenges with cfDNA samples due to high WBC-derived DNA.
Enhanced tissue-of-origin (TOO) detection model by integrating deconvolution scores from TSMA with other cfDNA features, using a multi-modal strategy.
Achieved an accuracy of 69% in determining TOO in a validation dataset of 239 low-depth cfDNA samples, utilizing a graph convolutional neural network that combines methylation density features with deconvolution scores.
Contributions: Statistical analysis, data visualization, critical revision of the manuscript
Key findings:
Conducted a web-based survey in 13 languages to assess anxiety levels among non-Japanese residents in Japan during the COVID-19 pandemic, using the State-Trait Anxiety Inventory.
Analyzed 357 valid responses from January to March 2021, finding that 54.6% of participants suffered from clinically significant anxiety (CSA).
Identified three significant risk factors associated with higher levels of anxiety: troubles or difficulties in learning or working, decreased sleep duration, and deteriorating overall physical health.
The study underscores the impact of COVID-19 on mental health among foreign communities in Japan, suggesting targeted support for these identified risk factors to mitigate anxiety.
Plos one (impact factor: 3.7 | updated in 2023) arxiv
Contributions: Statistical analysis, data visualization, critical revision of the manuscript
Key findings:
Conducted comprehensive analytical validation of the SPOT-MAS test on a retrospective cohort of 751 participants (290 healthy and 461 with confirmed cancer) to determine detection limits, repeatability, reproducibility, and resistance to potential interferents like hemoglobin and genomic DNA contamination.
In analytical tests, SPOT-MAS detected 50% of cancer cases at a tumor fraction of 0.049 with 98% specificity, demonstrating consistent results across intra- and inter-batch analyses.
Launched a large-scale multi-center prospective trial (K-DETEK) involving 9,057 asymptomatic participants in Vietnam, where the test achieved a positive predictive value of 58.14%, accuracy of 84.00% in tumor location prediction, a negative predictive value of 99.92%, overall sensitivity of 78.13%, and specificity of 99.80%.
This study marks the first extensive prospective validation of an MCED test in Asia, highlighting SPOT-MAS’s potential for early cancer detection in settings with limited resources and no existing nationwide cancer screening programs.
Contributions: Statistical analysis, data visualization, critical revision of the manuscript
Key findings:
Analyzed 199 patients with AMI who underwent primary percutaneous coronary intervention, finding that 68.3% (136 patients) had MetS.
Patients with MetS were more likely to be female and had higher body mass indices, larger waist circumferences, and a higher prevalence of hypertension and diabetes compared to those without MetS.
Major in-hospital complications such as cardiogenic shock, heart failure, mechanical complications, and arrhythmias showed no significant difference between patients with and without MetS.
While MetS was not associated with increased all-cause in-hospital mortality (OR 4.92, 95% CI 0.62-39.31, P = .13), it was significantly associated with higher cardiovascular mortality, highlighting increased waist circumference as a critical factor for increased all-cause mortality.
Medicine (impact factor: 1.55 | updated in 2023) arxiv
Contributions: Statistical analysis, data visualization, critical revision of the manuscript
Key findings:
Conducted a retrospective cohort study at a safety-net hospital in Virginia, comparing disease severity between 500 Delta and 500 Omicron variant-infected adults using electronic medical record data.
Identified 279 propensity score-matched pairs from the cohort, primarily consisting of unvaccinated individuals with medical comorbidities who self-identified as Black.
Found that individuals infected with the Delta variant exhibited more severe disease compared to those infected with the Omicron variant, independent of vaccination status.
Highlighted that patients with kidney, liver, respiratory diseases, and cancer are at higher risk for severe COVID-19, while those with 2 doses of COVID-19 vaccine tended to have less severe disease.
Contributions: Statistical analysis, data visualization, critical revision of the manuscript
Key findings:
Analyzed 199 patients with AMI who underwent primary percutaneous coronary intervention, finding that 68.3% (136 patients) had MetS.
Patients with MetS were more likely to be female and had higher body mass indices, larger waist circumferences, and a higher prevalence of hypertension and diabetes compared to those without MetS.
Major in-hospital complications such as cardiogenic shock, heart failure, mechanical complications, and arrhythmias showed no significant difference between patients with and without MetS.
While MetS was not associated with increased all-cause in-hospital mortality (OR 4.92, 95% CI 0.62-39.31, P = .13), it was significantly associated with higher cardiovascular mortality, highlighting increased waist circumference as a critical factor for increased all-cause mortality.
The Lancet Regional Health-Southeast Asia (impact factor: 2.2 | updated in 2023) arxiv
Awards and Recognitions
Jacob B. and Veronica Schmitt Scholarship - 2017, 2018, 2019
Harry S. Kieval Mathematics - 2017, 2018, 2019
Undergraduate Research Award - 2018, 2020
Special thanks to Jon Barron for providing the HTML code that enhanced this page.