I have broad research interests and experience in bioinformatics, cancer genomics and data analytics. These research areas mainly involve developing and applying bioinformatics and computational approaches to analyse large-scale cancer datasets to uncover novel diagnostic and prognostic biomarkers. I also lead the Cancer Research UK Barts Centre Bioinformatics Core Facility.
Transcriptomic analysis of cutaneous squamous cell carcinoma reveals a multigene prognostic signature associated with metastasis. J Am Acad Dermatol. 2023 S0190-9622(23)02504-5. PMID: 37586461
ACSNI: An unsupervised machine-learning tool for prediction of tissue-specific pathway components using gene expression profiles. Patterns (2021) 2(6), 100270. PMID: 34179848
The genomic landscape of actinic keratoses. J Invest Dermatol (2021) 141(7):1664-1674.e7. PMID: 33482222
Applications and analysis of targeted genomic sequencing in cancer studies. Computational and Structural Biotechnology Journal (2019) 17, 1348-1359. PMID: 31762958
The genomic landscape of cutaneous SCC reveals drivers and a novel azathioprine associated mutational signature. Nature Commun (2018) 9(2) 3667. PMID: 30202019
My main research interests lie in developing and applying bioinformatics and computational approaches to analyse large-scale cancer datasets to uncover novel diagnostic and prognostic features. In particular, I am interested in applying machine learning / AI algorithms to integrate multi-omics and clinicopathological data to derive diagnostic and prognostic tools for patient stratification.
I also lead the CRUK Barts Centre Bioinformatics Core Service.
Biomedical science, especially cancer research, is increasingly data driven, as new bioanalytical techniques deliver ever more data about DNA, RNA, proteins, metabolites and the interactions between them, in the whole tissue and single-cell levels. Given the increasing amount of omics datasets (big-data), the challenges are in how to analyse large-scale datasets and interpret the results accurately and thoroughly, and to identify “driver” events and predictive biomarkers in tumour development and progression.
Our research interests include the following:
Cancer genomics and evolution
Focusing on large-scale multi-omics datasets, we develop analytic pipelines and identify novel driver events, molecular subtypes, and diagnostic / prognostic signatures in cancer development and progression based on machine learning and data integration techniques. Using bulk tissue RNA-seq data, we are also interested in investigating immune and stromal landscape and signatures for patient subgrouping and stratification. Currently we are working on multi-omics datasets of cutaneous and oesophageal squamous cell carcinoma. We also investigate the clonal evolutionary patterns of these tumours and further understand how clonal / subclonal architecture affects clinical features of patients.
Noncoding sequence variants and RNA genes in cancer
Using publicly available whole-genome, ChIP-seq and RNA-seq data, we investigate functionally important noncoding mutations and dysregulated long noncoding RNAs in pancreatic and ovarian cancer. Using big-data and bioinformatic approaches, we first identify top novel candidates that are then taken to the lab for further in vitro validation using high-through screening (e.g., STARR-seq) and CRISPR/Cas9.
Single cell analytics
We have constructed a cross-package toolkit, named IBRAP (https://github.com/connorhknight/IBRAP), that provides the most comprehensive workflow from data pre-processing to automatic annotation of cell types, and enables users to interchange analytical components and individual methods. Benchmarking metrices are provided that distinguishes pipeline performance(s), thus providing dataset-specific pipeline production for single-cell studies. Currently, we are implementing IBRAP to construct normal reference maps using publicly available single cell data.
Computational histopathology and imaging analysis using AI
Despite recent advances in understanding the molecular pathogenesis of many cancers, disease assessment is still based on clinical and histopathological staging, with few objective prognostic biomarkers. A rapid, simple and cost-effective tool that augments clinicopathologic staging and allows clinicians to stratify patients according to their risk of progression is a priority for translational research.
Currently we are developing deep learning-based resources to automatically extract core histological features from digitised whole slide images and map these to molecular and clinical features in cutaneous and oesophageal squamous cell carcinoma. We aim to create a risk stratification tool which can be incorporated into routine pathology workflow, significantly improving patient outcomes.
Allele-Specific Expression of Leukemia Genes Is Associated with Pathogenicity in Poor Risk AML Zheng J, Bewicke-Copley F, Vermeulen C et al. Blood (2023) 142(10) 1389
The XPO1-FOXC1-HOX Functional Axis Opens New Therapeutic Avenues to Treat DEK-NUP214 AML Patients Kaya F, Bewicke-Copley F, Izquierdo PC et al. Blood (2023) 142(10) 4302
LB1663 A multi-gene prognostic signature associated with cutaneous squamous cell carcinoma metastasis Wang J, Harwood C, Bailey E et al. Journal of Investigative Dermatology (2023) 143(10) b8
Driver gene combinations dictate cutaneous squamous cell carcinoma disease continuum progression Bailey P, Ridgway RA, Cammareri P et al. Nature Communications 14(10) 5211
Personalized neoantigen viro-immunotherapy platform for triple-negative breast cancer Baleeiro RB, Liu P, Dunmall LSC et al. Journal for ImmunoTherapy of Cancer (2023) 11(10) e007336
Transcriptomic analysis of cutaneous squamous cell carcinoma reveals a multigene prognostic signature associated with metastasis Wang J, Harwood CA, Bailey E et al. Journal of the American Academy of Dermatology (2023) 89(10) 1159-1166
O02 Neoantigens from actinic keratosis are predicted to be more immunogenic than those from cutaneous squamous cell carcinoma – a strategy for immune escape? Thomson J, Harwoood C, Strid J et al. British Journal of Dermatology (2023) 189(10) e4-e5
010 Enhanced outcome prediction in cutaneous squamous cell carcinoma using deep-learning and computational histopathology (cSCCnet) Peleva E, Chen Y, Rizvi H et al. British Journal of Dermatology (2023) 188(10)
274 Homologous recombination deficiency scores in AK and cSCC are associated with tumor-immune phenotype Thomson J, Healy E, Strid J et al. Journal of Investigative Dermatology (2023) 143(10) s47
O20 TARGETING THE DEFECTIVE COA PATHWAY TO IMPROVE ERYTHROPOIESIS IN SF3B1-MUTANT MDS-RS PATIENTS Philippe C, Mian S, Maniati E et al. Leukemia Research (2023) 128(10) 107133
For additional publications, please click herePostdoctoral Bioinformaticians
PhD Students
Academic Clinical Fellow
Former lab members
I received my first degree in biological engineering at Shanghai Jiao Tong University. This was followed by an MSc degree of quantitative genetics and genome analysis, and a PhD in evolutionary genetics studying comparative genomics and evolution of noncoding sequences in Drosophila, both at the University of Edinburgh. I then joined Rothamsted Research as a postdoc working on plant genomics and genetic linkage mapping as part of the international Brassica rapa genome project. I moved to Barts Cancer Institute, Queen Mary University of London, as a bioinformaticist in 2010 to work on cancer genomics and biomarker discovery as part of the bioinformatics core. I became a Lecturer in Bioinformatics and group leader in 2016, and have also been leading the CRUK Barts Centre Bioinformatics Core Facility since 2018. I was promoted to Senior Lecturer in 2019.
I am Programme Director for the Cancer Genomics & Data Sciences MSc Programme at BCI, Queen Mary University of London.
Find out more about the programme.