About

Welcome! I’m Xiangyu (you can also call me Sean), a Ph.D. candidate in Data Science at New Jersey Institute of Technology, advised by Professor Yi Chen. Over the past few years, I have participated in multiple projects that develop end-to-end data mining and machine learning systems for healthcare applications. From these projects, I have accumulated extensive experience in various data science tasks, including data preprocessing, feature engineering, modeling, information extraction, and natural language processing. I am highly proficient in Python and R and have considerable SQL and SAS experience. In the summer of 2022, I interned as a Data Scientist at Amazon’s Supply Chain Optimization Technologies (SCOT). In the summer of 2021, I worked as an AI Research Associate at the Center for Health Information & Decision Systems (CHIDS), University of Maryland.

I received my Master of Science in STEM-designated Marketing Intelligence from Fordham University Gabelli School of Business and my Bachelor of Science in Business Administration from New York Institute of Technology.

I’m currently seeking a full-time DS/MLE/Applied Scientist position.

Experience

  • Data Scientist Intern, Amazon (May 2022 - September 2022)
  • AI Research Associate, CHIDS, University of Maryland (June 2021 - August 2021)
  • Data Category Manager, Standard Media Index (October 2015 - August 2018)

Research Experience

  • Research Assistant, New Jersey Institute of Technology
    • Led a project to increase the likelihood of having patients answer a phone call from the healthcare provider through machine learning and deep learning techniques, including a graph-based call time prediction model and a hybrid patient reachability prediction model
    • Led the design of a novel hybrid NLP approach and the development of a prototype system for extracting cancer biomarkers from free-text pathology reports
    • Developed a hierarchical time-aware neural network for personalized risk prediction of adverse drug events using claims data
    • Contributed to the implementation of an advanced deep learning model for extracting patient conditions from clinical notes

Teaching Experience

  • Instructor, New Jersey Institute of Technology
    • Instructor of MIS 363 Project Management for Managers (Summer 2020)
    • Instructor of MIS 445 Decision Support Tool and Technology for Managers (Spring 2020)
  • Teaching Assistant, New Jersey Institute of Technology
    • Teaching Assistant of MIS 385 Database Systems for Managers (Fall 2019)
    • Lab Assistant at NJIT Business Analytics Lab (Spring 2019)

Publications

  1. Weiting Gao, Xiangyu Gao, Wenjin Chen, David J Foran, Yi Chen, BioReX: Biomarker Information Extraction Inspired by Aspect-Based Sentiment Analysis, 2024 Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2024 [code]
  2. Jinhe Shi, Xiangyu Gao, William C Kinsman, Chenyu Ha, Guodong Gordon Gao, Yi Chen, DI++: A Deep Learning System for Patient Condition Identification in Clinical Notes, Artificial Intelligence in Medicine, 2022
  3. Jinhe Shi, Xiangyu Gao, Chenyu Ha, Yage Wang, Guodong Gao, Yi Chen, Patient ADE Risk Prediction through Hierarchical Time-Aware Neural Network Using Claim Codes, 2020 IEEE International Conference on Big Data (Big Data), 2020
  4. Xiangyu Gao, Jinhe Shi, Wenjin Chen, Nancy Sazo, Huiqi Chu, Evita Sadimin, David J Foran, Yi Chen, CBEx: A Hybrid Approach for Cancer Biomarker Extraction, 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2020

Working Papers

  1. Xiangyu Gao, Jinhe Shi, Junjie Luo, Guodong (Gordon) Gao, and Yi Chen, A Hybrid Deep Learning Model for Patient Reachability Prediction
  2. Xiangyu Gao, Jinhe Shi, Guodong (Gordon) Gao, and Yi Chen, Personalized Phone Call Time Prediction Using Graph Embedding