About
Welcome! I’m Xiangyu (you can also call me Sean), a Ph.D. candidate in Data Science at New Jersey Institute of Technology, advised by Professor Yi Chen. Over the past few years, I have participated in multiple projects that develop end-to-end data mining and machine learning systems for healthcare applications. From these projects, I have accumulated extensive experience in various data science tasks, including data preprocessing, feature engineering, modeling, information extraction, and natural language processing. I am highly proficient in Python and R and have considerable SQL and SAS experience. In the summer of 2022, I interned as a Data Scientist at Amazon’s Supply Chain Optimization Technologies (SCOT). In the summer of 2021, I worked as an AI Research Associate at the Center for Health Information & Decision Systems (CHIDS), University of Maryland.
I received my Master of Science in STEM-designated Marketing Intelligence from Fordham University Gabelli School of Business and my Bachelor of Science in Business Administration from New York Institute of Technology.
I’m currently seeking a full-time DS/MLE/Applied Scientist position.
Experience
- Data Scientist Intern, Amazon (May 2022 - September 2022)
- AI Research Associate, CHIDS, University of Maryland (June 2021 - August 2021)
- Data Category Manager, Standard Media Index (October 2015 - August 2018)
Research Experience
- Research Assistant, New Jersey Institute of Technology
- Led a project to increase the likelihood of having patients answer a phone call from the healthcare provider through machine learning and deep learning techniques, including a graph-based call time prediction model and a hybrid patient reachability prediction model
- Led the design of a novel hybrid NLP approach and the development of a prototype system for extracting cancer biomarkers from free-text pathology reports
- Developed a hierarchical time-aware neural network for personalized risk prediction of adverse drug events using claims data
- Contributed to the implementation of an advanced deep learning model for extracting patient conditions from clinical notes
Teaching Experience
- Instructor, New Jersey Institute of Technology
- Instructor of MIS 363 Project Management for Managers (Summer 2020)
- Instructor of MIS 445 Decision Support Tool and Technology for Managers (Spring 2020)
- Teaching Assistant, New Jersey Institute of Technology
- Teaching Assistant of MIS 385 Database Systems for Managers (Fall 2019)
- Lab Assistant at NJIT Business Analytics Lab (Spring 2019)
Publications
- Weiting Gao, Xiangyu Gao, Wenjin Chen, David J Foran, Yi Chen, BioReX: Biomarker Information Extraction Inspired by Aspect-Based Sentiment Analysis, 2024 Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2024 [code]
- Jinhe Shi, Xiangyu Gao, William C Kinsman, Chenyu Ha, Guodong Gordon Gao, Yi Chen, DI++: A Deep Learning System for Patient Condition Identification in Clinical Notes, Artificial Intelligence in Medicine, 2022
- Jinhe Shi, Xiangyu Gao, Chenyu Ha, Yage Wang, Guodong Gao, Yi Chen, Patient ADE Risk Prediction through Hierarchical Time-Aware Neural Network Using Claim Codes, 2020 IEEE International Conference on Big Data (Big Data), 2020
- Xiangyu Gao, Jinhe Shi, Wenjin Chen, Nancy Sazo, Huiqi Chu, Evita Sadimin, David J Foran, Yi Chen, CBEx: A Hybrid Approach for Cancer Biomarker Extraction, 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2020
Working Papers
- Xiangyu Gao, Jinhe Shi, Junjie Luo, Guodong (Gordon) Gao, and Yi Chen, A Hybrid Deep Learning Model for Patient Reachability Prediction
- Xiangyu Gao, Jinhe Shi, Guodong (Gordon) Gao, and Yi Chen, Personalized Phone Call Time Prediction Using Graph Embedding