Currently
Senior Data Scientist & Product Owner in Healthcare
Education
2016 - 2018
George Washington University
- M.S. in Data Science
2011 - 2015
Sichuan University
- B.S. in Mathematics, concerntration on Statistics
Publications
-
Wang S, Han J, Jung SY, et al. Development and implementation of patient-level prediction models of end-stage renal disease for type 2 diabetes patients using fast healthcare interoperability resources. Sci Rep. 2022;12(1):11232. Published 2022 Jul 4. doi:10.1038/s41598-022-15036-6
-
Koker TE, S.S. Chintapalli, Wang S, et al. On Identification and Retrieval of Near-Duplicate Biological Images: a New Dataset and Protocol. Published online January 10, 2021. doi:https://doi.org/10.1109/icpr48806.2021.9412849
Work
2019-current
Senior Data Scientist & Product Owner, Enolink
2019
Computer Vision Research Associate, Harvard Medical School
2017
NLP Research Assistant, George Washington University
Projects
Fashion Product Discovery App [CV/NLP]
- Equipped Glancer app with the power of machine learning solutions for indexing products, generating titles for mobile display, and exploring trending new products
- Developed product tagging pipelines leveraging multimodal learning to analyze products’ image and description text
- Improved tagging performance by assembling deep learning models with a keyword-matching agent
Movie Recommendation System [MySQL/Spark/Flask]
Link to Demo
- Built a Flask demo backed with MySQL and integrated recommendation models to provide live movie recommendations
- Created ETL pipelines for online analytical processing (OLAP) with Spark SQL to analyze user behavior and trending patterns
- Trained Collaborative Filtering (CF) models for personalized recommendation using ALS matrix factorization from Spark MLlib and provided user-based CF model to handle item cold start
Product Demand Analysis [CV/NLP/ML]
- Predicted product demand for an online ads platform by analyzing users and products’ unstructured (title and description in Russian, images) and structured data (price, category, location and time) using VGG16, Random Forest, lightGBM, and logistic regression
Master Capstone, German Traffic Sign Classification [Caffe/Tensorflow]
Blog for Caffe
Blog for Tensorflow
- Classified 39K traffic sign images in 43 categories using the convolutional neural network (CNN) and benchmarked results against two deep learning frameworks (Tensorflow and Caffe)
- Analyzed CNN models by building customized visualization to show the change of kernels during training using Tensorboard
Bachelor Thesis, Treatment Effect Prediction [SAS/SPSS]
- Applied logistic regression model to predict NIPPV treatment effect for patients with respiratory failure using SPSS and SAS
Tableau Portfolio
Honors and Awards
05/2015
Outstanding Undergraduate Thesis(5%)
01/2015
Excellent Volunteer of SK Sunny Undergraduates Volunteer Service
11/2014
Second Class Scholarship
05/2013
Active Volunteer in Love Passing Voluntary Service of SCU
04/2013
Second Prize in Undergraduates Tennis Championship of SCU
Conference
Cambridge, 2019, 2020
Women in Data Science (WiDS) Cambridge
Ann Arbor, 08/08/2019-08/10/2019
Machine Learning for Healthcare
DC, 10/09/2017
DevFest DC 2017
DC, 05/15/2017-05/17/2017
Know Identity Conference
DC, 05/05/2017-05/06/2017
DevFest DC 2017
DC, 12/03/2016
GW DATA Data Driven Insights Conference: Extract, Transform, Learn
DC, 11/29/2016
Exploring some of the latest and greatest tools in Data Science
DC, 09/28/2016
Data Transparency 2016 with Open Data Innovation Summit
DC, 06/30/2016
ATARC Federal Big Data Summit
DC, 03/04/2016-03/05/2016
Open Data Day DC 2016
Community Involvement
Boston, 2021-2024
Athlete, CYPN STORM Dragon Boat Club
- 2023 Rhode Island Race: Mixed Division 3rd Place
- 2023 Boston Dragon Boat Festival: Club Division 2nd Place; A Major Division 3rd Place
- 2022 Rhode Island Race (Captain): Mixed Division 1st Place
- 2022 Mercer GWN Race: Sport Women Division 3rd Place
- 2022 Riverfront Hartford Race: A Division 2nd Place
- 2022 IDBF 13th Club Crew World Championship: Premier Mixed Division Participant
- 2022 Boston Dragon Boat Festival: Club Division 1st Place; Women Division 1st Place; A Major Division 3rd Place
- 2021 USDBF Club Crew National Championships: Premier Mixed Division Participant
- 2021 Mercer GWN Race: Sport Mixed Division 2nd Place
- 2021 ERDBA Regional Championship: Premier Mixed Division 2nd Place
New Orleans, LA, 03/2017
Habitat for Humanity
Arlington, VA, 10/2016
Marine Corp Marathon
Washington Monument Grounds, DC, 06/2016
Moving day DC (National Parkinson Fundraising)
DC, 04/2016
DC Central Kitchen
Chengdu, China, 03/2014-06/2014
SK Sunny Undergraduates Volunteer Service
Chengdu, China, 05/2013
Love Passing Voluntary Service
Chengdu, China, 04/2013
Ya’an earthquake Volunteer
Chengdu, China, 04/2013-05/2013
The Love-Package Volunteer Service