Career Profile

  • 4 year of experience in executing data-driven solutions to increase efficiency, accuracy and utility of internal data processing.

  • 3+ years of experience in strategizing, interpreting and effectively analyzing an assort data attributes in streamlining the insight conversion process through leveraging various data languages such as R and Python.

  • Experienced at creating data regression models using predictive data modeling and analyzing data mining algorithms to deliver insights and implement action-oriented solutions to complex business problems.

  • Exceptional ability to identify data attributes based on business needs, simultaneously being able to extract and manipulate large sets of raw data from multiple platforms to derive for meaningful insights.

  • Results oriented with a team-player frame of mind, providing support from a data analytical aspect in achieving goals that align with the holistic organizational mission.

  • Seeks for a constant state of self-improvement for professional development, to maintain an inclusive mentality from a proficiency standpoint by proactively learning techniques for data analysis.

Education

Master of Science in Biostatistics

2023 - 2024
University of Toronto - St. George Campus
  • Laboratory in Statistical Design and Analysis
  • Categorical Data Analysis
  • Mathematical Foundations of Biostatistics
  • Introduction to Statistical Methods for Clinical Trials
  • Applied Machine Learning for Health Data
  • Introduction to Public Health

BSc Applied Mathematics Specialist & Statistics Major

2019 - 2023
University of Toronto - St. George Campus
  • Honourable Mention Best Statistical Analysis at DataFest@UofT 2022
  • cGPA: 3.63/4.0 (High Distinction)
  • Dean’s List in 2022
  • Dean’s List in 2021
  • Dean’s List in 2020
    • Methods for Multivariate Data (GPA 4.0/4.0)
    • Time Series Analysis (GPA 4.0/4.0)
    • Data Analysis II (GPA 4.0/4.0)
    • Probability I (GPA 4.0/4.0)
    • Methods of Data Analysis I (GPA 4.0/4.0)
    • Probability and Statistics I (GPA 3.7/4.0)
    • STAT Reasoning (GPA 3.7/4.0)
    • Survey Sam & Obs Data (GPA 3.7/4.0)
    • Introduction to Real Analysis (GPA 4.0/4.0)
    • Linear Algebra II (GPA 4.0/4.0)
    • Calculus Sci II (GPA 4.0/4.0)
    • Abstract Mathematics (GPA 4.0/4.0)
    • Groups and Symmetries (GPA 4.0/4.0)
    • Ordinary Diff Equat (GPA 4.0/4.0)
    • Complex Variables (GPA 3.7/4.0)

Experiences

Research Methods Specialist

Feb 2025 - Present
The Centre for Addiction and Mental Health (CAMH)
  • Statistical Modeling: Performed advanced analyses in R, Python, and SPSS, applying linear mixed models (LMM), generalized estimating equations (GEE), and survival analysis to assess treatment effects, behavioral outcomes, and biomarker trajectories in longitudinal clinical datasets.
  • Data Preparation, Missing Data & Exploratory Analysis: Implemented multiple imputation to address missing data, conducted exploratory data analysis (EDA), and carried out rigorous quality checks to ensure validity and robustness of results in both clinical trials and observational studies; applied regression tree approaches for exploratory subgroup identification.
  • REDCap Database Design & QC: Designed and maintained REDCap databases, including form development, validation rules, and workflow management, with ongoing quality control (QC) to safeguard data accuracy, integrity, and completeness.
  • Automation & Dashboards: Leveraged the REDCap API and large language models (LLMs) with R Shiny dashboards to automate data updates and monitoring from REDCap, enabling real-time tracking of study progress, completeness, and quality indicators.
  • SOP & Collaboration: Authored Standard Operating Procedures (SOPs) to establish standardized, reproducible analytic workflows in compliance with institutional standards. Collaborated with clinicians, statisticians, and epidemiologists to translate complex results into clinically interpretable insights supporting peer-reviewed publications, grant applications, and scientific presentations.

Data Research Analyst

Jun 2024 - Feb 2025
Lunenfeld-Tanenbaum Research Institute Sinai Health
  • Conducted comparative effectiveness analyses of Trimodality Therapy (TMT) and Radical Cystectomy (RadC) for muscle-invasive bladder cancer using SAS and R, applying doubly robust multivariable models to adjust for confounding.
  • Applied Fine and Gray regression with doubly robust estimation to account for competing risks, yielding precise estimates of treatment effects on overall and cancer-specific mortality.
  • Implemented inverse probability treatment weighting (IPTW) and propensity score matching (PSM) to balance baseline covariates and reduce selection bias in treatment comparisons.
  • Designed and executed trial emulation approaches to address immortal time bias, ensuring valid and unbiased evaluation of treatment efficacy.

Practicum Student

Oct 2023 - Jun 2024
Lunenfeld-Tanenbaum Research Institute Sinai Health
  • Performed advanced longitudinal analyses of serological profiles using R and Python, applying linear mixed-effects models to characterize antibody response dynamics to vaccination.
  • Applied quantile regression to assess heterogeneous vaccine effects across antibody distributions, generating insights that informed public health vaccination strategies.
  • Analyzed longitudinal antibody trajectories across different vaccination groups using mixed-effects modeling, revealing dose-dependent effects on antibody levels and supporting personalized vaccination strategies.

Consultant Intern

May 2021 - Aug 2021
Shanghai Pera Global Corporation Ltd
  • Explored and Collected information on 17,444 enterprises that operates in 17 industries such as automobile and fuel cells, through web crawler, cold call and other means. Using R, Excel(vookup, pivot table, VBA) screened companies according to the demand of the marketing department.
  • Analyzed 100+ semiconductor industry reports, performed a 74-page overviews and further investigated the most profitable sub-field in the industry and its correlation with simulation.
  • Visualized the project through Excel and Tableau and clarified 8,848 priority target customer companies to marketing department.
  • Conducted research on market size, value chain, competitive environment of 30+ interested industries and present reports.

Marketing Analyst Intern

Apr.2020 - Feb.2021
SavvyPro Education
  • Discovered and analyzed trends from collected 1000+ qualitative data to assess the potential sales of services.
  • Provided statistics support to make strategies which doubled the amount of read and retweet on promotion of social media.

IELTS Tutor

Jun.2018 - Aug.2018 & Jun.2019 - Sep.2019
New Oriental Education & Technology Group Inc
  • Collected students’ information to make teaching plans for the IELTS Program.
  • Delivered prepared lessons, exerciser, examination to students including listening, speaking, reading and writing.

Vice President

Aug 2020 - Jun 2021
Chinese Children Hope Offered(ECCHO)
  • Developed and executed marketing strategies based on evolving market needs back with statistical information sourced through analytical initiatives.
  • Coordinated with internal and external stakeholders on an international scale to effectively carry out marketing initiatives with demonstrated success in achieving set KPIs.
  • Provided analytical findings by identifying key data attributes and converting data to insights for growth and scaling opportunities.

Projects

Diabetes and Depression Prediction, A Comparison of Various Machine Learning Techniques (Group Work) - Employed logistic regression, neural network, and XGBoost models to attain precise predictions, advancing the landscape of proactive healthcare interventions.
Machine Learning for Human Movement Classification Using Wearable Sensors (Group Work) - Developed machine learning algorithms for classifying human movements based on high-frequency sensor data from wearable devices, employing recurrent neural network (RNN) and long short-term memory (LSTM) models.
Neural Network vs. Traditional Models in ICU Death Prediction, A Comprehensive Analysis (Group Work) - Directed a research initiative evaluating the predictive effectiveness of a neural network model versus traditional methods (logistic regression and Bernoulli Naïve Bayes classifier) in forecasting ICU mortality. The project focused on meticulous feature curation and tackled class imbalance, offering a nuanced insight into the performance of each model.
Integrated Analysis of Cardiovascular Health Metrics, Predictive Modeling and Regression Insights (Group Work) - Led a comprehensive research project on cardiovascular health, addressing three key questions related to mortality prediction, stroke risk factors, and average glucose levels using logistic and linear regression models.
Exploring Obesity Risk Factors, KNN Classification and K-Means Clustering on a Large Public Health Dataset (Group Work) - Spearheaded a comprehensive obesity research project utilizing KNN classification and K-Means clustering on a Canadian public health dataset, revealing nuanced insights into the predictive factors and clustering patterns related to BMI, with a focus on gender-stratified analyses.
Attorneys and Clients Data Analysis for American Bar Association with NLP (Group Work) - Constructed sentiment analysis and demographic analysis to provide actionable insights and recommendations to the ABA, aiming to enhance communication between clients and attorneys, and optimize human resource allocation.
Exploration of A Relationship Between Recall and White Matter (Individual) - Constructed robust, multiple linear regression models and linear mixed models to explore the relationship between structural and functional connectivity in the cingulum, fornix and episodic memory using fMRI and diffusion-weighted imaging in a sample of 40 healthy adults.
Investigated Typing Speed of Youth on Mobile vs. Keyboards (Individual) - Compared typing speed on mobile phones vs. physical keyboards using linear mixed models with 39 students in a 1-minute typing test, and discovered positive correlations between mobile typing speed and English level, age, and physical typing speed.
Tweet Emotion Recognition with TensorFlow (Guided Project) - Created a recurrent neural network and train it on a tweet emotion dataset to learn to recognize emotions in tweets and evaluated the model with confusion matrix.
Store Sales Forecast (Individual) - Constructed ARIMA and seasonal ARIMA models to forecast sales of various products in following 15 days in different locations based on the dataset with 1,064,613 observations.
Atmospheric Carbon Dioxide Concentration Forecast (Individual) - Constructed seasonal ARIMA models with the purpose to forecast trends in atmospheric carbon dioxide concentrations for 10 months.
Data Analysis for Marketing Campaign (Group Work) - Constructed generalized linear mixed model to verify the authenticity of feedback and specifically quantify the severity of the problem.
Data Analysis of Pursuing Higher Education (Individual) - Constructed a model of the influence of different neighborhoods on thoughts of students pursuing higher education.
Smartphone Addiction dnd Drug Addiction (Individual) - Explored whether smartphone addiction has similar characteristics similar to drug addiction.
2025 Canada Election Forecast (Group Work) - Constructed logistic regression model and post-stratification with the purpose to forecast polling and tracking popularity of election candidates and members of governments
Explore The Impact of Smoking on The Probability of Stroke (Individual) - Investigated which of the factors/variables in the provided dataset best explains the variation observed in probability of stroke.
Data Analysis of Toronto Health (Individual) - Explored the linear relationship between premature mortality and other information such as teenage pregnancy in Toronto communities.
Dognition Market Data Analysis with Tableau (Individual) - Investigated and visualized market data through Tableau and converted raw data to meaningful insights to make professional recommendations.
2022 Summer Make Insurance Better Bootcamp (Group Work) - Explored measures with design thinking to make insurance more attractive and lucrative by working in a group.

Skills & Proficiency

R

Python

SQL

SAS

Tableau

VBA