available for summer 2026

Zaiyi (Derek) Kuang

Data science & ML engineer at UC San Diego. Double major in Data Science + Math–CS. Building things with Python, TypeScript, and AWS.

01

About

I'm a double major in Data Science and Math–CS at UC San Diego, focused on building end-to-end ML systems that go from raw data to deployed products. My coursework sits at the intersection of statistics, algorithms, and applied machine learning.

Outside of coursework, I work as an Instructional Assistant for Intro to Data Science and Python Programming, hosting weekly office hours for 60+ students. I find that explaining concepts is one of the fastest ways to sharpen them.

UniversityUC San Diego
GPA3.88 / 4.00
GraduationJune 2027
SeekingSummer 2026 internship
LanguagesPython, TypeScript, SQL
Languages
PythonSQLRJavaJavaScriptTypeScript
ML & Data
PandasNumPyScikit-learnTensorFlowPyTorchPlotlyD3.js
Certifications
AWS Cloud PractitionerGitHub Foundations
02

Experience

Next Level DataJuly – Sept 2025

Data Science Intern · Remote

// end-to-end predictive modeling pipeline · deployed & monitored on AWS

UCSD Capital Program ManagementNov 2024 – May 2025

Data Science Intern · La Jolla, CA

// LLM chat interface (OpenAI API) · 10K+ time-series records processed

Agricultural Bank of ChinaJune – Aug 2024

Data Analyst Intern · Chengdu, China

// SQL analysis on 1M+ transactions · Tableau dashboards across 50K+ profiles

03

Projects

NLP · Finance

WallStreetBets Sentiment & Market Analysis

Built a pipeline combining Reddit sentiment with stock market history into a 16.6K-row ticker-day dataset. Logistic regression model predicts extreme next-day price moves.


PythonVADERTF-IDF
view live ↗

Data Viz · Healthcare

Vitals Unveiled: Surgical Risk Dashboard

Interactive cohort dashboards built on 557K+ medical records across 6,388 surgeries. Awarded best project in a class of 180 students.


TypeScriptD3.jsCSS
view live ↗

ML · Infrastructure

Major Power Outage Analysis

Analyzed U.S. power outage records to identify causes and predict severity. Random Forest classifier improved test accuracy from a 60% baseline.


PythonPandasScikit-Learn
view live ↗