Project Overview
Collaborated with Cal Club Field Hockey to design and implement a comprehensive database solution that modernized data management across the club. The system spans player information, event management, merchandise tracking, and predictive analytics for player evaluation.
Database Relation and Schema
The schema follows strict normalization (BCNF) to ensure integrity, minimize redundancy, and optimize queries. Relations were first specified in detail, then validated through EER modeling before implementation in MySQL Workbench.
Synthetic Data Generation
To protect sensitive information while demonstrating functionality, I generated synthetic data based on real patterns. Logic constraints maintained realism (e.g., goals never exceeding shots), enabling realistic demos without exposing private data.
Machine Learning Pipeline
- Performance metrics: goals, assists, minutes played, ratings
- Derived features: per‑90 stats, consistency scores
- Modeling: Random Forest (100 estimators) with 5‑fold cross‑validation
- Output: Player ranking and recognition support
Results & Impact
Delivered a functional, production‑ready database replacing spreadsheets and paper workflows. The system enables efficient entry, retrieval, and analytics across the club, while the ML model provides an objective perspective on player evaluation and awards.
Technologies Used
- MySQL Workbench
- Python
- Pandas & NumPy
- Scikit‑Learn
- Git & GitHub
Key Achievements
- 62 normalized tables
- BCNF normalization
- 92.5% model accuracy (internal validation)
- Production‑ready deployment