Joshua D. Piña
Data Scientist & Program Manager
|
|
|
Type of site
|
Virtual portfolio |
|---|---|
| Available in | English, Spanish, French |
| Creator | Joshua Piña |
| URL |
joshuapina |
| Initial Launch | 15 August 2025 |
| Wiki Launch | 8 February 2026 |
| Written in | HTML, CSS, & JS |
Joshua Darius Piña is an American Data Scientist, Program Manager, and U.S. Army Veteran. A recent Georgia State University graduate, he is credited with leading his senior Capstone projects in Data Science and Machine Learning. Joshua holds a current Secret security clearance and is a strong advocate for ethical data practices and the implementation of robust, end-to-end ML pipelines to ensure application efficacy.
Early Life
Joshua D. Piña was born in Coral Springs, Florida in the late 20th century. He was raised with an emphasis on disciplined and selfless service, open-ended thinking, and evidence-based problem-solving.
As a gifted student, he was awarded the prestigious Accelerated Reader award in 7th grade.
While attending Coconut Creek High School, he joined the National Honor Society, Animal Rights Club, and HOSA (Health Occupation Students of America), while being a member of the baseball team.
Prior to his academic career in data analytics and computational sciences, his background as a pre-med student at Broward College led him to enlisting in the army as a 68W (Combat Medic/Healthcare Specialist).
Military Career
Piña served as a Healthcare Specialist (Combat Medic) in the United States Army from January 2017 to April 2022.
He successfully completed Basic Combat Training (BCT) at Fort Leonard Wood, Missouri and Initial Entry Training (IET) at Fort Sam Houston, Texas.
Stationed primarily at Fort Benning, Georgia, Piña served as a combat medic with 4th Ranger Training Battalion (4th RTB), 316th Cavalry Brigade (316th Cav), and the 197th Infantry Brigade (197th Inf).
He also spent 3 months working as the lead medic for the Miami Military Entrance Processing Station (MEPS). During his service, he served as both a team and squad leader. March 2021, after successful completion of the United States Army BLC (Basic Leader Course), he was promoted to Sergeant.
Piña was responsible for managing medical records for over 5,000 personnel and oversaw >$100,000 in critical medical supplies and logistics while working directly with command teams and Nurse Practitioners within the Garrison setting.
His time as a medical Non-commissioned Officer emphasized command-level reporting and operational coordination, skills he later applied to program management in the public sector.
This period of his life provided the groundwork for his eventual transition into Data Science, where he advocates for ethical AI development and rigorous documentation standards.
Education
circa early 21st century
Piña holds a Bachelor of Science in Data Science from Georgia State University (GSU), where he graduated in December 2025.
While at GSU, he excelled at courses based on Applied Statistics, Data Mining, Big Data Programming, Machine Learning, and Database Systems.
As a proud polymath, he chose electives geared at broadening his knowledge base, and elected to take courses such as Web Development, Digital Image Processing, and System-Level Programming.
A multi-term recipient of the Dean's List award (Spring 2024–Fall 2025)[1], he was also a member of ColorStack and the National Student Data Corps (NSDC).
During his senior year, he founded the GSU-DS Organization GitHub page to centralize and archive capstone projects for the department. This organization was handed off to Dr. Kuzmin upon graduation, to ensure that future students have a blueprint for projects and somewhere to showcase their capstones.
In late 2025, he was a participant in the Syracuse University O2O Cohort and a Technical Mentor for CodePath, assisting students in mastering data structures.
Projects
Database Creation & Ingestion Pipeline
As of late February 2026, Piña engineered a comprehensive data pipeline to automate the ingestion of webinar and meeting data from multiple sources.
The system utilizes SharePoint to capture GoToWebinar and Microsoft Teams attendee reports, two completely different data sources with incompatible formats, which are then standardized and cleaned using Python.
The solution involves extensive terminal output with detailed logging and error handling through the Rich library. External CSVs are utilized to fill in missing data to ensure attendee reports, training schedules, and Microsoft polls and forms can be accurately matched.
Through this system, a true relational database was created to analyze the massive amount of data the company was sitting on.
This end-to-end pipeline facilitates seamless ingestion into Supabase, enabling non-technical stakeholders to perform complex queries directly through the Supabase interface or via PowerBI integration within Microsoft Teams.
The script utilizes the Data Science standard of Cookiecutter Data Science for data lineage and structure, ensuring that raw data is never directly modified.
Burglary Risk Prediction
Leading a four-person team, Piña spearheaded the creation of a spatiotemporal forecasting pipeline for urban burglary risk in Atlanta.
The study utilized FBI NIBRS data, the Atlanta Police Department Open Data Portal, Census data, GIS shapefiles, and meteorological records (through Open-Meteo) to create 4 unique and multi-faceted data panels to predict risk zones by Neighborhood Planning Unit (NPU).
The team employed Cookiecutter Data Science for structure and reproducibility with Weights & Biases (WandB) for robust model lineage, performance tracking, and artifact handling.
View the project →
The Living Library
A full-stack study resource database designed for data science students. Built using FastAPI and Supabase, the platform features a postgres-backed semantic search engine utilizing vector embeddings.
It provides a centralized and isolated interface for querying a curated collection of technical textbooks and documentation.
View the project →
ML-HNSCC Study
This Machine Learning capstone focused on the automated segmentation and classification of Head and Neck Squamous Cell Carcinoma (HNSCC) using axial CT scans. Due to the large dataset size, typical for image-based medical research, Piña implemented 90%+ data compression and addressed class imbalances using specialized loss functions, optimizing model generalization across diverse patient datasets.
Repository: ML-HNSCC-Study
Ticket-Heroes
A price optimization analysis tool developed to predict the optimal time to purchase concert tickets. The project involved web scraping and sourcing historical pricing data to provide data-driven insights into ticket market volatility. Because of the scarcity of data available, the project involved a significant amount of data cleaning and feature engineering to create multiple usable datasets for modeling and testing, all hosted on Kaggle for future Data Scientists.
Repository: gsu-ds/ticket-heroes
Personal Life
Joshua Piña currently resides in Columbus, Georgia. He is a dedicated family man, citing his roles as a husband and father as the primary drivers of his professional development and work ethic. He is married to his high school sweetheart, and they maintain a household with one child, two puppies, and a (sporadically) grumpy cat.
Hobbies
Piña is an active member of the tech mentorship community, frequently contributing to ColorStack and CodePath. He is an advocate for data ethics and explores the intersection of artificial intelligence and healthcare advancements. Piña is an avid runner and enjoys weightlifting to recenter, refocus, and decompress. He is a 1x winner of his Fantasy Football League (2017) and a 2x runner-up (2022, 2025).
Family
Piña's personal documentation emphasizes his commitment to his wife and children, maintaining a balance between his technical pursuits and family life in Florida and Georgia.
Professional Certifications
As of February 2026, Piña is a member of CodePath's inaugural AI110 (Foundations of AI Engineering) course.
On February 23, 2026, Piña earned the new Google AI Professional certification.[4]
He also holds a certification from CodePath's Advanced TIP (Technical Interview Prep) course[3] and the IBM Data Analyst Professional Certificate.
Piña is an alumnus of the FourBlock Veteran Development Cohort[2], and holds a certificate for the Fall 2025 Atlanta Cohort.
Notes
- ^ Georgia State University. "Joshua Piña: Dean's List (2024–2025)". GSU Merit.
- ^ FourBlock. (2025). "Professional Development Career Readiness Completion".
- ^ CodePath. (2025). "Technical Mentorship and Interview Prep Certification".
- ^ Google. (2026). "Google AI Professional Certificate". Completion Date: February 23, 2026.
Further repositories
- The Living Library: Semantic search database for data science resources (FastAPI, Supabase)
- ML-HNSCC Study: ML segmentation and classification of head and neck carcinoma from CT scans
- Graph-Cut Imaging: Graph-cut algorithms for image segmentation
- Digital Image Processing Project: Coursework covering filtering, transforms, and segmentation
- Campus Burglary Risk Prediction: Spatiotemporal burglary risk forecasting by NPU across Atlanta
- Ticket Heroes: Optimal concert ticket purchase timing using historical price data
- Fly Like a Bird: GSU Data Science capstone project
- GSU-DS Organization: Archive of GSU Data Science capstone projects
- Multi-Purpose Conversion Tool: Utility gist for common data type conversions
Further reading and resources
- KDnuggets. "Data Science, Machine Learning, and AI Research".
- Towards Data Science. "Technical Data Science Implementations".
- RealPython. "Python Development and Documentation".
- 3Blue1Brown. "Mathematical Visualizations and Intuition".