How Our Student Aryaman Built a Stroke-Detection AI Project in High School

BetterMind Labs
Sep 16, 2025
7 min read

Updated: Oct 28, 2025

Introduction : How a High School Student built Stroke-Detection AI Project

Person in black hoodie presenting a U.S. tech startups map on a board, with sticky notes, to two others. Wall clock above, white room setting.

If you ask most parents and students whether a high-schooler can build an AI model that flags possible stroke cases, they’ll assume it’s college-level work. Aryaman proved otherwise. With thoughtful mentorship, the right open datasets, and an emphasis on ethics and validation, he built a learning-grade stroke-detection AI that he could demo at fairs, discuss in scholarship essays, and expand into a research portfolio.

In this case study, I’ll show you exactly how Aryaman did it—from idea to prototype—so you can understand the path, the pitfalls, and how to replicate it responsibly. I’ll also point you to the same public resources and explain where BetterMindLabs.org fit in to keep everything structured and safe.

Group of five people in black and white working on a laptop; text reads "Know more about AI/ML Program at BetterMind Labs." Learn More button.

Quick note for families: this is not medical advice and Aryaman’s project was educational only, not a clinical device. We emphasize this throughout and include the disclaimers he used.

Why Stroke Detection? (The “Why” that powers the “How”)

Dentist in green scrubs adjusts overhead light above a smiling patient in green sweater, in a bright, modern dental office.

Two facts shaped Aryaman’s choice:

Every minute matters. Stroke treatment windows are brutally short; faster triage and diagnosis can protect brain function. Leading centers are using AI to speed up detection, coordination, and decision-making in real care settings. That context helped Aryaman frame a socially meaningful student project (and shaped his literature review).
There’s active, credible research to learn from. Universities and hospitals (e.g., University of Alberta; UC Davis; Mayo Clinic) continue exploring AI for quicker, more accurate stroke workflows—perfect fodder for a high-school literature review and for setting realistic expectations about scope.

The Project Summarized

Aryaman built a project: a baseline risk-prediction model from tabular health data, Much of this plan follows best-practice pathways for student-friendly stroke-AI builds.

Phase 0 — Planning the Build (with Parents in the Loop)

Healthcare worker in navy scrubs consults a tablet with a patient in a blue gown in a hospital room, creating a serious mood.

Family conversation & guardrails. Before touching data, Aryaman and his parents met with mentors to set non-negotiables: no protected health information (PHI), use public/consented datasets only, no claims of clinical performance, and clearly communicate limitations in any demo. We also agreed on “portfolio-grade” goals: learn core ML, show responsible design, and document everything (for fairs, essays, and interviews).

Phase 1 — Risk-Prediction Baseline (Tabular ML)

Dataset. Aryaman started with the widely used Kaggle Stroke Prediction Dataset (11 clinical features such as age, hypertension, heart disease, glucose, smoking status). It’s a classic starter set for classification and for learning about class imbalance.

What he built. A notebook that:

Cleans data, handles missingness, and encodes categories.
Tackles imbalance with class weights and simple resampling.
Trains Logistic Regression, Random Forest, and SVM, comparing metrics like ROC-AUC, precision/recall, and F1.
Uses explainability (e.g., feature importance and SHAP-style reasoning) to show which inputs most influence predictions—crucial for reflective learning and write-ups. (Explainable-AI approaches are a strong trend in recent stroke-AI work.)

Why this matters. It’s the safest on-ramp: no images, no clinical claims, but rich opportunities to discuss bias, false positives/negatives, and the ethics of prediction in healthcare.

Phase 2 — Imaging Prototype (Vision with Transfer Learning)

Dentist in green scrubs and mask points at dental x-ray on laptop, explaining to a seated patient in a white clinic with dental models.

The idea. Move from tabular signals to medical imaging, where many stroke-care breakthroughs originate. Literature shows a range of results (e.g., sensitivities/accuracies above 80–90% in research contexts depending on data and task), making this a rigorous but achievable learning challenge for a teen under guidance.

How Aryaman scoped it.

Start tiny: simulate a binary classifier (“stroke-suggestive vs. control”) using public imaging samples (de-identified; educational repositories).
Use transfer learning (e.g., a pre-trained CNN backbone) to minimize compute and data needs.
Emphasize evaluation discipline (train/val/test splits; no leakage) and report multiple metrics, not just accuracy.

Why this matters. Parents often ask, “Isn’t medical imaging off-limits?” With public datasets and strict ethical framing, students can learn imaging ML while staying far away from real patient data. Hospitals and universities are publishing research (Mayo/UC Davis/U. Alberta) that students can cite to explain the clinical context without claiming equivalence to hospital-grade tools.

Phase 3 — Speech Cues Mini-Study (Signals & Features)

The idea. Some research explores multimodal stroke cues (e.g., facial droop, slurred speech) for faster pre-hospital screening. Aryaman ran a small audio classification experiment (MFCC features → classical ML) to detect dysarthria-like signals using non-patient audio and/or synthetic data, purely as a concept demo. Emerging literature shows phone-grade tools and multimodal screening models are active research areas.

Why this matters. It broadens the project beyond images and teaches audio pipelines, feature extraction, and model comparison—useful skills for any aspiring engineer.

Phase 4 — A Safe Demo App (and What He Never Claimed)

Aryaman wrapped the three pieces into a Streamlit-style demo:

Choose a mode (risk, imaging, or speech).
Show inputs, model output, and plain-English explanations.
Prominent banner: “Educational research demo. Not for medical use.”
A section explaining limitations, data sources, and how clinical tools are validated—grounded in the literature he read.

Parents: The most important thing wasn’t the UI—it was the responsible framing. He learned to separate learning prototypes from clinical software, and to point to real hospital deployments for context rather than hype. Mayo Clinic MagazineUC Davis Health

What the Research Says (and How Aryaman Used It)

Hospitals are already using AI to speed triage and treatment decisions in stroke networks—e.g., Viz-style tools reducing transfer and treatment times, with clinical teams in the loop. Aryaman cited this to explain why speed matters in stroke care.
Academic labs are publishing new stroke-AI models (CT/MRI and multimodal). Aryaman summarized recent results (sensitivities/accuracies often >80% in research settings), while clearly stating that his model is for learning, not medicine.

Public datasets (like Kaggle’s stroke table) are a legitimate entry point for students, with clear documentation and community examples to learn from.

For his planning notes, he relied on a high-school-friendly stroke-AI guide that collects approaches (risk prediction, imaging, speech), common pitfalls (class imbalance), and starter resources. We used this to shape milestones and teaching moments.

Results & Reflection (What “Good” Looked Like)

Aryaman did not boast a single headline number. Instead, he learned to report:

ROC-AUC alongside precision/recall (since stroke classes are imbalanced).
A confusion matrix to discuss trade-offs (missing a positive vs. false alarms).
Model explanation visuals (feature importance, saliency/attribution caveats) to reflect on why the model behaved as it did.
A plain-English limitations section: small data, potential dataset bias, no clinical validation, and never for diagnosis/treatment.

He compared his student-grade metrics to ranges found in research papers and hospital write-ups—without implying parity. This gave him a credible, humble tone in presentations and essays.

The Role of BetterMind Labs (and Why It Helped)

Person writing in notebook at a wooden desk, with a computer showing a video call. Sunlight creates shadows; coffee cup nearby.

Aryaman built his project within a structured mentorship track at BetterMindLabs.org:

Scoping & ethics. Mentors helped pick a safe scope, add disclaimers, and choose appropriate public datasets.
Technical checkpoints. Weekly code reviews on data splits, leakage prevention, and evaluation.
Research literacy. We taught him how to cite real clinical deployments (e.g., UC Davis/Mayo) as context, not as claims.
Portfolio polish. Write-ups, annotated notebooks, and a demo page that college readers and scholarship committees can follow.

This structure turned a big idea into a repeatable learning journey rather than a one-off hack.

How You (and Your Teen) Can Follow the Same Path

8-Week Roadmap (Parent-Friendly)

Week	Focus	What You’ll Produce	Notes for Parents
1	Literature & ethics	One-page brief: what stroke is, why speed matters, key AI contexts	Look for sources from hospitals/universities; avoid hype. Mayo Clinic Magazine UC Davis Health
2	Data 101	Kaggle dataset exploration; imbalance notes	No PHI. Public/consented data only. Kaggle
3	Baseline ML	LR/Random Forest/SVM with ROC-AUC & F1	Teach false-positive/negative trade-offs.

4	Explainability	Feature importance + narrative	Emphasize interpretation ≠ truth. Nature
5	Imaging intro	Small transfer-learning prototype	Keep it educational; cite clinical context separately. Nature
6	Signals/audio	MFCC mini-classifier for speech cues	Non-patient or synthetic audio only. Nature
7	Demo app	Streamlit-style UI + NOT FOR MEDICAL USE banners	Parents review wording before sharing.

8	Portfolio	README, metrics table, limitations, bibliography	Tie learning to impact (fairs, essays).

What Colleges & Scholarships See

Students with backpacks walk on a campus path under autumn trees, near a red brick building. The mood is academic and focused.

Intellectual curiosity: You studied a high-stakes medical domain through peer-reviewed and hospital sources, not just blogs.
Technical growth: Tabular ML → imaging → signals, with proper metrics and explainability.
Ethical maturity: Clear disclaimers, data provenance, and limits.
Communication: A portfolio that non-engineers can understand.

Families often ask if this helps for programs like the Gates Scholarship or Cooke Scholarship. The short answer: yes—as part of a broader profile showing service, leadership, and resilience. The project becomes a story about learning responsibly and connecting tech to human outcomes.

Key Lessons Aryaman Wants You to Know

Pick an impact area you care about, then define an educational scope. Stroke is a great example because credible sources and open datasets exist.
Don’t chase accuracy alone. Learn to read ROC-AUC, precision/recall, and to write why that matters.
Cite the real world. Hospitals and universities are integrating AI to speed decisions; reference them to explain context—never as proof your student model is clinical-grade.
Ship a demo with guardrails. Prominent NOT FOR MEDICAL USE banners, a “limitations” page, and links to your sources.
Get feedback. Mentors at BetterMindLabs.org stress peer review, code checks, and presentation coaching so your teen can confidently talk about the project with teachers and selection committees.

Bonus: Metrics Cheat-Sheet

Metric	What it Means	Why it Matters in Stroke-like Tasks
Accuracy	% of correct predictions	Can be misleading with imbalanced data (few “stroke” cases).
Precision	Of predicted positives, how many were correct	High precision = fewer false alarms.
Recall (Sensitivity)	Of actual positives, how many you caught	High recall = fewer missed cases (important in screening).
F1 Score	Harmonic mean of precision & recall	Good single number for imbalanced classes.
ROC-AUC	Probability the model ranks a random positive above a random negative	Threshold-independent view of separability; great for model comparison.

Staying Current (Why this topic isn’t going away)

The field is moving quickly. In the past year, national health systems have expanded AI support in stroke pathways and reported faster door-to-treatment times and improved functional outcomes after rollout—underscoring why learning this domain is timely (again, not for teens to deploy clinically, but to understand responsibly).

Final Thoughts (and Your Next Step)

Aryaman’s project worked not because he chased flashy claims—but because he learned like a scientist: small steps, careful validation, relentless documentation, and humility about limits. That’s what colleges, scholarship committees, and research mentors respect.

If you want this same structure and accountability, explore mentorship tracks at BetterMindLabs.org. We’ll help your family set ethical guardrails, pick the right scope, and turn curiosity into a polished, portfolio-grade project you can be proud of.