How Our Student Aryaman Built a Stroke-Detection AI Project in High School
- BetterMind Labs

- Sep 16
- 7 min read
Updated: Oct 28
Introduction : How a High School Student built Stroke-Detection AI Project

If you ask most parents and students whether a high-schooler can build an AI model that flags possible stroke cases, they’ll assume it’s college-level work. Aryaman proved otherwise. With thoughtful mentorship, the right open datasets, and an emphasis on ethics and validation, he built a learning-grade stroke-detection AI that he could demo at fairs, discuss in scholarship essays, and expand into a research portfolio.
In this case study, I’ll show you exactly how Aryaman did it—from idea to prototype—so you can understand the path, the pitfalls, and how to replicate it responsibly. I’ll also point you to the same public resources and explain where BetterMindLabs.org fit in to keep everything structured and safe.
Quick note for families: this is not medical advice and Aryaman’s project was educational only, not a clinical device. We emphasize this throughout and include the disclaimers he used.
Why Stroke Detection? (The “Why” that powers the “How”)

Two facts shaped Aryaman’s choice:
Every minute matters. Stroke treatment windows are brutally short; faster triage and diagnosis can protect brain function. Leading centers are using AI to speed up detection, coordination, and decision-making in real care settings. That context helped Aryaman frame a socially meaningful student project (and shaped his literature review).
There’s active, credible research to learn from. Universities and hospitals (e.g., University of Alberta; UC Davis; Mayo Clinic) continue exploring AI for quicker, more accurate stroke workflows—perfect fodder for a high-school literature review and for setting realistic expectations about scope.
The Project Summarized
Aryaman built a project: a baseline risk-prediction model from tabular health data, Much of this plan follows best-practice pathways for student-friendly stroke-AI builds.
Phase 0 — Planning the Build (with Parents in the Loop)

Family conversation & guardrails. Before touching data, Aryaman and his parents met with mentors to set non-negotiables: no protected health information (PHI), use public/consented datasets only, no claims of clinical performance, and clearly communicate limitations in any demo. We also agreed on “portfolio-grade” goals: learn core ML, show responsible design, and document everything (for fairs, essays, and interviews).
Phase 1 — Risk-Prediction Baseline (Tabular ML)
Dataset. Aryaman started with the widely used Kaggle Stroke Prediction Dataset (11 clinical features such as age, hypertension, heart disease, glucose, smoking status). It’s a classic starter set for classification and for learning about class imbalance.
What he built. A notebook that:
Cleans data, handles missingness, and encodes categories.
Tackles imbalance with class weights and simple resampling.
Trains Logistic Regression, Random Forest, and SVM, comparing metrics like ROC-AUC, precision/recall, and F1.
Uses explainability (e.g., feature importance and SHAP-style reasoning) to show which inputs most influence predictions—crucial for reflective learning and write-ups. (Explainable-AI approaches are a strong trend in recent stroke-AI work.)
Why this matters. It’s the safest on-ramp: no images, no clinical claims, but rich opportunities to discuss bias, false positives/negatives, and the ethics of prediction in healthcare.
Phase 2 — Imaging Prototype (Vision with Transfer Learning)

The idea. Move from tabular signals to medical imaging, where many stroke-care breakthroughs originate. Literature shows a range of results (e.g., sensitivities/accuracies above 80–90% in research contexts depending on data and task), making this a rigorous but achievable learning challenge for a teen under guidance.
How Aryaman scoped it.
Start tiny: simulate a binary classifier (“stroke-suggestive vs. control”) using public imaging samples (de-identified; educational repositories).
Use transfer learning (e.g., a pre-trained CNN backbone) to minimize compute and data needs.
Emphasize evaluation discipline (train/val/test splits; no leakage) and report multiple metrics, not just accuracy.
Why this matters. Parents often ask, “Isn’t medical imaging off-limits?” With public datasets and strict ethical framing, students can learn imaging ML while staying far away from real patient data. Hospitals and universities are publishing research (Mayo/UC Davis/U. Alberta) that students can cite to explain the clinical context without claiming equivalence to hospital-grade tools.
Phase 3 — Speech Cues Mini-Study (Signals & Features)
The idea. Some research explores multimodal stroke cues (e.g., facial droop, slurred speech) for faster pre-hospital screening. Aryaman ran a small audio classification experiment (MFCC features → classical ML) to detect dysarthria-like signals using non-patient audio and/or synthetic data, purely as a concept demo. Emerging literature shows phone-grade tools and multimodal screening models are active research areas.
Why this matters. It broadens the project beyond images and teaches audio pipelines, feature extraction, and model comparison—useful skills for any aspiring engineer.
Phase 4 — A Safe Demo App (and What He Never Claimed)
Aryaman wrapped the three pieces into a Streamlit-style demo:
Choose a mode (risk, imaging, or speech).
Show inputs, model output, and plain-English explanations.
Prominent banner: “Educational research demo. Not for medical use.”
A section explaining limitations, data sources, and how clinical tools are validated—grounded in the literature he read.
Parents: The most important thing wasn’t the UI—it was the responsible framing. He learned to separate learning prototypes from clinical software, and to point to real hospital deployments for context rather than hype. Mayo Clinic MagazineUC Davis Health
What the Research Says (and How Aryaman Used It)
Hospitals are already using AI to speed triage and treatment decisions in stroke networks—e.g., Viz-style tools reducing transfer and treatment times, with clinical teams in the loop. Aryaman cited this to explain why speed matters in stroke care.
Academic labs are publishing new stroke-AI models (CT/MRI and multimodal). Aryaman summarized recent results (sensitivities/accuracies often >80% in research settings), while clearly stating that his model is for learning, not medicine.
Public datasets (like Kaggle’s stroke table) are a legitimate entry point for students, with clear documentation and community examples to learn from.
For his planning notes, he relied on a high-school-friendly stroke-AI guide that collects approaches (risk prediction, imaging, speech), common pitfalls (class imbalance), and starter resources. We used this to shape milestones and teaching moments.
Results & Reflection (What “Good” Looked Like)
Aryaman did not boast a single headline number. Instead, he learned to report:
ROC-AUC alongside precision/recall (since stroke classes are imbalanced).
A confusion matrix to discuss trade-offs (missing a positive vs. false alarms).
Model explanation visuals (feature importance, saliency/attribution caveats) to reflect on why the model behaved as it did.
A plain-English limitations section: small data, potential dataset bias, no clinical validation, and never for diagnosis/treatment.
He compared his student-grade metrics to ranges found in research papers and hospital write-ups—without implying parity. This gave him a credible, humble tone in presentations and essays.
The Role of BetterMind Labs (and Why It Helped)

Aryaman built his project within a structured mentorship track at BetterMindLabs.org:
Scoping & ethics. Mentors helped pick a safe scope, add disclaimers, and choose appropriate public datasets.
Technical checkpoints. Weekly code reviews on data splits, leakage prevention, and evaluation.
Research literacy. We taught him how to cite real clinical deployments (e.g., UC Davis/Mayo) as context, not as claims.
Portfolio polish. Write-ups, annotated notebooks, and a demo page that college readers and scholarship committees can follow.
This structure turned a big idea into a repeatable learning journey rather than a one-off hack.
How You (and Your Teen) Can Follow the Same Path
8-Week Roadmap (Parent-Friendly)
Week | Focus | What You’ll Produce | Notes for Parents |
1 | Literature & ethics | One-page brief: what stroke is, why speed matters, key AI contexts | Look for sources from hospitals/universities; avoid hype. Mayo Clinic MagazineUC Davis Health |
2 | Data 101 | Kaggle dataset exploration; imbalance notes | No PHI. Public/consented data only. Kaggle |
3 | Baseline ML | LR/Random Forest/SVM with ROC-AUC & F1 | Teach false-positive/negative trade-offs. |
4 | Explainability | Feature importance + narrative | Emphasize interpretation ≠ truth. Nature |
5 | Imaging intro | Small transfer-learning prototype | Keep it educational; cite clinical context separately. Nature |
6 | Signals/audio | MFCC mini-classifier for speech cues | Non-patient or synthetic audio only. Nature |
7 | Demo app | Streamlit-style UI + NOT FOR MEDICAL USE banners | Parents review wording before sharing. |
8 | Portfolio | README, metrics table, limitations, bibliography | Tie learning to impact (fairs, essays). |
What Colleges & Scholarships See

Intellectual curiosity: You studied a high-stakes medical domain through peer-reviewed and hospital sources, not just blogs.
Technical growth: Tabular ML → imaging → signals, with proper metrics and explainability.
Ethical maturity: Clear disclaimers, data provenance, and limits.
Communication: A portfolio that non-engineers can understand.
Families often ask if this helps for programs like the Gates Scholarship or Cooke Scholarship. The short answer: yes—as part of a broader profile showing service, leadership, and resilience. The project becomes a story about learning responsibly and connecting tech to human outcomes.
Key Lessons Aryaman Wants You to Know
Pick an impact area you care about, then define an educational scope. Stroke is a great example because credible sources and open datasets exist.
Don’t chase accuracy alone. Learn to read ROC-AUC, precision/recall, and to write why that matters.
Cite the real world. Hospitals and universities are integrating AI to speed decisions; reference them to explain context—never as proof your student model is clinical-grade.
Ship a demo with guardrails. Prominent NOT FOR MEDICAL USE banners, a “limitations” page, and links to your sources.
Get feedback. Mentors at BetterMindLabs.org stress peer review, code checks, and presentation coaching so your teen can confidently talk about the project with teachers and selection committees.
Bonus: Metrics Cheat-Sheet
Metric | What it Means | Why it Matters in Stroke-like Tasks |
Accuracy | % of correct predictions | Can be misleading with imbalanced data (few “stroke” cases). |
Precision | Of predicted positives, how many were correct | High precision = fewer false alarms. |
Recall (Sensitivity) | Of actual positives, how many you caught | High recall = fewer missed cases (important in screening). |
F1 Score | Harmonic mean of precision & recall | Good single number for imbalanced classes. |
ROC-AUC | Probability the model ranks a random positive above a random negative | Threshold-independent view of separability; great for model comparison. |
Staying Current (Why this topic isn’t going away)
The field is moving quickly. In the past year, national health systems have expanded AI support in stroke pathways and reported faster door-to-treatment times and improved functional outcomes after rollout—underscoring why learning this domain is timely (again, not for teens to deploy clinically, but to understand responsibly).
Final Thoughts (and Your Next Step)
Aryaman’s project worked not because he chased flashy claims—but because he learned like a scientist: small steps, careful validation, relentless documentation, and humility about limits. That’s what colleges, scholarship committees, and research mentors respect.
If you want this same structure and accountability, explore mentorship tracks at BetterMindLabs.org. We’ll help your family set ethical guardrails, pick the right scope, and turn curiosity into a polished, portfolio-grade project you can be proud of.













Comments