Machine Learning for Healthcare
by David Sontag & Peter Szolovits · MIT OpenCourseWare
Our Verdict
Worth it — with caveatsMIT 6.S897/HST.956 Machine Learning for Healthcare (Spring 2019), co-taught by Professors David Sontag and Peter Szolovits, is one of the most authoritative free graduate courses on applying ML to real clinical problems, and the full 25-lecture video set plus lecture notes, readings, and project guidelines are freely available on MIT OpenCourseWare under a Creative Commons (CC BY-NC-SA) license. Rather than re-teaching ML fundamentals, it focuses on what makes healthcare data hard: messy EHR and insurance-claims data, risk stratification, survival modeling, clinical NLP, causal inference, disease progression and subtyping, fairness, dataset shift, and regulation, anchored by guest lectures from practicing Boston-area clinicians. It assumes a prior ML course as a prerequisite, so it is genuinely advanced and not an entry point. As a self-study resource it has real limitations: it is a 2019 archive (predating the modern LLM/foundation-model wave), there is no certificate, no graded feedback or autograder, and the original problem sets relied on restricted datasets (MIMIC, IBM MarketScan) that require separate credentialed access. For a machine-learning practitioner or clinician-researcher who already knows the math and wants domain depth, it remains an excellent, citable free resource; for beginners it is the wrong starting point.
Outstanding, free, expert-taught domain course on clinical ML, but only worthwhile if you already have ML fundamentals; it is a 2019 archive with no certificate, no graded feedback, and problem sets that depend on access-restricted clinical datasets.
Best for: ML engineers, data scientists, and graduate students who already understand supervised learning, probability, and (ideally) some deep learning, and who want a rigorous, application-driven understanding of how ML is actually used on clinical data. It is especially valuable for people moving into health-tech or clinical-AI research, and for clinicians or biomedical researchers with a strong quantitative/ML background who want to learn the modeling side and the failure modes (bias, confounding, dataset shift) specific to healthcare.
Skip if: Complete beginners or anyone without a prior machine learning course. MIT explicitly lists a prior ML class as a prerequisite (e.g., 6.036/6.862, 6.867, 9.520, 6.806/6.864, 6.438, or 6.034), and lectures assume that background. It is also a poor fit for people who want a certificate, hands-on graded labs with feedback, or up-to-date coverage of LLMs and foundation models in medicine, since this is a static Spring 2019 recording.
About This Course
MIT course applying ML to healthcare problems including clinical NLP, disease progression modeling, and causal inference.
What You'll Learn
Curriculum
What makes healthcare unique, an overview of clinical care, and a deep dive into the nature of clinical data (EHRs, claims, coding, missingness).
Risk stratification using EHRs and insurance claims, survival modeling, and physiological time-series analysis.
Two lectures on clinical natural language processing for extracting information from unstructured clinical notes.
Translating technology into the clinic, plus applied ML for cardiology, differential diagnosis, pathology, and mammography, with clinician guest lectures.
Two lectures on causal inference and two on reinforcement learning applied to treatment decisions.
Disease progression modeling and subtyping, precision medicine, and automating clinical workflows.
Regulation of ML/AI in the US, algorithmic fairness, robustness to dataset shift, and interpretability.
Prerequisites
- A prior machine learning course (MIT lists 6.036/6.862 Intro/Applied ML, 6.867 Machine Learning, 9.520 Statistical Learning Theory, 6.806/6.864 Advanced NLP, 6.438 Algorithms for Inference, or 6.034 AI)
- Solid probability, statistics, and linear algebra
- Comfort reading machine-learning research papers (the course uses papers instead of a textbook)
- Python and general ML programming for the project work; basic familiarity with deep learning is helpful
Instructor
David Sontag & Peter Szolovits
Instructor · MIT OpenCourseWare
Pros & Cons
Pros
- Taught by leading clinical-ML researchers (David Sontag and Peter Szolovits) and built on real datasets (MIMIC critical-care data, IBM MarketScan claims) plus guest lectures from practicing clinicians, giving it strong authority and realism
- Completely free with full lecture videos (all 25), lecture notes, readings, and project guidelines on MIT OpenCourseWare under a CC BY-NC-SA license
- Goes well beyond generic 'AI in healthcare' overviews into rigorous, application-specific modeling: risk stratification, survival analysis, clinical NLP, causal inference, and imaging
- Strong, unusually deep coverage of the failure modes that matter in medicine: bias/fairness, confounding, dataset shift, interpretability, and US regulation
- Paper-driven rather than textbook-driven, which mirrors how the field actually evolves and exposes learners to primary research
Cons
- Advanced and unforgiving for beginners: it assumes a completed ML course and dives straight into clinical complexity, with no foundational ML ramp-up
- It is a Spring 2019 archive, so it predates the LLM/foundation-model era and the most recent clinical-AI and regulatory developments (note MIT now runs an updated version, 6.7930/HST.956)
- No certificate, no graded feedback, no autograder, and no interactive support; you watch recordings and read on your own
- The hands-on problem sets depend on access-restricted clinical datasets (MIMIC and IBM MarketScan) that require separate credentialing, so the full applied experience is hard to reproduce as a solo self-learner
Alternatives To Consider
Frequently Asked Questions
Is Machine Learning for Healthcare free?
Yes — Machine Learning for Healthcare is free to access. Free. All materials (25 lecture videos, lecture notes, readings, project guidelines) are on MIT OpenCourseWare under a Creative Commons BY-NC-SA license. There is no certificate and no paid tier; this is the open-courseware archive, not the for-credit MIT class or MIT Professional Education's separate paid offering.
Who is Machine Learning for Healthcare for?
ML engineers, data scientists, and graduate students who already understand supervised learning, probability, and (ideally) some deep learning, and who want a rigorous, application-driven understanding of how ML is actually used on clinical data. It is especially valuable for people moving into health-tech or clinical-AI research, and for clinicians or biomedical researchers with a strong quantitative/ML background who want to learn the modeling side and the failure modes (bias, confounding, dataset shift) specific to healthcare.
What will you learn in Machine Learning for Healthcare?
Why clinical data is uniquely difficult to work with (EHRs, insurance claims, physiological time-series, missingness, and bias) and how this shapes model design; Risk stratification and survival/disease-progression modeling on electronic health records and claims data; Clinical natural language processing for extracting structure from unstructured clinical text; Causal inference for healthcare, including why correlation-vs-causation mistakes are dangerous in clinical settings.
What are the prerequisites for Machine Learning for Healthcare?
A prior machine learning course (MIT lists 6.036/6.862 Intro/Applied ML, 6.867 Machine Learning, 9.520 Statistical Learning Theory, 6.806/6.864 Advanced NLP, 6.438 Algorithms for Inference, or 6.034 AI); Solid probability, statistics, and linear algebra; Comfort reading machine-learning research papers (the course uses papers instead of a textbook); Python and general ML programming for the project work; basic familiarity with deep learning is helpful.
Is Machine Learning for Healthcare worth it?
Outstanding, free, expert-taught domain course on clinical ML, but only worthwhile if you already have ML fundamentals; it is a 2019 archive with no certificate, no graded feedback, and problem sets that depend on access-restricted clinical datasets.
How we reviewed this course
This is an independent editorial assessment by Cursarium, based on MIT OpenCourseWare's published course materials and aggregated public learner feedback (last reviewed 2026-06). We have not independently completed the course. Links to providers are standard references, not paid placements.
Sources
- MIT OpenCourseWare - 6.S897 Machine Learning for Healthcare (Spring 2019)
- Official course site (mlhc19mit.github.io) - full syllabus, prerequisites, and grading
- MIT OCW - Lecture Videos gallery (all 25 lectures)
- MIT News - how the class is taught, datasets used, and student/faculty perspective
- Class Central listing (qualitative student reviews; no aggregate numeric rating)