Case study/2025/solo · streamlit · python

Student mental health, built like real software.

A Streamlit analytics tool for student mental health data, with a four-tier layered architecture and 191 tests written before most of the features. The interesting part is not what it shows on screen. It is what sits underneath.

Role
Solo. Architecture, TDD, data pipeline, UI, write-up.
Module
WM9QF Programming for AI · Warwick, 2025
Stack
pythonstreamlitsqlitepytestplotly
Links
191 Tests · written before features
12 Modules tested independently
4 Architectural layers
8 Bangladesh divisions · choropleth

01The problem

Depression at 54.3%, anxiety at 64.8%, among Bangladeshi university students. The data exists. The tool to actually look at it does not.

Public health researchers and university policymakers need answers to specific questions. Are students in certain divisions more at risk. Does financial stress compound anxiety on top of depression. The data sits in survey CSVs that require an Excel degree to interrogate, and the existing analyses are PDFs that answer last year's question, not this week's.

The goal of the project was to ship something that lets a non-engineer ask those questions interactively, with filters, maps, and downloadable reports, without me being in the room. That sounds modest. The cost of building it that way is that almost every interesting design problem ends up below the UI.

02Approach

I started from the user, not from the data. The user is a mental health researcher who needs to ask questions like "are students in certain divisions more at risk" or "does financial stress compound anxiety on top of depression." That user story drove the requirements, written up properly using FURPS+ and MoSCoW prioritisation rather than a wish list of features I thought would be cool.

The four Streamlit pages came out of those requirements, not the other way around:

  • Overview. Headline prevalence, demographic filters, choropleth maps across all eight Bangladesh divisions.
  • Data view + CRUD. Full create / read / update / delete with validation, so researchers can correct records rather than reload a CSV.
  • Detailed analysis. Per-variable breakdowns, distributions, cross-tabs.
  • Comparison explorer. Side-by-side groups, the question-driven view.

03Architecture

The whole point of the project was that this was a chance to build a small analytics tool the way a real one should be built. That meant a four-tier layered architecture, with each layer testable in isolation and the data source replaceable without touching the UI.

Layer 4 · UI PresentationStreamlit pages, filters, plotly charts, choropleth maps. Knows nothing about SQL.
Layer 3 ServiceBusiness logic and analysis functions. Pure-Python, fully unit-tested, completely decoupled from Streamlit.
Layer 2 Data accessSQLite via a Repository pattern. The service layer asks for "all records matching X" and does not know whether it came from SQLite, a Postgres replica, or a stubbed in-memory list.
Layer 1 UtilityLogging, config, geographic name standardisation, validation helpers. Everyone uses these.

The pay-off is concrete: if I wanted to swap Streamlit for a Flask app or a desktop GUI, the analysis code is identical. If the dataset moved from SQLite to Postgres, the repository implementation changes and nothing above it does. For a coursework project that is over-engineering on paper. In practice it is the only reason the tests below were buildable at all.

04Key decisions

Five choices made the project what it is. The first two are unfashionable, which is part of the point.

SQLite, not Postgres
Right tool for the size of the problem. The dataset is small, the tool is for a single researcher's exploratory use, and Postgres would be over-engineering. SQLite ships with Python, requires no infrastructure, and is easier to hand to someone else. The Repository pattern in Layer 2 means the choice is reversible if the scale changes.
4-layer for a solo project
Layered architecture is what makes proper unit tests possible. The service layer is decoupled from Streamlit, so I can test business logic without spinning up the UI. For a coursework project this looks like overkill until you try to write tests for a single-file Streamlit app and discover you cannot.
TDD on a dashboard
Tests before implementation. Unusual for analytics tools. The discipline of writing a failing test first forced me to define what each function should return before writing it. That caught several edge cases (what does the filter return when zero rows match?) that would otherwise have shown up only when a researcher hit them in the UI.
Drop missing outcomes
Imputing the primary outcome introduces bias into every downstream analysis. Missing values on the depression variable were dropped, not imputed. Binary indicators (anxiety, panic attacks, family history) were imputed with mode, where the bias cost is far lower. Different policies for different variables, with a written rationale, not a blanket "fill all NaNs."
Geographic standardisation
Bengali / English division names had to be reconciled before any choropleth would render. Boring data-pipeline work that the case study calls out, because forty per cent of "the data scientist's job" is reconciliation like this.

05Results

What shipped Status
Layered architecture, 4 tiersImplemented, each layer tested independently
Test suite191 tests across 12 modules, all passing
Streamlit dashboard4 pages: Overview, Data View + CRUD, Detailed, Comparison
Choropleth mapsAll 8 Bangladesh divisions
CRUD with validationCreate, read, update, delete via the data access layer
PDF exportResearchers can download the filtered view as a report
CSV ingestionPipeline can ingest a new file without code changes
Student Mental Health Dashboard FILTERS Division All divisions ▾ Gender All ▾ Year of study All years ▾ Condition Depression Anxiety Overview Data view + CRUD Detailed analysis Comparison Overview · Depression prevalence by division 54.3% Depression rate n = 101 students 64.8% Anxiety rate n = 101 students 2.5 Avg CGPA affected students Depression prevalence by division Dhaka 61% Chittagong 58% Rajshahi 49% Sylhet 63% Khulna 51% Barisal 55% Rangpur 46% Mymensingh 53% By year of study Year 1 60% Year 2 73% Year 3 49% Year 4 39% Download PDF report
Fig. 1. Overview page. Headline prevalence at the top, demographic filters on the left sidebar, choropleth across the eight Bangladesh divisions in the centre, and year-of-study breakdown on the right. PDF export available from this view.
Depression prevalence by division · Bangladesh Rangpur 46% Rajshahi 49% Khulna 51% Barisal 55% Mymensingh 53% Dhaka 61% Sylhet 63% Chittagong 58% PREVALENCE 63% (Sylhet) 54% 46% (Rangpur) n = 101 students Kaggle BMHDS dataset Bangladesh, 2019 Bengali ↔ English name reconciliation was required before this map would render for all 8 divisions.
Fig. 2. Depression prevalence choropleth across Bangladesh's eight divisions. Sylhet (63%) and Dhaka (61%) show the highest rates; Rangpur (46%) the lowest. Bengali–English division name standardisation in the utility layer was the prerequisite that made this render correctly.

06What I'd do differently

Two things would make this version better in obvious ways:

  • The geographic matching is fragile. It works for this dataset's specific division names but would break on a different dataset without manual mapping adjustments. A proper lookup table keyed to a canonical geographic identifier like ISO 3166-2 would make the choropleth robust to any source CSV using any name variant.
  • Deploy to Streamlit Cloud from day one. A local-only tool, however well built, is far less accessible to reviewers and collaborators than a live link. Treating "deployed" as a first-day requirement rather than a last-week task would have surfaced configuration issues earlier and made the project visible to people who needed to see it.

The broader thing I am taking forward is that the discipline mattered more than the specific architecture. Layered design and TDD on a coursework project look like ceremony, but they paid back the time many times over once I started changing things. Most of the bugs I caught were caught by a test I wrote ten minutes earlier, not a stack trace I would have read at midnight.