Student mental health dashboard

01The problem

Depression at 54.3%, anxiety at 64.8%, among Bangladeshi university students. The data exists. The tool to actually look at it does not.

Public health researchers and university policymakers need answers to specific questions. Are students in certain divisions more at risk. Does financial stress compound anxiety on top of depression. The data sits in survey CSVs that require an Excel degree to interrogate, and the existing analyses are PDFs that answer last year's question, not this week's.

The goal of the project was to ship something that lets a non-engineer ask those questions interactively, with filters, maps, and downloadable reports, without me being in the room. That sounds modest. The cost of building it that way is that almost every interesting design problem ends up below the UI.

02Approach

I started from the user, not from the data. The user is a mental health researcher who needs to ask questions like "are students in certain divisions more at risk" or "does financial stress compound anxiety on top of depression." That user story drove the requirements, written up properly using FURPS+ and MoSCoW prioritisation rather than a wish list of features I thought would be cool.

The four Streamlit pages came out of those requirements, not the other way around:

Overview. Headline prevalence, demographic filters, choropleth maps across all eight Bangladesh divisions.
Data view + CRUD. Full create / read / update / delete with validation, so researchers can correct records rather than reload a CSV.
Detailed analysis. Per-variable breakdowns, distributions, cross-tabs.
Comparison explorer. Side-by-side groups, the question-driven view.

03Architecture

The whole point of the project was that this was a chance to build a small analytics tool the way a real one should be built. That meant a four-tier layered architecture, with each layer testable in isolation and the data source replaceable without touching the UI.

Layer 4 · UI PresentationStreamlit pages, filters, plotly charts, choropleth maps. Knows nothing about SQL.

Layer 3 ServiceBusiness logic and analysis functions. Pure-Python, fully unit-tested, completely decoupled from Streamlit.

Layer 2 Data accessSQLite via a Repository pattern. The service layer asks for "all records matching X" and does not know whether it came from SQLite, a Postgres replica, or a stubbed in-memory list.

Layer 1 UtilityLogging, config, geographic name standardisation, validation helpers. Everyone uses these.

The pay-off is concrete: if I wanted to swap Streamlit for a Flask app or a desktop GUI, the analysis code is identical. If the dataset moved from SQLite to Postgres, the repository implementation changes and nothing above it does. For a coursework project that is over-engineering on paper. In practice it is the only reason the tests below were buildable at all.

04Key decisions

Five choices made the project what it is. The first two are unfashionable, which is part of the point.

SQLite, not Postgres

Right tool for the size of the problem. The dataset is small, the tool is for a single researcher's exploratory use, and Postgres would be over-engineering. SQLite ships with Python, requires no infrastructure, and is easier to hand to someone else. The Repository pattern in Layer 2 means the choice is reversible if the scale changes.

4-layer for a solo project

Layered architecture is what makes proper unit tests possible. The service layer is decoupled from Streamlit, so I can test business logic without spinning up the UI. For a coursework project this looks like overkill until you try to write tests for a single-file Streamlit app and discover you cannot.

TDD on a dashboard

Tests before implementation. Unusual for analytics tools. The discipline of writing a failing test first forced me to define what each function should return before writing it. That caught several edge cases (what does the filter return when zero rows match?) that would otherwise have shown up only when a researcher hit them in the UI.

Drop missing outcomes

Imputing the primary outcome introduces bias into every downstream analysis. Missing values on the depression variable were dropped, not imputed. Binary indicators (anxiety, panic attacks, family history) were imputed with mode, where the bias cost is far lower. Different policies for different variables, with a written rationale, not a blanket "fill all NaNs."

Geographic standardisation

Bengali / English division names had to be reconciled before any choropleth would render. Boring data-pipeline work that the case study calls out, because forty per cent of "the data scientist's job" is reconciliation like this.

05Results

What shipped	Status
Layered architecture, 4 tiers	Implemented, each layer tested independently
Test suite	191 tests across 12 modules, all passing
Streamlit dashboard	4 pages: Overview, Data View + CRUD, Detailed, Comparison
Choropleth maps	All 8 Bangladesh divisions
CRUD with validation	Create, read, update, delete via the data access layer
PDF export	Researchers can download the filtered view as a report
CSV ingestion	Pipeline can ingest a new file without code changes

Fig. 1. Overview page. Headline prevalence at the top, demographic filters on the left sidebar, choropleth across the eight Bangladesh divisions in the centre, and year-of-study breakdown on the right. PDF export available from this view.

Fig. 2. Depression prevalence choropleth across Bangladesh's eight divisions. Sylhet (63%) and Dhaka (61%) show the highest rates; Rangpur (46%) the lowest. Bengali–English division name standardisation in the utility layer was the prerequisite that made this render correctly.

06What I'd do differently

Two things would make this version better in obvious ways:

The geographic matching is fragile. It works for this dataset's specific division names but would break on a different dataset without manual mapping adjustments. A proper lookup table keyed to a canonical geographic identifier like ISO 3166-2 would make the choropleth robust to any source CSV using any name variant.
Deploy to Streamlit Cloud from day one. A local-only tool, however well built, is far less accessible to reviewers and collaborators than a live link. Treating "deployed" as a first-day requirement rather than a last-week task would have surfaced configuration issues earlier and made the project visible to people who needed to see it.

The broader thing I am taking forward is that the discipline mattered more than the specific architecture. Layered design and TDD on a coursework project look like ceremony, but they paid back the time many times over once I started changing things. Most of the bugs I caught were caught by a test I wrote ten minutes earlier, not a stack trace I would have read at midnight.

Student mental health, built like real software.

01The problem

02Approach

03Architecture

04Key decisions

05Results

06What I'd do differently

More work

Customer purchase pattern analysis

EEG mental workload classification