01The problem
Depression at 54.3%, anxiety at 64.8%, among Bangladeshi university students. The data exists. The tool to actually look at it does not.
Public health researchers and university policymakers need answers to specific questions. Are students in certain divisions more at risk. Does financial stress compound anxiety on top of depression. The data sits in survey CSVs that require an Excel degree to interrogate, and the existing analyses are PDFs that answer last year's question, not this week's.
The goal of the project was to ship something that lets a non-engineer ask those questions interactively, with filters, maps, and downloadable reports, without me being in the room. That sounds modest. The cost of building it that way is that almost every interesting design problem ends up below the UI.
02Approach
I started from the user, not from the data. The user is a mental health researcher who needs to ask questions like "are students in certain divisions more at risk" or "does financial stress compound anxiety on top of depression." That user story drove the requirements, written up properly using FURPS+ and MoSCoW prioritisation rather than a wish list of features I thought would be cool.
The four Streamlit pages came out of those requirements, not the other way around:
- Overview. Headline prevalence, demographic filters, choropleth maps across all eight Bangladesh divisions.
- Data view + CRUD. Full create / read / update / delete with validation, so researchers can correct records rather than reload a CSV.
- Detailed analysis. Per-variable breakdowns, distributions, cross-tabs.
- Comparison explorer. Side-by-side groups, the question-driven view.
03Architecture
The whole point of the project was that this was a chance to build a small analytics tool the way a real one should be built. That meant a four-tier layered architecture, with each layer testable in isolation and the data source replaceable without touching the UI.
The pay-off is concrete: if I wanted to swap Streamlit for a Flask app or a desktop GUI, the analysis code is identical. If the dataset moved from SQLite to Postgres, the repository implementation changes and nothing above it does. For a coursework project that is over-engineering on paper. In practice it is the only reason the tests below were buildable at all.
04Key decisions
Five choices made the project what it is. The first two are unfashionable, which is part of the point.
05Results
| What shipped | Status |
|---|---|
| Layered architecture, 4 tiers | Implemented, each layer tested independently |
| Test suite | 191 tests across 12 modules, all passing |
| Streamlit dashboard | 4 pages: Overview, Data View + CRUD, Detailed, Comparison |
| Choropleth maps | All 8 Bangladesh divisions |
| CRUD with validation | Create, read, update, delete via the data access layer |
| PDF export | Researchers can download the filtered view as a report |
| CSV ingestion | Pipeline can ingest a new file without code changes |
06What I'd do differently
Two things would make this version better in obvious ways:
- The geographic matching is fragile. It works for this dataset's specific division names but would break on a different dataset without manual mapping adjustments. A proper lookup table keyed to a canonical geographic identifier like ISO 3166-2 would make the choropleth robust to any source CSV using any name variant.
- Deploy to Streamlit Cloud from day one. A local-only tool, however well built, is far less accessible to reviewers and collaborators than a live link. Treating "deployed" as a first-day requirement rather than a last-week task would have surfaced configuration issues earlier and made the project visible to people who needed to see it.
The broader thing I am taking forward is that the discipline mattered more than the specific architecture. Layered design and TDD on a coursework project look like ceremony, but they paid back the time many times over once I started changing things. Most of the bugs I caught were caught by a test I wrote ten minutes earlier, not a stack trace I would have read at midnight.