I spent a Sunday last month opening every vertical's schema.sql and reading them back to back. 220 schemas. Mostly SQLite, a few Postgres for the hosted tiers. Every one of them written by me or Jess, starting from the same template, diverging into the shape of the specific business domain.
I was expecting to find a lot of repetition. I did. I was not expecting to find the specific anti-patterns I found, which showed up in predictable places — usually where the domain had a feature that pushed against the template's defaults.
Here's the tour.
The Universal Columns
Every table in every vertical has four columns:
id INTEGER PRIMARY KEY AUTOINCREMENT,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP,
deleted_at TIMESTAMP -- soft delete
The deleted_at soft-delete column is the most-debated default. A bunch of posts on the internet tell you soft-deletes are an anti-pattern (they complicate every query). A bunch more tell you they're essential (they prevent accidental data loss). After 220 schemas, my position is: soft-delete by default, hard-delete for explicitly-ephemeral data.
The reason: we've had to recover "deleted" data from customer panic-clicks at least a dozen times across the portfolio. A soft-deleted row is a 30-second recovery. A hard-deleted row is a backup restore, which is longer and sometimes impossible. The query-complexity cost of "WHERE deleted_at IS NULL" is worth it.
The Audit Column Pattern
On tables where changes matter (customer records, billable items, anything touching money), we add:
created_by TEXT, -- user or system that made it
updated_by TEXT,
change_reason TEXT
The change_reason column is opt-in per-app but high value when present. It turns "who broke this record?" into "who broke this record and what did they say at the time?" In a 2-person operation where Jess and I do everything, it's saved me at least a week of cumulative confusion.
When JSON Is a Feature
Traditional relational modeling says: normalize everything into proper tables. In practice, we have a bunch of places where JSON columns are the right answer:
- Per-tenant configuration. Every tenant has a slightly different settings dict. Schemafying every option is torture.
settings_jsonit is. - Attributes that vary by record type. In inkbook, tattoo appointments have style-specific fields (color profile for realism, line weight for fine-line) that don't exist for other styles.
style_attributes_json. - Audit trails. Storing the "before" and "after" of a changed record in a generic way.
before_json/after_json.
SQLite's JSON functions are good enough to query these (json_extract(settings_json, '$.feature_x')). Not every field needs to be a column.
When JSON Is a Cop-Out
The anti-pattern: modeling core business entities as JSON blobs because "the schema is still evolving." I did this early in the template's life. In coachboard, I had a session_data_json field that contained the notes, action items, goals, timer data, all of it. Three months in, every interesting query required json_extract and nothing was indexable sensibly.
Rewrote it into proper tables: session, session_note, session_action_item, session_goal_reference. Queries became readable. Indexes became useful. The "schema is still evolving" defense was wrong — the schema was stable, I was just avoiding the typing work.
Rule I now follow: if a field gets queried, it gets a column. If it gets displayed, it can probably be JSON. When in doubt, structured.
The Foreign Key Question
SQLite has foreign keys, but they're off by default. They require PRAGMA foreign_keys=ON per-connection. I enable them everywhere.
About a third of our apps have disabled foreign keys for specific relationships because cascading deletes were doing the wrong thing. A family in churchadmin can have members who've moved to other families; deleting the original family shouldn't cascade-delete those members. So the member.original_family_id column has no FK constraint; it's a soft reference.
This is fine. FKs are guardrails, not commandments. Turn them off where they produce worse behavior. Document why.
Naming Conventions
Every vertical uses the same naming:
- Table names are singular:
customer, notcustomers. - Join tables use both names alphabetically:
customer_tag, nottag_customer. - Foreign key columns end in
_id:customer_id, notcust. - Boolean columns start with
is_orhas_:is_active,has_consent. - Timestamp columns end in
_at:created_at,closed_at.
The rule: if a human reads the column name, they know what it is without looking at the schema. This seems obvious. It isn't — I've seen plenty of schemas in the wild where c is a column name.
Indexes That Actually Matter
SQLite will generate decent query plans without many indexes for small tables. But there are three indexes we always add:
- Foreign key columns. SQLite doesn't auto-index FKs. Not indexing them makes JOIN performance degrade quadratically.
- Any column used in
WHEREfor the app's main list views. - Any column used in
ORDER BYfor the app's main list views.
Beyond those three, premature indexing is real. An index costs disk + write overhead. Don't add one until the query is actually slow.
The "Enum" Problem
SQLite doesn't have native enums. You can use a CHECK constraint:
status TEXT NOT NULL CHECK(status IN ('open', 'closed', 'cancelled'))
Or a lookup table. Or just an unchecked TEXT column with application-level validation. Across the portfolio: we use CHECK constraints for status fields and other small enums. Lookup tables for anything with more than ~8 values or that might grow. No unchecked TEXT ever — bugs get into enums quickly without enforcement.
Migration Story
We use a simple home-grown migration system. Each vertical has a migrations/ directory with numbered SQL files. A schema_migrations table tracks which have run. On app startup, unrun migrations are applied in order.
No Alembic, no Django migrations, no Flyway. Just SQL files and a tiny runner. 220 apps, zero migration tooling complaints in 18 months.
# migrations/003_add_recall_tracker.sql
ALTER TABLE patient ADD COLUMN recall_due_at DATE;
CREATE INDEX idx_patient_recall_due ON patient(recall_due_at) WHERE recall_due_at IS NOT NULL;
The best schema is the one you'd write if you were explaining the business to a new hire with SQL. Every table is a concept. Every column earns its keep.
Related
Flask + SQLite at scale. The template walkthrough. For the spotlights: meetingmind has the cleanest schema; breweryops has the most complex.