Summary
A volunteer team with code on GitHub needed a low-friction, collaborative database solution requiring minimal training. The instinct toward a JSON file in the repository or a raw command dashboard is common but violates operational safety and data integrity principles. The core conflict is balancing ease of access with risk management and scalability. The optimal path is managed cloud services with web-based interfaces to avoid client installation overhead.
Root Cause
The root cause of the dilemma is the intersection of technical immaturity (volunteers with limited training) and operational necessity (shared data state). The specific technical risks driving the postmortem are:
- Git as a Database: Treating Git as a transactional store (JSON files) leads to merge conflicts, race conditions, and hard-to-track history. Git tracks code, not data state.
- Direct Access Exposure: Providing raw “command sending” capabilities exposes the infrastructure to SQL injection, accidental
DROPcommands, or schema corruption without guardrails. - Friction of Installation: Requiring volunteers to install database clients (e.g., DBeaver, TablePlus,
psql) creates a barrier to entry and support nightmare (OS-specific issues, credential management). - Lack of Guardrails: There is no separation between the data layer (the database) and the interface layer (the UI), meaning there is no validation logic before data is written.
Why This Happens in Real Systems
This scenario is a classic example of the “Script Kiddie to Production” gap, often seen in open-source or non-profit orgs.
- The “Good Enough” Fallacy: Leaders often believe a text file or simple spreadsheet is sufficient because the current dataset is small. This is a tactical solution that becomes a strategic liability as data grows.
- Volunteer Turnover: In volunteer-based organizations, documentation is often ignored. If a system requires learning a query language (SQL) or a specific client tool, knowledge transfer fails the moment a contributor leaves.
- Tooling Confusion: Developers often conflate Source Control (GitHub) with Data Management. Using GitHub for data usually happens because there is no designated “Single Source of Truth” for the data layer.
Real-World Impact
Implementing the wrong solution (files in Git) creates specific downstream failures:
- Data Loss via Overwrite: If two volunteers edit the same JSON file, the second one to push overwrites the first one’s work silently.
- Security Leaks: Storing sensitive data (PII, keys) in a JSON file within a repository, even a private one, creates a permanent history of that secret. If the repo is ever made public, the secret is leaked forever.
- Velocity Bottlenecks: Volunteers cannot work concurrently. They must wait for the “master” file to be unlocked, effectively serializing database writes.
- No Audit Trail: It is impossible to know who changed a specific value in a JSON file and when, whereas proper databases log every transaction.
Example or Code
This is a Pseudo-code comparison of why JSON files fail vs. databases succeed in a collaborative environment.
Scenario: Two volunteers try to add a user record at the exact same time.
// VOLUNTEER A pushes this file
{
"users": [
{"id": 1, "name": "Alice"},
{"id": 2, "name": "Bob"}
]
}
// VOLUNTEER B pushes this file (unaware of A's change)
{
"users": [
{"id": 1, "name": "Alice"},
{"id": 3, "name": "Charlie"}
]
}
Result: When merged, Git creates a conflict. If resolved incorrectly, Bob is lost, or the file becomes corrupted. A database handles this via ACID transactions, ensuring one record wins or the operation errors out safely without corrupting the rest of the dataset.
How Senior Engineers Fix It
Senior engineers solve this by prioritizing User Experience (UX) and Managed Services over raw power. The goal is to abstract the complexity away from the volunteers.
-
Choose the Right Tool (Low Training):
- Airtable (Freemium): This is the industry standard for “non-database users.” It looks like a spreadsheet but acts like a SQL database. It has zero installation (runs in browser) and built-in permissioning.
- Supabase (Postgres) + Supabase Studio: Offers a free tier with a web-based dashboard. It is strictly a database but provides a GUI for data entry.
- Google Firebase (Firestore): Good if the website needs to read this data dynamically. Has a web console.
- Retool: If you must stick to a SQL backend, Retool builds internal dashboards quickly, masking the raw SQL interface.
-
Implement “Least Privilege” Access:
- Give volunteers Editor access to the data UI, not Admin access to the infrastructure.
- Use Row Level Security (RLS) if using PostgreSQL/Firebase to ensure volunteers can only touch the rows they are authorized to touch.
-
Automate the “Pipeline”:
- If using GitHub, use GitHub Actions to validate data before it is merged (e.g., check for valid JSON schema), rather than treating the JSON file as the database itself.
Why Juniors Miss It
Junior engineers often make the following mistakes in this scenario because they optimize for developer convenience rather than user experience.
- Over-engineering with Docker/SQL: They try to spin up a local Postgres instance in Docker. This immediately fails because volunteers don’t know how to install Docker or connect to a local tunnel.
- Scripting the Problem: They write a custom JavaScript/Python script to read/write JSON files. They underestimate the time required to debug edge cases (e.g., “What happens if the internet cuts out while saving?”).
- Ignoring the “Human Factor”: They assume volunteers will read the documentation regarding how to edit the file. Juniors code for humans who read; Seniors code for humans who click.
- Confusing “Code” with “Content”: They fail to realize that for a volunteer team, the database is content management, not code management, and should be treated as a product with a UI, not a file system.