Fleshed out documentation

This commit is contained in:
2026-02-28 17:41:00 +03:00
parent 828cbcc822
commit aa607b9b6e
13 changed files with 332 additions and 9 deletions
+1
View File
@@ -64,3 +64,4 @@ The dev server proxies `/api` to `http://localhost:8000`.
- Known gaps and risks: `docs/risks.md`
- Architecture and async/observability decisions: `docs/architecture.md`
- Documentation index and standards: `docs/README.md` and `docs/documentation.md`
+34 -9
View File
@@ -1,11 +1,36 @@
# Docs Notes (MVP Alignment)
# Documentation Index
## High-Level Takeaways
- The MVP roadmap aligns with Phase 1 goals but needs tighter documentation around provider readiness and async strategy.
- ExecPlan references drift between `AGENTS.md` and `PLANS.md` should be resolved to avoid conflicting guidance.
- Observability and operational visibility are thin; errors are stored but not surfaced through clear runbooks/dashboards.
This directory is the source of truth for product, engineering, and ops documentation. Keep it current as features change.
## Near-Term Focus
- Make ExecPlan references consistent and keep active plans clearly labeled.
- Document whether MVP uses async jobs (and which system) or remains synchronous with strict timeouts.
- Keep `docs/risks.md` current as gaps are closed.
## Start Here
- Project overview and setup: `README.md` (repo root)
- Architecture overview: `docs/architecture.md`
- Active ExecPlan: `docs/execplans/booking-notifications.md`
- Known risks and gaps: `docs/risks.md`
## Documentation Standards
See `docs/documentation.md` for documentation goals, update triggers, and templates.
## Docs Map
- `docs/architecture.md`: System architecture, boundaries, and MVP async/observability decision.
- `docs/adr/`: Architecture Decision Records (ADRs). New cross-cutting decisions must land here.
- `docs/execplans/`: Execution plans for significant features or refactors.
- `docs/runbooks/`: Operational runbooks and production checklists.
- `docs/risks.md`: Tracked risks and gaps.
- `docs/templates/`: Reusable templates (ADR, runbook).
## Update Triggers (Quick Reference)
- New external dependency, provider, or major flow: add an ADR in `docs/adr/`.
- Change to booking/payment/auth logic: update `docs/architecture.md` and relevant runbook(s).
- New operational procedure: add a runbook in `docs/runbooks/`.
- Close or add a significant risk: update `docs/risks.md`.
## Ownership And Review
- Authors own freshness: if you touch an area, update the docs in the same PR.
- New production flows require at least one runbook.
- Avoid duplicating instructions; link to the single source of truth.
@@ -0,0 +1,28 @@
# ADR 0001: Synchronous External Calls For MVP
## Status
Accepted
## Context
The MVP relies on OTP delivery, booking notifications, and payment gateway calls. Introducing a task queue (Celery/RQ) would add infrastructure (Redis, workers, retries) and operational complexity that is not required for the early launch.
## Decision
For the MVP, OTP sends, booking notifications, and payment gateway calls run synchronously in the request/response path with strict timeouts. A task queue will be revisited when traffic grows or operational needs change.
## Consequences
- Faster initial delivery with fewer moving parts.
- Increased latency risk on endpoints that call external providers.
- Failures are immediately visible to clients and logged for support.
## Alternatives Considered
- Celery + Redis for all external calls: rejected for MVP due to infra overhead.
- Hybrid async for notifications only: rejected to keep the execution model consistent.
## Related
- `docs/architecture.md`
+30
View File
@@ -0,0 +1,30 @@
# ADR 0002: Moyasar As The Payment Gateway
## Status
Accepted
## Context
The platform needs a payment gateway that supports Saudi Arabia, SAR currency defaults, and local payment methods (e.g. STC Pay, Apple Pay, Samsung Pay). The backend already implements a `MoyasarGateway` integration and models `payments.Payment` with a `moyasar` provider option.
## Decision
Use Moyasar as the payment gateway for the MVP. Payment creation, capture, refund, and webhook reconciliation are implemented through `apps.payments.services.gateway.MoyasarGateway`.
## Consequences
- Supports KSA-focused payment methods and SAR by default.
- Operational dependency on Moyasar uptime and API stability.
- Payment flows and webhooks are tied to the Moyasar API surface until a gateway abstraction is expanded.
## Alternatives Considered
- Other regional gateways: deferred until the MVP is validated.
- Stripe or similar global providers: not selected for MVP due to KSA-specific coverage priorities.
## Related
- `backend/apps/payments/services/gateway.py`
- `docs/runbooks/payments_sanity_check.md`
- `docs/architecture.md`
+30
View File
@@ -0,0 +1,30 @@
# ADR 0003: Authentica As Primary OTP Provider
## Status
Accepted
## Context
The platform requires phone-first authentication with OTP delivery for KSA. The codebase includes multiple provider adapters (`console`, `twilio`, `unifonic`, `authentica`) but only Authentica is implemented for provider-managed OTP delivery (send/verify) and direct SMS messaging. Twilio and Unifonic adapters are partial or unimplemented; a console provider exists for local development.
## Decision
Use Authentica as the primary OTP provider for the MVP, with `OTP_PROVIDER=authentica` in production environments. Keep `console` for local development and tests, and retain Twilio/Unifonic adapters as scaffolds for future expansion.
## Consequences
- OTP verification relies on Authentica APIs and credentials in production.
- Local development remains simple with the console provider.
- Adding a second production provider will require completing adapters and updating operational runbooks.
## Alternatives Considered
- Twilio as primary provider: not selected due to KSA-focused delivery needs and current adapter gaps.
- Unifonic as primary provider: deferred until the adapter is fully implemented and validated.
## Related
- `backend/apps/accounts/services/otp.py`
- `backend/salon_api/settings.py`
- `docs/architecture.md`
+5
View File
@@ -0,0 +1,5 @@
# Architecture Decision Records
ADRs capture cross-cutting or hard-to-reverse decisions. Add a new ADR when changing providers, async strategy, data model boundaries, or other architectural choices.
Use the template in `docs/templates/adr.md` and increment the numeric prefix (`0002`, `0003`, ...).
+11
View File
@@ -14,6 +14,16 @@ The Salon platform is a Django REST API backend with a React/Vite frontend, opti
| **payments** | Payment model, Moyasar integration (create, capture, refund), webhook reconciliation, idempotency. |
| **notifications** | Booking lifecycle notifications (SMS/WhatsApp). Reuses OTP providers; sends on booking created/confirmed/cancelled. |
## Data Model Overview
The core data model centers on users, salons, and time-bound bookings. A booking ties a customer to a service, a staff member, and a scheduled time. Payments are recorded per booking and reconcile to the external gateway. Notifications are stored for every booking lifecycle message for auditability.
- `accounts.User` owns phone, locale, and auth preferences.
- `salons.Salon`, `salons.Service`, and `salons.Staff` define the catalog and scheduling surface.
- `bookings.Booking` links customer, staff, service, and scheduled time, with status transitions.
- `payments.Payment` tracks gateway state and idempotency per booking.
- `notifications.Notification` records each SMS/WhatsApp send attempt tied to a booking event.
## Data Flow
```
@@ -28,6 +38,7 @@ User → React Frontend → Django API
## Async and Observability (MVP Decision)
**Decision (MVP):** All OTP sends, booking notifications, and payment gateway calls run **synchronously** in the request/response path. No Celery, RQ, or other task queue for the initial launch.
This is captured in ADR 0001 (`docs/adr/0001-synchronous-external-calls-mvp.md`).
**Rationale:**
- Reduces deployment complexity (no Redis, no worker processes).
+51
View File
@@ -0,0 +1,51 @@
# Documentation Practices
These standards aim to keep documentation reliable as the codebase grows.
## Principles
- Single source of truth: one canonical doc per topic; link instead of duplicating.
- Proximity: keep docs close to the code they describe when possible.
- Freshness: update docs in the same PR as the code change.
- Observable behavior: describe what someone can see or run to validate the behavior.
## Required Docs By Area
- Architecture and major decisions: `docs/architecture.md` and `docs/adr/`.
- Feature delivery plans: `docs/execplans/` (required by `PLANS.md`).
- Operational procedures: `docs/runbooks/`.
- Risks and gaps: `docs/risks.md`.
## When To Write An ADR
Use an ADR for any decision that is cross-cutting or hard to reverse, including:
- External providers or payment/auth strategy changes.
- Async vs synchronous execution decisions.
- Data model changes that affect multiple apps or services.
ADRs live in `docs/adr/` and use the template in `docs/templates/adr.md`.
## Runbook Expectations
Every production-impacting flow should have a runbook that covers:
- Symptoms and impact.
- Detection and quick checks.
- Safe remediation steps.
- Rollback or escalation path.
Use the template in `docs/templates/runbook.md`.
## Writing Style
- Be explicit: include exact commands, paths, and expected output where useful.
- Keep sections short and focused.
- Avoid unstated assumptions; if a step needs a specific directory, say so.
## Review Checklist
- Docs updated or explicitly confirmed unnecessary.
- New runbook added when operational behavior changes.
- ADR added for new cross-cutting decisions.
- `docs/risks.md` updated for meaningful gaps added or closed.
+9
View File
@@ -0,0 +1,9 @@
# Runbooks
Operational procedures live here. Each new production-impacting workflow should add or update a runbook.
Existing runbooks:
- `docs/runbooks/auth_otp_failures.md`
- `docs/runbooks/booking_failures.md`
- `docs/runbooks/payments_sanity_check.md`
+39
View File
@@ -0,0 +1,39 @@
# Runbook: Auth OTP Failures
## Summary
Guide for diagnosing and mitigating OTP send or verify failures in phone-first authentication.
## Symptoms
- Users report not receiving OTP codes.
- `/api/auth/otp/request/` or `/api/auth/phone/request/` returns HTTP 500 or rate-limit errors.
- `/api/auth/otp/verify/` or `/api/auth/phone/verify/` returns invalid or expired OTP errors unexpectedly.
## Impact
- Users cannot sign in or complete phone verification.
- Booking and payment flows are blocked when auth is required.
## Quick Checks
- Confirm the provider configured in `backend/salon_api/settings.py` via `OTP_PROVIDER`.
- Check recent application logs for OTP send errors.
- Verify provider credentials are present in `backend/.env` for the active provider.
## Mitigation Steps
- If provider credentials are missing or invalid, fix the environment variables and restart the API process.
- If the provider is down, temporarily switch to `OTP_PROVIDER=console` for non-production environments and notify support.
- If rate limits are triggered, validate `OTP_MAX_PER_WINDOW`, `OTP_WINDOW_MINUTES`, and `OTP_RESEND_COOLDOWN_SECONDS` values and confirm client behavior is not retrying aggressively.
- If verification is failing, confirm server time is correct and `OTP_EXPIRY_MINUTES` is appropriate.
## Rollback / Escalation
- Roll back recent auth/OTP changes if the failure coincides with a deployment.
- Escalate to the provider (Authentica) with request IDs and timestamps if external API errors persist.
## Notes
- Authentica is the primary OTP provider for MVP; console provider is for local development.
- OTP send/verify logic lives in `backend/apps/accounts/services/otp.py`.
+40
View File
@@ -0,0 +1,40 @@
# Runbook: Booking Failures
## Summary
Guide for diagnosing booking creation or status update failures (availability, overlap prevention, or validation errors).
## Symptoms
- `POST /api/bookings/` returns HTTP 400 or 500.
- `PATCH /api/bookings/<id>/` fails when confirming or cancelling.
- Users report bookings not appearing or incorrect status.
## Impact
- Customers cannot place bookings.
- Staff schedules become inconsistent.
- Notification and payment flows may not trigger.
## Quick Checks
- Confirm the request payload includes a valid `service`, `staff`, and scheduled time.
- Check server logs for booking validation errors or integrity exceptions.
- Verify that staff availability and overlap prevention rules are behaving as expected.
## Mitigation Steps
- Reproduce with a known test user and staff member to isolate data issues.
- If overlap rules are too strict, review booking validation logic and confirm time zone assumptions.
- If status updates are blocked, verify role checks and serializer permissions in `backend/apps/bookings/`.
- If notifications are expected but missing, confirm `NOTIFICATION_PROVIDER` configuration and notification records.
## Rollback / Escalation
- Roll back recent booking-related changes if failures started after a deployment.
- Escalate to engineering with the booking ID, user ID, and timestamps.
## Notes
- Booking validation and status transitions live in `backend/apps/bookings/`.
- Notifications for booking lifecycle are handled in `backend/apps/notifications/`.
+25
View File
@@ -0,0 +1,25 @@
# ADR <NNNN>: <Title>
## Status
Proposed | Accepted | Deprecated | Superseded
## Context
Explain the problem and the forces at play. Include constraints, risks, or user needs.
## Decision
State the decision clearly and explicitly.
## Consequences
List the expected positive and negative outcomes, including operational impact.
## Alternatives Considered
Briefly document viable alternatives and why they were rejected.
## Related
Link to relevant PRs, runbooks, or architecture sections.
+29
View File
@@ -0,0 +1,29 @@
# Runbook: <Short Title>
## Summary
One or two sentences describing the situation this runbook covers.
## Symptoms
Describe what an operator or user will observe.
## Impact
Who or what is affected.
## Quick Checks
Exact commands or checks that confirm the issue.
## Mitigation Steps
Step-by-step actions to resolve or reduce impact.
## Rollback / Escalation
How to revert or who to contact if the issue persists.
## Notes
Any caveats, dependencies, or follow-up actions.