chore: condense all docs and markdown files

This commit is contained in:
2026-03-14 15:11:40 +03:00
parent f3811b7520
commit 8b626a940e
24 changed files with 483 additions and 1346 deletions
+4 -5
View File
@@ -1,9 +1,8 @@
# Runbooks
Operational procedures live here. Each new production-impacting workflow should add or update a runbook.
Existing runbooks:
# Runbooks Index
Runbooks for production-impacting flows:
- `docs/runbooks/auth_otp_failures.md`
- `docs/runbooks/booking_failures.md`
- `docs/runbooks/payments_sanity_check.md`
Rule: if a new production flow is added, add or update a runbook in same change.
+24 -30
View File
@@ -1,40 +1,34 @@
# Runbook: Auth OTP Failures
## Summary
Guide for diagnosing and mitigating OTP send or verify failures in phone-first authentication.
## Symptoms
- Users report not receiving OTP codes.
- `/api/auth/otp/request/` or `/api/auth/phone/request/` returns HTTP 500 or rate-limit errors.
- `/api/auth/otp/verify/` or `/api/auth/phone/verify/` returns invalid or expired OTP errors unexpectedly.
- Users do not receive OTP.
- `/api/auth/otp/request` or `/api/auth/phone/request` fails.
- `/api/auth/otp/verify` or `/api/auth/phone/verify` shows invalid/expired unexpectedly.
## Impact
- Users cannot sign in or complete phone verification.
- Booking and payment flows are blocked when auth is required.
Users cannot sign in/verify phone; booking/payment flows may block.
## Quick Checks
- Confirm `OTP_PROVIDER` in `backend/salon_api/settings.py`.
- Check OTP provider credentials in `backend/.env`.
- Check app logs for provider/timeouts/rate-limit errors.
- Validate OTP rate-limit settings:
- `OTP_MAX_PER_WINDOW`
- `OTP_WINDOW_MINUTES`
- `OTP_RESEND_COOLDOWN_SECONDS`
- `PHONE_AUTH_IP_MAX_PER_WINDOW`
- `PHONE_AUTH_DEVICE_MAX_PER_WINDOW`
- Confirm the provider configured in `backend/salon_api/settings.py` via `OTP_PROVIDER`.
- Check recent application logs for OTP send errors.
- Verify provider credentials are present in `backend/.env` for the active provider.
## Mitigation
1. Fix env/config mismatch; restart API.
2. If provider outage, use `console` only in non-prod.
3. If abuse spike/false positives, tune IP/device thresholds.
4. Verify server clock and `OTP_EXPIRY_MINUTES`.
## Mitigation Steps
## Escalation
- Roll back recent auth changes if correlated with deployment.
- Escalate to Authentica with request IDs + timestamps.
- If provider credentials are missing or invalid, fix the environment variables and restart the API process.
- If the provider is down, temporarily switch to `OTP_PROVIDER=console` for non-production environments and notify support.
- If rate limits are triggered, validate `OTP_MAX_PER_WINDOW`, `OTP_WINDOW_MINUTES`, and `OTP_RESEND_COOLDOWN_SECONDS` values and confirm client behavior is not retrying aggressively.
- For phone-login abuse spikes, also validate `PHONE_AUTH_IP_MAX_PER_WINDOW`, `PHONE_AUTH_DEVICE_MAX_PER_WINDOW`, and `PHONE_AUTH_RISK_WINDOW_MINUTES`.
- If verification is failing, confirm server time is correct and `OTP_EXPIRY_MINUTES` is appropriate.
## Rollback / Escalation
- Roll back recent auth/OTP changes if the failure coincides with a deployment.
- Escalate to the provider (Authentica) with request IDs and timestamps if external API errors persist.
## Notes
- Authentica is the primary OTP provider for MVP; console provider is for local development.
- OTP send/verify logic lives in `backend/apps/accounts/services/otp.py`.
## References
- OTP logic: `backend/apps/accounts/services/otp.py`
- Risks: `docs/risks.md`
+18 -30
View File
@@ -1,40 +1,28 @@
# Runbook: Booking Failures
## Summary
Guide for diagnosing booking creation or status update failures (availability, overlap prevention, or validation errors).
## Symptoms
- `POST /api/bookings/` returns HTTP 400 or 500.
- `PATCH /api/bookings/<id>/` fails when confirming or cancelling.
- Users report bookings not appearing or incorrect status.
- `POST /api/bookings/` fails (400/500).
- Booking status update fails.
- Booking missing/incorrect in listing.
## Impact
- Customers cannot place bookings.
- Staff schedules become inconsistent.
- Notification and payment flows may not trigger.
Customers cannot book; staff schedule integrity degrades; dependent flows break.
## Quick Checks
- Validate payload: `service`, `staff`, `start_time`, `end_time`.
- Check logs for validation/integrity errors.
- Confirm staff availability + overlap expectations.
- If notifications expected, confirm provider config + notification rows.
- Confirm the request payload includes a valid `service`, `staff`, and scheduled time.
- Check server logs for booking validation errors or integrity exceptions.
- Verify that staff availability and overlap prevention rules are behaving as expected.
## Mitigation
1. Reproduce with known test data.
2. Inspect booking validation service and serializer permissions.
3. Confirm timezone assumptions for failing case.
4. If regression after deploy, roll back booking-related change.
## Mitigation Steps
## Escalation
Share booking id, user id, timestamps, and failing payload/response with engineering.
- Reproduce with a known test user and staff member to isolate data issues.
- If overlap rules are too strict, review booking validation logic and confirm time zone assumptions.
- If status updates are blocked, verify role checks and serializer permissions in `backend/apps/bookings/`.
- If notifications are expected but missing, confirm `NOTIFICATION_PROVIDER` configuration and notification records.
## Rollback / Escalation
- Roll back recent booking-related changes if failures started after a deployment.
- Escalate to engineering with the booking ID, user ID, and timestamps.
## Notes
- Booking validation and status transitions live in `backend/apps/bookings/`.
- Notifications for booking lifecycle are handled in `backend/apps/notifications/`.
## References
- Booking logic: `backend/apps/bookings/`
- Notification logic: `backend/apps/notifications/`
+28 -127
View File
@@ -1,136 +1,37 @@
# Payments Sanity Check (Moyasar Mock + Demo Data)
# Runbook: Payments Sanity Check (Local Mock)
This runbook documents the end-to-end sanity check for the Moyasar payments flow using demo data and a local mock provider. It is intended for developers and agents validating payment creation + webhook reconciliation before merging to `main`.
## Purpose
Verify that the payment creation endpoint and webhook processing work end-to-end in a local environment without hitting Moyasar.
Validate payment create + webhook reconciliation without hitting Moyasar.
## Preconditions
- Backend dependencies installed in the Python venv.
- Frontend is not required for this check.
- `backend/` database is migrated and uses SQLite for local dev.
## High-level Flow
1. Start a local mock Moyasar server (HTTP) that emulates `/v1/payments` responses.
2. Run migrations and seed demo data.
3. Start Django with a local payment configuration pointing to the mock server.
4. Obtain a JWT access token for the demo customer.
5. Create a payment for an existing booking.
6. Send a webhook payload to mark it as paid.
7. Verify the payment status updates.
- Venv + backend deps installed.
- DB migrated.
- Run from repo root unless noted.
## Steps
1. Start local mock server on `127.0.0.1:8001` exposing `POST /v1/payments`.
2. Seed data:
- `source venv/bin/activate`
- `cd backend`
- `python3 manage.py migrate`
- `python3 manage.py seed_demo`
3. Run API with mock settings:
- `DJANGO_DEBUG=1 MOYASAR_SECRET_KEY=sk_test MOYASAR_PUBLISHABLE_KEY=pk_test MOYASAR_BASE_URL=http://127.0.0.1:8001 MOYASAR_WEBHOOK_SECRET=whsec python3 manage.py runserver 8000`
4. Generate JWT in shell (demo user) and store as `<ACCESS>`.
5. Create payment:
- `POST /api/payments/` with `booking_id`, `provider=moyasar`, `idempotency_key`, valid source.
6. Send paid webhook:
- `POST /api/payments/webhook/` with `{"type":"payment_paid","secret_token":"whsec","data":{"id":"<external_id>"}}`
7. Verify `GET /api/payments/` shows status `paid` and `paid_at` set.
### 1) Start the mock Moyasar server
## Expected Results
- Create payment returns `status=initiated` + provider `external_id` + `redirect_url`.
- Webhook returns `{"detail":"Webhook processed"}`.
- Payment transitions to `paid` idempotently.
The mock server responds to `POST /v1/payments` with a static `id` and `transaction_url`.
Create the mock server at `/tmp/moyasar_mock.py` and run it:
python3 /tmp/moyasar_mock.py
Expected: the process stays running, listening on `http://127.0.0.1:8001`.
### 2) Run migrations and seed demo data
source venv/bin/activate
cd backend
python3 manage.py migrate
python3 manage.py seed_demo
Expected: `Demo data seeded.`
### 3) Start Django with the mock provider
Run the backend with environment variables pointing to the mock server:
DJANGO_DEBUG=1 \
MOYASAR_SECRET_KEY=sk_test \
MOYASAR_PUBLISHABLE_KEY=pk_test \
MOYASAR_BASE_URL=http://127.0.0.1:8001 \
MOYASAR_WEBHOOK_SECRET=whsec \
python3 manage.py runserver 8000
Expected: server starts at `http://127.0.0.1:8000/`.
### 4) Obtain a JWT access token
Password token login at `/api/auth/token/` is deprecated for phone-first auth. For this runbook, mint a local JWT in Django shell.
The demo customer is:
- `customer@example.com`
- `Customer123!`
Generate an access token:
python3 manage.py shell -c "from django.contrib.auth import get_user_model; from rest_framework_simplejwt.tokens import RefreshToken; u=get_user_model().objects.get(email='customer@example.com'); print(str(RefreshToken.for_user(u).access_token))"
Expected: a JWT string printed to stdout. Use it as `<ACCESS>`.
### 5) Create a payment
Pick a booking (demo data creates bookings; you can list them):
curl -s -H "Authorization: Bearer <ACCESS>" http://127.0.0.1:8000/api/bookings/
Then create a payment (example uses booking id `3`):
curl -s -X POST http://127.0.0.1:8000/api/payments/ \
-H "Authorization: Bearer <ACCESS>" \
-H "Content-Type: application/json" \
-d '{
"booking_id": 3,
"provider": "moyasar",
"idempotency_key": "<UUID>",
"source": {"type": "stcpay", "mobile": "0500000000"}
}'
Expected: response includes:
- `status: initiated`
- `external_id: pay_mock_123`
- `redirect_url: https://moyasar.example/tx/mock`
### 6) Send webhook for paid state
curl -s -X POST http://127.0.0.1:8000/api/payments/webhook/ \
-H "Content-Type: application/json" \
-d '{"type":"payment_paid","secret_token":"whsec","data":{"id":"pay_mock_123"}}'
Expected: `{ "detail": "Webhook processed" }`
### 7) Verify payment state
curl -s -H "Authorization: Bearer <ACCESS>" http://127.0.0.1:8000/api/payments/
Expected: payment record shows:
- `status: paid`
- `paid_at` set
- `metadata.last_webhook` populated
## Considerations and Edge Cases
- **Webhook secret**: `MOYASAR_WEBHOOK_SECRET` must be set. Requests missing or mismatching `secret_token` return `401`.
- **Idempotency**: reuse the same `idempotency_key` to verify the API returns the existing payment without creating another provider charge.
- **Unsupported sources**: `creditcard` is rejected by the backend. Use `stcpay`, `token`, or `applepay`.
- **Callback URL**: required for `token` payments; otherwise validation fails.
- **Demo data**: `seed_demo` creates a payment with `external_id=None` (not empty string) to avoid violating unique constraints.
- **Debug mode**: `DJANGO_DEBUG=1` is required for local `runserver` if `ALLOWED_HOSTS` is not set.
- **JWT warnings**: short JWT secret keys can trigger warnings in logs; this is acceptable for local sanity checks but should be hardened in production.
## What to Look For
- Payment creation returns `external_id` from the mock server.
- Webhook transitions the payment to `paid` and populates `paid_at`.
- `metadata.last_webhook` persists the payload for audit.
## Edge Checks
- Wrong/missing webhook secret -> `401`.
- Reused idempotency key -> same payment reused, no duplicate charge.
- Unsupported sources rejected by validation.
## Cleanup
- Stop the Django server (`Ctrl+C`).
- Stop the mock server (`Ctrl+C`).
- Optionally delete `/tmp/moyasar_mock.py`.
Stop Django + mock processes.