41 lines
1.8 KiB
Markdown
41 lines
1.8 KiB
Markdown
# Runbook: Auth OTP Failures
|
|
|
|
## Summary
|
|
|
|
Guide for diagnosing and mitigating OTP send or verify failures in phone-first authentication.
|
|
|
|
## Symptoms
|
|
|
|
- Users report not receiving OTP codes.
|
|
- `/api/auth/otp/request/` or `/api/auth/phone/request/` returns HTTP 500 or rate-limit errors.
|
|
- `/api/auth/otp/verify/` or `/api/auth/phone/verify/` returns invalid or expired OTP errors unexpectedly.
|
|
|
|
## Impact
|
|
|
|
- Users cannot sign in or complete phone verification.
|
|
- Booking and payment flows are blocked when auth is required.
|
|
|
|
## Quick Checks
|
|
|
|
- Confirm the provider configured in `backend/salon_api/settings.py` via `OTP_PROVIDER`.
|
|
- Check recent application logs for OTP send errors.
|
|
- Verify provider credentials are present in `backend/.env` for the active provider.
|
|
|
|
## Mitigation Steps
|
|
|
|
- If provider credentials are missing or invalid, fix the environment variables and restart the API process.
|
|
- If the provider is down, temporarily switch to `OTP_PROVIDER=console` for non-production environments and notify support.
|
|
- If rate limits are triggered, validate `OTP_MAX_PER_WINDOW`, `OTP_WINDOW_MINUTES`, and `OTP_RESEND_COOLDOWN_SECONDS` values and confirm client behavior is not retrying aggressively.
|
|
- For phone-login abuse spikes, also validate `PHONE_AUTH_IP_MAX_PER_WINDOW`, `PHONE_AUTH_DEVICE_MAX_PER_WINDOW`, and `PHONE_AUTH_RISK_WINDOW_MINUTES`.
|
|
- If verification is failing, confirm server time is correct and `OTP_EXPIRY_MINUTES` is appropriate.
|
|
|
|
## Rollback / Escalation
|
|
|
|
- Roll back recent auth/OTP changes if the failure coincides with a deployment.
|
|
- Escalate to the provider (Authentica) with request IDs and timestamps if external API errors persist.
|
|
|
|
## Notes
|
|
|
|
- Authentica is the primary OTP provider for MVP; console provider is for local development.
|
|
- OTP send/verify logic lives in `backend/apps/accounts/services/otp.py`.
|