Security
Security architecture, authentication, and data handling practices.
Authentication
Firebase Auth + Guest Tokens
Authenticated endpoints require a Firebase ID token:
curl -H "Authorization: Bearer <id_token>" https://api.example.com/api/v1/sessions
Guest diagnostic flows use a short‑lived X-Guest-Token for session access:
curl -H "X-Guest-Token: <guest_token>" https://api.example.com/api/v1/sessions/<id>
Candidate Identification
Candidates are identified by candidate_id, typically a Firebase UID. Guest sessions use a temporary token tied to the session.
Security Model:
Frontend (Firebase Auth) → candidate_id → Backend → Database
Guest (X-Guest-Token) → session access → Backend → Database
Data Protection
Data at Rest
| Data Type | Storage | Encryption |
|---|---|---|
| User data | Cloud SQL | Google-managed encryption |
| Session data | Cloud SQL | Google-managed encryption |
| Workflow state | Temporal DB (PostgreSQL) | Google-managed encryption |
| Analytics metrics | TimescaleDB | Google-managed encryption |
| Secrets | Kubernetes Secrets | Base64 (use Secret Manager for production) |
Data in Transit
- All external traffic over HTTPS (TLS 1.2+)
- Internal cluster traffic via Kubernetes service mesh
- WebSocket connections secured via WSS
Sensitive Data Handling
Never logged:
- API keys
- Database credentials
- User PII beyond candidate_id
Logged with care:
- Session IDs (for debugging)
- Error messages (sanitized)
Infrastructure Security
Kubernetes
# Pod security context
securityContext:
runAsNonRoot: true
runAsUser: 1000
readOnlyRootFilesystem: true
Network Policies
- Backend pods: Accept traffic from frontend, ingress controller
- Database: Accept traffic only from backend pods
- Temporal: Accept traffic only from worker pods
Cloud SQL
- Private IP only (no public access)
- IAM database authentication available
- Automated backups enabled
API Security
Rate Limiting
Currently not implemented. Recommended for production:
from slowapi import Limiter
from slowapi.util import get_remote_address
limiter = Limiter(key_func=get_remote_address)
@app.get("/api/v1/exams")
@limiter.limit("100/minute")
async def get_exams():
...
Input Validation
All inputs validated via Pydantic schemas:
class SessionCreate(BaseModel):
paper_id: str
candidate_id: str
@validator('paper_id')
def validate_uuid(cls, v):
UUID(v) # Raises if invalid
return v
CORS
Configured to allow only specified origins:
origins = ["https://app.yourdomain.com"]
app.add_middleware(
CORSMiddleware,
allow_origins=origins,
allow_methods=["GET", "POST", "PUT", "DELETE"],
allow_headers=["*"],
)
AI Security
Prompt Injection Prevention
All user inputs are escaped before inclusion in prompts:
# User input is data, not instructions
prompt = f"""
Analyze the following exam response:
---
{user_response}
---
Provide feedback based ONLY on the content above.
"""
Grounding
Content generation uses Google Search grounding to prevent hallucination:
tools = [types.Tool(google_search=types.GoogleSearch())]
Output Validation
AI outputs are validated against schemas:
config = types.GenerateContentConfig(
response_mime_type="application/json",
response_schema=ExpectedSchema
)
Privacy
Data Collection
| Data | Collected | Purpose | Retention |
|---|---|---|---|
| Exam responses | Yes | Scoring, feedback | Indefinite |
| Timing data | Yes | Behavioral analysis | Session duration |
| Tab focus events | Yes | Engagement metrics | Session duration |
| Camera data | Opt-in | Biometric analysis | Not stored (processed in browser) |
| Audio data | Opt-in | Oral examination | Transcribed, audio discarded |
Biometric Data
Camera-based metrics (gaze tracking, expressions) are:
- Processed entirely in-browser via MediaPipe
- Only derived metrics sent to backend (e.g., "focus_score: 85")
- Raw video never transmitted or stored
User Rights
Users can request:
- Export of their session data
- Deletion of their account and associated data
- Opt-out of optional biometric collection
Threat Model (concise)
| Asset | Threats | Mitigations |
|---|---|---|
| API keys, DB creds | Leakage, reuse | Store in Secret Manager/K8s secrets; least-privileged IAM; rotate quarterly |
| Session responses/PII | Unauthorized access | API key + Firebase-authenticated user context; RBAC on DB; TLS everywhere |
| WebSocket telemetry | Interception, spoofing | WSS only; include session_id + Firebase auth; validate origin; rate limit |
| AI prompt/channel | Prompt injection | Escape user inputs; tools restricted; response schema validation |
| Workflow control | Malicious signals | Authenticate workflow operations; limit signal payload size; audit Temporal history |
Key & Secret Rotation
- Firebase web keys and
API_KEY: rotate quarterly; update Kubernetes secrets and redeploy. GOOGLE_API_KEY/Gemini: rotate per org policy; verify via health check after deploy.- Database users: prefer IAM DB auth; rotate passwords every 90 days if static.
Compliance & Data Handling
- No stated SOC2/FERPA; plan: log retention 30d (app), 90d (audit); PII access requires VPN + IAM.
- Biometric opt-in: store only derived metrics; keep opt-in flag per candidate; include in export on request.
- Backups: Cloud SQL automated; verify restores monthly in staging.
Secrets Management
Development
# .env file (gitignored)
DATABASE_URL=postgresql://localhost/learnpanta
GOOGLE_API_KEY=AIzaSy...
API_KEY=dev-secret
Production
Recommended: Use Google Secret Manager:
# Create secret
gcloud secrets create api-key --data-file=/docs/api-key.txt
# Grant access to service account
gcloud secrets add-iam-policy-binding api-key \
--member="serviceAccount:[email protected]" \
--role="roles/secretmanager.secretAccessor"
Current: Kubernetes Secrets (acceptable for initial deployment):
kubectl create secret generic backend-secrets \
--from-literal=api-key="production-secret"
Compliance Considerations
GDPR
- User consent for data processing: Obtained at signup
- Right to erasure: Implement
/api/v1/users/{id}/delete - Data portability: Implement
/api/v1/users/{id}/export
SOC 2
For SOC 2 compliance, implement:
- Audit logging for all data access
- Automated vulnerability scanning
- Incident response procedures
- Access control reviews
FERPA (Educational)
If serving US educational institutions:
- Limit PII collection to minimum necessary
- Implement data retention policies
- Provide institutional admin access controls
Security Checklist
Before Production
- Change default API key
- Enable HTTPS only
- Configure CORS for production domains
- Set up Cloud SQL with private IP
- Enable Cloud SQL backups
- Review Kubernetes RBAC
- Implement rate limiting
- Set up monitoring/alerting
Ongoing
- Rotate API keys quarterly
- Review access logs monthly
- Update dependencies for security patches
- Conduct annual security review
Reporting Vulnerabilities
Report security issues to: [email protected]
We follow responsible disclosure practices and will acknowledge reports within 48 hours.
Next Steps
- Configuration - Environment setup
- Deployment - Secure deployment
- API Reference - Endpoint documentation