Security

Security architecture, authentication, and data handling practices.

Authentication

Firebase Auth + Guest Tokens

Authenticated endpoints require a Firebase ID token:

curl -H "Authorization: Bearer <id_token>" https://api.example.com/api/v1/sessions

Guest diagnostic flows use a short‑lived X-Guest-Token for session access:

curl -H "X-Guest-Token: <guest_token>" https://api.example.com/api/v1/sessions/<id>

Candidate Identification

Candidates are identified by candidate_id, typically a Firebase UID. Guest sessions use a temporary token tied to the session.

Security Model:

Frontend (Firebase Auth) → candidate_id → Backend → Database
Guest (X-Guest-Token) → session access → Backend → Database

Data Protection

Data at Rest

Data Type	Storage	Encryption
User data	Cloud SQL	Google-managed encryption
Session data	Cloud SQL	Google-managed encryption
Workflow state	Temporal DB (PostgreSQL)	Google-managed encryption
Analytics metrics	TimescaleDB	Google-managed encryption
Secrets	Kubernetes Secrets	Base64 (use Secret Manager for production)

Data in Transit

All external traffic over HTTPS (TLS 1.2+)
Internal cluster traffic via Kubernetes service mesh
WebSocket connections secured via WSS

Sensitive Data Handling

Never logged:

API keys
Database credentials
User PII beyond candidate_id

Logged with care:

Session IDs (for debugging)
Error messages (sanitized)

Infrastructure Security

Kubernetes

# Pod security context
securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  readOnlyRootFilesystem: true

Network Policies

Backend pods: Accept traffic from frontend, ingress controller
Database: Accept traffic only from backend pods
Temporal: Accept traffic only from worker pods

Cloud SQL

Private IP only (no public access)
IAM database authentication available
Automated backups enabled

API Security

Rate Limiting

Currently not implemented. Recommended for production:

from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)

@app.get("/api/v1/exams")
@limiter.limit("100/minute")
async def get_exams():
    ...

Input Validation

All inputs validated via Pydantic schemas:

class SessionCreate(BaseModel):
    paper_id: str
    candidate_id: str
    
    @validator('paper_id')
    def validate_uuid(cls, v):
        UUID(v)  # Raises if invalid
        return v

CORS

Configured to allow only specified origins:

origins = ["https://app.yourdomain.com"]
app.add_middleware(
    CORSMiddleware,
    allow_origins=origins,
    allow_methods=["GET", "POST", "PUT", "DELETE"],
    allow_headers=["*"],
)

AI Security

Prompt Injection Prevention

All user inputs are escaped before inclusion in prompts:

# User input is data, not instructions
prompt = f"""
Analyze the following exam response:
---
{user_response}
---
Provide feedback based ONLY on the content above.
"""

Grounding

Content generation uses Google Search grounding to prevent hallucination:

tools = [types.Tool(google_search=types.GoogleSearch())]

Output Validation

AI outputs are validated against schemas:

config = types.GenerateContentConfig(
    response_mime_type="application/json",
    response_schema=ExpectedSchema
)

Privacy

Data Collection

Data	Collected	Purpose	Retention
Exam responses	Yes	Scoring, feedback	Indefinite
Timing data	Yes	Behavioral analysis	Session duration
Tab focus events	Yes	Engagement metrics	Session duration
Camera data	Opt-in	Biometric analysis	Not stored (processed in browser)
Audio data	Opt-in	Oral examination	Transcribed, audio discarded

Biometric Data

Camera-based metrics (gaze tracking, expressions) are:

Processed entirely in-browser via MediaPipe
Only derived metrics sent to backend (e.g., "focus_score: 85")
Raw video never transmitted or stored

User Rights

Users can request:

Export of their session data
Deletion of their account and associated data
Opt-out of optional biometric collection

Threat Model (concise)

Asset	Threats	Mitigations
API keys, DB creds	Leakage, reuse	Store in Secret Manager/K8s secrets; least-privileged IAM; rotate quarterly
Session responses/PII	Unauthorized access	API key + Firebase-authenticated user context; RBAC on DB; TLS everywhere
WebSocket telemetry	Interception, spoofing	WSS only; include session_id + Firebase auth; validate origin; rate limit
AI prompt/channel	Prompt injection	Escape user inputs; tools restricted; response schema validation
Workflow control	Malicious signals	Authenticate workflow operations; limit signal payload size; audit Temporal history

Key & Secret Rotation

Firebase web keys and API_KEY: rotate quarterly; update Kubernetes secrets and redeploy.
GOOGLE_API_KEY/Gemini: rotate per org policy; verify via health check after deploy.
Database users: prefer IAM DB auth; rotate passwords every 90 days if static.

Compliance & Data Handling

No stated SOC2/FERPA; plan: log retention 30d (app), 90d (audit); PII access requires VPN + IAM.
Biometric opt-in: store only derived metrics; keep opt-in flag per candidate; include in export on request.
Backups: Cloud SQL automated; verify restores monthly in staging.

Secrets Management

Development

# .env file (gitignored)
DATABASE_URL=postgresql://localhost/learnpanta
GOOGLE_API_KEY=AIzaSy...
API_KEY=dev-secret

Production

Recommended: Use Google Secret Manager:

# Create secret
gcloud secrets create api-key --data-file=/docs/api-key.txt

# Grant access to service account
gcloud secrets add-iam-policy-binding api-key \
  --member="serviceAccount:[email protected]" \
  --role="roles/secretmanager.secretAccessor"

Current: Kubernetes Secrets (acceptable for initial deployment):

kubectl create secret generic backend-secrets \
  --from-literal=api-key="production-secret"

Compliance Considerations

GDPR

User consent for data processing: Obtained at signup
Right to erasure: Implement /api/v1/users/{id}/delete
Data portability: Implement /api/v1/users/{id}/export

SOC 2

For SOC 2 compliance, implement:

Audit logging for all data access
Automated vulnerability scanning
Incident response procedures
Access control reviews

FERPA (Educational)

If serving US educational institutions:

Limit PII collection to minimum necessary
Implement data retention policies
Provide institutional admin access controls

Security Checklist

Before Production

Ongoing

Rotate API keys quarterly
Review access logs monthly
Update dependencies for security patches
Conduct annual security review

Reporting Vulnerabilities

Report security issues to: [email protected]

We follow responsible disclosure practices and will acknowledge reports within 48 hours.

Next Steps

Configuration - Environment setup
Deployment - Secure deployment
API Reference - Endpoint documentation