gateway/docs/code-documentation/security-component.md

53 KiB

Security Component Documentation

Comprehensive documentation for the Security component of the Gateway application, detailing authentication, authorization, token management, CSRF protection, and integration with other system components.

Table of Contents

  1. Overview
  2. High-Level Architecture
  3. Request Flow
  4. Component Overview
  5. Detailed Component Explanations
  6. How Components Work Together
  7. Security Patterns & Best Practices
  8. Configuration
  9. Security Considerations
  10. Troubleshooting

Note

: For API endpoint documentation and usage examples, see Security API Documentation.

Overview

What is the Security Component?

The Security component is a comprehensive authentication and authorization system that protects the Gateway application. It ensures that only authenticated and authorized users can access protected resources, while providing a seamless user experience through automatic token management and refresh mechanisms.

Why Does It Exist?

Modern web applications need robust security to protect user data and system resources. The Security component provides:

  • Authentication: Verifying who a user is (login process)
  • Authorization: Determining what a user can do (permissions)
  • Session Management: Maintaining user sessions securely
  • Token Management: Handling JWT tokens for stateless authentication
  • OAuth Integration: Supporting third-party authentication (Microsoft, Google)
  • Attack Prevention: Protecting against common web vulnerabilities (CSRF, XSS)

What Problems Does It Solve?

  1. User Authentication: Users need to prove their identity to access the system
  2. Session Security: Sessions must be maintained securely without exposing sensitive data
  3. Token Expiration: OAuth tokens expire and need automatic refresh to avoid user disruption
  4. Multi-Tenancy: Users belong to different mandates (organizations) and must be scoped correctly
  5. Cross-Site Attacks: Protection against CSRF and XSS attacks
  6. Rate Limiting: Preventing brute force attacks and system abuse

Key Features

  • Multi-Authority Authentication: Supports three authentication methods:

    • Local Authentication: Username/password stored in the database
    • Microsoft OAuth: Single Sign-On (SSO) via Microsoft Azure AD
    • Google OAuth: Single Sign-On (SSO) via Google accounts
  • JWT Token Management:

    • Creates secure JSON Web Tokens (JWT) for authentication
    • Manages both access tokens (short-lived) and refresh tokens (long-lived)
    • Validates tokens on every request
    • Supports token revocation for LOCAL authentication
  • Cookie-Based Authentication:

    • Uses secure httpOnly cookies to store tokens (prevents JavaScript access)
    • Falls back to Authorization headers for API clients
    • Automatically configures security settings based on environment (HTTPS vs HTTP)
  • Automatic Token Refresh:

    • Background refresh of expired OAuth tokens
    • Proactive refresh before expiration
    • Non-blocking operation (doesn't slow down user requests)
  • CSRF Protection:

    • Validates CSRF tokens for state-changing operations (POST, PUT, DELETE, PATCH)
    • Exempts login and OAuth callback endpoints
    • Prevents cross-site request forgery attacks
  • Rate Limiting:

    • Built-in rate limiting for authentication endpoints
    • Prevents brute force attacks
    • Configurable limits per endpoint
  • Database-Backed Token Validation:

    • LOCAL tokens are tracked in the database
    • Supports token revocation
    • Validates token status on every request

High-Level Architecture

The Big Picture

The Security component acts as a protective layer around the entire Gateway application. Every HTTP request passes through security middleware before reaching your application code. Think of it as a security checkpoint at the entrance of a building - everyone must pass through it, and only authorized people are allowed in.

graph TB
    subgraph "Client"
        Browser[Web Browser<br/>or API Client]
    end
    
    subgraph "Security Component - Request Processing"
        CSRF[CSRF Protection<br/>Validates CSRF tokens]
        TokenRefresh[Token Refresh<br/>Refreshes expired tokens]
        Auth[Authentication<br/>Validates JWT tokens]
    end
    
    subgraph "Application"
        Routes[API Routes<br/>Your application code]
    end
    
    Browser -->|HTTP Request| CSRF
    CSRF -->|Valid CSRF| TokenRefresh
    TokenRefresh -->|Valid Token| Auth
    Auth -->|Authenticated User| Routes
    Routes -->|Response| Browser
    
    style CSRF fill:#fce4ec,stroke:#880e4f,stroke-width:2px
    style TokenRefresh fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style Auth fill:#e1f5ff,stroke:#01579b,stroke-width:2px

Component Structure

The Security component consists of six main modules, each with a specific responsibility:

graph TB
    subgraph "Security Component"
        Auth[auth.py<br/>Authentication & User Context<br/><br/>Validates tokens and extracts user info]
        JWT[jwtService.py<br/>JWT Creation & Cookie Management<br/><br/>Creates tokens and manages cookies]
        TokenMgr[tokenManager.py<br/>OAuth Token Refresh<br/><br/>Refreshes Microsoft/Google tokens]
        TokenRefreshSvc[tokenRefreshService.py<br/>Token Refresh Orchestration<br/><br/>Coordinates token refresh operations]
        TokenRefreshMw[tokenRefreshMiddleware.py<br/>Automatic Token Refresh<br/><br/>Middleware that triggers refresh]
        CSRF[csrf.py<br/>CSRF Protection<br/><br/>Validates CSRF tokens]
    end
    
    subgraph "External Dependencies"
        Config[Configuration<br/>APP_CONFIG]
        DB[(Database<br/>PostgreSQL)]
        Interfaces[Interfaces<br/>Data Access Layer]
        Routes[Routes<br/>API Endpoints]
    end
    
    subgraph "External OAuth Providers"
        MSFT[Microsoft OAuth]
        Google[Google OAuth]
    end
    
    Auth --> JWT
    Auth --> Interfaces
    Auth --> Config
    Auth --> Routes
    
    JWT --> Config
    
    TokenMgr --> Config
    TokenMgr --> MSFT
    TokenMgr --> Google
    TokenMgr --> Interfaces
    
    TokenRefreshSvc --> TokenMgr
    TokenRefreshSvc --> Interfaces
    TokenRefreshSvc --> DB
    
    TokenRefreshMw --> TokenRefreshSvc
    
    CSRF --> Routes
    
    Routes --> Auth
    
    style Auth fill:#e1f5ff,stroke:#01579b,stroke-width:3px
    style JWT fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style TokenMgr fill:#fff3e0,stroke:#e65100,stroke-width:2px
    style TokenRefreshSvc fill:#fff3e0,stroke:#e65100,stroke-width:2px
    style TokenRefreshMw fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style CSRF fill:#fce4ec,stroke:#880e4f,stroke-width:2px

How It All Fits Together

  1. Middleware Layer (runs first): CSRF protection and token refresh middleware intercept requests
  2. Authentication Layer (runs when needed): Validates JWT tokens and extracts user information
  3. Route Layer (runs last): Your application code receives authenticated requests with user context

Request Flow

What Happens When a User Makes a Request?

Let's trace through what happens when a user makes an API request:

Step 1: Request Arrives

A user's browser or API client sends an HTTP request to the Gateway API. This request might be:

  • A GET request to fetch data
  • A POST request to create something
  • A PUT request to update something
  • A DELETE request to remove something

Step 2: CSRF Protection (if state-changing)

If the request is a state-changing operation (POST, PUT, DELETE, PATCH), the CSRF middleware checks for a CSRF token in the X-CSRF-Token header. This prevents malicious websites from making requests on behalf of the user.

Why? Imagine you're logged into the Gateway application. A malicious website could try to trick your browser into making a request to the Gateway API (like deleting your data). CSRF protection prevents this by requiring a token that only the legitimate Gateway application knows.

Step 3: Token Refresh (if needed)

The token refresh middleware checks if the user has any expired OAuth tokens (Microsoft or Google). If so, it automatically refreshes them in the background without blocking the request.

Why? OAuth tokens expire after a certain time (usually 1 hour). Instead of waiting for the token to expire and then failing, the system proactively refreshes tokens before they expire. This happens silently in the background so users never notice.

Step 4: Authentication

The authentication layer extracts the JWT token from either:

  • An httpOnly cookie (for web browsers)
  • An Authorization header (for API clients)

It then validates the token:

  • Checks the token format (is it a valid JWT?)
  • Verifies the signature (was it signed with our secret key?)
  • Checks expiration (has it expired?)
  • Validates the user exists and is enabled
  • For LOCAL tokens, checks the database to ensure the token hasn't been revoked

Step 5: User Context Extraction

If authentication succeeds, the system extracts user information from the token:

  • Username
  • User ID
  • Mandate ID (which organization the user belongs to)
  • Authentication authority (LOCAL, MSFT, or GOOGLE)

Step 6: Route Handler Execution

Finally, the request reaches your route handler with a fully authenticated user object. Your code can trust that:

  • The user is who they claim to be
  • The user has permission to make this request (based on your route's authentication requirements)
  • The user's context (mandate, etc.) is correct

Visual Request Flow

sequenceDiagram
    participant Client as Client/Browser
    participant CSRF as CSRF Middleware
    participant TokenMw as Token Refresh Middleware
    participant Auth as Authentication Layer
    participant Route as Route Handler
    
    Client->>CSRF: HTTP Request<br/>(POST /api/data)
    
    alt State-Changing Request
        CSRF->>CSRF: Check X-CSRF-Token header
        alt Invalid CSRF Token
            CSRF-->>Client: 403 Forbidden
        else Valid CSRF Token
            CSRF->>TokenMw: Continue
        end
    else Read-Only Request
        CSRF->>TokenMw: Continue
    end
    
    TokenMw->>TokenMw: Check for expired OAuth tokens
    alt Tokens Need Refresh
        TokenMw->>TokenMw: Refresh tokens (background)
    end
    TokenMw->>Auth: Continue
    
    Auth->>Auth: Extract JWT from cookie/header
    Auth->>Auth: Validate token signature
    Auth->>Auth: Check token expiration
    Auth->>Auth: Lookup user in database
    Auth->>Auth: Validate user status
    
    alt Authentication Failed
        Auth-->>Client: 401 Unauthorized
    else Authentication Succeeded
        Auth->>Route: Request + User Object
        Route->>Route: Process request with user context
        Route-->>Client: Response
    end

Component Overview

Before diving into the details of each component, let's understand what each one does at a high level:

1. auth.py - The Authentication Core

Role: The heart of authentication. Validates tokens and extracts user information.

Think of it as: A security guard who checks IDs at the door. They verify your token (ID) is valid, check if you're allowed in, and tell the system who you are.

2. jwtService.py - Token Factory

Role: Creates JWT tokens and manages HTTP cookies.

Think of it as: A ticket office that issues tickets (tokens) and manages how they're stored (cookies).

3. tokenManager.py - OAuth Token Handler

Role: Refreshes OAuth tokens from Microsoft and Google.

Think of it as: A renewal office that extends your Microsoft/Google access passes before they expire.

4. tokenRefreshService.py - Token Refresh Coordinator

Role: Orchestrates token refresh operations, handles rate limiting, and tracks refresh attempts.

Think of it as: A manager who coordinates when and how tokens should be refreshed, ensuring we don't refresh too frequently.

5. tokenRefreshMiddleware.py - Automatic Refresh Trigger

Role: FastAPI middleware that automatically triggers token refresh when users make requests.

Think of it as: An automatic system that checks your tokens in the background and refreshes them when needed, without you having to think about it.

6. csrf.py - CSRF Protection

Role: Validates CSRF tokens to prevent cross-site request forgery attacks.

Think of it as: A bouncer who checks that requests are coming from legitimate sources, not malicious websites.

Detailed Component Explanations

1. auth.py - Authentication & User Context

What Does It Do?

The auth.py module is responsible for authenticating users on every request. It's the first line of defense that determines whether a request should be allowed to proceed.

Key Components Explained

CookieAuth Class

This is a custom implementation of FastAPI's HTTPBearer security scheme. It's smart enough to check two places for authentication tokens:

  1. httpOnly Cookies (preferred for web browsers): Cookies are automatically sent by the browser, making them convenient for web applications. The httpOnly flag prevents JavaScript from accessing them, which protects against XSS attacks.

  2. Authorization Header (for API clients): Programmatic API clients (like mobile apps or scripts) send tokens in the Authorization: Bearer <token> header.

Why both? Web browsers work best with cookies (they're automatically included), while API clients prefer headers (they have more control). This dual approach supports both use cases.

Example Flow:

# When a request comes in:
1. Check cookie: request.cookies.get('auth_token')
2. If not found, check header: request.headers.get("Authorization")
3. Extract token from whichever source has it
4. Return token for validation

_getUserBase Function

This is the core authentication function that performs all the security checks. Let's break down what it does step by step:

Step 1: Token Format Validation

# Checks if token has the correct JWT structure: header.payload.signature
if token.count(".") != 2:
    raise credentialsException  # Invalid format

Why? JWTs have a specific format. If the token doesn't have exactly two dots, it's not a valid JWT and we reject it immediately.

Step 2: Token Signature Verification

# Decodes and verifies the token was signed with our secret key
payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])

Why? This ensures the token was actually issued by our server. If someone tries to forge a token, they won't have our secret key, so the signature won't match and the token will be rejected.

Step 3: Extract User Information

username = payload.get("sub")  # Subject (username)
mandateId = payload.get("mandateId")  # Which organization
userId = payload.get("userId")  # User ID
authority = payload.get("authenticationAuthority")  # LOCAL, MSFT, or GOOGLE
tokenId = payload.get("jti")  # Token ID for tracking

Why? The token contains all the information we need to identify the user. We extract it so we can verify it matches what's in the database.

Step 4: User Lookup

user = appInterface.getUserByUsername(username)
if user is None:
    raise credentialsException  # User doesn't exist

Why? The token might be valid, but the user might have been deleted or the username might have changed. We always verify against the database.

Step 5: User Status Check

if not user.enabled:
    raise HTTPException(status_code=403, detail="User is disabled")

Why? Even if authentication succeeds, disabled users shouldn't be able to access the system. This provides a way to temporarily disable accounts without deleting them.

Step 6: Context Validation

if str(user.mandateId) != str(mandateId) or str(user.id) != str(userId):
    raise HTTPException(status_code=401, detail="User context has changed")

Why? If a user's mandate or ID changes (maybe they were moved to a different organization), their old tokens become invalid. This forces them to log in again with the new context.

Step 7: Database Token Validation (for LOCAL tokens)

# For LOCAL tokens, check if token exists in database and is active
if authority == AuthAuthority.LOCAL:
    active_token = appInterface.findActiveTokenById(tokenId, userId, ...)
    if not active_token:
        raise credentialsException  # Token was revoked

Why? LOCAL tokens are stored in the database so we can revoke them. If a user logs out or their token is revoked, we check the database to ensure the token is still valid.

getCurrentUser Function

This is a simple wrapper around _getUserBase that provides a clean interface for route handlers. Routes use it like this:

@router.get("/protected")
async def protected_endpoint(currentUser: User = Depends(getCurrentUser)):
    # currentUser is guaranteed to be authenticated and enabled
    return {"message": f"Hello, {currentUser.username}!"}

Why a wrapper? It provides a clear, simple interface for routes. Routes don't need to know about the internal _getUserBase function - they just use getCurrentUser and trust that it works.

Security Checks Performed

The authentication process performs multiple layers of security checks:

  1. JWT Format Validation: Ensures the token has the correct structure
  2. Signature Verification: Verifies the token was signed with our secret key
  3. Expiration Check: Ensures the token hasn't expired
  4. User Existence: Verifies the user still exists in the database
  5. User Status: Checks if the user is enabled
  6. Context Validation: Ensures token context matches user record
  7. Token Revocation: For LOCAL tokens, checks database for revocation status

Dependencies

  • jwtService.py: Uses JWT decoding functions (via the jose library)
  • interfaceDbAppObjects: Accesses the database to look up users and validate tokens
  • datamodelUam.User: User data model
  • datamodelSecurity.Token: Token data model
  • slowapi.Limiter: Rate limiting utility (exported for use in routes)

What Does It Do?

The jwtService.py module is responsible for creating JWT tokens and managing how they're stored in HTTP cookies. It's the "token factory" that issues authentication tokens.

Key Functions Explained

createAccessToken

Creates a short-lived JWT access token that users include with every request.

def createAccessToken(data: dict, expiresDelta: Optional[timedelta] = None) -> Tuple[str, datetime]:
    # Adds a unique token ID (jti) if not present
    if "jti" not in toEncode:
        toEncode["jti"] = str(uuid.uuid4())
    
    # Sets expiration time
    expire = getUtcNow() + (expiresDelta or timedelta(minutes=ACCESS_TOKEN_EXPIRE_MINUTES))
    toEncode.update({"exp": expire})
    
    # Signs and encodes the token
    encodedJwt = jwt.encode(toEncode, SECRET_KEY, algorithm=ALGORITHM)
    return encodedJwt, expire

What it does:

  1. Ensures every token has a unique ID (JTI) for tracking
  2. Sets an expiration time (default: 60 minutes)
  3. Signs the token with our secret key
  4. Returns both the token string and expiration time

Why short-lived? If a token is stolen, it will expire quickly, limiting the damage. Access tokens are meant to be used frequently and replaced often.

createRefreshToken

Creates a long-lived refresh token used to obtain new access tokens.

def createRefreshToken(data: dict) -> Tuple[str, datetime]:
    toEncode = data.copy()
    if "jti" not in toEncode:
        toEncode["jti"] = str(uuid.uuid4())
    toEncode["type"] = "refresh"  # Marks it as a refresh token
    
    expire = getUtcNow() + timedelta(days=REFRESH_TOKEN_EXPIRE_DAYS)  # Default: 7 days
    toEncode.update({"exp": expire})
    encodedJwt = jwt.encode(toEncode, SECRET_KEY, algorithm=ALGORITHM)
    return encodedJwt, expire

What it does:

  1. Similar to access token creation
  2. Marks the token as type "refresh"
  3. Sets a longer expiration (default: 7 days)
  4. Returns the token and expiration

Why long-lived? Refresh tokens are used less frequently (only when access tokens expire). They allow users to stay logged in for extended periods without re-entering credentials.

setAccessTokenCookie

Stores the access token in an httpOnly cookie.

def setAccessTokenCookie(response: Response, token: str, expiresDelta: Optional[timedelta] = None):
    maxAge = expiresDelta.total_seconds() if expiresDelta else ACCESS_TOKEN_EXPIRE_MINUTES * 60
    response.set_cookie(
        key="auth_token",
        value=token,
        httponly=True,  # JavaScript can't access it
        secure=USE_SECURE_COOKIES,  # Only sent over HTTPS in production
        samesite="strict",  # Prevents CSRF attacks
        path="/",  # Available to entire application
        max_age=maxAge
    )

Security Settings Explained:

  • httponly=True: Prevents JavaScript from accessing the cookie. This protects against XSS attacks where malicious scripts try to steal tokens.
  • secure=USE_SECURE_COOKIES: In production (HTTPS), cookies are only sent over encrypted connections. In development (HTTP), this is disabled.
  • samesite="strict": Cookies are only sent with requests from the same site. This prevents CSRF attacks where malicious sites try to make requests with your cookies.
  • path="/": The cookie is available to all paths in the application.

setRefreshTokenCookie

Similar to setAccessTokenCookie, but for refresh tokens with longer expiration.

clearAccessTokenCookie / clearRefreshTokenCookie

These functions remove cookies when users log out. They use a dual-method approach for maximum browser compatibility:

def clearAccessTokenCookie(response: Response):
    # Method 1: Raw Set-Cookie header with expiration in the past
    response.headers.append(
        "Set-Cookie",
        f"auth_token=deleted; Path=/; Max-Age=0; Expires=Thu, 01 Jan 1970 00:00:00 GMT; HttpOnly; SameSite=Strict"
    )
    
    # Method 2: FastAPI's built-in method
    response.delete_cookie(key="auth_token", path="/")

Why two methods? Different browsers handle cookie deletion differently. Using both methods ensures the cookie is deleted regardless of browser quirks.

Configuration

The module uses these configuration values:

  • APP_JWT_KEY_SECRET: The secret key used to sign tokens. Must be kept secret!
  • Auth_ALGORITHM: JWT signing algorithm (default: HS256)
  • APP_TOKEN_EXPIRY: How long access tokens are valid (default: 60 minutes)
  • APP_REFRESH_TOKEN_EXPIRE_DAYS: How long refresh tokens are valid (default: 7 days)
  • APP_API_URL: Used to determine if cookies should be secure (HTTPS) or not (HTTP)

3. tokenManager.py - OAuth Token Refresh

What Does It Do?

The tokenManager.py module handles refreshing OAuth tokens from Microsoft and Google. When users authenticate via OAuth, they receive tokens that expire after a certain time. This module refreshes those tokens before they expire, so users don't get interrupted.

Understanding OAuth Token Refresh

When a user logs in with Microsoft or Google:

  1. They're redirected to Microsoft/Google's login page
  2. After successful login, Microsoft/Google gives us:
    • An access token (short-lived, ~1 hour)
    • A refresh token (long-lived, can be used to get new access tokens)

When the access token expires, we use the refresh token to get a new access token without requiring the user to log in again.

Key Methods Explained

refreshMicrosoftToken

Refreshes a Microsoft OAuth token.

def refreshMicrosoftToken(self, refreshToken: str, userId: str, oldToken: Token) -> Optional[Token]:
    # Microsoft token refresh endpoint
    tokenUrl = f"https://login.microsoftonline.com/{self.msft_tenant_id}/oauth2/v2.0/token"
    
    # Prepare refresh request
    data = {
        "client_id": self.msft_client_id,
        "client_secret": self.msft_client_secret,
        "grant_type": "refresh_token",
        "refresh_token": refreshToken,
        "scope": "Mail.ReadWrite Mail.Send Mail.ReadWrite.Shared User.Read"
    }
    
    # Make request to Microsoft
    response = client.post(tokenUrl, data=data)
    
    if response.status_code == 200:
        tokenData = response.json()
        # Create new token object with refreshed data
        newToken = Token(
            userId=userId,
            authority=AuthAuthority.MSFT,
            connectionId=oldToken.connectionId,  # Preserve connection ID
            tokenAccess=tokenData["access_token"],
            tokenRefresh=tokenData.get("refresh_token", refreshToken),  # Keep old if new not provided
            expiresAt=createExpirationTimestamp(tokenData.get("expires_in", 3600)),
        )
        return newToken

What it does:

  1. Sends a request to Microsoft's token endpoint with the refresh token
  2. Microsoft validates the refresh token and issues a new access token
  3. Creates a new Token object with the refreshed data
  4. Preserves the connection ID (so we know which Microsoft connection this is for)

Why preserve connection ID? Users can have multiple OAuth connections (e.g., multiple Microsoft accounts). The connection ID identifies which specific connection this token belongs to.

refreshGoogleToken

Similar to refreshMicrosoftToken, but for Google's OAuth endpoint:

def refreshGoogleToken(self, refreshToken: str, userId: str, oldToken: Token) -> Optional[Token]:
    tokenUrl = "https://oauth2.googleapis.com/token"
    # ... similar process to Microsoft

refreshToken

A generic method that routes to the appropriate provider-specific refresh method:

def refreshToken(self, oldToken: Token) -> Optional[Token]:
    # Cooldown check: don't refresh if refreshed recently
    if secondsSinceLastRefresh < 10 * 60:  # 10 minutes
        return oldToken  # Return existing token
    
    # Route to appropriate provider
    if oldToken.authority == AuthAuthority.MSFT:
        return self.refreshMicrosoftToken(...)
    elif oldToken.authority == AuthAuthority.GOOGLE:
        return self.refreshGoogleToken(...)

Cooldown Mechanism: Prevents excessive refresh attempts. If a token was refreshed less than 10 minutes ago, we skip refreshing it again. This prevents hitting OAuth provider rate limits.

ensureFreshToken

Proactively refreshes a token if it's about to expire:

def ensureFreshToken(self, token: Token, *, secondsBeforeExpiry: int = 30 * 60) -> Optional[Token]:
    nowTs = getUtcTimestamp()
    expiresAt = token.expiresAt or 0
    
    # If token expires within threshold (default: 30 minutes), refresh it
    if expiresAt < (nowTs + secondsBeforeExpiry):
        refreshed = self.refreshToken(token)
        if refreshed and saveCallback:
            saveCallback(refreshed)  # Persist the refreshed token
        return refreshed
    
    return token  # Token is still fresh

Why proactive? Instead of waiting for tokens to expire and then failing, we refresh them 30 minutes before expiration. This ensures tokens are always fresh when needed.

Features

  • Cooldown Mechanism: 10-minute minimum between refresh attempts prevents rate limit exhaustion
  • Automatic Persistence: Uses callback mechanism to save refreshed tokens to database
  • Proactive Refresh: Refreshes tokens 30 minutes before expiration
  • Error Handling: Gracefully handles OAuth provider failures
  • Connection Preservation: Maintains connection IDs during refresh

Dependencies

  • httpx: HTTP client for making requests to OAuth providers
  • interfaceDbAppObjects: For token persistence (via callback)
  • datamodelSecurity.Token: Token data model
  • datamodelUam.AuthAuthority: Authority enum (LOCAL, MSFT, GOOGLE)

4. tokenRefreshService.py - Token Refresh Orchestration

What Does It Do?

The tokenRefreshService.py module is a high-level service that coordinates token refresh operations. It handles multiple connections, rate limiting, and tracks refresh attempts. Think of it as the "manager" that decides when and how tokens should be refreshed.

Key Methods Explained

refresh_expired_tokens

Refreshes all expired OAuth tokens for a user.

async def refresh_expired_tokens(self, user_id: str) -> Dict[str, Any]:
    # Get all connections for the user
    connections = root_interface.getUserConnections(user_id)
    
    refreshed_count = 0
    failed_count = 0
    rate_limited_count = 0
    
    for connection in connections:
        # Only refresh expired OAuth connections
        if connection.tokenStatus == 'expired' and connection.authority in [AuthAuthority.GOOGLE, AuthAuthority.MSFT]:
            
            # Check rate limiting
            if self._is_rate_limited(connection.id):
                rate_limited_count += 1
                continue
            
            # Record attempt
            self._record_refresh_attempt(connection.id)
            
            # Refresh based on authority
            if connection.authority == AuthAuthority.GOOGLE:
                success = await self._refresh_google_token(root_interface, connection)
            elif connection.authority == AuthAuthority.MSFT:
                success = await self._refresh_microsoft_token(root_interface, connection)
            
            if success:
                refreshed_count += 1
            else:
                failed_count += 1
    
    return {
        "refreshed": refreshed_count,
        "failed": failed_count,
        "rate_limited": rate_limited_count
    }

What it does:

  1. Gets all OAuth connections for the user
  2. Filters to only expired connections
  3. Checks rate limits for each connection
  4. Attempts to refresh each expired token
  5. Tracks success/failure/rate-limited counts
  6. Returns summary of results

Why batch processing? Users might have multiple OAuth connections (e.g., both Microsoft and Google). This method refreshes all of them in one operation.

proactive_refresh

Proactively refreshes tokens that are about to expire (within 5 minutes).

async def proactive_refresh(self, user_id: str) -> Dict[str, Any]:
    connections = root_interface.getUserConnections(user_id)
    current_time = getUtcTimestamp()
    five_minutes = 5 * 60
    
    for connection in connections:
        # Only refresh active tokens that expire soon
        if (connection.tokenStatus == 'active' and 
            connection.tokenExpiresAt and
            connection.authority in [AuthAuthority.GOOGLE, AuthAuthority.MSFT]):
            
            time_until_expiry = connection.tokenExpiresAt - current_time
            if 0 < time_until_expiry <= five_minutes:
                # Refresh this token
                # ... (similar to refresh_expired_tokens)

What it does:

  1. Gets all active OAuth connections
  2. Checks which ones expire within 5 minutes
  3. Refreshes those tokens proactively
  4. Returns summary of results

Why 5 minutes? This is a safety margin. If a token expires in 5 minutes, we refresh it now to ensure it's still valid when the user needs it.

Rate Limiting

The service implements per-connection rate limiting:

def _is_rate_limited(self, connection_id: str) -> bool:
    now = getUtcTimestamp()
    if connection_id not in self.rate_limit_map:
        return False
    
    # Remove attempts older than 1 hour
    recent_attempts = [
        attempt_time for attempt_time in self.rate_limit_map[connection_id]
        if now - attempt_time < (self.refresh_window_minutes * 60)
    ]
    self.rate_limit_map[connection_id] = recent_attempts
    
    return len(recent_attempts) >= self.max_attempts_per_hour  # Default: 3

What it does:

  1. Tracks refresh attempts per connection
  2. Uses a 1-hour sliding window
  3. Limits to 3 attempts per hour per connection
  4. Prevents OAuth provider rate limit exhaustion

Why rate limiting? OAuth providers (Microsoft, Google) have rate limits. If we refresh too frequently, we'll hit those limits and all refresh attempts will fail. Rate limiting prevents this.

Features

  • Rate Limiting: Maximum 3 refresh attempts per hour per connection
  • Batch Processing: Handles multiple connections in one operation
  • Status Tracking: Tracks refreshed, failed, and rate-limited counts
  • Audit Logging: Logs security events for compliance
  • Silent Operation: Doesn't block requests (runs asynchronously)

Dependencies

  • tokenManager.TokenManager: Token refresh logic
  • interfaceDbAppObjects: Database access for connections and tokens
  • auditLogger: Security event logging

5. tokenRefreshMiddleware.py - Automatic Token Refresh

What Does It Do?

The tokenRefreshMiddleware.py module provides FastAPI middleware that automatically triggers token refresh when users make requests. It runs silently in the background, so users never notice when their tokens are being refreshed.

Understanding Middleware

Middleware in FastAPI is code that runs before your route handlers. It can:

  • Inspect requests
  • Modify requests
  • Perform background tasks
  • Block requests (return errors)

In our case, the middleware intercepts requests, checks if tokens need refreshing, and triggers refresh operations in the background.

Key Classes Explained

TokenRefreshMiddleware

Refreshes expired tokens when specific endpoints are accessed.

class TokenRefreshMiddleware(BaseHTTPMiddleware):
    def __init__(self, app, enabled: bool = True):
        super().__init__(app)
        self.enabled = enabled
        self.refresh_endpoints = {
            '/api/connections',
            '/api/files',
            '/api/chat',
            '/api/msft',
            '/api/google'
        }
    
    async def dispatch(self, request: Request, call_next: Callable) -> Response:
        if not self.enabled:
            return await call_next(request)
        
        # Check if this endpoint might need token refresh
        if not self._should_check_tokens(request):
            return await call_next(request)
        
        # Extract user ID
        user_id = self._extract_user_id(request)
        if not user_id:
            return await call_next(request)
        
        # Trigger background refresh (non-blocking)
        asyncio.create_task(self._silent_refresh_tokens(user_id))
        
        # Continue with request
        return await call_next(request)

What it does:

  1. Checks if the request is to a monitored endpoint (one that might use OAuth tokens)
  2. Extracts the user ID from the request
  3. Triggers token refresh in the background (doesn't wait for it)
  4. Continues processing the request immediately

Why only specific endpoints? Not all endpoints use OAuth tokens. We only check endpoints that are likely to need OAuth tokens (like /api/msft or /api/google).

Why non-blocking? Token refresh can take a few seconds. If we waited for it, every request would be slow. By running it in the background, requests complete immediately while tokens refresh silently.

ProactiveTokenRefreshMiddleware

Proactively refreshes tokens before they expire.

class ProactiveTokenRefreshMiddleware(BaseHTTPMiddleware):
    def __init__(self, app, enabled: bool = True, check_interval_minutes: int = 5):
        super().__init__(app)
        self.enabled = enabled
        self.check_interval_minutes = check_interval_minutes
        self.last_check = {}  # Track last check time per user
    
    async def dispatch(self, request: Request, call_next: Callable) -> Response:
        user_id = self._extract_user_id(request)
        if not user_id:
            return await call_next(request)
        
        # Check if we need to do proactive refresh (every 5 minutes)
        if self._should_check_proactive_refresh(user_id):
            asyncio.create_task(self._proactive_refresh_tokens(user_id))
            self.last_check[user_id] = getUtcTimestamp()
        
        return await call_next(request)

What it does:

  1. Extracts user ID from request
  2. Checks if it's been 5 minutes since last proactive check
  3. If so, triggers proactive refresh in background
  4. Updates last check time
  5. Continues with request

Why 5-minute interval? We don't want to check on every request (that would be wasteful). Checking every 5 minutes is frequent enough to catch tokens before they expire, but not so frequent that it impacts performance.

Monitored Endpoints

The middleware only checks these endpoints:

  • /api/connections - Managing OAuth connections
  • /api/files - File operations that might use OAuth
  • /api/chat - Chat features that might use OAuth
  • /api/msft - Microsoft-specific operations
  • /api/google - Google-specific operations

Why these? These endpoints are most likely to use OAuth tokens. Checking all endpoints would be wasteful.

Features

  • Asynchronous Background Refresh: Doesn't block requests
  • Endpoint-Specific Triggering: Only checks relevant endpoints
  • User ID Extraction: Automatically extracts user ID from request context
  • Configurable Intervals: Default 5 minutes for proactive refresh
  • Silent Operation: Errors don't affect request processing

Dependencies

  • tokenRefreshService: Refresh orchestration service

6. csrf.py - CSRF Protection

What Does It Do?

The csrf.py module provides CSRF (Cross-Site Request Forgery) protection middleware. It validates CSRF tokens for state-changing operations to prevent malicious websites from making requests on behalf of authenticated users.

Understanding CSRF Attacks

Imagine this scenario:

  1. You're logged into the Gateway application
  2. You visit a malicious website
  3. The malicious website contains code that makes a request to the Gateway API (e.g., delete your account)
  4. Your browser automatically includes your authentication cookies with the request
  5. The Gateway API sees your valid authentication and executes the malicious request

CSRF protection prevents this by requiring a special token (CSRF token) that only the legitimate Gateway application knows. Malicious websites can't get this token, so their requests are rejected.

Key Features Explained

Protected Methods

Only state-changing HTTP methods are protected:

  • POST: Creating new resources
  • PUT: Updating existing resources
  • DELETE: Deleting resources
  • PATCH: Partial updates

Why only these? GET requests don't change data, so they're safe. Only requests that modify data need CSRF protection.

Exempt Paths

These paths are exempt from CSRF protection:

  • /api/local/login - Login endpoint (users aren't authenticated yet)
  • /api/local/register - Registration endpoint (users aren't authenticated yet)
  • /api/msft/login - Microsoft OAuth login
  • /api/google/login - Google OAuth login
  • /api/msft/callback - Microsoft OAuth callback
  • /api/google/callback - Google OAuth callback

Why exempt? These endpoints either don't require authentication (login/register) or are called by OAuth providers (callbacks). CSRF protection isn't needed or would interfere with OAuth flows.

Token Validation

The middleware checks for a CSRF token in the X-CSRF-Token header:

async def dispatch(self, request: Request, call_next):
    # Skip CSRF check for exempt paths
    if request.url.path in self.exempt_paths:
        return await call_next(request)
    
    # Skip CSRF check for non-state-changing methods
    if request.method not in self.protected_methods:
        return await call_next(request)
    
    # Skip OPTIONS requests (CORS preflight)
    if request.method == "OPTIONS":
        return await call_next(request)
    
    # Get CSRF token from header
    csrf_token = request.headers.get("X-CSRF-Token")
    if not csrf_token:
        return JSONResponse(status_code=403, content={"detail": "CSRF token missing"})
    
    # Validate token format
    if not self._is_valid_csrf_token(csrf_token):
        return JSONResponse(status_code=403, content={"detail": "Invalid CSRF token format"})
    
    return await call_next(request)

Token Format Validation

The middleware validates that the CSRF token has a valid format:

def _is_valid_csrf_token(self, token: str) -> bool:
    if not token or not isinstance(token, str):
        return False
    
    # Length validation (16-64 characters)
    if len(token) < 16 or len(token) > 64:
        return False
    
    # Must be a valid hex string
    try:
        int(token, 16)
        return True
    except ValueError:
        return False

Why format validation? This is a basic check to ensure the token looks valid. More sophisticated validation (like checking against a session) could be added, but format validation prevents obviously invalid tokens.

How It Works

  1. Request Arrives: User makes a POST/PUT/DELETE/PATCH request
  2. Path Check: Is this an exempt path? If yes, skip CSRF check
  3. Method Check: Is this a protected method? If no, skip CSRF check
  4. Token Extraction: Get CSRF token from X-CSRF-Token header
  5. Token Validation: Check if token exists and has valid format
  6. Request Processing: If valid, continue. If invalid, return 403 Forbidden

Features

  • State-Changing Methods Only: Only protects POST, PUT, DELETE, PATCH
  • Exempt Paths: Login, registration, and OAuth callbacks are exempt
  • Token Format Validation: Basic validation prevents malformed tokens
  • Header-Based: Uses standard X-CSRF-Token header
  • CORS-Aware: Skips OPTIONS requests (CORS preflight)

How Components Work Together

The Complete Picture

Now that we understand each component individually, let's see how they work together to provide comprehensive security.

Request Processing Flow

Here's what happens when a user makes a request:

sequenceDiagram
    participant Client
    participant CSRF as CSRF Middleware
    participant TokenMw as Token Refresh Middleware
    participant Auth as Authentication (auth.py)
    participant JWT as JWT Service
    participant Route as Route Handler
    
    Client->>CSRF: HTTP Request<br/>(POST /api/data)
    
    alt State-Changing Request
        CSRF->>CSRF: Validate X-CSRF-Token
        alt Invalid Token
            CSRF-->>Client: 403 Forbidden
        else Valid Token
            CSRF->>TokenMw: Continue
        end
    else Read-Only Request
        CSRF->>TokenMw: Continue
    end
    
    TokenMw->>TokenMw: Check if endpoint needs refresh
    alt Needs Refresh Check
        TokenMw->>TokenMw: Extract User ID
        TokenMw->>TokenMw: Trigger background refresh<br/>(async, non-blocking)
    end
    TokenMw->>Auth: Continue
    
    Auth->>Auth: Extract token from cookie/header
    Auth->>JWT: Decode token (via jose library)
    JWT-->>Auth: Token payload
    Auth->>Auth: Validate signature, expiration
    Auth->>Auth: Lookup user in database
    Auth->>Auth: Validate user status & context
    
    alt Authentication Failed
        Auth-->>Client: 401 Unauthorized
    else Authentication Succeeded
        Auth->>Route: Request + User Object
        Route->>Route: Process request
        Route-->>Client: Response
    end

Token Refresh Flow

When tokens need to be refreshed:

sequenceDiagram
    participant Request as API Request
    participant Middleware as TokenRefreshMiddleware
    participant Service as TokenRefreshService
    participant TokenMgr as TokenManager
    participant Provider as OAuth Provider
    participant Interface as Interface Layer
    participant DB as Database
    
    Request->>Middleware: HTTP Request<br/>(to /api/connections)
    Middleware->>Middleware: Extract User ID
    Middleware->>Middleware: Check Endpoint<br/>(should refresh?)
    
    Middleware->>Service: refresh_expired_tokens(userId)<br/>(async, background)
    Service->>Interface: getUserConnections(userId)
    Interface->>DB: Query Connections
    DB-->>Interface: Connection Records
    Interface-->>Service: Connections List
    
    loop For Each Expired Connection
        Service->>Service: Check Rate Limit
        alt Rate Limited
            Service->>Service: Skip Connection
        else Not Rate Limited
            Service->>Interface: getConnectionToken(connectionId)
            Interface->>DB: Query Token
            DB-->>Interface: Token Record
            Interface-->>Service: Token Object
            
            Service->>TokenMgr: refreshToken(token)
            
            alt Microsoft Token
                TokenMgr->>Provider: POST /token<br/>(refresh_token grant)
                Provider-->>TokenMgr: New Access Token
            else Google Token
                TokenMgr->>Provider: POST /token<br/>(refresh_token grant)
                Provider-->>TokenMgr: New Access Token
            end
            
            TokenMgr->>TokenMgr: Create New Token Object
            TokenMgr-->>Service: Refreshed Token
            
            Service->>Interface: saveConnectionToken(token)
            Interface->>DB: Update Token Record
            
            Service->>Interface: Update Connection Status
            Interface->>DB: Update Connection
        end
    end
    
    Service-->>Middleware: Refresh Results<br/>(refreshed, failed, rate_limited)
    
    Note over Request,DB: Request continues normally<br/>(non-blocking)
    Request->>Request: Process Request
    Request-->>Request: Return Response

Middleware Stack Order

The middleware is registered in app.py in this order (execution order is reverse):

  1. CORS Middleware (FastAPI built-in) - Handles cross-origin requests
  2. CSRF Middleware - Validates CSRF tokens
  3. TokenRefreshMiddleware - Refreshes expired tokens
  4. ProactiveTokenRefreshMiddleware - Proactively refreshes tokens

Execution Flow:

Request → ProactiveTokenRefreshMiddleware → TokenRefreshMiddleware → CSRFMiddleware → CORS → Route Handler

Each middleware processes the request in sequence. If any middleware rejects the request (e.g., CSRF validation fails), the request stops and an error is returned.

Component Dependencies

auth.py depends on:

  • jwtService.py - For JWT decoding (via jose library)
  • interfaceDbAppObjects - For user and token database access
  • datamodelUam.User - User model
  • datamodelSecurity.Token - Token model
  • slowapi.Limiter - Rate limiting utility (exported for use in routes)

tokenManager.py depends on:

  • httpx - HTTP client for OAuth API calls
  • interfaceDbAppObjects - For token persistence (via callback)
  • datamodelSecurity.Token - Token model

tokenRefreshService.py depends on:

  • tokenManager.TokenManager - Token refresh logic
  • interfaceDbAppObjects - For connection and token access
  • auditLogger - Security event logging

tokenRefreshMiddleware.py depends on:

  • tokenRefreshService - Refresh orchestration service

Interface Layer Integration

Security components interact with the interface layer for:

  1. User Lookup: interface.getUserByUsername(username) - Retrieves user by username
  2. Token Management:
    • interface.findActiveTokenById() - Validates LOCAL tokens against database
    • interface.saveConnectionToken() - Persists refreshed OAuth tokens
  3. Connection Management:
    • interface.getUserConnections() - Gets all connections for a user
    • interface.getConnectionToken() - Retrieves token for a connection

Abstraction Benefits:

  • Security components don't need database connection details
  • User context (mandateId) is automatically handled by interfaces
  • Consistent data access patterns across the application
  • Easier testing with mock interfaces

Security Patterns & Best Practices

1. Defense in Depth

Multiple layers of security validation ensure that if one layer fails, others still protect the system:

  1. Middleware Layer: CSRF protection, token refresh
  2. Authentication Layer: JWT validation, user verification
  3. Database Layer: Token status tracking, user status checks
  4. Route Layer: Explicit authentication dependencies

Why? No single security measure is perfect. Multiple layers provide redundancy and make attacks much harder.

2. Token Security

  • httpOnly Cookies: Prevents XSS attacks from accessing tokens
  • Secure Flag: Automatically enabled for HTTPS environments
  • SameSite=Strict: Prevents CSRF attacks
  • JWT Expiration: Short-lived access tokens (configurable, default 60 minutes)
  • Refresh Tokens: Longer-lived tokens stored securely (default 7 days)
  • Token Revocation: Database-backed revocation for LOCAL tokens

Why? Tokens are sensitive. These measures ensure they're stored and transmitted securely.

3. OAuth Token Management

  • Automatic Refresh: Background refresh prevents user disruption
  • Rate Limiting: Prevents OAuth provider exhaustion (max 3 attempts per hour)
  • Cooldown Period: Prevents excessive refresh attempts (10-minute minimum)
  • Proactive Refresh: Refreshes before expiration (30-minute threshold)
  • Error Handling: Graceful degradation on refresh failures

Why? OAuth tokens expire frequently. Automatic refresh ensures users don't get interrupted, while rate limiting prevents hitting provider limits.

4. User Context Validation

  • Mandate Scoping: Ensures user operates within correct mandate
  • User ID Validation: Verifies token user ID matches database
  • Status Checks: Validates user is enabled and active
  • Context Mismatch Detection: Forces re-authentication on context changes

Why? Users can belong to multiple organizations (mandates). Context validation ensures they can only access data for their current mandate.

5. CSRF Protection

  • State-Changing Methods: Only protects POST, PUT, DELETE, PATCH
  • Exempt Paths: Login and OAuth endpoints exempted
  • Token Format Validation: Basic validation prevents malformed tokens
  • Header-Based: Uses X-CSRF-Token header (standard pattern)

Why? CSRF attacks are common. This protection prevents malicious websites from making requests on behalf of authenticated users.

6. Error Handling

  • Consistent Error Responses: Standard HTTP status codes
  • Security-Aware Logging: Logs security events without exposing sensitive data
  • Graceful Degradation: Token refresh failures don't block requests
  • Audit Logging: Security events logged for compliance

Why? Proper error handling prevents information leakage and ensures the system degrades gracefully when things go wrong.

Configuration

Required Environment Variables

# JWT Configuration
APP_JWT_KEY_SECRET=<secret-key-for-jwt-signing>
Auth_ALGORITHM=HS256
APP_TOKEN_EXPIRY=60  # minutes
APP_REFRESH_TOKEN_EXPIRY=7  # days

# API Configuration
APP_API_URL=https://api.example.com  # Determines secure cookie flag

# OAuth Configuration (for token refresh)
Service_MSFT_CLIENT_ID=<microsoft-client-id>
Service_MSFT_CLIENT_SECRET=<microsoft-client-secret>
Service_MSFT_TENANT_ID=<tenant-id-or-common>

Service_GOOGLE_CLIENT_ID=<google-client-id>
Service_GOOGLE_CLIENT_SECRET=<google-client-secret>

Middleware Configuration

In app.py:

# CSRF protection
app.add_middleware(CSRFMiddleware)

# Token refresh middleware
app.add_middleware(TokenRefreshMiddleware, enabled=True)

# Proactive token refresh
app.add_middleware(
    ProactiveTokenRefreshMiddleware, 
    enabled=True, 
    check_interval_minutes=5
)

Security Considerations

Best Practices Implemented

  1. Never Log Tokens: Tokens are never logged in plaintext
  2. Secure Cookie Configuration: httpOnly, secure (HTTPS), SameSite=Strict
  3. Token Expiration: Short-lived access tokens reduce attack window
  4. Database Validation: LOCAL tokens validated against database for revocation
  5. Rate Limiting: Prevents brute force and DoS attacks
  6. Context Validation: Prevents token reuse across mandates/users
  7. Error Messages: Generic error messages don't leak information

Potential Improvements

  1. CSRF Token Storage: Currently validates format only; could add session-based validation
  2. Token Rotation: Could implement refresh token rotation for enhanced security
  3. IP Validation: Could add IP address validation for token usage
  4. Device Fingerprinting: Could track devices for additional security
  5. MFA Support: Could add multi-factor authentication support

Troubleshooting

Common Issues

  1. 401 Unauthorized:

    • Check token expiration
    • Verify user status (enabled?)
    • Check for context mismatch (mandateId/userId changes)
    • For LOCAL tokens, verify token exists in database and is active
  2. 403 Forbidden:

    • Check CSRF token header (X-CSRF-Token)
    • Verify user enabled status
    • Check if path is exempt from CSRF protection
  3. Token Refresh Failures:

    • Check OAuth provider credentials
    • Verify rate limits haven't been exceeded
    • Check OAuth provider status
    • Review logs for specific error messages
  4. Cookie Not Set:

    • Verify HTTPS in production (cookies require secure flag)
    • Check SameSite settings
    • Verify cookie path settings

Debugging

Enable debug logging:

import logging
logging.getLogger("modules.security").setLevel(logging.DEBUG)

Check audit logs for security events:

# Security events are logged via audit_logger.logSecurityEvent()