External Integrations Architecture & Operations
External Integrations Architecture & Operations
1. Introduction
In line with other pipeline architecture, we have developed a set of pipelines to handle integrations with external third-party API providers. These pipelines provide a consistent, secure, and operationally robust mechanism for invoking external services such as credit reference agencies, while ensuring that consuming systems do not need to handle provider-specific authentication, API behaviour, retries, or error handling.
The integrations are implemented as Ruby-based pipelines that can be executed in two primary ways:
- Command-line execution, supporting operational use cases, testing, and controlled re-runs
- HTTP-triggered execution, enabling invocation from upstream applications and user interfaces via Power Automate for example
At present, the following external integration pipelines are in scope:
- Experian Pipeline – supporting company and individual credit-related API calls
- Creditsafe Pipeline – supporting company, director, and consumer credit-related API calls
Each pipeline follows a shared architectural pattern covering:
- explicit user intent
- authentication and token management
- execution control and retry handling
- structured logging and auditability
- secure persistence of responses where required
This document describes the architecture, processing model, security controls (such as authentication, authorisation, and data protection), and operational considerations for these external integrations. It is intentionally focused on the integration layer and does not attempt to fully document downstream processes unless they are directly impacted by the integration process.
2. Scope & Objectives
2.1 In Scope
- Architecture of the external integration pipelines
- Interaction patterns with Experian and Creditsafe
- Error handling, retries, and reprocessing controls
- Security, audit, and compliance considerations
- Operational run-book and support model
2.2 Out of Scope
- Business intelligence and reporting models
- Detailed UI implementation (covered in separate documentation)
- Integration with Alph4 APIs (currently in planning stage)
3. Architecture
3.1 Detailed Architecture Overview
The external integration pipelines provide a dedicated integration layer between Leasepath and third-party API providers. This layer is responsible for executing external API calls in a controlled, secure, and observable manner, while shielding upstream systems from provider-specific concerns.
The integration layer is responsible for:
- authentication and token management
- execution control (timeouts, retries, backoff)
- provider-specific request and response handling
- structured logging and auditability
- persistence of external responses for downstream consumption
Upstream systems interact with the integration layer via an HTTP interface or controlled command-line execution. They do not interact directly with third-party providers.
flowchart LR
A[Upstream System / UI] --> B[HTTP Layer]
B --> C[Integration Pipeline]
C --> D[External API Provider]
C --> E[Logging & Monitoring]
C --> F[Persistent Storage Dynamics / Data Lake]
3.2 Execution Environment & Deployment Model
The integration pipelines are implemented as Ruby services and are deployed on Windows Server. They are designed to support asynchronous invocation via an HTTP interface, where incoming requests initiate a pipeline execution and return immediately, with results persisted for later consumption. Pipelines may also be executed directly via the command line for operational and support purposes.
Key characteristics include:
- environment-specific configuration supplied via environment variables
- no hard-coded credentials or endpoints
- consistent execution behaviour regardless of invocation method
3.3 Authentication & Provider Isolation
Authentication with third-party providers is handled within the integration layer using provider-specific token mechanisms. Token acquisition, caching, and refresh logic is encapsulated within the pipelines and is not exposed to upstream callers.
This ensures that:
- consuming systems are not coupled to provider authentication models
- credentials are centrally managed and rotated
- provider-specific behaviour is isolated to the relevant pipeline
3.4 Observability, Audit & Persistence
All pipeline executions are fully observable and auditable. Each execution is associated with correlation identifiers that link:
- the originating request
- external API calls
- logs and metrics
- persisted response artefacts
External API responses are persisted to Dynamics tables (and, where appropriate, additional storage eg Sharepoint).
3.5 Architectural Constraints & Assumptions
The integration architecture operates under the following explicit constraints:
-
User-initiated execution only Credit-impacting API calls are only executed as a result of explicit user actions. Automated or scheduled credit checks are not permitted.
-
Provider dependency The integration layer depends on the availability and contractual stability of third-party provider APIs.
-
Fail-fast behaviour Invalid input, authentication failures, or unrecoverable errors result in immediate failure rather than silent degradation.
-
Incremental evolution The architecture is designed to evolve incrementally, with improvements to idempotency, observability, and audit integration introduced without changing upstream contracts.
4. External Integration Execution Model
4.1 Pipeline Execution
Each pipeline follows a consistent execution lifecycle:
- Input validation and intent verification
- Authentication token acquisition
- External API request execution
- Response validation and normalisation
- Persistence (where applicable)
- Structured logging
sequenceDiagram
participant Caller
participant Pipeline
participant Provider
participant Log
Caller->>Pipeline: Execute request
Pipeline->>Provider: API call
Provider-->>Pipeline: Response
Pipeline->>Log: Write audit & metrics
Pipeline-->>Caller: Result / Error
The caller does not wait for the external API call to complete; execution continues asynchronously after the initial request is accepted.
4.2 Request, Response & Execution Model
The external integration pipelines expose a well-defined execution model for invoking third-party APIs. Each pipeline defines the inputs it accepts, the external services it interacts with, and the controls applied during execution.
Each integration is defined in terms of:
-
Request inputs Business identifiers and parameters required to determine the external API call to be made.
-
Execution options Runtime controls that govern how the integration behaves, including timeouts, retry limits, backoff strategy, and diagnostic modes.
-
External API interactions The third-party endpoints invoked and any provider-specific constraints or behaviours.
-
Response handling How responses are returned, logged, and optionally persisted for audit or reprocessing purposes.
-
Correlation and traceability Identifiers used to link requests, external API calls, logs, and persisted artefacts across the execution lifecycle.
This model ensures that each external API call is executed in a controlled, observable, and auditable manner, with clear separation between business intent, execution behaviour, and operational concerns.
4.2.1 Request Inputs & Identifiers
Each pipeline requires a minimal set of business identifiers to determine the external API call to be executed. Examples include:
- Company registration number
- Provider-specific identifiers (e.g.
connectId,peopleId) - Explicit action or intent (e.g. search vs credit report)
- Correlation ID
These identifiers are validated prior to execution and are included in structured logs to support traceability.
4.2.2 Execution Options (Runtime Controls)
In addition to business identifiers, each pipeline exposes a set of execution options that control how API calls are made. These options are consistent across providers to ensure predictable operational behaviour.
Common execution options include:
| Option | Description | Purpose |
|---|---|---|
http_timeout_seconds |
HTTP open/read timeout for external API calls | Prevents indefinite blocking |
http_max_attempts |
Maximum number of retry attempts | Bounds retry behaviour |
http_backoff_base_seconds |
Base delay for exponential backoff | Controls retry pacing |
http_backoff_max_seconds |
Maximum backoff delay | Prevents excessive wait times |
debug |
Enables logging of raw API responses | Diagnostic use only |
dry_run |
Logs intended API calls without executing them | Safe testing and validation |
These options are available whether the pipeline is invoked via the command line or through the HTTP layer, ensuring consistent behaviour across execution contexts.
4.2.3 Provider-Specific Requests
For each provider, the pipeline maps validated inputs and execution options to one or more external API calls.
Examples include:
- Experian
- Company search and credit-related endpoints
- Authentication via token-based access
- Strict timeout guidance as per provider recommendations
- Creditsafe
- Company, director, and consumer endpoints
- Distinct identifiers for companies and individuals
- Explicit handling of ambiguous search results
Provider-specific request logic is encapsulated within the relevant pipeline to prevent leakage of provider concerns into upstream systems.
4.2.4 Response Artefacts & Persistence
The integration pipelines are not designed to stream full third-party responses synchronously back to callers.
For each successful external API invocation, the response is written to a Dynamics table associated with the originating request. This enables downstream systems and user interfaces to retrieve results asynchronously and ensures a durable audit trail.
The HTTP interface returns an acknowledgement of execution rather than response data.
4.2.5 Correlation, Audit & Traceability
Every pipeline execution is assigned a correlation identifier that is propagated across:
- Incoming request
- External API calls
- Logs and metrics
- Persisted artefacts (where applicable)
This enables:
- End-to-end traceability
- Safe operational investigation
- Controlled reprocessing without unintended duplicate actions
4.2.6 Validation & Guardrails
Prior to execution, the pipelines enforce a set of guardrails, including:
- Mandatory input validation (e.g. registration number required)
- Validation of execution option ranges (e.g. timeouts > 0)
- Explicit failure for invalid or incomplete requests
These controls ensure that invalid requests fail early and predictably, before any external API calls are made.
4.3 Error Handling & Reprocessing
Failure scenarios handled include:
- Network timeouts
- Authentication failures
- Provider-side errors
- Ambiguous or multiple search results
Key controls:
- Bounded retry logic
- No automatic retries for credit report actions without safeguards
- Operator-driven reprocessing using correlation identifiers
- Clear distinction between transient and terminal failures
4.4 Authentication & Token Management
External API providers typically require short-lived access tokens. The integration pipelines centralise token handling so that individual API calls do not need to implement provider-specific token lifecycle logic.
Token Provider Pattern
Each provider pipeline composes a TokenProvider responsible for obtaining access tokens. Token providers follow a common interface:
get()returns a valid access token- tokens are cached in-memory for the lifetime of the running process
- refresh occurs automatically when the token is expired (or close to expiry)
This pattern ensures:
- consistent authentication behaviour across providers
- fewer token endpoint calls (reduced load and reduced failure surface)
- separation of concerns (auth logic not duplicated across API calls)
Cached Token Provider
Token caching is implemented using a shared component:
- The token is stored in-memory (
cached_token) - The expiry time is tracked as an epoch timestamp (
cached_expiry_epoch) - If a token is still valid,
get()returns it immediately - Otherwise, the provider-specific
fetchfunction is invoked to obtain a new token
A small safety buffer is applied to avoid using tokens that are close to expiry:
cached_expiry_epoch = now + expires_in - 60
This “refresh 60 seconds early” behaviour helps reduce race conditions where a token expires mid-request.
If token acquisition fails, the error is:
- logged
- re-raised to ensure the pipeline fails fast and predictably
Provider-Specific Token Acquisition (Example: Creditsafe)
Creditsafe authentication is implemented via a provider-specific TokenProvider that:
- reads the token endpoint and credentials from environment configuration
- performs a POST to the provider token endpoint
- extracts the token from the JSON response
- delegates caching/refresh behaviour to the shared cached token provider
This keeps Creditsafe-specific details isolated while still using the common caching behaviour.
Configuration inputs (Creditsafe)
CREDITSAFE_TOKEN_URLCREDITSAFE_USERNAMECREDITSAFE_PASSWORD
Notes and Considerations
In-memory scope
- Token caching is per-process. Tokens are not shared across multiple running pipeline processes or servers.
Expiry handling
- The cached token provider expects token metadata including
expires_in(seconds) to calculate expiry. - Where a provider does not supply
expires_in, a default or conservative expiry strategy should be defined to prevent tokens being treated as valid indefinitely.
Logging
- Token acquisition failures are logged with error class and message.
- Tokens and credentials must never be logged.
5. Security, Controls & Compliance
5.1 Access & Role Model
- Separation between:
- users initiating requests
- services executing integrations
5.2 Data Protection
- All data in transit encrypted using TLS
- Data at rest encrypted in:
- Azure Data Lake
- SQL databases (where applicable)
- Sensitive fields handled as PII, no sensitive data persisted beyond that which is required for operational use (eg pdf credit reports stored in sharepoint, API response data stored in Dynamics)
5.3 Audit & Logging
- Every request assigned a correlation identifier
- Logs capture:
- request metadata
- provider response status
- execution timing
6. Technology Stack & Environments
6.1 Technology Inventory
- Ruby (integration pipelines)
- Windows Server hosting
- External APIs (Experian, Creditsafe)
- Azure Data Lake
- SQL Database
- Centralised logging and alerting
6.2 Integration Configuration
External integrations are configured entirely via environment variables. This ensures that sensitive information is not embedded in code and that configuration can vary cleanly between environments (e.g. sandbox vs production).
The following configuration categories are relevant to the Experian and Creditsafe pipelines.
Provider Endpoints
Each provider exposes one or more base URLs that define the external API surfaces used by the integration pipelines.
Experian
EXPERIAN_API_BASE_URLBase URL for Experian API requests.EXPERIAN_TOKEN_URLOAuth token endpoint used for authentication.
Creditsafe
CREDITSAFE_API_BASE_URLBase URL for Creditsafe Connect API requests.CREDITSAFE_TOKEN_URLAuthentication endpoint used to obtain access tokens.
These values are environment-specific and differ between sandbox and production deployments.
Authentication Credentials
Provider credentials are supplied via environment variables and are read at runtime by the relevant pipeline.
Experian
EXPERIAN_USERNAMEEXPERIAN_PASSWORDEXPERIAN_CLIENT_IDEXPERIAN_CLIENT_SECRET
Creditsafe
CREDITSAFE_USERNAMECREDITSAFE_PASSWORDCREDITSAFE_CLIENT_IDCREDITSAFE_CLIENT_SECRET
Credentials are never logged and are only used within the integration layer for token acquisition and API authentication.
Timeout Configuration
All outbound HTTP calls to third-party providers are subject to a configurable open/read timeout.
- Default timeout: 30 seconds
- Controlled via runtime execution options
- Applies to both authentication and API request calls
Timeouts ensure that external provider latency does not cause indefinite blocking within the integration service.
Retry & Backoff Configuration
Integration pipelines apply bounded retry logic for transient failures (for example, network issues or temporary provider unavailability).
- Default maximum retry attempts: 4
- Backoff strategy: exponential backoff with a bounded maximum delay
- Retries are applied only where safe and appropriate
Credit-impacting actions (such as credit report retrieval) are explicitly guarded to prevent unintended repeated calls.
Feature Flags & Execution Modes
The integration pipelines support a small number of execution modes controlled via runtime options rather than environment configuration.
Examples include:
- Dry-run mode, which logs intended external API calls without executing them
- Debug mode, which enables additional diagnostic logging under controlled conditions
These modes are intended for testing, troubleshooting, and operational support and are not enabled by default.
Configuration Management Principles
- All configuration is externalised via environment variables
- No provider endpoints or credentials are hard-coded
- Defaults are conservative and aligned with provider guidance
- Configuration changes do not require code changes