Article 12 in Practice: How to Build a Logging System for High-Risk AI

cici BEL
18 hours ago
11 min read

In our previous article, we explained what Article 12 requires — the three purposes of logging, the role responsibilities, the biometric exception, and the unresolved GDPR tension.

Now the question is: how do you actually build it?

Article 12 is a technical construction requirement. That means the answer is not a document — it is an architecture. In this article, we walk through the complete logging system for TalentMatch AI, layer by layer: from infrastructure through event schemas and GDPR retention to incident reconstruction.

This is Part 2 of our Article 12 series.
If you haven't read the theory yet, start with: "Article 12 EU AI Act: Record-Keeping & Logging Explained" 
→ TalentMatch AI is the same example from our Article 9, 10, and 11 guides. The compliance journey builds on itself.

The Scenario: Adding Logging to TalentMatch AI

You know TalentMatch AI by now. The AI-powered Applicant Tracking System that screens CVs, ranks candidates, flags best matches, and suggests interview questions. Annex III, Point 4(a) — high-risk. 35 employees — SME. OpenAI API for NLU, custom gradient-boosted ranking model, 50,000 historical applications as training data.

Your Article 9 risk management, Article 10 data governance, and Article 11 technical documentation are done. But there is a gap: logging. TalentMatch AI has basic application logs — server errors, API response times, authentication events. Standard DevOps monitoring. What it does not have is compliance-grade event logging — the kind that enables incident reconstruction, supports post-market monitoring, and satisfies a regulator asking "show me what happened on March 15 when candidate X reported discriminatory treatment."

That is what we are building now.

The 4-Layer Logging Architecture

Every high-risk AI system needs four distinct logging layers. Each layer has a different owner, a different purpose, and different technical requirements. Build them in order — each layer depends on the one below it.

Layer 1: Infrastructure — Where logs are stored, how they are secured, and who can access them. Owner: DevOps.

Layer 2: Event Schemas — What gets logged and in what format. The system-specific definitions of which events matter. Owner: ML Engineers + Product.

Layer 3: Retention & GDPR — How long logs are kept, what gets pseudonymised, and how data protection is ensured. Owner: Compliance / DPO.

Layer 4: Monitoring & Alerting — What triggers an alert, how incidents are reconstructed, and how logs feed post-market monitoring. Owner: Product + Operations.

Infographic showing a 4-layer logging architecture for high-risk AI systems — Infrastructure, Event Schemas, Retention & GDPR, and Monitoring & Alerting, with role assignments per layer

Layer 1: Infrastructure

DevOps owns this layer. The decisions made here determine whether the logging system is reliable, secure, and performant enough to satisfy Article 12.

Storage & Encryption

TalentMatch AI runs as a cloud SaaS in an EU region. The logging infrastructure follows the same deployment model:

→ Primary storage: PostgreSQL for structured event data (decision events, risk events). Indexed by timestamp, event_type, and model_version for fast querying during incident reconstruction.

→ Archive storage: S3-compatible object storage (EU region) for long-term retention of risk and change events.

→ Encryption: AES-256 at rest for all log storage. TLS 1.3 in transit for all log transport.

→ Backup: Daily snapshots with 30-day rollback capability. Geo-redundant within the EU region.

The storage architecture must handle differentiated retention — risk events kept for 10 years, monitoring metrics kept for 6 months — without mixing retention policies in the same storage bucket.

Access Control

Log data is sensitive. For TalentMatch AI, logs contain pseudonymised candidate data, model decision logic, and internal performance metrics. Access must be strictly role-based:

→ DevOps: Infrastructure access (storage configuration, backup management, performance monitoring). No access to log content.

→ ML Team: Event schema configuration and model-level log analysis. Read access to decision and risk events for debugging and model improvement.

→ Compliance / DPO: Read-only audit access to all event types. The only role authorised to access pseudonymisation mapping tables.

→ Everyone else: No access. Explicitly denied.

Critical: log who accesses the logs. Access logging is itself a compliance requirement — it creates the audit trail that proves only authorised persons interacted with log data.

Performance Budget

Logging must not degrade the AI system's core functionality. For TalentMatch AI, the performance budget is clear:

→ Maximum latency increase from logging: ≤ 5%

→ Implementation pattern: asynchronous logging — events are queued and written in background, never blocking the ranking pipeline

→ Monitoring metrics (daily aggregates) are batch-written, not real-time

→ Log completeness monitoring: if the logging system itself fails, an alert fires immediately

Layer 2: Event Schemas

ML Engineers and Product Managers define this layer together. This is where Article 12 becomes system-specific: what exactly does TalentMatch AI log?

Decision Events

Every candidate ranking decision generates a log entry. This is the core of traceability — if a regulator asks "why was candidate X ranked #7?", this event must answer the question.

Key fields:

→ timestamp — ISO 8601 format, UTC

→ event_type — "decision_event"

→ event_id — UUID, unique per event

→ model_version — which model produced this ranking

→ job_posting_id — which position

→ candidate_pseudo_id — pseudonymised candidate identifier

→ input_features_hash— SHA-256 hash of input features (not raw CV data)

→ ranking_position — output rank

→ confidence_score — model confidence

→ top_3_features— the three features that most influenced the ranking (explainability)

→ fairness_check_result — pass/fail for demographic parity check

Notice: the log does not store the full CV or personal details. It stores a hash and a pseudonymised ID. The actual candidate data lives in the application database — the log references it, does not duplicate it. This is the pseudonymisation principle in practice.

Risk Events

Risk events are derived directly from the Article 9 risk management system. Each risk identified in the RMS maps to a loggable event:

Risk Event	Trigger Condition	Severity
Low Confidence Score	confidence_score < 0.70	MEDIUM
Bias Detection Triggered	disparate_impact_ratio < 0.80 (80%-Rule)	HIGH
Anomalous Input Data	anomaly_score > 0.85 (out-of-distribution)	LOW
Model Drift Detected	accuracy_drop > 0.10 over 7-day window	HIGH

Each risk event includes the trigger condition, the actual measured value, the severity classification, and whether an automated response was triggered (e.g. fallback to human review for low confidence scores).

Change Events

Every modification that could affect the system's conformity status must be logged. For TalentMatch AI:

→ Model retrain / update: Old version, new version, training data changes, validation metrics before and after

→ Configuration changes: Hyperparameter modifications, threshold adjustments

→ Data pipeline modifications: Feature engineering changes, preprocessing updates

→ Deployment events: New version deployed to production

Each change event includes who initiated the change, the timestamp, the reason, and validation metrics. This connects directly to the Annex IV No. 6 change history in your Article 11 documentation.

Monitoring Events

Continuous performance tracking feeds the Article 72 post-market monitoring system:

Metric	Frequency	Threshold
Model Accuracy (NDCG@10)	Daily	accuracy < 0.85
Fairness — Gender Parity	Weekly	ratio < 0.80
CV Processing Latency	Real-time	latency > 5 seconds
Skill Extraction Completeness	Daily	completeness < 0.90

When a threshold is violated, the monitoring event automatically escalates to a risk event and triggers an alert. The boundary between "monitoring" and "risk" is defined by the threshold — below it is routine, above it is an incident in formation.

Four-column grid showing the event types logged by TalentMatch AI — Decision Events, Risk Events, Change Events, and Monitoring Events with key fields and triggers for each type

Layer 3: Retention & GDPR

The Compliance Officer and DPO own this layer. Every decision here balances AI Act obligations against GDPR constraints.

Differentiated Retention

Not all log data deserves the same retention period. TalentMatch AI implements three tiers:

Event Category	Retention Period	Justification	Storage
Risk Events + Incident Logs	10 years	Art. 18 KI-VO — document retention	Archive (S3, encrypted)
Change Events	10 years	Audit trail — conformity history	Archive (S3, encrypted)
Decision Events	2 years	Incident reconstruction window — pseudonymised	Primary DB (PostgreSQL)
Monitoring Metrics	6 months	Operational — aggregated, no personal data	Primary DB (auto-purge)

Auto-delete jobs run weekly to enforce retention. A manual hold function exists for incident investigations — when an incident is opened, relevant logs are frozen regardless of retention policy until the investigation is closed.

Pseudonymisation in Practice

TalentMatch AI processes personal data — candidate names, CVs, application histories. The logging system must not duplicate this exposure. Three pseudonymisation measures:

→ CV filenames → UUID: No real filenames appear in logs. Each CV receives a UUID at intake; logs reference only the UUID.

→ Email addresses → SHA-256 hash: If email appears in log context, it is hashed before storage. Reversible only through the mapping table.

→ Protected characteristics → aggregated only: Gender, age, ethnicity are never logged at the individual level. Fairness metrics log aggregate statistics (demographic parity ratio), not individual classifications.

The mapping table (UUID ↔ real identity) is stored separately with restricted access — only the DPO and authorised Compliance staff can access it, and only for legitimate purposes (incident investigation, data subject requests).

DPIA and Legal Basis

TalentMatch AI's logging system requires a Data Protection Impact Assessment under Article 35 GDPR. The triggers are clear: automated processing of personal data in an employment context, systematic monitoring, and large-scale data processing.

The DPIA documents the legal basis (Article 6(1)(c) GDPR — legal obligation under the AI Act), the necessity and proportionality of the logging, and the safeguards implemented (pseudonymisation, access restriction, differentiated retention).

The processing register (Article 30 GDPR) must be updated to include the logging system as a distinct processing activity — with purpose, legal basis, data categories, retention periods, and technical safeguards.

Layer 4: Monitoring & Alerting

Product Managers and Operations own this layer. It turns raw logs into actionable signals.

What Triggers an Alert

Four alert categories for TalentMatch AI:

→ Fairness threshold breach: Demographic parity drops below 0.80 → immediate alert to Compliance Officer + ML Lead. System continues operating but flags all subsequent rankings for human review.

→ Accuracy drift: NDCG@10 drops more than 5% from baseline over a 7-day window → alert to ML Team. Investigation required within 48 hours.

→ Log completeness failure: The logging system itself fails to record events → highest priority alert to DevOps. A system that cannot log is a system that cannot prove compliance.

→ Unauthorised access attempt: Someone outside the authorised role set attempts to access log data → immediate alert to Security + DPO.

Incident Reconstruction: A Worked Example

A recruiter reports that TalentMatch AI consistently ranks female candidates lower for a senior engineering position. Here is how the logging system enables reconstruction:

Step 1: Query. Filter decision events by job_posting_id for the reported position, time range covering the past 3 months.

Step 2: Analyse. Extract ranking_position and fairness_check_result for all candidates. Group by demographic parity dimensions. Calculate whether the 80%-rule was violated across the sample.

Step 3: Check. Cross-reference with risk events — were any bias detection events triggered during this period? If yes, what was the automated response?

Step 4: Identify. Check model_version — was there a model retrain or data update during the affected period? Cross-reference with change events. If a retrain occurred, compare validation metrics (specifically fairness metrics) before and after.

Step 5: Trace. If the issue traces back to a training data change, pull the data governance log from the Article 10 documentation. Identify which training data batch introduced the bias pattern. Document the root cause, corrective action, and preventive measures.

This five-step process is only possible because each layer of the logging architecture is in place: the decision events capture the what, the risk events capture the when, the change events capture the why, and the infrastructure ensures everything is stored and retrievable.

Five-step process diagram showing how to reconstruct an AI incident from logs — from initial query through filtering, checking, identifying the model version, and tracing back to the root cause

The 8-Phase Implementation Roadmap

Building a compliant logging system is an engineering project. Here is the phased approach for TalentMatch AI:

Phase 1 — Infrastructure: Select logging library (Fluentd for TalentMatch AI), configure storage (PostgreSQL + S3), activate encryption (AES-256 + TLS), implement access controls, test backup strategy.

Phase 2 — Event Implementation: Define JSON event schemas for all four event types, integrate log functions into the ranking pipeline (log_decision_event(), log_risk_event()), implement JSON schema validation, standardise timestamps to ISO 8601 UTC.

Phase 3 — Retention & Cleanup: Configure three retention tiers, create auto-delete jobs, implement archiving pipeline for long-term storage, build manual hold function for incident investigations.

Phase 4 — Testing: Unit tests for all log functions, integration tests for end-to-end event flow, incident reconstruction test (can we reconstruct a 6-month-old decision?), performance test (is log overhead ≤ 5%?), security audit (encryption, access controls).

Phase 5 — Monitoring: Set up threshold alerts, implement log completeness monitoring, configure storage capacity alerts, build compliance dashboard for the Compliance Officer.

Phase 6 — Documentation: Update Article 11 technical documentation with logging specification. Reference event schemas, retention policies, and access controls in Annex IV.

Phase 7 — GDPR Compliance: Conduct DPIA, update processing register, document pseudonymisation measures, confirm legal basis with DPO.

Phase 8 — Go-Live: Deploy logging system to production, verify all events are being captured, confirm alert pipeline is functional, conduct first compliance review after 30 days.

Common Mistakes

Four mistakes appear repeatedly when teams build AI logging systems:

Mistake 1: Logging as an Afterthought

Article 12 is a technical construction requirement. Recital 71 requires the system to technically enable automatic event recording. Bolting logging onto a finished system is expensive, architecturally fragile, and may not meet the "by design" intent. Plan logging during system design — not during the compliance review before launch.

Mistake 2: Logging Everything

The instinct to "log everything, figure it out later" creates three problems: storage costs explode, performance degrades, and GDPR exposure increases with every piece of personal data captured. Proportionality is a legal requirement under Recital 71 — logging must be proportionate to risks and use context. Define your event schemas deliberately.

Mistake 3: No Differentiated Retention

Keeping all logs for 10 years is both expensive and a GDPR risk. Keeping all logs for 6 months may violate the AI Act's retention requirements. Differentiated retention — risk events longer, routine metrics shorter — is the only defensible approach. Define tiers, justify each one, and implement auto-purge.

Mistake 4: Open Log Access

Logs contain decision logic, performance data, and pseudonymised personal data. Giving the entire engineering team read access to all logs undermines GDPR safeguards and creates unnecessary data protection exposure. Implement strict role-based access. Log who accesses the logs.

How TrustTroiAI Helps

TrustTroiAI's Article 12 template is built in two parts — matching the two audiences that need the logging specification.

Part A: Compliance Documentation Article 12 logging practice covers everything auditors and notified bodies need to see. Diana guides you through 8 sections: system identification, deployment architecture, mandatory event definitions, biometric obligations (if applicable), retention policy, GDPR compliance, access controls, and incident reconstruction capability.'

The generated DOCX includes your system metadata, the JTC 21 Best Practice Extension badge, and the complete logging specification formatted for regulatory review.

Screenshot of the generated DOCX output from TrustTroiAI showing the Logging Specification cover page for TalentMatch AI with system metadata, Part A Compliance Documentation designation, and JTC 21 Best Practice Extension badge — SCREENSHOT: TrustTroiAI Logging Specification — DOCX Output

The mandatory events section is where Article 12(2) comes to life. You define your risk events with trigger conditions and severity levels, select which change event types apply, configure monitoring metrics with measurement frequencies and thresholds, and map human-in-the-loop events to Article 14 requirements. Each entry maps directly to the event schemas your engineering team will implement.

Screenshot of TrustTroiAI Article 12 template showing the Mandatory Events configuration — Risk Events table with trigger conditions and severity, change event types, monitoring metrics with thresholds, and human-in-loop events with Art. 14 cross-reference — SCREENSHOT: TrustTroiAI Mandatory Events Configuration

The GDPR compliance section addresses the tension head-on. Diana surfaces an active guidance banner explaining the GDPR conflict and recommending pseudonymisation before storage. You select which types of personal data appear in your logs, choose the legal basis, document your pseudonymisation measures (CV filenames → UUID, email → SHA-256, protected characteristics → aggregated only), and confirm your DPIA status.

Screenshot of TrustTroiAI Article 12 template showing the GDPR Compliance section with active guidance on the GDPR conflict, personal data type selection, legal basis dropdown, pseudonymisation measures, and DPIA status indicator — SCREENSHOT: TrustTroiAI GDPR Compliance Section

The incident reconstruction section implements JTC 21 WG 3 best practices. You document your incident response plan status, reconstruction timeline capability, and — for neural network systems — a specific reconstructability checklist: model checkpoints versioned, architecture definition versioned, training hyperparameters logged, input data retrievable, inference outputs timestamped. The responsible person for incident logging is assigned by name.

Screenshot of TrustTroiAI Article 12 template showing the Incident Reconstruction section based on JTC 21 WG 3 — incident response plan status, reconstruction timeline, neural network reconstructability checklist, and responsible person assignment — SCREENSHOT: TrustTroiAI Incident Reconstruction Section

Part B: Technical Implementation Specification provides the engineering detail — JSON event schemas, retention configuration, and the 8-phase implementation checklist. This is the document your DevOps and ML teams work from to build the actual logging infrastructure.

Build your Article 12 logging specification with TrustTroiAI
From compliance documentation to implementation checklist — both layers in one template.
→ Start your logging specification: trusttroiai.com
→ Theory first: Article 12 EU AI Act Explained 
→ Foundation: Article 9 — Risk Management System 
→ Technical Documentation: Article 11 in Practice

Key Takeaways Article 12 logging practice

→ A compliant logging system has four layers : Infrastructure (DevOps), Event Schemas (ML Engineers + Product), Retention & GDPR (Compliance/DPO), and Monitoring & Alerting (Product + Operations). Build them in order.

→ Event schemas must be system-specific . TalentMatch AI logs four event types: decision events, risk events, change events, and monitoring events — each with defined JSON schemas, triggers, and severity levels.

→ Differentiated retention is the only defensible approach: risk events for 10 years, decision events for 2 years (pseudonymised), monitoring metrics for 6 months. Justify each tier.

→ Pseudonymise before logging — not after. UUIDs instead of filenames, hashes instead of emails, aggregate statistics instead of individual protected characteristics. The mapping table is separate and access-restricted.

→ Incident reconstruction is the ultimate test: if you cannot trace a reported discrimination incident back through decision events, risk events, change events, and training data — your logging system is not compliant.