V5 Ultimate
Compliance · The complete guide

Data Integrity By Design

TL;DR

Data integrity by design means building MES and connected systems so compliant, reliable records happen by default—not by heroics. It aligns with Part 11, Annex 11, MHRA, PIC/S, and GAMP 5, applying ISA‑95 layering, robust identity and audit trails, and lifecycle validation. V5 Ultimate closes the loop by unifying MES, QMS, eBMR/eDHR, LIMS, WMS, and Maintenance on a single, reviewable record.

Reviewed · By V5 Ultimate compliance team· 3,500 words · ~16 min read

01What it is

Data Integrity by Design is a systems-engineering approach that embeds ALCOA+ into MES and connected GxP systems so compliant, trustworthy records are the default outcome of normal work. It treats data integrity as a preventive control objective across people, process, and technology—spanning data capture at the shop floor, contextualization, transmission, storage, visualization, review, release, and retention. It relies on layered technical safeguards (identity, authorization, time synchronization, audit trails, segregation of duties, error-proofed workflows) and procedural controls (SOPs, training, governance) that are specified, verified, and maintained through the system lifecycle.

  • Attributable: enforced user identity and equipment attribution at point of capture (Part 11, Annex 11).
  • Legible: durable, human-readable records and metadata; preserved original data format.
  • Contemporaneous: trusted time sources, automatic timestamps, sequence-of-events (MHRA, PIC/S).
  • Original: raw data retained and protected; derived results traceable back to source.
  • Accurate: validated algorithms, calibrated instruments, verified interfaces; error detection and reconciliation.
  • Plus: complete, consistent, enduring, and available across retention with controlled retrieval.

02Regulatory foundations and expectations

The legal and guidance baseline is clear: 21 CFR Part 11 and 21 CFR 211.68 require controls for electronic records and computerized systems; EU GMP Annex 11 sets lifecycle expectations; MHRA and PIC/S define data integrity expectations across the data lifecycle; and GAMP 5 provides practical, risk-based engineering and validation practice. Designing for integrity translates these into concrete, testable requirements that prevent data loss, manipulation, or ambiguity and that demonstrate fitness for intended use via validation and ongoing assurance.

SourceFocusDesign-by-default implication
21 CFR Part 11Electronic records/signatures, identity binding, audit trailsUnique credentials, secure e-signatures, tamper-evident audit trails, system checks
21 CFR 211.68Computerized equipment controls and accuracy checksQualification, accuracy verification, change control, backup/restore, restricted access
EU GMP Annex 11Computerised systems lifecycle, periodic review, data integrityRequirements- and risk-based specification, validation, periodic review, incident mgmt.
MHRA DI GuidanceLifecycle DI, definitions, review practicesEnd-to-end ALCOA+ controls, proportionate review, governance and training
GAMP 5 (2nd ed.)Good practice, critical thinking, supplier assessmentRisk-based controls, supplier leverage, assurance evidence aligned to impact

03Governance and Quality Risk Management

Data integrity is a governance topic before it is a tooling topic. A cross-functional body (QA, Manufacturing IT/OT, Production, QC, RA) should own DI policies, risk appetite, and escalation. ICH Q9(R1) quality risk management applies: identify data and decision risks, analyze likelihood/impact, and specify proportional preventive and detective controls. GxP training must cover ALCOA+, role-based responsibilities, and practical scenarios (e.g., handling temporary automation outages) to prevent workarounds.

Applying ICH Q9(R1) to data flows

  • Risk identification: map critical data elements (CDEs), origins, transformations, and decisions influenced.
  • Risk analysis: failure modes (omission, transcription error, clock drift, identity spoofing, silent interface failure).
  • Risk control: preventive (RBAC, time sync, interlocks), detective (reconciliations, alerts), and review frequency.
  • Risk review: trend exceptions (e.g., frequent audit trail reversals) and adjust controls or training.

"Data integrity refers to the degree to which data are complete, consistent, and accurate throughout the data lifecycle."

MHRA GxP Data Integrity Guidance

04Architecture and ISA‑95 layering

Data integrity by design leverages the ISA‑95 model to place controls closest to risk. Level 0–1 (sensors, PLCs) provide deterministic signals; Level 2 (SCADA/HMI) contextualizes; Level 3 (MES/LIMS) executes and records; Level 4 (ERP/QMS/WMS) plans and releases. The design objective is to prevent uncontrolled manual transcription across layers, assure time-aligned event ordering, and maintain traceability between physical product and records. Robust interface design protects original data while enabling derived calculations and review.

ISA‑95 LevelTypical assetsData integrity controls by design
0–1Instruments, PLCsCalibration status, signed firmware, secure protocols, sequence-of-events buffers
2SCADA/HMIUnique operator login, authority checks, disabled write-arounds, time sync to trusted NTP
3MES/eBMR/eDHR, LIMSRBAC, enforced workflows, Part 11 e-signatures, audit trails, raw-data preservation
4ERP, QMS, WMSChange control, master data governance, traceability links, controlled release and recall

05Record capture, context, and metadata

Integrity starts with capture. Each record should bind identity, equipment, material, time, and location without relying on memory or later transcription. Design form fields and automated connectors so critical data elements are system-generated or scanned, not typed. Preserve original raw data (e.g., instrument files) and associate derived values with algorithms and versions used. Define minimum required metadata for each data class and enforce presence with real-time checks.

  • Required metadata: who, what, when (timezone-aware UTC), where (asset/location), why (step/recipe), and how (method/config version).
  • Sequence integrity: monotonic, tamper-evident event IDs and server-side timestamps.
  • Data classification: raw/primary; processed/derived; reportable; decision/approval record.
  • Linkage: bidirectional genealogy between product units/lots and records.
  • Attachment policy: store original files in controlled repository; renderings are not substitutes.

06Identity, authorization, and e‑signatures

Part 11 and Annex 11 expect unique, verified identities and secure binding of actions to individuals. Implement enterprise identity (e.g., SSO/SAML) with strong authentication commensurate to risk, role-based access control (RBAC) with least privilege, and segregation of duties for execution vs. approval. E‑signatures should be linked to meaning (what, why, and intent), bound to record content, and require reauthentication at appropriate control points. Two-person approvals for high-impact steps reduce error and deter manipulation.

  • Unique accounts; no shared logins; disable generic terminals for GxP actions.
  • RBAC and privilege reviews; automatic session lockouts; device trust management.
  • E‑signature ceremony: signer identity, time, meaning of signature, data hash to bind intent.
  • Biometric capture (where permitted) paired with credentials; secure storage and privacy controls.
  • Two-person e‑signatures and witness steps for critical operations (e.g., yield reconciliation).

07Audit trails, review, and exception handling

Audit trails should be computer-generated, time-stamped, and tamper-evident, capturing creation, modification, deletion (if permitted), and e‑signing events, including before/after values and reasons for change. Design review-by-exception workflows with risk-based frequency and scope, ensuring reviewers have the context (diffs, related records, electronic attachments) to determine impact. Define procedural responses: categorize anomalies, assess product impact, and escalate to QMS (deviation/CAPA) when warranted.

  1. Filter and prioritize audit trail events by critical data elements and steps.
  2. Perform impact assessment with linked records and real-time diffs.
  3. Document rationale; apply e‑signatures for review/approval.
  4. Trigger deviations/CAPA when thresholds or patterns indicate systemic risk.
  5. Trend exceptions and feed governance dashboards for management review.

08Validation, assurance, and change control

GAMP 5 (2nd ed.) and Annex 11 call for requirements- and risk-based validation that demonstrates fitness for intended use. Use critical thinking to target testing where integrity risk is highest (e.g., role constraints, e‑sign binding, audit trail integrity, failure modes of interfaces, backup/restore). Leverage supplier quality artifacts without duplicating work, and adopt FDA Computer Software Assurance (CSA) concepts to focus on unscripted, challenge-based evidence for novel, high-risk functionality. Change control must assess data integrity impact, preserve traceability (configuration and recipe versions), and revalidate proportionately.

  • Define DI-critical requirements (identity, time, audit, raw data, calculations, interfaces).
  • Challenge tests: negative/abuse cases (e.g., spoofed timestamps, mid-transaction disconnects).
  • Backup/restore drills: prove recovery preserves integrity and completeness.
  • Periodic review: account/access recertification, log retention, vendor patches, clock health.

09Integration, interoperability, and raw data preservation

Interfaces are frequent sources of silent data loss or miscontextualization. Prefer secure, deterministic integrations (e.g., transactional APIs with acknowledgements) over file drops where feasible, implement message integrity checks, and maintain sequence numbers. For instrument data, capture and retain original raw files in a controlled repository; perform parsing/derivation in a validated service that logs algorithm version and provenance. Reconcile totals (e.g., counts, weights) across systems, and ensure error handling is visible and actionable at the point of use.

Interface typePrimary DI riskDesign safeguard
Equipment-to-MES (OPC/driver)Dropped/duplicated signals; timestamp driftBuffered events with sequence IDs; server-side timestamping; heartbeats; retries
MES-to-LIMSSample ID mismatch; result transcriptionBarcode binding; bidirectional acknowledgments; schema validation; raw attachment retention
MES-to-ERP/WMSQuantity/yield mismatch; partial commitTransactional commits; reconciliation reports; rollbacks on failure; audit trail linkage
User uploadsUncontrolled formats; altered filesWhitelisted formats; automated checksum; viewer-only renderings; immutable storage

10How V5 handles it

V5 Ultimate applies data integrity by design across ISA‑95 layers with enforced identities (SSO/SAML), granular RBAC, secure e‑signatures, trusted time sources, immutable audit trails, raw data repositories, and review-by-exception workflows. Unified modules (MES, QMS, eBMR/eDHR, LIMS, WMS, Maintenance) operate on a single record so master data, deviations/CAPA, calibrations, sampling, inventory moves, and batch execution share one provenance chain.

11Edge cases and operational realities

Real plants face connectivity gaps, shift pressures, and hybrid workflows. Offline or degraded modes must be deliberate: restrict available functions, cache with signed, time-bound journals, and auto-reconcile on reconnect with human-in-the-loop conflict resolution. For time-critical operations (e.g., radiopharma), pre-validated templates, interlocks, and automated data capture reduce manual transcription under time stress. Transitional paper-to-electronic phases require explicit controls to avoid double data entry or orphan records, with clear master data ownership and cutovers under change control.

  • Degraded-mode playbooks: what can proceed, what must pause, and how to reconcile.
  • Barcode-first policy: scan over type for identities and materials.
  • Environmental and equipment data: automatic capture over manual readings; preserve raw sensor streams.
  • Batch pauses/aborts: capture reason, approval, and controlled restart with state provenance.
  • Supplier systems: qualify and periodically review cloud services impacting DI; document responsibilities.

12Common pitfalls and anti‑patterns

Most DI failures are designed-in, not accidental. Avoid patterns that invite ambiguity or post-hoc reconstruction. Do not rely on unsecured spreadsheets for critical calculations; do not allow shared accounts or generic terminals to perform GxP actions; and do not treat audit trail review as a clerical step detached from process risk. Address clock drift proactively, and design interfaces to fail loudly with clear operator guidance.

  • Manual transcription between systems without dual verification or scan enforcement.
  • Hidden configuration drift (e.g., recipe or method edited without version control).
  • Free-text critical fields where enumerations or barcodes are feasible.
  • Unbounded delete/overwrite privileges; lack of before/after and reason capture.
  • Backups untested for restorability and integrity (hash mismatches, missing metadata).
  • Audit trail volumes that overwhelm reviewers; no risk-based filtering or analytics.

Frequently asked questions

Q.How does Data Integrity by Design differ from traditional compliance approaches?+

Traditional approaches often add audit trails and signatures late in projects or rely on procedural compensations. Data Integrity by Design engineers integrity into capture, context, and control paths from the outset, aligning requirements, architecture, and validation so compliant records are a default outcome—not an afterthought.

Q.What are the most critical technical controls to prioritize in MES for data integrity?+

Prioritize unique identities with RBAC, secure e‑signatures, trusted time synchronization, immutable audit trails with before/after values, raw data preservation, validated interfaces with acknowledgements, and rigorous backup/restore. These guard against the most common failure modes: attribution ambiguity, non-contemporaneous entries, silent data loss, and post-hoc edits.

Q.How often should audit trails be reviewed, and by whom?+

Use risk-based frequency: critical steps warrant real-time or per-batch review, while lower-risk areas can be sampled periodically. Reviewers should be trained process owners or QA personnel independent from the originators, with authority to escalate to deviation/CAPA when needed.

Q.Can cloud-based MES comply with Part 11 and Annex 11 for data integrity?+

Yes, provided supplier qualification, identity controls, audit trails, data residency/retention, backup/restore, and validated configurations meet requirements. Responsibilities must be clearly allocated in quality agreements, and the sponsor remains accountable for validation, oversight, and periodic review.

Q.How does FDA’s CSA impact data integrity assurance?+

CSA encourages focusing evidence on high-risk functions, emphasizing critical thinking and challenge-based testing. For data integrity, that means more time validating e‑sign binding, audit trail behavior, and failure modes, and less duplicative scripted testing where supplier evidence suffices.

Primary sources

Further reading

See Data Integrity By Design working on a real shop floor

V5 Ultimate ships with the Data Integrity By Design controls already wired in — audit trail, e-signatures, validation evidence. Free trial, no credit card, onboard in days, not months.