// Exercise 03 — Database Design
Clinical Trials Data Model
Relational database schema designed for a hospital clinical trials support service.
Supports multiple studies, pseudonymized participants, longitudinal data collection,
role-based access, and file management.
Entity-Relationship Diagram
MySQL Diagram
Figure 1. MySQL relational diagram generated for the clinical trials support database.
Overview
Model Overview
The schema is centered on clinical studies, which act as the main organizational entity of the system.
Each study is associated with a responsible laboratory, includes pseudonymized participants, defines its own variable set,
and allows controlled access for users through study-specific roles.
The design also supports longitudinal data collection by storing repeated hospital visits for each participant.
Data recorded during those visits may correspond either to structured scalar values or to file-based clinical resources.
Direct patient identifiers are intentionally excluded from this operational database and would be managed in a separate
restricted environment in a real hospital setting.
Core Entities
Main Structural Blocks
1. Study Administration
The laboratories and clinical_studies tables define the institutional and
organizational context of the system. They store the identity of each study together with its code, description,
status, contact information, and responsible laboratory.
2. User Management
The users, roles, and user_study_access tables control system access.
This structure allows assigning different roles to the same user in different studies.
3. Pseudonymized Participants
The participants table stores only study-specific pseudonymized identifiers.
The participant_seq table supports automatic identifier generation for each study.
4. Longitudinal Visits
The visits table represents repeated participant follow-up over time.
Each visit stores date, label, optional notes, and the responsible user.
5. Dynamic Study Variables
The variable_set table defines the variables collected in each study.
This makes the schema flexible and avoids using fixed clinical columns for every possible trial.
6. Clinical Records
The schema stores clinical observations in clinical_data_items and
file metadata in clinical_data_files, both linked to visits and study variables.
Technical Documentation
Relational Structure
Technical relational structure including entities, keys, cardinalities, and referential integrity rules.
Tables in the Schema
- laboratories — stores laboratories responsible for clinical studies.
- clinical_studies — central table describing each clinical trial.
- roles — defines application roles (admin, supervisor, data entry).
- users — stores system users with authentication data.
- user_study_access — junction table linking users, studies, and roles.
- participant_seq — sequence counter for automatic participant ID generation.
- participants — pseudonymized study participants.
- visits — participant visits over time (longitudinal data).
- variable_set — study-specific variable definitions.
- clinical_data_items — scalar clinical values per visit/variable.
- clinical_data_files — file metadata (images, sequencing) per visit/variable.
Primary Keys
- laboratories.laboratory_id
- clinical_studies.clinical_study_id
- roles.role_id
- users.user_id
- user_study_access.access_id
- participant_seq.clinical_study_id
- participants.participant_id
- visits.visit_id
- variable_set.variable_id
- clinical_data_items.item_id
- clinical_data_files.file_id
Foreign Keys
- clinical_studies.laboratory_id → laboratories.laboratory_id
- user_study_access.user_id → users.user_id
- user_study_access.clinical_study_id → clinical_studies.clinical_study_id
- user_study_access.role_id → roles.role_id
- participant_seq.clinical_study_id → clinical_studies.clinical_study_id
- participants.clinical_study_id → clinical_studies.clinical_study_id
- visits.participant_id → participants.participant_id
- visits.user_in_charge → users.user_id
- variable_set.clinical_study_id → clinical_studies.clinical_study_id
- clinical_data_items.visit_id → visits.visit_id
- clinical_data_items.variable_id → variable_set.variable_id
- clinical_data_files.visit_id → visits.visit_id
- clinical_data_files.variable_id → variable_set.variable_id
- clinical_data_files.uploaded_by → users.user_id
Relationship Types
-
One-to-Many (1 : N)
This is the dominant pattern in the schema. Examples include:
laboratories → clinical_studies, clinical_studies → participants,
participants → visits, clinical_studies → variable_set,
visits → clinical_data_items, and visits → clinical_data_files.
-
Many-to-Many (N : N)
The relationship between users and studies is many-to-many and is implemented through
the user_study_access junction table.
-
One-to-One (1 : 1)
A logical one-to-one relationship exists between clinical_studies and
participant_seq, since each study uses a single sequence counter for participant ID generation.
Referential Integrity
- CASCADE — dependent records deleted with parent.
- SET NULL — reference removed, dependent record kept.
- RESTRICT — deletion blocked while dependents exist.
Pseudonymization
Participants are stored through pseudonymized study-specific identifiers.
The participant_seq table supports automatic ID generation.
Patient-identifying information is intentionally excluded from this operational schema
and would be handled externally in a restricted environment.
Summary
Technical Summary Table
| Table |
Primary Key |
Main Foreign Keys |
Relationship Type |
| laboratories | laboratory_id | — | 1:N with clinical_studies |
| clinical_studies | clinical_study_id | laboratory_id | 1:N with participants, variable_set, access |
| roles | role_id | — | 1:N with user_study_access |
| users | user_id | — | N:N with studies via access table |
| user_study_access | access_id | user_id, clinical_study_id, role_id | Junction table |
| participant_seq | clinical_study_id | clinical_study_id | 1:1 with clinical_studies |
| participants | participant_id | clinical_study_id | 1:N with visits |
| visits | visit_id | participant_id, user_in_charge | 1:N with data items/files |
| variable_set | variable_id | clinical_study_id | 1:N with data items/files |
| clinical_data_items | item_id | visit_id, variable_id | N-side of visit/variable |
| clinical_data_files | file_id | visit_id, variable_id, uploaded_by | N-side of visit/variable |
Rationale
Design Decisions
- Support for multiple studies — the schema manages several independent clinical studies within the same database.
- Flexible variable definition — different studies define different variables through the variable_set table.
- Longitudinal follow-up — the visit table supports repeated data collection over time.
- Controlled access by study — user_study_access assigns users different roles per study.
- Privacy-aware design — direct personal identifiers are excluded; handled externally in a restricted environment.