CDISC data standards
CDISC stands for Clinical Data Interchange Standards Consortium. CDISC provides several interrelated standards, each addressing a specific stage of the data lifecycle Figure 1. Each standard has an Implementation Guide (IG) that shows exactly how to apply that standard practically, e.g., SDTMIG for SDTM. All the implementations have version numbers that are important to keep track of.
CDISC also provides TAUGs (Therapeutic Area User Guides). A TAUG is a document that provides guidance on applying CDISC standards to a specific therapeutic area (e.g., oncology, diabetes, CNS, etc.).
CDASH
CDASH (Clinical Data Acquisition Standards Harmonization) standardizes data entry into case report forms (CRFs) at clinical sites.
SDTM
SDTM (Study Data Tabulation Model) organizes the data from CDASH into submission-ready datasets. It includes different domains/datasets (e.g., exposure and lab values) and controlled terminology (e.g., how to name variables).
General Observation Classes
General Observation Classes are high-level classifications within SDTM that group domains based on the data type they contain.
There are three general observation classes: Interventions, Events, and Findings. There is also a fourth observation class: Special purpose.
Observation classes are then grouped into domains, organized around a particular topic. Each “domain” is usually a separate dataset; for example, EX can be a dataset containing exposure (dosing) data. However, a domain can also be split into multiple datasets. A typical example is splitting the Laboratory Test Results (LB) domain due to size.
Observation Class | Domain Name | Domain Abbreviation |
---|---|---|
Interventions | Procedure agents | AG |
Concomitant/prior medications | CM |
|
Exposure | EX |
|
Exposure as collected | EC |
|
Meal data | ML |
|
Procedures | PR |
|
Substance use | SU |
|
Events | Adverse events | AE |
Biospecimen events | BE |
|
Clinical events | CE |
|
Disposition | DS |
|
Healthcare encounters | HO |
|
Medical history | MH |
|
Protocol deviations | DV |
|
Findings | Product/drug accountability | DA |
Death details | DD |
|
ECG test results | EG |
|
Inclusion/Exclusion criteria not met | IE |
|
Biospecimen findings | BS |
|
Cell phenotype findings | CP |
|
Genomics findings | GF |
|
Immunogenicity specimen assessments | IS |
|
Laboratory test results | LB |
|
Microbiology specimen | MB |
|
Microbiology susceptibility | MS |
|
Microscopic findings | MI |
|
PK concentrations | PC |
|
PK parameters | PP |
|
Morphology | MO |
|
Cardiovascular system findings | CV |
|
Musculoskeletal system findings | MK |
|
Nervous system findings | NV |
|
Ophthalmic examinations | OE |
|
Reproductive system findings | RP |
|
Respiratory system findings | RE |
|
Urinary system findings | UR |
|
Physical examination | PE |
|
Functional tests | FT |
|
Questionnaires | QS |
|
Disease response and clinical classification | RS |
|
Subject characteristics | SC |
|
Subject status | SS |
|
Tumor/lesion identification | TU |
|
Tumor/lesion results | TR |
|
Findings about | Vital signs | VS |
Findings about events or interventions | FA |
|
Skin response | SR |
|
Special Purpose | Comments | CO |
Demographics | DM |
|
Subject elements | SE |
|
Subject disease milestones | SM |
|
Subject visits | SV |
|
Trial design | Trial arms | TA |
Trial disease assessments | TD |
|
Trial elements | TE |
|
Trial inclusion/exclusion criteria | TI |
|
Trial disease milestones | TM |
|
Trial summary | TS |
|
Trial visits | TV |
|
Relationship | Related records | RELREC |
Related specimens | RELSPEC |
|
Related subjects | RELSUB |
|
Supplemental qualifiers for [domain name] | SUPP-- |
|
Study reference | Non-host organism indentifiers | OI |
PC Domain (Pharmacokinetics Concentrations)
ref: https://www.fda.gov/media/153632/download
The PC domain should support the creation of time series graphs and automatic calculation of pharmacokinetic parameters from sets of related plasma concentrations. Three elements are necessary:
- Nominal timings relative to the dose in ISO 8601 duration format
- Grouping of each different set of time series measurements used to calculate a related pharmacokinetic parameter
- Identification of the start of each time series relative to the start of exposure
If the nominal times are provided in PCELTM, nulls should be avoided for plasma concentrations used to calculate a profile. PCDTC and PCDY variables should be populated with actual/collected information when it available; however, for single dose, repeat dose, or carcinogenicity studies where actual/collected data are not readily available to be incorporated into the dataset, these variables may be left null or populated with calculated or nominal dates/times. The use of calculated or nominal dates and times should be mentioned in the nSDRG. When actual dose dates or date/time values are available for PCRFTDTC/PPRFTDTC, they can be included. When a test result is below a lower limit of quantitation (LLOQ), it should be submitted using the following instructions:
PCORRES
should not contain a specific value. For example, the value in PCORRES may be<LLOQ
, where LLOQ is the numerical value. •BLQ
[^According to the FDA’s Bioanalytical Method Validation Guidance for Industry (May 2018), study samples with concentrations listed below the LLOQ should be reported asBQL
; however,BLQ
, as specified in FDA-supported SENDIG versions, is appropriate to use in SEND datasets to report this data.] should be in PCSTRESC to signify that the result is below the LLOQ. •PCSTRESN
should be blank. • Standardized units for LOQ should be inPCSTRESU
. •PCLLOQ
should be populated with the lower limit of quantitation for the analyte. • When a numeric value has been assigned to a result that is below the LLOQ for the purpose of group summary statistics, that value should be submitted in SUPPPC as QNAM =PCCALCN
to allow the group statistics presented in the study report to be reproduced. When a value that is below the LLOQ is excluded from group statistics, no PCCALCN entry is needed.
SDTM variables
Variables are classified as required, expected, or permissible for each domain.
Identifier
- STUDYID
- DOMAIN
- USUBJID
- –SEQ
Topic
Contain the focus of the observation.
- –TRT
- –TERM
- –TESTCD
Qualifier
Qualifiers provide additional information about the topic variables.
- Grouping
- Result
- Synonym
- Record
- Variable
Timing
Describes the timing of the observation.
- Date/times
- Study days
- Durations
- Intervals
- Visits
- Time points
- Relative times
ADaM
ADaM (Analysis Data Model) transforms SDTM data into datasets suitable for statistical analysis. It contains derived datasets used for tables, listings, and figures. Like SDTM, it is organized into different domains/datasets, each with its specific setup.
https://www.cdisc.org/system/files/members/standard/foundational/ADaMIG_v1.3.pdf
ADaM Standard data structures
- ADSL: Subject-level analysis dataset
- One record per subject
- BDS: Basic data structure
- One (or more) record per subject/parameter/timepoint
- OCCDS: Occurrence data structure
- Designed for counting occurrences
- ADaM Other data structure
ADaM variable conventions
General variable conventions
All ADaM variable names must be no more than 8 characters in length, start with a letter (not underscore), and be composed only of letters (A–Z), underscore (_), and numerals (0–9).
ADaM adheres to a principle of harmonization known as “same name, same meaning, same values.”
Generally, suppose an SDTM character variable is converted to a numeric variable in an ADaM dataset. In that case, it should be named as it is in the SDTM dataset with an “N” suffix added.
In a pair of corresponding variables (e.g., TRTP and TRTPN), the primary or most commonly used variable does not have the suffix or extension (i.e., N for numeric or C for character). The relevant suffix is used only on the name of the secondary member of the variable pair.
Variables whose names end in FL are character flag (or indicator) variables.
Variables whose names end in GRy, Gy, or CATy are grouping variables, where “y” refers to the grouping scheme or algorithm (not the category within the grouping).
It is recommended that producer-defined grouping or categorization variables begin with the name of the variable being grouped and end in GRy (e.g., variable ABCGRy is a character description of a grouping or categorization of the values from the ABC variable for analysis purposes).
Timing variable conventions
Variables whose names end in DT are numeric dates.
Variables whose names end in DTM are numeric datetimes.
Variables whose names end in TM are numeric times.
Names of timing start variables end with an S followed by the characters indicating the type of timing (i.e., SDT, STM, SDTM).
Names of timing end variables end with an E followed by the characters indicating the type of timing (i.e., EDT, ETM, EDTM)
Variables whose names end in DY are relative day variables.
Variable naming fragments
Fragment | Notes |
---|---|
GRy | Grouping variables |
Gy | Abbreviation of GRy |
FL | Flag/indicator |
DT | Numeric date |
TM | Numeric time |
DTM | Numeric datetime |
DTF | Date imputation flag |
TMF | Time imputation flag |
DY | Relative days (does not include day 0) |
Fragment | Notes |
---|---|
BL | Baseline |
CHG | Change |
FU | Follow-up |
OT | On treatment |
RU | Run-in |
SC | Screening |
TA | Taper |
TI | Titer |
U | Units |
WA | Washout |
ADSL variables
Variable | Notes |
---|---|
Variable | Notes |
---|---|
BDS variables
Analysis-enabling variables
SEND
SEND (Standard for Exchange of Nonclinical Data) mirrors SDTM but for non-clinical (animal) studies, describing how data is organized.