Merged English Cancer Registry Data

ONS Minimum Cancer Dataset | Hospital Episode Statistics | National Clinical Audit Data

Each of the English cancer registries provides an extract from their database, which are merged to form the core of the National Cancer Data Repository. The 2008 version of the repository contatins data from 1990 to 2008, with the specification shown below.

Patient data (demographics and death data)

Field name Description of contents Format Allowed values Validation criteria
ONS_number ONS number. A unique identifier made up from the Registry Centre Code, Registration Year and Registration Serial Number. an11 11-digit number No validation
PatientNumberMerge This is the patient code used by your registry. an20 Free text No validation
nhsno NHS number. n10 10-digit number Check digit is valid
dob Full date of birth an10 ccyy-mm-dd format Must be blank, or a valid date
birth_date_flag Imputed date of birth flag. Imputation of dates to follow rules agreed by UKACR DQAR sub-group (scheduled August 2010). Blank field indicates that date imputation did not occur. n1 (1,2,3,4,5,6,7) 1: Day imputed 2: Month imputed 3: Day and month imputed 4: Year imputed 5: Day and Year imputed 6: Month and Year imputed 7: Day, Month and Year imputed Must be blank, or a valid code
Forename1 First name an35 Free text No validation
Forename2 Subsequent names an35 Free text No validation
Surname Surname at time of diagnosis an35 Free text No validation
otherSurname Surname at birth - if blank this is assumed to be same as surname at time of diagnosis an35 Free text No validation
postcode7 Full postcode (at time of diagnosis) an8 Postcode-7 format Is valid PC-7 format and can be found in current UKACR postcode directory
sex Gender n1 1: Male 2: Female 9: Not specified 0: Not known Must be blank, or valid code
ethnicity Ethnicity. Data recorded in 1991 census ethnicity codes should be mapped to 2001 codes using the mapping approved by the UKACR DQAR group. an1 A: British (White) B: Irish (White) C: Any other White background D: White and Black Caribbean (Mixed) E: White and Black African (Mixed) F: White and Asian (Mixed) G: Any other Mixed background H: Indian (Asian or Asian British) J: Pakistani (Asian or Asian British) K: Bangladeshi (Asian or Asian British) L: Any other Asian background M: Caribbean (Black or Black British) N: African (Black or Black British) P: Any other Black background R: Chinese (other ethnic group) S: Any other ethnic group Z: Not stated X: Not known Must be blank, or valid code
DCO Death certificate only flag an1 (Y,N) Y This is a death certificate only registration N This is not a death certificate only registration Must be blank, or valid code
extra_regional Extra-regional flag an1 (Y,N) Y This is an extra regional registration N This is a registration from within the submitting registry area Must be blank, or valid code
dod Date of death an10 ccyy-mm-dd Must be blank, or a valid date
death_date_flag Imputed date of death flag. Imputation of dates to follow rules agreed by UKACR DQAR sub-group (scheduled August 2010). Blank field indicates that date imputation did not occur. n1 (1,2,3,4,5,6,7) 1: Day imputed 2: Month imputed 3: Day and month imputed 4: Year imputed 5: Day and Year imputed 6: Month and Year imputed 7: Day, Month and Year imputed Must be blank, or a valid code
cod_1a Cause of death 1a an6 ICD-10 codes Must be a valid ICD-10 code
cod_1b Cause of death 1b an6 ICD-10 codes Must be a valid ICD-10 code
cod_1c Cause of death 1c an6 ICD-10 codes Must be a valid ICD-10 code
cod_2 Cause of death 2 an6 ICD-10 codes Must be a valid ICD-10 code
place_of_death Place of death n1 (1,2,3,4,5,6) 1 Hospital 2 NHS hospice / specialist palliative care unit 3 Voluntary hospice / specialist palliative care unit 4 Patient's own home 5 Care Home 6 Other Must be blank, or valid code

Tumour data (Diagnostic & tumour flags)

Field name Description of contents Format Allowed values Validation criteria
PatientNumberMerge This is the patient code used by your registry (this is repeated from the Demographics and Death section as the second section is envisaged to be a separte table and the ID must be repeated to enable linking) an20 Free text No validation
TumourNumberMerge This is the tumour number used by your registry an20 Free text No validation
site4 Full ICD10 code (4 digit code) an6 ICD-10 codes Must be a valid ICD-10 code
morphology_system Morphology coding system - indicates whether ICD-O-02 or ICD-O-03 system is used n1 (2,3) 2: ICD-O-2 is used to record morphology 3: ICD-O-3 is used to record morphology Must be blank, or valid code
type5 Morphology code (ICD-0-2 or ICD-0-3). The behvaiour code should be included as the fifth digit of the morphology code, using the appropriate ICD-O-2 or ICD-O-3 definition n5 Valid ICD-O-02 or ICD-O-03 codes Must be number >80000
diag_date Date of diagnosis an10 ccyy-mm-dd Must be blank, or a valid date
diag_date_flag Imputed date of diagnosis flag. Imputation of dates to follow rules agreed by UKACR DQAR sub-group (scheduled August 2010). Blank field indicates that date imputation did not occur. n1 (1,2,3,4,5,6,7) 1: Day imputed 2: Month imputed 3: Day and month imputed 4: Year imputed 5: Day and Year imputed 6: Month and Year imputed 7: Day, Month and Year imputed Must be blank, or a valid code
basis_of_diagnosis Basis of diagnosis (e.g. histology, cytology, clinical opinion etc) n1 (0, 1, 2, 3, 4, 5, 6, 7, 9) Non-microscopic 0 Death Certificate: The only information available is from a death certificate 1 Clinical: Diagnosis made before death but without the benefit of any of the following (2-7) 2 Clinical Investigation: Includes all diagnostic techniques (e.g. X-rays, endoscopy, imaging, ultrasound, exploratory surgery and autopsy) without a tissue diagnosis 4 Specific tumour markers: Includes biochemical and/or immunological markers which are specific for a tumour site Microscopic 5 Cytology: Examination of cells whether from a primary or secondary site, including fluids aspirated using endoscopes or needles. Also including microscopic examination of peripheral blood films and trephine bone marrow aspirates 6 Histology of a mestastases: Histological examination of tissues from a metastasis, including autopsy specimens 7 Histology of a primary tumour: Histological examination of tissue from the primary tumour, however obtained, including all cutting and bone marrow biopsies. Also includes autopsy specimens of a primary tumour 9 Unknown: No information on how the diagnosis has been made (e.g. PAS or HISS record only) Must be blank, or a valid code
screening_status Screening Status n1 (1,2,3,9) 1 Cancers detected by national screening program 2 Cancers in the screening population which are detected between screens in the normal screening round 3 Other cancers 9 Not known (default) Must be blank or a valid code
tumour_size The size in millimetres of the diameter of a lesion, largest if more than one, if the histology of a SAMPLE proves to be invasive. n2 The size in millimetres of the diameter of a lesion, largest if more than one, if the histology of a SAMPLE proves to be invasive. Must be number or blank
grade This field records the grade of the tumour, for tumours that are graded on a simple numeric 1-3 or 1-4 scale. an2 (GX,G1,G2,G3,G4) where GX Grade of differentiation is not appropriate or cannot be assessed G1 Well differentiated G2 Moderately differentiated G3 Poorly differentiated G4 Undifferentiated / anaplastic Must be blank, or a valid code
grade_Description This field used to modify meaning of ""grade"" field depending on the grading system that is recorded in the ""grade"" field. For tumours where the standard well-moderately-poorly-undifferentiated scale is used this should be 1. n1 (1,2,3,4,9) 1 Well / moderately / poorly / undifferentiated 2 Low / Intermediate / High 3 Fuhrman Grade (Kidney only) 4 WHO grade (bladder only) 5 Bloom and Richardson Grade (Breast only) 9 Unkown Must be blank, or a valid code
gleason_grade This is written as two scores, minimum value 1, maximum value 5 e.g. 2 + 3. Gleason Grade format follows that of NCDS: Appendix for Urological Cancer an3 Primary grade(1-5)+secondary grade (1-5), e.g. "3+4" Must be blank, or a valid format
Laterality Tumour laterality an1 (L,R,M,B,8,9) L Left R Right M Midline B Bilateral 8 Not applicable 9 Not known Must be blank, or a valid code
nodes_examined Number of nodes examined n2 Integers Must be blank or integer
nodes_positive Number of nodes found positive n2 Integers Must be blank or integer
mets Distant metastases an1 (Y,N,X) Y: There are distant metastases present at diagnosis N: No metastases can be detected X: It is not known or not recorded whether metastases are present Must be blank, or valid code
UICC_version UICC staging version, i.e. which version of the TNM classification of malignant cancers was used to stage the tumour an1 (5,6,7,X) 5: TNM Classification of Malignant Tumours Version 5 was used 6: TNM Classification of Malignant Tumours Version 6 was used 7: TNM Classification of Malignant Tumours Version 7 was used X: TNM Classification of Malignant Tumours Version Unknown Must be blank, or valid code
DUKE_stage DUKEs Stage an1 (A,B,C,C1,C2,D) Must be blank, or a valid code
FIGO_stage FIGO Stage an4 (0, I, IA, IA1, IA2, IB, IB1, IB2, IC, II, IIA, IIA1, IIA2, IIB, IC, III, IIIA, IIIB, IIIC, IIIC1, IIIC2, IV, IVA, IVB) Must be blank, or a valid code
CLARK_level CLARK level an1 (1,2,3,4,5) Must be blank, or a valid code
NPI_score The Nottingham Prognostic indicator score (not the derived stage) n5 Format: nn.nn Must be blank, or a valid format
Breslow Breslow Thickness in millimetres. n5 Number Number or blank
TNM_clin TNM stage grouping as defined by the TNM handbook for the combination of clinical T, N and M in the t_clin, n_clin, and m_clin fields, i.e. "IIA" not "100". Includes Ann Arbor staging. an7 (0, 0a, 0is , I, IA, IA1, IA2, IB, IB1, IB2, IC, IEA, IEB, IS, II, IIA, IIB, IIC, IIEA, IIEB, III, IIIA, IIIB, IIIC, IIIEA, IIIEB, IIIESA, IIIESB, IIISA, IIISB, IV, IVA, IVB, IVC) Must be blank, or a valid code
t_clin T stage (clinical) an7 (T0, Tis, Tis pu, Tis pd, Ta, T1, T1a, T1a1, T1a2, T1b, T2b1, T2b2, T1c, T2, T2a, T2b, T2c, T3, T3a, T3b, T3c,T4, T4a, T4b, T4c, T4d, TX) Must be blank, or a valid code
n_clin N stage (clinical) an7 (N0, N1, N1a, N1b, N2, N2a, N2b, N2c, N3, N3a, N3b, N3c, NX) Must be blank, or a valid code
m_clin M stage (clinical) an7 (M0, M1, M1a, M1b, M1c, MX) Must be blank, or a valid code
neoadjuvant_flag_path Neo-adjuvent treatment flag. As discussed in May 27th Meeting. Indicate that staging occurred following prior tumour-shrinking treatment. an1 (Y,N,X) Y: Neo-adjuvant treatment occured prior to path staging N: Neo-adjuvant treatment did not occur prior to path staging X: Unknown whether neo-adjuvant treatment occured prior to path staging Must be blank, or a valid code
TNM_path TNM stage grouping as defined by the TNM handbook for the combination of clinical T, N and M in the t_path, n_path, and m_path fields, i.e. "IIA" not "100". Includes Ann Arbor staging. an7 (0, 0a, 0is, I, IA, IA1, IA2, IB, IB1, IB2, IC, IEA, IEB, IS, II, IIA, IIB, IIC, IIEA, IIEB, III, IIIA, IIIB, IIIC, IIIEA, IIIEB, IIIESA, IIIESB, IIISA, IISB, IV, IVA, IVB, IVC) Must be blank, or a valid code
t_path T stage (pathological) an7 (pT0, pTis, pTis pu, pTis pd, pTa, pT1, pT1a, pT1a1, pT1a2, pT1b, pT1mic, pT2b1, pT2b2, pT1c, pT2, pT2a, pT2b, pT2c, pT3, pT3a, pT3b, pT3c,pT4, pT4a, pT4b, pT4c, pT4d, pTX) Must be blank, or a valid code
n_path N stage (pathological) an7 (pN0, pN1, pN1a, pN1b, pN1bi, pN1bii, pN1biii, pN1biv, pN1mi, pN2, pN2a, pN2b, pN2c, pN3, pN3a, pN3b, pN3c, pNX) Must be blank, or a valid code
m_path M stage (pathological) an7 (pM0, pM1, pM1a, pM1b, pM1c, pMX) Must be blank, or a valid code
TNM_int TNM stage grouping as defined by the TNM handbook for the combination of clinical T, N and M in the t_int, n_int, and m_int fields, i.e. "IIA" not "100". Includes Ann Arbor staging. an7 (0, 0a, 0is, I, IA, IA1, IA2, IB, IB1, IB2, IC, IEA, IEB, IS, II, IIA, IIB, IIC, IIEA, IIEB, III, IIIA, IIIB, IIIC, IIIEA, IIIEB, IIIESA, IIIESB, IIISA, IISB, IV, IVA, IVB, IVC) Must be blank, or a valid code
t_int T stage (integrated) an7 (T0, Tis, Tis pu, Tis pd, Ta, T1, T1a, T1a1, T1a2, T1b, T2b1, T2b2, T1c, T2, T2a, T2b, T2c, T3, T3a, T3b, T3c,T4, T4a, T4b, T4c, T4d, TX) Must be blank, or a valid code
n_int N stage (integrated) an7 (N0, N1, N1a, N1b, N2, N2a, N2b, N2c, N3, N3a, N3b, N3c, NX) Must be blank, or a valid code
m_int M stage (integrated) an7 (M0, M1, M1a, M1b, M1c, MX) Must be blank, or a valid code
surgeryTherapy Treatment indicators (Y/N) for surgery. All recorded treatments should be included within 6 months of diagnosis. Only curative surgeries should be included an1 (Y,N) Must be blank, or valid code
RT Treatment indicators (Y/N) for RT. All recorded treatments should be included within 6 months of diagnosis. an1 (Y,N) Must be blank, or valid code
CT Treatment indicators (Y/N) for Chemo. All recorded treatments should be included within 6 months of diagnosis. an1 (Y,N) Must be blank, or valid code
HormoneTherapy Treatment indicators (Y/N) for Hormone Therapy. All recorded treatments should be included within 6 months of diagnosis. Includes immunnotherapy. an1 (Y,N) Must be blank, or valid code

Treatment data

Field name Description of contents Format Allowed values Validation criteria
TumourNumberMerge This is the tumour number used by your registry (this is repeated from the diagnostics and treatment flags section as the second section is envisaged to be a separte table and the ID must be repeated to enable linking) an20 Free text No validation
TreatmentNumber This is the unique treatment number used by your registry an20 Free text No validation
TreatmentSite The site at which the treatment was performed. NHS Data Dictionary Supporting Information, Administrative Codes, NHS Trust Site (5 character code, starts with ‘R’, allocated by OCS) an5 Valid site code No validation
Date The date at which the clinical intervention started an10 ccyy-mm-dd No validation
Treatment_code The OPCS-4 code of the treatment an5 Valid OPCS code No validation
Consultant_code The consultant code of the the responsible consultant. NHS Data Dictionary Supporting Information, Administrative Codes (Practitioner Code for a Consultant) an8 Valid consultant code No validation