2021 USZ Oropharynx
This folder contains the detailed patterns of lymphatic progression of 287 patients with squamous cell carcinomas (SCCs) in the oropharynx, treated at the University Hospital Zurich (USZ) between 2013 and 2019.
You can find here
- the data itself as
data.csv
- a citation file
CITATION.cff
that can be used to cite this dataset. But it is also possible to cite the Data in Brief paper or the zenodo identifier. - a jupyter notebook
figures.ipynb
for rendering figures visualizing different aspects of the data - the folder
figures
containing the already rendered figures which we also used in our publication for Radiation & Oncology [1].
Cohort Characteristics
Below we show some figures that aim to coarsely characterize the patient cohort in this directory.
Figure 1: Distribution over age, stratified by sex and smoking status. |
Figure 2: Distribution over age, stratified by sex and smoking status. | Figure 3: Distribution over primary tumor subsite. |
Curation
We have detailed inclusion criteria and what was considered lymphatic involvement in our paper that has been published in the journal of Radiotherapy & Oncology (a preprint is also available on medRxiv). The data of this repository is also - and in somewhat more detail - described and provided in its own publication in Data in Brief, enabling anyone to reuse and cite our dataset.
Online Interface
We provide a user-friendly and intuitive graphical user interface to view the dataset, which is available at https://2021-oropharynx.lyprox.org/. The GUI has two main functionalities: the patient list and the dashboard. The patient list allows for viewing the characteristics of a patient, corresponding to one row of the csv file, in a visually appealing and intuitive way. The dashboard allows for filtering of the dataset. For example, the user may select all patients with primary tumors extending over the mid-sagittal plane with involvement of ipsilateral level III. The dashboard will then display the number or percentage of patients with metastases in each of the other LNLs.
Description
The data is provided as a CSV-table containing one row for each of the 287 patients. The table has a header with three levels that describe the columns. Below we explain each column in the form of a list with three levels. So, for example, list entry 1.i.g refers to a column with the three-level header patient | # | hpv_status
and underneath it tha patients' HPV status is listed.
Columns
patient:
General information about the patient’s condition can be found under this top-level header.#:
The second level under patient has no meaning and exists solely as a filler.id:
Enumeration of the patientsinstitution:
The clinic at which the patient were treated and recorded. This holds the value "University Hospital Zurich" for all patients in this dataset.sex:
Sex of the patientage:
Patient’s age at diagnosisdiagnose_date:
Date of diagnosis (formatYYYY-mm-dd
) defined as the date of first histological confirmation of HNSCC.alcohol_abuse:
true
for patients who stated that they consume alcohol regularly,false
otherwisenicotine_abuse:
true
for patients who have been regular smokers (> 10 pack years)hpv_status:
true
for patients with human papilloma virus associated tumors (as defined by p16 immunohistochemistry)neck_dissection:
Indicates whether the patient has received a neck dissection as part of the treatment.tnm_edition:
The edition of the TNM classification used to classify the patient [2]n_stage:
Degree of spread to regional lymph nodesm_stage:
Presence of distant metastases
tumor:
Information about tumors is stored under this top-level header<number>:
The second level enumerates the synchronous tumors. In our database, no patient has had a second tumor, but this structure of the file allows us to include such patients in the future. The third-level headers are the same for each tumor.location:
Anatomic location of the tumorsubsite:
ICD-O-3 code associated with a tumor at the particular location according to the world health organization [3], [4]side:
Lateralization of the tumor. Can be“left”
or“right”
for tumors that have their center of mass clearly on the respective side of the mid-sagittal line and“central”
for patients with a tumor on the mid-sagittal line.central:
Whether the tumor is centralized or not.extension:
True if part of the tumor extends over the mid-sagittal linevolume:
Volume of the tumor in cm3stage_prefix:
Prefix modifier of the T-category. Can be“c”
or“p”
t_stage:
T-category of the tumor, according to TNM staging
<diagnostic modality>:
Each recorded diagnostic modality is indicated by its own top-level header. In this file FNA, CT, MRI, PET, pathology and pCT (planning CT) are providedinfo:
date:
Day on which a diagnose with the respective modality was performed
ipsi:
All findings of involved lymph nodes on the ipsilateral side of the patient’s neck<LNL>:
One column is provided for each recorded lymph node level. For each leveltrue
indicates at least one finding diagnosed as malignant lymph node in the respective LNL,false
means no malignant lymph node has been found and an empty field indicates that no diagnosis is available for this LNL according to the respective diagnostic modality.<LNL>
can be: I, Ia, Ib, II, IIa, IIb, III, IV, V, VI, VII, VIII, IX, X.
contra:
Same as 3.ii but for the contralateral side of the patient’s neck<LNL>:
same as under 3.ii.a
References
[1] R. Ludwig, B. Pouymayou, J.-M. Hoffmann et al, "Detailed patient-individual reporting of lymph node involvement in oropharyngeal squamous cell carcinoma with an online interface." Radiotherapy & Oncology, 2021, DOI: 10.1016/j.radonc.2022.01.035
[2] J. D. Brierley, M. K. Gospodarowicz, and C. Wittekind, "TNM Classification of Malignant Tumours." John Wiley & Sons, 2017.
[3] World Health Organization, Ed., "International statistical classification of diseases and related health problems, 10th revision, 2nd edition." Geneva: World Health Organization, 2004.
[4] A. G. Fritz, Ed., "International classification of diseases for oncology: ICD-O, 3rd ed." Geneva: World Health Organization, 2000.