Harmonisation Template for Cohort B

Author

My Name

Published

March 10, 2025

Preface

Here is the documentation of the data harmonisation step generated using Quarto. To learn more about Quarto books visit https://quarto.org/docs/books.

File Structure

Here is the file structure of the project used to generate the document.

harmonisation/                            # Root of the project template.
|
├── .quarto/ (not in repository)          # Folder to keep intermediate files/folders 
|                                         # generated when Quarto renders the files.
|
├── archive/                              # Folder to keep previous books and harmonised data.
|   |
│   ├── reports/                          # Folder to keep previous versions of
|   |   |                                 # data harmonisation documentation.
|   |   |
|   |   ├── {some_date}_batch/            # Folder to keep {some_date} version of
|   |   |                                 # data harmonisation documentation.
|   |   |
|   |   └── Flowchart.xlsx                # Flowchart sheet to record version control.
|   |
|   └── harmonised/                       # Folder to keep previous version of harmonised data.
|       |
|       ├── {some_date}_batch/            # Folder to keep {some_date} version of
|       |                                 # harmonised data.
|       |
|       └── Flowchart.xlsx                # Flowchart sheet to record version control.
|
├── codes/                                # Folder to keep R/Quarto scripts 
|   |                                     # to run data harmonisation.
|   |
│   ├── {cohort name}/                    # Folder to keep Quarto scripts to run
|   |   |                                 # data cleaning, harmonisation 
|   |   |                                 # and output them for each cohort.
|   |   |
|   |   └── preprocessed_data/            # Folder to keep preprocessed data.
|   |
│   ├── harmonisation_summary/            # Folder to keep Quarto scripts to create
|   |                                     # data harmonisation summary report.
|   |
│   ├── output/                           # Folder to keep harmonised data.
|   |                                     
|   ├── cohort_harmonisation_script.R     # R script to render each {cohort name}/ folder. 
|   |                                     # folder into html, pdf and word document.
|   |
|   └── harmonisation_summary_script.R    # R script to render the {harmonisation_summary}/ 
|                                         # folder into word document.
│  
├── data-raw/                             # Folder to keep cohort raw data (.csv, .xlsx, etc.)
|   |
│   ├── {cohort name}/                    # Folder to keep cohort raw data.
|   |   |
|   |   ├── {data_dictionary}             # Data dictionary file that correspond to the 
|   |   |                                 # cohort raw data. Can be one from the
|   |   |                                 # collaborator provide or provided by us.
|   |   |
|   |   └── Flowchart.xlsx                # Flowchart sheet to record version control.
|   |
|   ├── data-dictionary/                  # Folder to keep data dictionary 
|   |   |                                 # used for harmonising data.
|   |   |
|   |   └── Flowchart.xlsx                # Flowchart sheet to record version control.
|   |
|   └── data-input/                       # Folder to keep data input file 
|       |                                 # for collaborators to fill in.
|       |
|       └── Flowchart.xlsx                # Flowchart sheet to record version control.
|  
├── docs/                                 # Folder to keep R functions documentation 
|                                         # generated using pkgdown:::build_site_external().
|  
├── inst/                                 # Folder to keep arbitrary additional files 
|   |                                     # to include in the project.
|   |  
|   └── WORDLIST                          # File generated by spelling::update_wordlist()
|  
├── man/                                  # Folder to keep R functions documentation
|   |                                     # generated using devtools::document().
|   |
│   ├── {fun-demo}.Rd                     # Documentation of the demo R function.
|   |
│   └── harmonisation-template.Rd         # High-level documentation.
|  
├── R/                                    # Folder to keep R functions.
|   |
│   ├── {fun-demo}.R                      # Script with R functions.
|   |
│   └── harmonisation-package.R           # Dummy R file for high-level documentation.
│  
├── renv/ (not in repository)             # Folder to keep all packages 
|                                         # installed in the renv environment.
| 
├── reports/                              # Folder to keep the most recent data harmonisation
|                                         # documentation.
|
├── templates/                            # Folder to keep template files needed to generate
|   |                                     # data harmonisation documentation efficiently.
|   |
|   ├── quarto-yaml/                      # Folder to keep template files to generate 
|   |   |                                 # data harmonisation documentation structure 
|   |   |                                 # in Quarto. 
|   |   |
│   |   ├── _quarto_{cohort name}.yml     # Quarto book template data harmonisation documentation 
|   |   |                                 # for {cohort name}.
|   |   |
|   |   └── _quarto_summary.yml           # Quarto book template data harmonisation summary.
|   |
|   └── index-qmd/                        # Folder to keep template files to generate
|       |                                 # the preface page of the data harmonisation 
|       |                                 # documentation.
|       |
|       ├── _index_report.qmd             # Preface template for each cohort data harmonisation
|       |                                 # report. 
|       |
|       └── _index_summary.qmd            # Preface template for data harmonisation 
|                                         # summary report. 
|        
├── tests/                                # Folder to keep test unit files. 
|                                         # Files will be used by R package testhat.
|
├── .Rbuildignore                         # List of files/folders to be ignored while 
│                                         # checking/installing the package.
|
├── .Renviron (not in repository)         # File to set environment variables.
|
├── .Rprofile (not in repository)         # R code to be run when R starts up.
|                                         # It is run after the .Renviron file is sourced.
|
├── .Rhistory (not in repository)         # File containing R command history.
|
├── .gitignore                            # List of files/folders to be ignored while 
│                                         # using the git workflow.
|
├── .lintr                                # Configuration for linting
|                                         # R projects and packages using linter.
|        
├── .renvignore                           # List of files/folders to be ignored when 
│                                         # renv is doing its snapshot.
|
├── DESCRIPTION[*]                        # Overall metadata of the project.
|
├── LICENSE                               # Content of the MIT license generated via
|                                         # usethis::use_mit_license().
|
├── LICENSE.md                            # Content of the MIT license generated via
|                                         # usethis::use_mit_license().
|
├── NAMESPACE                             # List of functions users can use or imported
|                                         # from other R packages. It is generated 
|                                         # by devtools::document().
│        
├── README.md                             # GitHub README markdown file generated by Quarto.
|
├── README.qmd                            # GitHub README quarto file used to generate README.md. 
|        
├── _pkgdown.yml                          # Configuration for R package documentation
|                                         # using pkgdown:::build_site_external().
|        
├── _quarto.yml                           # Configuration for Quarto book generation.
|                                         # It is also the project configuration file.
|
├── csl_file.csl                          # Citation Style Language (CSL) file to ensure
|                                         # citations follows the Lancet journal.
|        
├── custom-reference.docx                 # Microsoft word template for data harmonisation 
|                                         # documentation to Word.
|
├── harmonisation_template.Rproj          # RStudio project file.
|        
├── index.qmd                             # Preface page of Quarto book content.
|        
├── references.bib                        # Bibtex file for Quarto book.
|      
└── renv.lock                             # Metadata of R packages installed generated
                                          # using renv::snapshot().

[*] These files are automatically created but user needs to manually add some information.