Full library
Health Data for Linguistic Minority Group Research in Canada: Proof-of-Concept Centralized Health Care Metadata Repository Development and Usability Study
Resource type
Journal Article
Authors/contributors
- Martin-Schreiber, Vincent (Author)
- Peixoto, Cayden (Author)
- Batista, Ricardo (Author)
- Belanger, Christopher (Author)
- Tanuseputro, Peter (Author)
- Hsu, Amy T (Author)
- Bjerre, Lise M (Author)
Title
Health Data for Linguistic Minority Group Research in Canada: Proof-of-Concept Centralized Health Care Metadata Repository Development and Usability Study
Abstract
Background
Language barriers between Canadian patients and health care providers are associated with poorer health outcomes, including decreased patient safety and quality of care, misdiagnosis and longer treatment initiation times, and increased mortality. However, research exploring language as a social determinant of health is limited, as Canadian health data are scattered across many jurisdictions, each with its own policies and procedures. This fragmentation makes it difficult for researchers to identify, locate, and use existing data. This paper presents the results of a pilot study that attempts to address this gap by creating a metadata repository (MDR) to act as a central source of information about what data are available at which data holdings across Canada.
Objective
This project aimed to (1) create a proof-of-concept MDR for Canadian health data at the variable level; (2) identify and label language-related variables existing within the MDR data; and (3) develop an interactive, public-facing web application to let users browse and search the MDR.
Methods
Metadata were collected from 5 Canadian health data sources, including 4 provincial data holdings and 1 national survey, and pooled to create a data repository. Then, we performed bottom-up labeling of language-related variables within the pooled metadata by first using a search string algorithm across all variable labels, names, and definitions and then consensus screening these variables using a derived, standardized definition of language or linguistic variables. Using the Shiny web framework in R, we then developed an openly accessible web application to allow users to search the proof-of-concept MDR.
Results
A total of 850,343 variables were collected and included in the repository, with most coming from Ontario (n=712,037, 83.7%) and Manitoba (n=97,051, 11.4%) provincial data holdings. Among all variables in the repository, 213,696 (25.1%) were confirmed to be language related.
Conclusions
Developing a national MDR would be a transformative opportunity for Canadian researchers to leverage the full scope of Canadian health administrative data. Although a top-down approach with consistent engagement of and collaboration between provincial data holdings and federal data agencies is ideal to develop a national MDR, this study demonstrates the feasibility of a bottom-up approach in contributing to this overarching goal.
Publication
JMIR Infodemiology
Date
2026-2-9
Volume
6
Issue
1
Pages
e77242
Journal Abbr
JMIR Infodemiology
DOI
Accessed
3/9/26, 2:01 PM
ISSN
2564-1891
Short Title
Health Data for Linguistic Minority Group Research in Canada
Language
en
Library Catalogue
DOI.org (Crossref)
Citation
Martin-Schreiber, V., Peixoto, C., Batista, R., Belanger, C., Tanuseputro, P., Hsu, A. T., & Bjerre, L. M. (2026). Health Data for Linguistic Minority Group Research in Canada: Proof-of-Concept Centralized Health Care Metadata Repository Development and Usability Study. JMIR Infodemiology, 6(1), e77242. https://doi.org/10.2196/77242
Minority language group(s)
Country
Research type
- Quantitative
Link to this record