Data availability and access

Data reuse

The International Human Epigenome Consortium (IHEC) aims to coordinate rapid distribution of the data generated by IHEC to the entire research community with minimal restrictions to accelerate the translation of new knowledge of health and diseases. IHEC members are committed to the principles of rapid data release to the scientific community and aim to deliver large datasets on human epigenomes in health and disease, thereby establishing a unique data resource that will be freely accessible.

Where can I get the data?

Our data is available both from the EGA sequence archives and from our own web portal, and through data mining and browsing tools. Two categories of McGill EMC data are available:

  • Controlled access data: Raw sequence data (FASTQ files) is archived with the EGA. Access is available upon successful application to the DAC. Our EGA Study page lists all publicly released datasets that can be obtained from the EGA following an application to the DAC. This list is accessible here: https://www.ebi.ac.uk/ega/studies/EGAS00001000995.
  • Open access data: All other data is available through regular public releases. This includes gene expression, methylation and chromatin state data. Visualization and download tools for this type of data are available on the McGill EMC portal at: http://epigenomesportal.ca/edcc.

Acknowledgment and citation of McGill EMC Data

Manuscripts

We ask that all use of data, whether obtained through controlled or open access, be acknowledged as follows in the methods section of manuscripts, if possible, or elsewhere in the main text of the manuscript:

This research used data shared by the McGill Epigenomics Mapping Centre and it is available from the European Genome-phenome Archive of the European Bioinformatics Institute (accession numbers: study EGAS00001000995 and dataset(s) EGAD00...).

If you used the IHEC Data Portal to identify, download, visualize or analyze data, please cite:

Bujold et al. "The International Human Epigenome Consortium Data Portal." Cell Systems 3.5 (2016): 496-499.

Please also cite the source of the data in any manuscript based on its analysis, as follows:

McGill Epigenomics Mapping Centre (2015). Dataset from EGA Study EGAS00001000995 [Data file]. Available from http://epigenomesportal.ca/edcc.

Posters and Presentations

Please acknowledge the use of McGill EMC data in any poster or oral presentation.

Accessing the McGill EMC raw sequence data (controlled access data)

To protect the interests of research participants, some IHEC datasets are only available to researchers after an application for access has been approved by a ‘Data Access Committee’. This data is called ‘controlled access’ data and approved researchers must agree to specific conditions for using it such as keeping it secure and only using it for approved purposes. Controlled access IHEC datasets are archived at the EGA (the European Genome-phenome Archive) and at dbGaP (the Database of Genotypes and Phenotypes) for data generated by US producers.

The McGill Epigenomics Mapping Centre Data Access Committee will review requests for access to controlled access IHEC data stored at the EGA by the McGill Epigenome Data Coordination Centre. To apply for access to this controlled access data, please fill the Application for Access to McGill EMC Controlled Data Form (accessible from the top right of this page), and send it by e-mail to dac.edcc (at) mail.mcgill.ca.

You will be asked to provide the following details:

  • The principal applicant’s name, title, position, affiliation, email address, institutional website and mailing address.
  • The name, title, position, affiliation, email address and mailing address of the applicant’s institutional representative.
  • The name, title, position, affiliation, and email address of any personnel or students who need access to the data.
  • Any other information needed for unique authentication of the applicant, personnel and students.
  • A scientific abstract providing a brief (approx. 500-word) overview of the research to be carried out, including the proposed uses of the data requested.
  • A list of three publications authored or co-authored by the principal applicant which are relevant to the current project.
  • A lay summary describing the project to a general audience.
  • A letter of ethics approval if required by the applicant’s country or home institution.

Protocols

A description of the protocols used to produce datasets is available here: