The tracks available in this set have been generated by the Centre for Epigenome Mapping Technology (CEMT) at Canada's Michael Smith Genome Sciences Centre (BCGSC) as a part of the contribution of Canadian Epigenetics, Environment and Health Research Consortium (CEEHRC) to the International Human Epigenome Consortium (IHEC).
The data tracks represent raw signal generated from aligned reads. Access to the raw data underlying the tracks is controlled via European Bioinformatics Institute. The data will be submitted to CEMT Reference Epigenomes (Study: EGAS00001000552) as it becomes available. More information about the project is available at www.epigenomes.ca.
Sample metadata is available at: CEMT Samples.
The wet lab protocols used are described in protocols.
Please note the following only applies to CEMT sample; for REMC samples see section "Methylation data cross-assay standardization and uniform processing for consolidated epigenomes" of Roadmap Epigenomics Consortium - Integrative analysis of 111 reference human epigenomes.
The protocol for WGBS assays was paired end. Three lanes of sequence data where used to create each library. The adaptors were trimmed off the raw reads, then fastq files corresponding to the two mate pairs were generated for each lane. The data from all three lanes was merged, and the reads were aligned to GRCh37-lite reference using Novoalign (version 3.01.00) and converted to bam format with SAMtools (version 0.1.13). The bams were annotated using in-house tools (including flagging of chastity failed reads) and the duplicates were marked using Picard Tools' MarkDuplicates.jar (version 1.71). Novoalign and other in-house tools were used to generate fractional calls and coverage analysis. Compressed wig tracks were generated from bam through in-house tools using SAMtools flags "-F 1028 -q 5" and GRCh37-lite chromosome names were changed to UCSC chromosome names. The wigs files were converted to bigwigs using UCSC tools.
The display comprises of coverage on the negative y axis, and the fractional calls (scaled between 0-10) on the positive with following color scheme:
Where possible information about the tracks and data is included in the title. However, due to constraints on the length of these fields, this is sometimes not possible. The library information for each track is still included as the first field in the colon delimited title string. Please refer to the metadata table included on this page to look up further details by the library.
When analyzing data from different sources, please note that underlying data processing and handling procedures may be different.
Please direct any questions to: firstname.lastname@example.org