# Dense mark activity matrices (3.5M+ DHSs x 833 biosamples) in hdf5 format. Loadable with h5py in Python or rhdf5 in R. # Manifest: ## Rows and columns: masterlist_DHSs_733samples_WM20180608_all_coords_hg19_epimap_annotated.tsv masterlist_DHSs_733samples_WM20180608_all_chunkIDs2indexIDs.txt masterlist_DHSs_733samples_WM20180608_all_coords_hg19_r25_e100_names.core.srt.tsv masterlist_DHSs_733samples_WM20180608_all_coords_hg19.core.srt.txt mark_matrix_names.txt mnemonic_mapping.tsv ## HDF5 Files: H3K4me2_all_bin_dense_on_mixed_impobs_r25_e100_allchr_merged.hdf5 H3K4me3_all_bin_dense_on_mixed_impobs_r25_e100_allchr_merged.hdf5 H3K9ac_all_bin_dense_on_mixed_impobs_r25_e100_allchr_merged.hdf5 H3K4me1_all_bin_dense_on_mixed_impobs_r25_e100_allchr_merged.hdf5 H3K27ac_all_bin_dense_on_mixed_impobs_r25_e100_allchr_merged.hdf5 DNase-seq_all_bin_dense_on_mixed_impobs_r25_e100_allchr_merged.hdf5 Enhancer_H3K27ac_intersect_matrix.hdf5 # Enhancer/Promoter matrices and indices: Enhancer_H3K27ac_intersect_matrix.mtx.gz Promoter_H3K27ac_intersect_matrix.mtx.gz Promoter_H3K27ac_intersect_matrix.names.tsv Enhancer_H3K27ac_intersect_matrix.names.tsv DYADIC_masterlist_indices_0indexed.tsv PROM_masterlist_indices_0indexed.tsv ENH_masterlist_indices_0indexed.tsv PROM_masterlist_locations.bed ENH_masterlist_locations.bed DYADIC_masterlist_locations.bed # File descriptions: *_all_bin_dense_on_mixed_impobs_r25_e100_allchr_merged.hdf5 - Dense HDF5 matrix with dimensions 3.5M+ (DHSs) by 833 (biosamples) - Matrix of the average bedgraph signal in the 200bp region centered on the DHS midpoint mark_matrix_names.txt - Column names (biosamples) for the HDF5 matrices mnemonic_mapping.tsv - Mapping from biosample ids to mnemonics for each of the 833 samples masterlist_DHSs_733samples_WM20180608_all_coords_hg19_epimap_annotated.tsv - Overall table of DNase Hypersensitive Sites, locations, row (for slicing hdf5), and names - DHS masterlist at: http://www.meuleman.org/project/dhsindex/ - Columns: chr start end row name id is.enh is.prom is.dyadic strand - NOTE: This is a merged version of the three files below, and is preferable to use over those: masterlist_DHSs_733samples_WM20180608_all_coords_hg19.core.srt.txt - DNase Hypersensitive Sites (rows) names and locations for HDF5 matrices - DHS masterlist at: http://www.meuleman.org/project/dhsindex/ - Columns: chr start end old_identifier - NOTE: Not sorted according to the HDF5 matrix order masterlist_DHSs_733samples_WM20180608_all_coords_hg19_r25_e100_names.core.srt.tsv - DNase Hypersensitive Sites (rows) names and locations for HDF5 matrices - DHS masterlist at: http://www.meuleman.org/project/dhsindex/ - Columns: chr row old_identifier - NOTE: Sorted according to the HDF5 matrix order masterlist_DHSs_733samples_WM20180608_all_chunkIDs2indexIDs.txt - Mapping between chunk names and DHS IDs as in DHS masterlist (at http://www.meuleman.org/project/dhsindex/) - Columns: DHSid old_identifier Promoter_H3K27ac_intersect_matrix.hdf5 - HDF5 matrix with dimensions 3.5M+ (DHSs) by 833 (biosamples) - Sparse matrix of intersection of promoters (ChromHMM) in DHSs with high H3K27ac signal in the 200bp region centered on the DHS midpoint - Promoter states used in the 18-state model: 1,2,3,4,14 Promoter_H3K27ac_intersect_matrix.mtx.gz - Above hdf5 matrix in MatrixMarket format (https://math.nist.gov/MatrixMarket/formats.html) Promoter_H3K27ac_intersect_matrix.names.tsv - Names of epigenomes (columns) in above matrix Enhancer_H3K27ac_intersect_matrix.hdf5 - HDF5 matrix with dimensions 3.5M+ (DHSs) by 833 (biosamples) - Sparse matrix of intersection of enhancers (ChromHMM) in DHSs with high H3K27ac signal in the 200bp region centered on the DHS midpoint - Enhancer states used in the 18-state model: 7,8,9,10,11,15 Enhancer_H3K27ac_intersect_matrix.mtx.gz - Above hdf5 matrix in MatrixMarket format (https://math.nist.gov/MatrixMarket/formats.html) Enhancer_H3K27ac_intersect_matrix.names.tsv - Names of epigenomes (columns) in above matrix PROM_masterlist_indices_0indexed.tsv ENH_masterlist_indices_0indexed.tsv DYADIC_masterlist_indices_0indexed.tsv - DHS indexes corresponding to promoter, enhancer, and dyadic elements - 0-indexed, add 1 to indices for use in R and Matlab, keep as is for Python and similar PROM_masterlist_locations.bed ENH_masterlist_locations.bed DYADIC_masterlist_locations.bed - All DHS locations for promoter, enhancer, and dyadic elements - Slices of masterlist_DHSs_733samples_WM20180608_all_coords_hg19_r25_e100_names.core.srt.tsv with locations added