We welcome submissions of genetic association summary statistics for traits that are relevant to the Knowledge Portals. We have analyzed the privacy risks inherent in sharing summary statistics in the Knowledge Portals and found that they are extremely low (read our white paper).
Upon receipt of your summary statistics we will integrate them into the Knowledge Portal database, making them available for browsing and querying via the interfaces, tools, and APIs of the relevant Portal(s). We will share with you a preview portal for your review before releasing the results on the open-access Portal(s).
At your request, we are also able to provide the summary statistic files for public download from the Portal. If you would like us to provide these files, please let us know when submitting your dataset.
We can also accept raw, individual-level data for analysis or extended processing. For more information on submitting such data, see these detailed instructions written for the Type 2 Diabetes Knowledge Portal.
We are also interested in receiving other 'omics datasets: epigenomic modifications, transcript levels and tissue-specific expression, chromatin conformation, proteomics, and more. Please contact us about these types of data.
Summary statistic file formats
Our preferred file format includes:
- position (hg19; if your results are not in the hg19 genome version we can perform LiftOver in either direction)
- (rsID may be substituted for chromosome and position)
- reference allele
- alternate allele
- effect size (or, if not available, + or - direction of effect)
- sample size for each variant
- effect allele frequency (or, if not available, minor allele frequency)
- allele count for cases, if relevant
- allele count for controls, if relevant
The minimum information that we need to be able incorporate results into the Portal is:
- effect allele
- alternate allele
- direction of effect
Please submit files in .txt format, compressed if necessary.
So that we can document the dataset appropriately, we would also like to have:
- a brief description of the dataset
- the total sample size, and sample size for each phenotype
- definitions of the phenotypes assayed
- ancestry of the participants
- reference to a publication describing the study, if available
- image file(s) for logo(s) of consortia involved, if you would like us to display them in the dataset documentation
- if you would like us to provide the summary statistic files for download, please also supply a README file to accompany them
- When you are ready to submit data, please contact us. Data files may be transferred via email, Dropbox, any other means of file transfer that you use, or via an Aspera site that we can set up.
- Let us know whether you would like us to to provide the files for download as well as integrating the results in the Portal.
- We will review the files and get back to you with any questions.
- We will load the dataset and its documentation in a preview version of the Portal and will give you the opportunity to review them before making them available in the open-access Portal.