APIs

In keeping with our mission of democratizing access to genetic data and facilitating worldwide research, we are developing software (APIs) so that researchers can query summary statistics and other results programmatically. 

The available APIs are described on this page, and may be accessed here. Currently, the APIs query data that were last updated in summer 2020.

For programmatic access to current data, try our Lunaris tool. You can access it here or via a link on any Region page of the Portal. Instructions are here.

Visit API User Interface


General considerations for using APIs

Data queried by the APIs

Some of the APIs query genetic association datasets stored at the AMP CMD Data Coordinating Center (DCC) at the Broad Institute, relevant to types 1 and 2 diabetes; cerebrovascular, cardiovascular, and lung disease; sleep disorders; and musculoskeletal disorders.

Other APIs query either the results of computational analyses performed at the AMP CMD DCC, or data downloaded from external sources.

API paths

APIs whose path is identical except for ending in "array" or "object" generate the same results in differing formats.

Entering phenotypes

Many of the APIs take a phenotype code as input. See a list of all phenotypes and their codes. Codes for some common phenotypes:

  • type 2 diabetes, T2D
  • body mass index, BMI
  • fasting glucose, FG
  • fasting insulin, FI
  • triglycerides, TG
  • LDL cholesterol, LDL
  • HDL cholesterol, HDL
  • waist/hip ratio, WHR
  • two-hour insulin, 2hrI
  • hip circumference, HIPC
  • chronic kidney disease, CKD
  • coronary artery disease, CAD
  • height, HEIGHT
  • waist circumference, WAIST

Citing the APIs

If you cite in a scientific publication results generated using these data and APIs, please do so in the following format:

Portal name. Year/Month/Date of access; Portal URL.

If your analysis uses specific datasets, please also cite the original publications for those datasets.


Currently available APIs


APIs that access genetic association results

 

Variants by chromosome region and phenotype

API: /getAggregatedDataSimple
Queries: genetic association results stored at the AMP T2D DCC
Use: specify a chromosomal region (up to 3 Mb) and a phenotype code; retrieve the most significantly associated variants across the region, along with the name of the dataset in which the association was seen, standard error, effect size, predictions of variant impact, and more

 

Meta-analysis

API: graph/meta/variant/array or graph/meta/variant/object
Queries: bottom-line meta-analysis results generated by application of the METAL software to the datasets in the DCC upon each data release (bimonthly), as documented here. METAL software, developed at the University of Michigan, performs meta-analysis while accounting for sample overlap between datasets.
Use: specify a variant, in the format chromosome_coordinate_referenceAllele_alternateAllele (e.g., 8_118184783_C_T); retrieve "bottom-line" p-values and effect sizes for its association with each phenotype (see a list of all phenotypes and their codes)

 

Variant prioritization

API: graph/prioritization/variant/array or graph/prioritization/variant/object
Queries: bottom-line meta-analysis results generated by application of the METAL software, accounting for sample overlap, to the datasets in the DCC, as documented here
Use: specify a phenotype code and a genomic region (chromosome number, start coordinate, end coordinate); retrieve bottom-line p-values, standard error, effect size, and several predictions of variant impact for the variants associated with the phenotype in that region

 

Custom association analysis

API: burden/v1
Queries: one of 6 burden datasets
Use: Conduct an aggregate association analysis, using a dataset of choice, between a specified set of variants and a specified phenotype. The phenotype is regressed on the combined variant score.

Description of parameters:

  • covariates: Covariates to include in the regression
  • dataset_id: The dataset in which to conduct the analysis (see below)
  • variants: The set of variants to include in the analysis
  • phenotype: The phenotype to test
  • operation: The procedure to apply across genotypes (for each sample) to aggregate variants into a variant score. The default setting "operation":"sum" applies the additive burden test method; "operation":"max" applies the collapsing burden test method.
  • ci_level / calc_ci: The confidence interval to calculate (and whether to calculate it)

Datasets:

All datasets are described in detail on the Datasets page.

  • enter 'samples_camp_mdv39' to specify the CAMP GWAS dataset
  • specifying 'camp_grs_dv1' uses the CAMP GWAS dataset but limits the variants analyzed to those in a set of 243 variants associated with T2D risk in subjects of European ancestry, as defined by Mahajan et al. (2018)
  • enter 'samples_metsim_mdv39' to specify the METSIM GWAS dataset
  • specifying 'metsim_grs_dv1' uses the METSIM GWAS dataset but limits the variants analyzed to those in a set of 243 variants associated with T2D risk in subjects of European ancestry, as defined by Mahajan et al. (2018)
  • enter 'samples_biome_mdv39' to specify the BioMe AMP T2D GWAS dataset
  • specifying 'biome_grs_dv1' uses the BioMe AMP T2D GWAS dataset but limits the variants analyzed to those in a set of 243 variants associated with T2D risk in subjects of European ancestry, as defined by Mahajan et al. (2018)
  • enter 'samples_fusion_mdv39' to specify the FUSION GWAS dataset
  • enter 'samples_singapore_mdv39' to specify the Diabetic Cohort - Singapore Prospective Study GWAS dataset
  • enter 'samples_55k_multi' to specify the AMP T2D-GENES exome sequence analysis dataset

APIs that access results of computational methods

 

DEPICT

All DEPICT APIs query results that were generated by application of the DEPICT software (Pers, TH, et al., 2015) to a subset of the datasets in the DCC in Fall 2018.

API: /testcalls/depict/genepathway/array or /testcalls/depict/genepathway/object
Use: specify a gene, a phenotype code, and a threshold p-value ("lt_value"); retrieve gene sets of which the gene is a member that are enriched for associations with that phenotype at a significance below the specified p-value

API: /testcalls/depict/region/array or /testcalls/depict/region/object
Use: specify a gene and a phenotype code; retrieve a p-value for association of the gene with the phenotype

API: /testcalls/depict/tissue/array or /testcalls/depict/tissue/object
Use: specify a phenotype code; retrieve tissues enriched for associations with it, along with the significance of the predicted associations

API: /testcalls/depict/pathway/array or /testcalls/depict/pathway/object
Use: specify a phenotype code; retrieve gene sets enriched for associations with that phenotype, along with p-values for the associations

 

eCAVIAR

API: testcalls/ecaviar/{colocalization}/array or testcalls/ecaviar/{colocalization}/object
Queries: results generated by application of the eCAVIAR software (Hormozdiari, F, et al., 2016) to the datasets in the DCC in Spring 2019
Use: specify a gene and a phenotype code, and choose whether to retrieve all colocalizations or only the most significant ("colocalization_max"); retrieve variants in the gene that represent colocalized GWAS and eQTL signals in a given tissue, and the colocalization posterior probability of the prediction

 

LD score

API: testcalls/ldscore/tissue/array or testcalls/ldscore/tissue/object
Queries: results generated by application of LD score regression (Finucane, H, et al., 2015) to the datasets in the DCC in Spring 2019.
Use: specify a phenotype code; retrieve a list of tissues enriched for associations with that phenotype along with effect sizes and p-values for the associations

 

MAGMA

API: /testcalls/magma/gene/array or /testcalls/magma/gene/object
Queries: results generated by application of the MAGMA software (de Leeuw, CA, et al., 2015) to the datasets in the DCC in Spring 2019.
Use: specify one or more phenotype codes; retrieve a list of gene-level association results, with a p-value for each gene. Optionally, specify gene names(s) to restrict the results to those genes.

 

GREGOR

API: graph/gregor/phenotype/array or graph/gregor/phenotype/object
Queries: results generated by application of the GREGOR software (Schmidt, EM et al., 2015) to the datasets in the DCC upon each data release (bimonthly)
Use: specify a phenotype code; retrieve tissues enriched for trait associations, with their enrichment p-values and effect sizes per chromatin state and ancestry

 

Region 

API: graph/region/variant/array or graph/region/variant/object
Queries: chromatin state annotations reflecting the regulatory potential of specific chromosomal regions, generated by Varshney et al. 2017
Use: specify one or more variants in the format chromosome_coordinate_referenceAllele_alternateAllele (e.g., 8_118184783_C_T); optionally, specify a tissue name (use "tissue" to retrieve all; view a list of all tissues); optionally, specify a chromatin state (use "annotation" to see all); retrieve tissue-specific chromatin state annotations that overlap the variant(s) that were specified


Other APIs

 

GTEx

API: testcalls/gtex/gene_weight/array or testcalls/gtex/gene_weight/object
Queries: tissue-specific gene expression data downloaded from GTEx in Spring 2019
Use: specify one or more genes; retrieve expression levels in multiple tissues

 

Knockout

API: testcalls/knockout/array or testcalls/knockout/object
Queries: mouse phenotype annotations from the Knockout Mouse Project, downloaded from the Mouse Genome Informatics (MGI) database in Spring 2019.
Use: specify a human gene; retrieve the gene ID and name of the mouse homolog, as determined using the Homologene algorithm, along with null mutant phenotype annotations (Mammalian Phenotype Ontology ID and term) and other information about the mouse gene

 

Phenotype list

API: /graph/phenotype/list/array or /graph/phenotype/list/object
Queries: phenotypes for which genetic associations exist in the datasets stored at the DCC
Use: retrieve a list of all phenotypes that includes, for each phenotype, its phenotype group (e.g., "GLYCEMIC"), internal code, user-facing name, and whether or not it is a dichotomous phenotype

 

Tissue list

API: graph/tissue/list/array or graph/tissue/list/object
Queries: tissues or cell lines for which there are annotations in the datasets or results displayed in the Knowledge Portal Network
Use: retrieve the name of each tissue and its ID from the Cell Ontology (CL), the Experimental Factor Ontology (EFO), or the Uber-anatomy ontology (UBERON)