Skip to contents

This function downloads gene sets from specified Enrichr databases and saves them to a specified output directory as a .tsv file per default. The file is named with a timestamp per default to ensure uniqueness (all databases are stored in a single file). This file has 3 columns: DB containing the database name, Geneset, with the genesets, and Gene, with the gene names.

Usage

download_enrichr_databases(
  gene_set_lib,
  output_dir = here::here(),
  filename = NULL
)

Arguments

gene_set_lib

character(): A character vector of database names to download from Enrichr, for example: c("WikiPathways_2019_Human", "NCI-Nature_2016").

output_dir

character(1): A string specifying the output directory where the .tsv file will be saved. Defaults to the current project directory as defined by here::here().

filename

character(1): Name of the output file (with file extension). Due to commas present in some terms, .tsv is recommended. When left out, the file is named all_databases_timestamp.tsv.

Value

A data.frame of gene set annotations with three columns:

DB

Database name (e.g. "WikiPathways_2019_Human", "NCI-Nature_2016").

Geneset

The gene set or pathway term from that database.

Gene

A gene contained in the gene set.

In addition to returning the data.frame, the function also writes the same table to disk as a .tsv file in the specified output_dir.

Examples

if (interactive()) {
  libs <- c("WikiPathways_2019_Human")
  out <- download_enrichr_databases(
    gene_set_lib = libs,
    output_dir = tempdir(),
    filename = "enrichr_demo.tsv"
  )
  head(out)
}