BART

Binding Analysis for Regulation of Transcription


About BART

BART (Binding Analysis for Regulation of Transcription) is a bioinformatics tool for predicting functional transcriptional regulators (TRs) that bind at cis-regulatory regions to regulate gene expression in human or mouse, taking a query gene set, a ChIP-seq dataset or a scored genomic region set as input.

BART leverages 7,968 human TR binding profiles and 5,851 mouse TR binding profiles from the public domain (collected in Cistrome Data Browser) to make the prediction.

BART is implemented in Python and distributed as an open-source package along with necessary data libraries. BART package is also available on Github.

BART is developed and maintained by the Chongzhi Zang Lab at the University of Virginia.


Web Interface (Beta)

BART web interface can be accessed here.


Download

BART Package

Current version: 2.0, updated May 15, 2020

BART2.0 for Python3 – Source code only (need to download data library separately)


BART Data Libraries

The human genome hg38 and the mouse genome mm10 are supported.

hg38 library
mm10 library


Supplementary Data

The union DNaseI hypersensitive sites (UDHS) used in the BART model. (They are NOT required for BART installation.)

hg38 UDHS
mm10 UDHS


Installation

Prerequisites

BART uses Python's distutils tools for source installation. Before installing BART, please make sure Python3 and the following python packages are installed. We highly recommend the Anaconda environment, which include all the required python packages.


Download the source package and setup the configuration file

You have to download the Human or Mouse Data Library under your own directory before install BART. The unpacked libraries occupy 14GB hard drive storage in the download directory. Please download from the link provided above or use the following command lines.

$ wget https://faculty.virginia.edu/zanglab/bart/hg38_library.tar.gz
$ wget https://faculty.virginia.edu/zanglab/bart/mm10_library.tar.gz

To install a source distribution of BART, unpack the distribution tarball and go to the directory where you unpacked BART.

$ tar zxf bart_v2.0.tar.gz
$ cd bart_v2.0

Modify the configure file (bart2/bart.conf). For example, if you download the hg38_library.tar.gz (and/or mm10_library.tar.gz) and unpack it under /path/to/data, then you can modify the bart.conf file as:

[path]
hg38_library_dir = /path/to/data/
mm10_library_dir = /path/to/data/


Global installation

Install with root/administrator permission, or you have the Anaconda environment prepared. By default, the script will install python library and executable codes globally.

$ python setup.py install


Local installation

If you want to install everything under a specific directory, for example, a directory as /path/to/bart2/, use the following commands.

$ mkdir -p /path/to/bart2/lib/pythonX.Y/site-packages
$ export PYTHONPATH=/path/to/bart2/lib/pythonX.Y/site-packages/:$PYTHONPATH
$ python setup.py install --prefix /path/to/bart
$ export PATH=/path/to/bart2/bin/:$PATH

In this value, X.Y stands for the major–minor version of Python you are using (such as 3.5; you can find this with sys.version[:3] from a Python command line).

You’ll need to modify the environment variables and add those lines in your bash file (varies on each platform, usually is ~/.bashrc or ~/.bash_profile).

$ export PYTHONPATH=/path/to/bart2/lib/pythonX.Y/site-packages/:$PYTHONPATH
$ export PATH=/path/to/bart2/bin/:$PATH


Tutorial

Positional arguments

{geneset,profile,region}


bart2 geneset

Given a query gene set in official gene symbols (HGNC for human or MGI for mouse) in text format (each gene in a row, at least 100 genes recommended), predict functional TRs that regulate these genes.

Usage:

bart2 geneset -i genelist.txt -s hg38 --outdir bart2_output


bart2 profile

Given a ChIP-seq data file (mapped reads in BAM or BED format in either hg38 or mm10), predict TRs whose binding pattern associates with the input ChIP-seq profile.

Usage:

bart2 profile -i ChIP.bam -f bam -s hg38 --outdir bart2_output


bart2 region

Given a scored genomic region set (BED format in either hg38 or mm10), predict TRs enriched in this genomic region set.

Usage:

bart2 region -i ChIPpeak.bed -c 4 -s hg38 --outdir bart2_output


Output files

    *_adaptive_lass_Info.txt provides regression information tells which representative H3K27ac samples are selected along with coefficients through adaptive lasso regression and sample annotations including cell line, cell type or tissue type. This is the output only generated in geneset mode.

    *_CRE_prediction_lasso.txt is the predicted cis-regulatory profile of the input gene set and is a ranked list of all CREs (UDHS) in the genome. The higher the score, the more likely the regulatory element regulates the input gene set. This is the output only generated in geneset mode.

    *_auc.txt provides the association score of each of the TR ChIP-seq dataset with the genome cis-regulatory profile.

    *_bart_results.txt is a rank of all TRs with multiple quantification scores.

    An example of BART2 output can be found here.


Frequently Asked Questions

Please sign up to BART users Google Group for update announcements and discussions.


Citation

If you use BART in your data analysis, please cite:

BART: a transcription factor prediction tool with query gene sets or epigenomic profiles
Zhenjia Wang, Mete Civelek, Clint Miller, Nathan Sheffield, Michael J. Guertin, Chongzhi Zang
Bioinformatics 34, 2867–2869 (2018)


Contact

BART is developed and maintained by the Chongzhi Zang Lab at the University of Virginia. Please email us for any questions.


Last modified: May 16, 2020