dREG Gateway

dHIT2 Documentation

1 Login
The user needs to log in by clicking 'Log in' link at the top-right corner of the page. Having an account provides a number of benefits, and is free and easy.

Figure 1: Login page

2 Create a new experiment
Select the dHIT2 application on the Dashboard panel to create a data analysis for your data, as the following screenshot (Figure 2).

Figure 2: dHIT2 dashboard

3 Set experiment name
Set "Experiment Name", and click "Add a description" to comment on the experimental setup page (optional). Choose the project that the experiment belongs to. By default, the "Default Project" is created and used.

Figure 3: Start new dHIT2 experiment

4 Upload bigWig files
There are two ways users may use to upload bigWigs.
(1) Click "Select files from storage" to choose existing files submitted for previous tasks, or
(2) click "Drop files here or browse" to upload new files from user's storage. Note that the bigWig files of run-on sequencing are strand-specific, and hence the ordering of bigWig files needs to be matched for plus and minus strands within each condition. Additionally, as tfTarget uses DESeq2 to model differential transcription and perform hypothesis testing, at least two replicates are required for each condition.

Figure 4: Upload bigWig files

5 Set computing parameters
(1) Specify the genome assembly of bigWig files. Information of genome assemblies is required for defining gene bodies for quantifying their transcriptional activity and computing motif enrichment scores. Currently, only hg19, hg38, mm9 and mm10 are supported. If additional genome assemblies are needed, please submit a request to the admin.
(2) Specify the prefix of the output files. This can help distinguish results from multiple experiments. (3) Specift the resolution of output files. Currently, only 16 bp and 128 bp are supported.

Figure 5: Set computing parameters

6 Submit the job
Once steps 1-5 are finished, proceed to "Save and launch". Input data and parameters will be submitted to the computing node of the XSEDE cluster via the dREG gateway server. Click the checkbox next to "Receive email notification of experiment status" if needed. dREG peak calling will be automatically performed on bigWigs merged for each condition. Upon launching, users will be directed to the "Experiments" page, shown in Fig. 4. A typical experiment usually finishes within 4 hrs. Users may view the progress by logging in and clicking the "Experiment" button on the left control panel at the dashboard.

7 Check the status
Users may view the progress by logging in and clicking the "Experiments" button on the left control panel at the dashboard. All experiments submitted are listed on this page.

Figure 6: Check the experiment status

8 Check the results
Once a job is completed, the user can click selected dHIT2 experiment and the website will jump to Experiment Summary page. All parameters used to set up the experiment are listed on this page. The user can also access output files of dHIT2 stored in the ARCHIVE, and the predicted results in bigWig format are stored in the predicted_bw. Just click the ARCHIVE to check any single result file. A compressed file, including input bigWigs file set, two task log files and all result files, is also provided for users. Click Download Zip button to download a compressed file. The downloaded file with the 'tar.gz' extension can be decompressed by the 'tar' command, the file with the 'gz' extension can be decompressed by the 'gunzip' command in Linux.
In Safari, it could be problematic because Safari tries to unzip the compressed results automatically using a non-compatible compress method. Please check this link to disable this feature.

Figure 7: dHIT2 Archive

The input to dHIT2 consists of two condictions' bigWig files which represent the position of RNA polymerase on the positive and negative strands. The sequence alignment and processing steps to make the input bigWig files can refer to the dREG service.

1 dHIT2 parameter list

Note: The users can use the default parameters. This list provides the deatails which could be useful for advance users.

Parameter name	Description
Prefix for output file	This can help distinguish results from multiple experiments.
Reference genome	When dHIT2 is used to predict histone modification signals, information from the reference genome can optionally be used to assist dHIT2 in prediction. The user needs to select a reference genome that matches the input bigWig file. Default='None'.
Resolution	Default=128

1 tfTarget output list

Note: All files below are stored in the "ARCHIVE" directory.

File name	Description
job_XXX.slurm	The command used by the system to invoke dHIT2.
slurm-XXX.out	The execution log of this experiment.

dREG Gateway is online service that supports Web-based science through the execution of online computational experiments and the management of data. The items below are trying to answer qustions from the users

Q: How should I prepare bigWig files for use with the dREG gateway?

A: Information about how to prepare files can be found here .

Q: How should I do when I meet the computational failure in the dREG gateway?

A: There are two types of error you may have, we explain how to identify your error and how to handle it here.

Q: Which browser works well with the dREG gateway?

A: We have tested in the Firefox, Google Chrome and Safari so far. For IE (version 10 or 11) and some version of Safari, you maybe have trouble showing sequence data in WashU genome browser. For Safari users, please read next Q&A.

Q: What should the Safari users be aware of?

A: By default, Safari unzips a zip file automatically when you download it. However dREG results are compressed by the 'bgzip' command which is not compatiable with the Safari method. It would be probelmatic when you download dREG results. Please refer to this link to disable this feature in Safari and then download the compressed results from dREG gateway.
Secondly, when you click the genome browser link, please use the Left-Click, don't use Right-Click menu and the menu option "open a new tab".

Q: What types of enhancers and promoters can be identified using the dREG gateway?

A: As a general rule of thumb, high-quality datasets provide very similar groups of enhancers and promoters as ChIP-seq for H3K27ac. This suggests that dREG identifies the location of all of the so-called 'active' class of enhancers and promoters.

Q: Will the dREG gateway work with my data type?

A: The dREG gateway will work well with data collected by any run-on and sequencing method, including GRO-seq, PRO-seq, or ChRO-seq. Other methods that map the location of RNA polymerase genome wide using alternative tools (for example, NET-seq) will most likely work well, but are not officially supported.

Q: Will the pre-trained models work using data from my species?

A: Models are currently available only in mammalian organisms. The length and density of genes, which vary considerably between highly divergent species, affects the way that a transcribed promoter or enhancer looks. For this reason, models can only be used in species. We are working to create models in widely-used model organisms, including drosophila and C. elegans.

Q: How deeply do I need to sequence PRO-seq libraries?

A: Sensitivity is reasonable at ~40 million mapped reads and saturates at ~100 million mapped reads. See our analysis here: supplementary figure 3 in dREG paper.

Q: How long do my data and results keep in the dREG gateway?

A: One month.

Q: How to I cite the dREG gateway?

A: Please cite one of our papers if you use dREG results in your publication:

A: Please cite one of our papers if you use dREG results in your publication:
(1) Wang, Z., Chu, T., Choate, L. A., & Danko, C. G. (2019). Identification of regulatory elements from nascent transcription using dREG. Genome research, 29(2), 293-303.

(2) Danko, C. G., Hyland, S. L., Core, L. J., Martins, A. L., Waters, C. T., Lee, H. W., ... & Siepel, A. (2015). Identification of active transcriptional regulatory elements from GRO-seq data. Nature methods, 12(5), 433-438.

Q: Do I have to create account before using this service?

A: Yes, this system is supported by an NSF funded supercomputing resource known as XSEDE, who regularly needs to report bulk usage statistics to NSF. Nevertheless, data that you provide are completely safe.

Q: How do I know the status of the computational nodes?

A: Since we can't update this web site very often, the gateway status is updated here on the dREG page based on the notifications of the XSEDE community.

Q: Who do I thank for the computing power?

A: This web-based tool is powered by SciGaP and Apache Airavata and the GPU servers are supported by the XSEDE.

Q: I have another question that is not on this FAQ. How can I contact you?

A: Yes, please contact us with any questions! Zhong(zw355 at cornell.edu). Charles(cgd24 at cornell.edu).