Changes in deepTools2.0¶
Major changes¶
Note
The major changes encompass features for increased efficiency, new sequencing data types, and additional plots, particularly for QC.
Moreover, deepTools modules can now be used by other python programs. The deepTools API example is part of the new documentation.
Accommodating additional data types¶
correlation and comparisons can now be calculated for bigWig files (in addition to BAM files) using
multiBigwigSummaryandbigwigCompareRNA-seq: split-reads are now natively supported
MNase-seq: using the new option
--MNaseinbamCoverage, one can now compute read coverage only taking the 2 central base pairs of each mapped fragment into account.
Structural updates¶
All modules have comprehensive and automatic tests that evaluate proper functioning after any modification of the code.
Virtualization for stability: we now provide a
dockerimage and enable the easy deployment of deepTools via the Galaxytoolshed.Our documentation is now version-aware thanks to readthedocs and
sphinx.The API is public and documented.
Renamed tools¶
heatmapper to plotHeatmap
profiler to plotProfile
bamCorrelate to multiBamSummary
bigwigCorrelate to multiBigwigSummary
bamFingerprint to plotFingerprint
Increased efficiency¶
We dramatically improved the speed of bigwig related tools (multiBigwigSummary and
computeMatrix) by using the new pyBigWig module.It is now possible to generate one composite heatmap and/or meta-gene image based on multiple bigwig files in one go (see computeMatrix, plotHeatmap, and plotProfile for examples)
computeMatrixnow also accepts multiple input BED files. Each is treated as a group within a sample and is plotted independently.We added additional filtering options for handling BAM files, decreasing the need for prior filtering using tools other than deepTools: The
--samFlagIncludeand--samFlagExcludeparameters can, for example, be used to only include (or exclude) forward reads in an analysis.We separated the generation of read count tables from the calculation of pairwise correlations that was previously handled by
bamCorrelate. Now, read counts are calculated first usingmultiBamSummaryormultiBigWigCoverageand the resulting output file can be used for calculating and plotting pairwise correlations usingplotCorrelationor for doing a principal component analysis usingplotPCA.
New features and tools¶
Correlation analyses are no longer limited to BAM files – bigwig files are possible, too! (see multiBigwigSummary)
Correlation coefficients can now be computed even if the data contains NaNs.
- Added new quality control tools:
use plotCoverage to plot the coverage over base pairs
use plotPCA for principal component analysis
bamPEFragmentSize can be used to calculate the average fragment size for paired-end read data
Added the possibility for hierarchical clustering, besides k-means to
plotProfileandplotHeatmapplotProfilehas many more options to make compelling summary plots
Minor changes¶
Changed parameters names and settings¶
computeMatrixcan now read files with DOS newline characters.--missingDataAsZerowas renamed to--skipNonCoveredRegionsfor clarity inbamCoverageandbamCompare.Read extension was made optional and we removed the need to specify a default fragment length for most of the tools:
--fragmentLengthwas thus replaced by the new optional parameter--extendReads.Added option
--skipChromosomestomultiBigwigSummary, which can be used to, for example, skip all ‘random’ chromosomes.Added the option for adding titles to QC plots.
Bug fixes¶
Resolved an error introduced by
numpy version 1.10incomputeMatrix.Improved plotting features for
plotProfilewhen using as plot type: ‘overlapped_lines’ and ‘heatmap’Fixed problem with BED intervals in
multiBigwigSummaryandmultiBamSummarythat returned wrongly labeled raw counts.multiBigwigSummarynow also considers chromosomes as identical when the names between samples differ by ‘chr’ prefix, e.g. chr1 vs. 1.Fixed problem with wrongly labeled proper read pairs in a BAM file. We now have additional checks to determine if a read pair is a proper pair: the reads must face each other and are not allowed to be farther apart than 4x the mean fragment length.
For
bamCoverageandbamCompare, the behavior ofscaleFactorwas updated such that now, if given in combination with the normalization options (--normalizeTo1xor--normalizeUsingRPKM), the given scaling factor will be multiplied with the factor computed by the respective normalization method.