changelog.html

Version history

2.1.5 (August 2019)

No future updates are planned beyond this final release.

Fixed bug in reported median mutation rates in histogram figures (previous ShapeMapper versions actually displayed the 5th percentile)
Added guidance for running ShapeMapper components piecemeal (see Modular workflow)
Added output histogram plot of un-normalized ln(mut_rate_modified/mut_rate_untreated)
Reorganized list of ShapeMapper dependencies (see Building)
Adjusted --min-mutation-separation 0 behavior to match expectations
Added missing --output-aligned-reads option
Allow --input_is_unpaired argument to shapemapper_mutation_parser to override flags present in SAM alignment
Fixed issue with unit test paths on machines outside the build environment
Fixed issue with undocumented --separate-ambig-counts param
Exposed bowtie2 effort params -R and -D as --max-reseed and --max-search-depth
Added a more helpful error message in certain cases of missing input files
Added error messages when attempting to run on a Mac or run when executables are not present
Added a top-level CMakeLists to simplify most common build situation

2.1.4 (Oct. 2018)

Documentation reorganized and expanded
Fix for crash with folder names shorter than 3 characters
Amplicon primer pair mapping filters
Now using paired-end alignment mode for paired reads that fail to merge
Added test for overall pipeline accuracy on a small example dataset
Added simplified reactivity profile outputs suitable for direct import into VARNA or Ribosketch (see Coloring by SHAPE reactivity).
Refactored much of MutationCounter and MutationParser
- Refactored Read class and used throughout
- Moved all mutation processing and filtering functions from MutationCounter to MutationParser
- Incorporated previously debug outputs into primary output of MutationParser
Debug mutation rendering (--render-mutations) reworked
- Provides more detailed information about each mutation processing step and quality filters
- Outputs a multi-page pdf file scaled to fit the width specified by --max-paired-fragment-length
Added end-to-end tests for unpaired inputs
Bugfix for log file path when --name provided
Exposed STAR --genomeSAindexNbase parameter
- Added option to automatically rerun with defined --genomeSAindexNbase in the case of STAR segfault (see STAR parameters)
Excluded lowercase sequence from mutation rate histogram plots
Added --per-read-histograms option and disabled by default
Print all subprocess stdout/stderrs to main log file if run failed and --verbose

2.1.3

Fix for crash with large --random-primer-len

2.1.2

Fix for execution in Slurm cluster environments
Added error message for all-lowercase input sequence
Third-party conda package fixes
Bamtools CMake fixes to accommodate recent repo changes

2.1.1

Added simple read length and mutations per read histogram outputs
Support passing target sequences directly on commandline
Various fixes to ease remote builds
Exit with error if tests fail
Error message if FASTQ files present in a provided --folder without R1 or R2 in filename
Bugfix for sequence names with spaces
Softened data quality warning message
Added more detailed output for intermediate single read classified mutation files.

2.0-rc3

Added grip-rendered README.

2.0-rc2

Added license
Updated README
Allow pipeline to run to completion even if some RNAs have no mapped reads
Updated STAR aligner suggestion message for long RNAs
Updated thirdparty package management scripts
Moved some utility scripts to separate repo
Bugfix for partial argument parsing
Fix for bowtie2 component failure detection
Clarified error message for filename collisions
Do not render pipeline flowchart by default (controlled with --render-flowchart)
Removed some large unused test files

0.1.5

Quality control checks more readable
Detailed quality control descriptions in docs/quality_control.html
File format descriptions in docs/file_formats.html
Softened quality control warning text
Better --version and --help handling
Files under active development conditionally excluded from tarball
Changed default --min-mutation-separation to 6
Added post-alignment basecall quality filter for mutation counting. Controlled by the --min-qual-to-count parameter (default=30). Effective read depths are now calculated using only positions with high-quality basecalls.
Fixed speed problem with STAR end-to-end tests
Fixed issue with --correct-seq crashing with regions of zero coverage
Fixed issue with --correct-seq causing downstream crash if sequence length changed.

0.1.4

Bugfix release. Updated run_example.sh, log file location, STAR aligner warning, and quality control warning for RNAs with long names.

0.1.3

Multiple RNA support
- Provide one or more FASTA files with one or multiple target sequences in each file.
- By default, normalize reactivity profiles as a group (disable with --indiv-norm)
Unpaired read support
Masked region support
- Lowercase nucleotides in sequence will be excluded from reactivity profile calculation.
- Useful for primer binding regions in targeted primer experiments
Quality control checks and warnings
- Good read depths
- Mutation rates higher in modified sample than in untreated
- Expected number of highly reactive nucleotides
- Not too many high background positions
Exclude mutations and depths over 3-prime random primer binding portion of reads with the --random-primer-len parameter
STAR aligner support (--star-aligner). Recommended for large RNAs, as it can be much faster than Bowtie2.
Intermediate/debug file output options
- Aligned reads (--output-aligned)
- Parsed mutations (--output-parsed)
- Classified mutations (--output-classified)
- Counted mutations (--output-counted)
- Rendered mutations (--render-mutations). This will generate a postscript image showing a subset of reads with parsed mutations and the adjusted mutations that ultimately contribute to profile calculation shown above and below the read, respectively.
Realign ambiguously-located deletions and insertions to their leftmost valid positions and include in reactivity profile calculation. This empirically produces more accurate profiles than excluding ambiguous mutations.
Combine mutations separated by up to 11 unchanged reference nucleotides. This empirically produces more accurate profiles than only combining immediately adjacent mutations. This threshold can be changed with --min-mutation-separation.
Occluded depth correction support. Under this scheme, multinuc mutations do not contribute to the read depth calculation, since they effectively prevent the detection of any modification within that region. Disable with --no-occluded-depth-correction
Options to exclude inserts, deletions, or ambiguously-aligned inserts or deletions from reactivity profile calculation
Sped up some tests
Added --overwrite option, otherwise give an error if existing files conflict with output files
Verbose option to show each subprocess command
Default --min-depth raised to 5000
Default --min-mapq lowered to 10
Flowchart legend
Simplified pipeline-building framework
Clearer output filenames
Minor fixes
- Counted mutations and read depth files are now guaranteed to have the same lengths, even if the 3-prime end of the RNA is covered by no reads
- Error message when denatured and modified samples provided but no untreated sample
- Profile normalization creates a new file rather than overwriting input file
- Error for FASTA file with whitespace in sequence
- Explicit filename argument handling
- Sequence variants reported to user in 1-based coordinates
- Various test runner fixes
- Fix for build using local libs

0.1.2

Histogram rendering (mutation rates, sequencing depths, reactivities)
Sequence variant correction integrated into command-line interface
- Generate updated FASTA file
- Report sequence changes to user
- Warn user about high-frequency but sub-threshold mutations
Alignment stats reported to user
- Spurious reads making it through BBmerge are now filtered out, so alignment stats better reflect actual mapping percentages
Option to generate output without excluding ambiguously aligned mutations
- Histograms generated this way should better reflect actual mutation rates
- Nucleotide-resolution profiles generated this way may be misleading in regions of homopolymeric or repeated sequence
Log file generation
Testing
- Unit tests built again
- End-to-end pipeline success tests
- End-to-end module failure detection tests
- Sequence variant correction tests
- Reduced size of test dataset to speed up execution
- Single script to run all tests and summarize results
Serial execution working again
Mapping quality filter
FASTA format checks
Segfaults reported to user
Bugfix for 2-sample run
Correct handling of ambiguous mutations for poor alignments (corner case)
Flowchart rendering working on UNC cluster
Quote filenames in shell wrappers
CMake rebuild much faster (don't rebuild BamTools every time)
Misc. fixes to build scripts

← back to README