Reference Sequence Formats ============================= This appendix explains the details of how |breseq| handles different reference sequence formats. Most importantly, this includes how different types of feature annotations are used. Illegal Characters -------------------- For all sequence formats: #. In nucleotide sequences, all characters are converted to uppercase and all non [ATCG] characters are converted to [N]. #. In gene names and locus tags, the characters [,;/|] are replaced with [_]. #. In gene descriptions, the character [|] is replaced with [;]. Feature Annotations ---------------------------- |breseq| is able to more accurately predict the locations of transposon insertions if these elements are annotated in the reference genome. They must have a feature type of ``repeat_region`` or ``mobile_element``.