Online bioinformatics tools for sequence analysis

Bioinformatics is a field of study concerned with computational analysis and storage of biological data. The field is broad, ranging from the study of DNA and proteins, to structural biology, drug design and comparative genomics. Dr Trevor Bell and Professor Anna Kramvis, from the Hepatitis Virus Diversity Unit (HVDRU) in the Department of Internal Medicine, have developed a number of free, online bioinformatic tools, described in several Open Access papers (1-4).

The standard workflow in the HVDRU includes DNA extraction, PCR amplification, direct DNA sequencing, viewing and checking of chromatograms, preparation of curated sequences, multiple sequence alignment, sequence analysis, serotyping, genotyping, phylogenetic analysis and preparation of sequences for submission to public databases such as GenBank. The tools developed in the HVDRU are used at several of the steps in this process, with a particular focus on processing of chromatograms and DNA sequence data. Although developed and tested with sequence data from hepatitis B virus (HBV), sequences from other organisms can be submitted to most of the tools.

The suite includes the following tools too: plot and visualize chromatogram quality scores; generate contigs directly from forward and reverse chromatograms; conservatively clean or curate sequence data; extract HBV protein sequences; calculate 2-by-2 contingency tables; determine HBV serotype; merge long overlapping sequence fragments; summarize and graph nucleotide or mutation distribution; automate phylogenetic analysis and prepare fragments for GenBank submission. Two tools have been developed to assist with processing and analysis of ultra-deep re-sequencing (pyrosequencing) data.

These stand-alone, web-based tools allow users on any operating system platform to access the tools they require from any location with an internet connection, without needing to learn a new bioinformatics software suite or a new programme and without having to install any software on to their computer. The appropriate tool is simply used as and when required. They are available online at no cost and do not require extensive computer skills or training to use. Data can easily be processed by a mixture of online tools and other software packages, as standard file formats are used. Using specific tools designed to perform a single task, means that workflows can be partitioned into logical units and that processes or analyses can be easily repeated.

The tools are available online on the HVDRU server at the following addresses:
http://hvdr.bioinf.wits.ac.za/tools
http://hvdr.bioinf.wits.ac.za/SmallGenomeTools.
The source code for some of the tools is released under the GPL version 2 and is available online via GitHub, at the following address:
https://github.com/DrTrevorBell/SmallGenomeTools.

The tools are described in the following papers:
1. Bell T.G, Kramvis A (2015). Bioinformatics tools for small genomes, such as hepatitis B virus. Viruses, 7, 2:781-97.
2. Bell T.G, Kramvis A (2013). Fragment merger: an online tool to merge overlapping long sequence fragments. Viruses, 5, 3:824-33.
3. Bell T.G, Kramvis A (2013). Mutation Reporter Tool: an online tool to interrogate loci of interest, with its utility demonstrated using hepatitis B virus. Virology Journal, 10:62.
4. Yousif M, Bell T.G, Mudawi H, Glebe D, Kramvis A (2014). Analysis of ultra-deep pyrosequencing and cloning based sequencing of the basic core promoter/precore/core region of hepatitis B virus using newly developed bioinformatics tools. PLOS ONE, 9, 4:e95377.

Story by: Dr Trevor Bell, Wits Health Science Research News July 2015