Sight application examples

Sight examples are available as a separate "Sight application examples" package. If you cannot open downloaded application, it may be because you are using too old Sight version. The required version is indicated as the release number in the application examples page.

Single file reader and access of polymorphic resources. Reads that query (local html file) and logs the found sequences in FASTA format and # -delimited format. The latter can be easily imported into the most of databases. The workflow automatically recognises the links to UniProt, Tigr, Tair, FlyBase, Mgi ad DictyBase databases and chooses the appropriate agent to retrieve the sequence. This workflow will work with 3.2.0 and higher only! This workflow demonstrates how to use the use the RecordReader to work with a pre-saved Go query and how to use filters for choosing the suitable branch. The workflow .tjar file is packed together with a sample query in a single .zip archive.

Genome_walker.tjar. Analysis of the given DNA fragment or fragments, representing genome fragment or small (bacterial or viral) genome. The input file can contain one or several sequences in FASTA format (>header (new line) DNA). For the first test, use the provided human genome DNA fragment .

Protein_explorer.tjar. Performs various analysis on the protein sequence, corresponding the given NCBI accession number. Can take some RNA sequences if the corresponding XML entry contains protein translation. For test, use the provided list of identifiers in NCBI_list.txt.

Blast_blast_blast.tjar. Re-submit the received BLAST hits to BLAST similarity search again. The iteration is repeated 3 times. The team investigates problems and possibilites for this way of finding the similarity between proteins, also can be useful in understanding BLAST limitations.

Specific_range.tjar. Explains how to use subsequence selector. Scans the 200 amino acids upstream the third transmembrane helix for PROSITE hits and NCBI conservative domains. The imput is the list of NCBI accession numbers to the protein sequences. As this is a public example, we used the simple and fast-responding servers.

SwissToFastaConverter.tjar. Explains how to use and generate flat files. This very simple application converts between two flat files, having multiple sequences in SwissProt and FASTA formats.

Redundancy_reductor.tjar. Redundancy reductor. The imput is expected to be a set of sequences, some of them being rather similar. With the built-in similarity search tool a subset of more different sequences is extracted. The file Log/subset.fasta is used as a search database and later the non-similar sequence is added to this file.

PrepareTrainingSet.tjar. Prepare training set for classifier, used to detect chloroplast proteins. The imput is expected to be a set of sequences, some of them being rather similar. With the built-in similarity search tool a subset of more different sequences is extracted. The initial sequences are check for being valid protein sequences, not shorter than 200 amino acids. The file Log/subset.fasta is used as a search database and later the non-similar sequence is added to this file. Only first 200 amino acids participate in this search. This set is then splitted into chloroplast an non-chloroplast sequences by searching for the corresponding tags in SwissProt record. The final sets are written in complete SwissProt form and in Fasta form (not truncated).