[Biodevelopers] Blast - exact same sequence only gives 98%

Noel noel.faux at med.monash.edu.au
Sat Mar 24 04:12:15 EDT 2007


Hi Phil,

The X's signify that the query sequences has been put through a low
complexity filter 'SEG' which will X out low complex sequences.  These
sequences may introduce false positives into the results.  To turn the
low complexity filter off use the -F F option.  Assuming you're using
the command line version of BLAST.

Hope this helps
Noel

Phil Princely wrote:
> Hi all,
>
> I'm new here, so sorry if this is a bit of an obvious question. I've
> been using blast for a while now, but am still learning. Here's my
> problem:
>
> I used formatdb to make a blast database from a text file with about
> 2000 genes. Everything went well, and I could query the database. But
> when I input a sequence from the original text file, the result isn't
> always 100%. Sometimes it comes out 98% or 95%, when it should always
> be 100%. When I look at the results, I find one or more series of xs,
> signifying a missing part of data. For example:
>
> Score = 1815 bits (4702), Expect = 0.0
> Identities = 913/953 (95%), Positives = 913/953 (95%)
>
> LTLDRLSNTLSGGESQRISLATQXXXXXXXXXXXXDEPSIGLHQ (Query)
> LTLDRLSNTLSGGESQRISLATQ            DEPSIGLHQ
> LTLDRLSNTLSGGESQRISLATQLGSSLVGSLYVLDEPSIGLHQ (Subject)
>
> Is there a way to make this 100%. I want to run the 2000 genes against
> another genome to find 100% similar regions, 95% similar regions and
> so on.
>
> Thanks
>
> Phil P
> _______________________________________________
> Biodevelopers mailing list
> Biodevelopers at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/biodevelopers



More information about the Biodevelopers mailing list