[Bioclusters] Java Vs C++(Qt) for Bioinformatics
Tim Cutts
tjrc at sanger.ac.uk
Thu May 24 08:19:57 EDT 2007
On 23 May 2007, at 11:50 pm, Mr. Syed Aijaz wrote:
> Hello All,
>
> Just wondering what bioinformatics community thinks of is best to use:
> 1. Java Swings (1.6+)
> 2. C++ Qt (4.0)
>
> My visualziation tool requires accessing data which is in the order of
> few hundred MBs we are expecting this to hit GBs soon. I am planning
> not to hold up all the data. However, I will have to hold up some data
> (a few hunderds of thousands (O(100,000)) of data entities, each
> costing
> around ~60 bytes). As the tool is supposed to be a interactive,
> what will
> be good alternative between Java Vs C++? I am leaning towards Java,
> reason being:
> 1. Comprehensive GUI
I've never done a comparison of Java and Qt in this regard - you
might well be right.
> 2. Java not that Slow, as they say!
Java afficionados keep saying that. Doesn't make it true though.
It's still a lot slower than a compiled language. Allegedly getting
better, but still slow.
> 3. Huge API, DBMS, XML, DRMAA, . . . . .
Can't argue with that, although there are of course vast numbers of C
libraries as well which you can call from C++ programs which have the
same functionality, so it isn't a cut-and-dried case of Java having
something C(++) does not. The C API to mysql is quite easy to use,
for example, and there's the expat XML library.
> 4. No deployment pain, although a little application
> specific deployment may be required example: preference files etc
This is true in theory, and was always Java's great promise, but
there are of course all those niggling little differences between JVM
implementations.
> 5. Automated Garbage collection, less trouble in maintaining memory.
> Although it has a little overhead, it can be reduced by efficient
> handling of data???
And would have to be. The downside of Java's convenient memory model
is partly that it uses a lot more memory, and also that because you
have no control over memory layout, you're likely to have many more
cache misses than with well-written compiled code. This will hurt
performance. This problem will, in my view, get worse as we move to
more massively multi-core processors, and all those cache misses on
the many cores will be competing with each other for access to main
memory.
> 6. efficient multi threading, not system level fork, etc??????
I'm not sure how efficient it actually is. My experience with
watching multithreaded Java applications run led me to observe that
the program wastes an inordinate amount of time just deciding which
thread to run next, and not actually doing anything useful. How much
of this was down to programmer error in their implementation though,
I don't know. Probably quite a lot. A brief glance at the code in
question seemed to show a surfeit of synchronisation requests, most
of which weren't necessary.
Most operating systems have light weight threads, compared to fork(),
and those will usually be what the JVM is using anyway, so a C++
application using pthreads should be as efficient, if not more, than
the JVM running on the same architecture.
> 7. Java has growing number of Bioinformatics applications
Just because everyone else is doing it doesn't make it right! It's
just trendy. Java has its place. GUIs are one thing it is quite
good at. But I don't think it's a good choice for handling large
data sets, because of its memory inefficiency, and at very CPU-
intensive code, because of its speed.
Of course, one thing you might consider is writing the actual data
processing parts of your classes as Java native methods, written in
C. They should then be fast, but at the expense of making the
program harder to deploy. Keep Java for what it is good at (GUIs,
database connectivity, etc) and use a lower level language for the
actual data processing, I say.
Personally, my favourite GUI developing framework currently is Cocoa,
but of course it isn't cross platform, so is a bit of a non-starter
for this discussion.
Anyway, that's my 2¢, from an admittedly somewhat anti-Java biassed
standpoint. I don't actually hate Java as much as it sounds, I just
hate seeing it used as a swiss-army knife for any job (in much the
same way that perl was abused 5-10 years ago)
Tim
More information about the Bioclusters
mailing list