[ViewVC] Log of: owl/trunk/tools/DataDistributer.java

Links to HEAD:	(view) (annotate)
Sticky Revision:	(Current path doesn't exist after revision 950)
Sort logs by:

Revision 93 - (view) (annotate) - [select for diffs]
Modified Wed May 24 15:44:25 2006 UTC (18 years, 5 months ago) by duarte
File length: 23810 byte(s)
Diff to previous 89

Now split of data also working with text-based keys as well as numerical
MySQLConnection:
- method getAllIds4KeyAndTable now splitted into two methods one for numerical ids and another for text ids
- new methods getColumnType and isKeyNumerical
DataDistribution:
- method getIdSetsFromNodes splitted into two one for numerical ids one for text ids
DataDistributer:
- new methods: splitIdsIntoSets now splitted into two methods one numerical, one text
- change methods: splitTableToCluster, splitTable, insertIdsToKeyMaster, removePK, addPK, createNewKeyMasterTbl, removeZeros, loadSplitData, dumpSplitData to make them work for both text and numeric keys. Introduced generic type T in some of them
- some bugs corrected:
-- an important one in createNewKeyMasterTbl, was introducing record in dbs_keys with srcDb instead of destDb as it should have been
-- some bugs in loadSplitData and dumpSplitData to account for cases in which there are less ids than number of nodes and thus some nodes don't get any data. Wasn't counting with this before.

Revision 89 - (view) (annotate) - [select for diffs]
Modified Fri May 5 10:46:35 2006 UTC (18 years, 5 months ago) by duarte
File length: 18523 byte(s)
Diff to previous 87

Added method to setDumpDir method

Revision 87 - (view) (annotate) - [select for diffs]
Modified Wed May 3 08:50:26 2006 UTC (18 years, 5 months ago) by duarte
File length: 18453 byte(s)
Diff to previous 86

Improved considerably the splitTableToCluster method:
- got rid of the unnecessary step of creating partial tables before dumping.
- now directly dumping with new method dumpSplitData, a modified dumpData that dumps using a WHERE condition
- added variable NUM_CONCURRENT_SAMEHOST_WRITE_QUERIES used in dumpSplitData method. It sets the concurrency when dumping locally only from the master

Revision 86 - (view) (annotate) - [select for diffs]
Modified Fri Apr 28 10:31:35 2006 UTC (18 years, 5 months ago) by duarte
File length: 17940 byte(s)
Diff to previous 85

Added PARALLELISM in load/dump of tables using new class QueryThread (extends Thread)
Modified methods loadData, dumpData and loadSplitData to dump/load parallely in cases that is useful by using the QueryThread class.
New method initializeDirs(String[]) to do some of the dir initialization that was in dumpData
Got rid of one of the getConnectionToNode method, not needed anymore
New important 2 final static variables: NUM_CONCURRENT_READ_QUERIES and NUM_CONCURRENT_WRITE_QUERIES. They define how much concurrency we want in reads/writes to nfs for loads/dumps

Revision 85 - (view) (annotate) - [select for diffs]
Added Mon Apr 24 12:41:27 2006 UTC (18 years, 6 months ago) by duarte
File length: 16265 byte(s)

MAJOR change.
Split DataDistribution into 2 classes: DataDistributer and DataDistribution.
I haven't actually changed or added functionality
DataDistributer deals with the distribution of the data, while DataDistribution deals with things to do when data is already distributed, right now is only a few data checks
Note that DataDistributer now has two db fields: srcDb and destDb. This is different to before, when destDb was rather a parameter passed as arguments to the methods
Methods in DataDistributer have been tidied up a little (specially load and dump ones)

This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.