The LECB 2-D PAGE Gel Images Data Sets
The LECB 2-D PAGE gel images database is available for public
use. It contains data sets from four types of experiments with over 300
gif images with annotation and landmark data in html, tab-delimited and
xml formats. It could be used for samples of several types of biological
materials and for test data for 2D gel analysis software development and
comparison with other similar samples. PAGE is polyacrylamide gel
electrophoresis. The LECB was the U.S. National Cancer Institute's Laboratory
of Experimental and Computational Biology. Since this work was done, LECB
has been reorganized as the
CCR Nanobiology Program. The database is available at two Web
sites (the bioinformatics.org is the mirror site):
The database consist of four 2D gel image data sets previously
analyzed with the GELLAB-II
system (also
GELLAB-II history). These data consist of over 300 gel images
(including some replicate samples and some replicate scans) and are
summarized below. The data set experiment conditions are documented
in the associated literature references. The four data sets are:
- Human leukemias (Eric Lester, Peter Lemkin)
- HL-60 cell lines (Eric Lester, Peter Lemkin)
- MOLT-4 cells (Eric Lester, Peter Lemkin)
- Fetal alcohol syndrome (FAS) - serum (James Myrick, Mary Robinson, Peter Lemkin)
The data sets are described in the papers associated with each data
set. The case or control samples could be used for comparison with
your 2D PAGE gel samples or as test data for 2D gel analysis
software.
These gel data could be used with the Open2Dprot project
(open2dprot.sourceforge.net) software or Flicker gel
comparison program (open2dprot.sourceforge.net/Flicker) as well
as other 2D gel analysis software such as ImageJ, Photoshop, GIMP,
etc.
This data is released for public use under the
following conditions listed below (Section 3).
You may view the data directly from the Web site by clicking on
hyperlinks in the following (accession and landmark) HTML files to pop
up static images in your Web browser (see below). Alternatively, you could
compare the images dynamically using
the Flicker program on downloaded data sets.
1. Distribution documentation
The annotation documentation on the diagnosis and experimental
conditions as well as the running of the gels is described in the
referenced papers, when available. List of gels for each project
summarize these conditions and diagnoses are summarized in
tab-delimited spreadsheets for reading into various software
packages. The accession data also includes a computing window for a
valid region of interest in the gel [cwx1:cwx2, cwy1:cwy2]. The gel
scan data grayscale was calibrated with a neutral density calibration
step wedge. The List of OD values and corresponding gray values are
given in the accession data table for each gel. This lets you compute
integrated density rather than integrated grayscale (the former being
more accurate). All coordinates are given in a raster coordinate
system with the upper left hand corner being (0,0). For browsing of
this database we also include tab-delimited data, XML data, and HTML
web pages with links to the images for looking at specific gel
images. These are the accession.tbl, accession.xml and
accession.html files.
Manually landmarked gel data is also made available. Landmarks are
useful in aligning gels in spot pairing software. These are the
landmark.tbl, landmark.xml, and landmark.html
files. These are vertically stacked spreadsheets. A landmark is a
spot position in a reference gel that corresponds to the putatively
same spot in another gel - typically matched by flickering the gels to
find corresponding local regions. By pairing N-1 gels to a reference
gel in a N gel database, it is possible to pair spots back to the
reference gel and to build corresponding spot expression lists for
analysis. In this data, there are instances where a particular
landmark spot is missing from one of the non-landmark gels. This is to
be expected in real-world data. The position was estimated by visually
aligning neighboring spots in a local region around the landmark in
the reference gel with the local region in the other gel. The landmark
coordinates should be on the spot since it is assumed that spot
matching software will latch onto the actual spot centroid from the
manually specified coordinates.
To simplify reading all of the data contained in the accession.tbl and
landmark.tbl, we also generated a publish.tbl and
publish.xml files which is the relational join of the other two
files. It would be used primarily by programs (such as the R program)
to read all data about the data set for further analysis.
Finally, the complete set of files including the GIF images is
packaged in a project.tar.gz file for each project and
may be unpacked with WinZip on Windows PCs, gunzip on Unix, etc.
2. List of 2D gel data sets
There are additional references (see list of
GELLAB References), but just a few are listed.
Human leukemias (AML, ALL, CLL, HCL and other) (Lester,
Lipkin, Lemkin). 170 gels [512x512 pixels, 8-bit, 250 microns/pixel,
GIF]
References:
-
Lester EP, Lemkin PF, Lipkin LE.Protein indexing in leukemias and
lymphomas. Ann N Y Acad Sci. 1984;428:158-72.
-
Lester EP, Lemkin P, Lipkin L. A two-dimensional gel analysis of
autologous T and B lymphoblastoid cell lines. Clinical
Chemistry 1982 Apr;28(4 Pt 2):828-39.
HL-60 cell line (Lester, Lipkin, Lemkin). 111 gels
[512x512 pixels, 8-bit, 250 microns/pixel, GIF].
References:
-
Lester EP, Lemkin P, Lipkin L, Cooper HL. A two-dimensional
electrophoretic analysis of protein synthesis in resting and growing
lymphocytes in vitro. J Immunol. 1981 Apr;126(4):1428-34.
-
Lester EP, Lemkin P, Lipkin L, Cooper HL. Computer-assisted
analysis of two-dimensional electrophoreses of human lymphoid cells.
Clinical Chemistry 1980 Sep;26(10):1392-402.
Molt-4 cell line (Lester, Lipkin, Lemkin). Four gels
[512x512 pixels, 8-bit, 250 microns/pixel, GIF].
Fetal Alchohol Syndrome serum biomarkers case-control study
(Robinson, Myrick and Lemkin). 53 gels [512x512 pixels, 8-bit, 340
microns/pixel, GIF]
References:
-
Robinson MK, Myrick JE, Henderson LO, Coles CD, Powell MK, Orr GA,
Lemkin PF. Two-dimensional protein electrophoresis and multiple
hypothesis testing to detect potential serum protein biomarkers in
children with fetal alcohol syndrome. Electrophoresis 1995
Jul;16(7):1176-83.
3. Notice and Disclaimer for 2D Gel Data Sets
The data sets included herein are provided as a service to the
research community under the following conditions by the National
Cancer Institute (NCI), a member institute of the National Institutes
of Health (NIH) and part of the United States Department of Health and
Human Services.
- THE DATA SETS ARE BEING PROVIDED TO THE RESEARCH COMMUNITY 'AS
IS' WITH NO WARRANTIES, EXPRESS OR IMPLIED, INCLUDING ANY WARRANTY OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
- THE DATA SETS SHALL NOT BE USED IN THE TREATMENT OR DIAGNOSIS
OF HUMAN SUBJECTS.
- No indemnification for any loss, claim, damage or liability is
intended or provided by the NCI. The NCI, as an agency of the United
States Government, assumes liability only to the extent provided under
the federal Tort Claims Act, 28 U.S.C. 2671 et seq.
- Users of the data sets agree not to claim, infer, or imply
endorsement by the Government of the United States of America, the
NIH, the NCI or any of its employees.
- Users shall not request or attempt to obtain in any manner or
form any private patient information that may be associated with the
data sets.
It is possible to flicker-compare any two images using the Flicker
program on downloaded data sets. You should:
- Download the Flicker program (and read the documentation).
- Download one of the above 2D gel datasets you are interested
in. Then unzip the dataset file. There will be a directory ppx
which contains the images.
- Copy that ppx directory into the Images directory
where you have installed Flicker. You may want to rename the ppx
directory if you plan on having several image subdirectories in the
Images directory.
- If you have your own images to compare, you can add them to the ppx
directory or you can copy your directoory to the Images directory.
- When you start Flicker, your data will appear in the (File |
Open user images | Pairs of images | ppx | ...) menu. This will
list all combinations of images in your ppx directory. Then
select a pair of images and it will load them into Flicker, at which
point you can do the comparison.
The following are web sites that use or reference the data.
If you have used the data, send us your links to add to this list.
-
Potra FA, Liu X, Aligning Families of 2D-Gels by a Combined
Multiresolution Forwared-Inverse Transformation Approach.
$Date: 2006/06/30$ /
P. Lemkin.
lemkin@ncifcrf.gov
or
lemkin@bioinformatics.org