2VB1

From XDSwiki
Jump to navigation Jump to search

This reports processing of triclinic hen egg-white lysozyme data @ 0.65Å resolution (PDB id 2VB1). Data (sweeps a to h, each comprising 60 to 360 frames of 72MB) were collected by Zbigniew Dauter at APS 19-ID and are available from here. Details of data collection, processing and refinement are published.

XDS processing

ORGX=3130 ORGY=3040  ! for ADSC, header values are subject to interpretation; better inspect the table in IDXREF.LP!
TRUSTED_REGION=0 1.5 ! we want the whole detector area
ROTATION_AXIS=-1 0 0 ! at this beamline the spindle goes backwards!
  • for faster processing on a machine with many cores, use (e.g. for 16 cores):
MAXIMUM_NUMBER_OF_PROCESSORS=2
MAXIMUM_NUMBER_OF_JOBS=8

For all the sweeps, processing stopped with an error message after the IDXREF step. By inspecting IDXREF.LP, one should make sure that everything works as it should, i.e. that a large percentage of reflections was actually indexed nicely:

...
  63879 OUT OF   72321 SPOTS INDEXED.
...

***** DIFFRACTION PARAMETERS USED AT START OF INTEGRATION *****

REFINED VALUES OF DIFFRACTION PARAMETERS DERIVED FROM  63879 INDEXED SPOTS
REFINED PARAMETERS:   DISTANCE BEAM AXIS CELL ORIENTATION    
STANDARD DEVIATION OF SPOT    POSITION (PIXELS)     0.53
STANDARD DEVIATION OF SPINDLE POSITION (DEGREES)    0.12

It may be possible to adjust some parameters (for COLSPOT) so that the error message does not occur, but it is not worth the effort. So we just change

JOBS=XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT

to

JOBS=DEFPIX INTEGRATE CORRECT

and run "xds_par" again. It completes after about 5 minutes on a fast machine, and we may inspect CORRECT.LP .


timings for processing sweep "e" as a function of MAXIMUM_NUMBER_OF_PROCESSORS and MAXIMUM_NUMBER_OF_JOBS

The following is going to be rather technical! If you are only interested in crystallography, skip this.

Using

MAXIMUM_NUMBER_OF_PROCESSORS=2
MAXIMUM_NUMBER_OF_JOBS=8

we observe for the INTEGRATE step:

total cpu time used               2063.6 sec
total elapsed wall-clock time      296.1 sec

Using

MAXIMUM_NUMBER_OF_PROCESSORS=1
MAXIMUM_NUMBER_OF_JOBS=16

the times are

total cpu time used               2077.1 sec
total elapsed wall-clock time      408.2 sec

Using

MAXIMUM_NUMBER_OF_PROCESSORS=4
MAXIMUM_NUMBER_OF_JOBS=4

the times are

total cpu time used               2102.8 sec
total elapsed wall-clock time      315.6 sec

Using

MAXIMUM_NUMBER_OF_PROCESSORS=16 ! the default for xds_par on a 16-core machine
MAXIMUM_NUMBER_OF_JOBS=1 ! the default

the times are

total cpu time used               2833.4 sec
total elapsed wall-clock time      566.5 sec

but please note that this actually only uses 10 processors, since the default DELPHI=5 and the OSCILLATION_RANGE is 0.5°.

Using

MAXIMUM_NUMBER_OF_PROCESSORS=4
MAXIMUM_NUMBER_OF_JOBS=8

(thus overcommitting the available cores by a factor of 2) the times are

total cpu time used               2263.5 sec
total elapsed wall-clock time      320.8 sec

Using

MAXIMUM_NUMBER_OF_PROCESSORS=4
MAXIMUM_NUMBER_OF_JOBS=6

(thus overcommitting the available cores, but less severely) the times are

total cpu time used               2367.6 sec
total elapsed wall-clock time      267.2 sec

Thus,

MAXIMUM_NUMBER_OF_PROCESSORS=4
MAXIMUM_NUMBER_OF_JOBS=6

performs best for a 2-Xeon X5570 machine with 24GB of memory and a RAID1 consisting of 2 1TB SATA disks. It should be noted that the dataset has 27GB, and in 296 seconds this means 92 MB/s continuous reading. The processing time is thus limited by the disk access, not by the CPU. And no, the data are not simply read from RAM (tested by "echo 3 > /proc/sys/vm/drop_caches before the XDS run).

SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION     NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA   R-meas  Rmrgd-F  Anomal  SigAno   Nano
  LIMIT     OBSERVED  UNIQUE  POSSIBLE     OF DATA   observed  expected                                      Corr

    2.91       15799    2114      2147       98.5%       2.3%      2.5%    15787   73.42     2.6%     1.1%   -15%   0.705    1969
    2.06       39607    3830      3856       99.3%       2.5%      2.8%    39602   81.49     2.6%     0.9%   -11%   0.750    3794
    1.68       64423    5068      5087       99.6%       3.1%      3.3%    64415   82.27     3.3%     1.0%    -3%   0.843    5018
    1.45       72869    6147      6163       99.7%       3.2%      3.5%    72867   77.43     3.4%     1.0%     0%   0.833    6055
    1.30       71079    6652      6657       99.9%       3.3%      3.5%    71079   70.69     3.4%     1.1%     8%   0.865    6506
    1.19       74584    7287      7298       99.8%       3.2%      3.4%    74575   66.78     3.4%     1.2%     5%   0.870    7060
    1.10       84893    8268      8278       99.9%       3.5%      3.7%    84865   62.98     3.6%     1.3%     5%   0.858    7983
    1.03       87893    8585      8603       99.8%       4.2%      4.4%    87859   56.04     4.4%     1.5%     4%   0.828    8238
    0.97       92833    9457      9465       99.9%       5.2%      5.6%    92810   48.70     5.5%     1.7%     6%   0.802    9010
    0.92       83981    9911      9927       99.8%       5.7%      6.3%    83954   41.48     6.0%     2.1%     5%   0.785    9362
    0.88       74101    9620      9621      100.0%       6.3%      7.2%    74083   35.53     6.7%     2.6%     5%   0.785    9041
    0.84       81383   11511     11518       99.9%       6.8%      7.7%    81361   30.26     7.3%     3.3%     1%   0.760   10616
    0.81       67616   10240     10247       99.9%       7.1%      7.8%    67596   25.84     7.7%     4.2%     1%   0.782    9368
    0.78       74077   11807     11817       99.9%       7.2%      7.3%    74049   22.26     7.8%     5.2%     1%   0.797   10697
    0.75       86236   13831     13839       99.9%       8.5%      8.7%    86206   18.77     9.3%     6.7%     2%   0.809   12497
    0.73       64601   10481     10488       99.9%      10.4%     10.5%    64573   15.77    11.3%     8.2%     2%   0.810    9375
    0.71       71886   11727     11741       99.9%      12.8%     13.0%    71835   13.05    14.0%    10.6%     2%   0.800   10420
    0.69       80233   13156     13163       99.9%      16.5%     16.9%    80130   10.32    18.1%    13.7%     1%   0.796   11661
    0.67       84259   14746     14766       99.9%      22.0%     22.5%    84056    7.61    24.1%    19.6%     3%   0.789   12468
    0.65       60775   15579     16551       94.1%      27.5%     30.3%    59893    4.49    31.7%    32.3%     1%   0.723    8936
   total     1433128  190017    191232       99.4%       3.3%      3.5%  1431595   33.18     3.5%     3.5%     2%   0.801  170074



Comparison of data processing: published vs XDS results

resolution (highest resolution range) observations unique reflections Multiplicity Completeness (%) R merge (%) mean I/sigma
published 30-0.65Å (0.67-0.65Å) 1331953 (12764) 187165 (6353) 7.1 (2.7) 97.6 (67.3) 4.5 (18.4) 36.2 (4.2)
XDS 30-0.65Å (0.67-0.65Å) 1433128 (60775) 190017 (15579) 7.5 (3.9) 99.4 (94.1) 3.3 (27.5) 33.2 (4.5)