2VB1: Difference between revisions

Revision as of 11:22, 22 February 2011

This reports processing of triclinic hen egg-white lysozyme data @ 0.65Å resolution (PDB id 2VB1). Data (sweeps a to h, each comprising 60 to 360 frames of 72MB) were collected by Zbigniew Dauter at APS 19-ID and are available from here. Details of data collection, processing and refinement are published.

XDS processing

use generate_XDS.INP to obtain a good starting point
edit XDS.INP and change the following:

ORGX=3130 ORGY=3040  ! for ADSC, header values are subject to interpretation; better inspect the table in IDXREF.LP!
TRUSTED_REGION=0 1.5 ! we want the whole detector area
ROTATION_AXIS=-1 0 0 ! at this beamline the spindle goes backwards!

for faster processing on a machine with many cores, use (e.g. for 16 cores):

MAXIMUM_NUMBER_OF_PROCESSORS=2
MAXIMUM_NUMBER_OF_JOBS=8

For all the sweeps, processing stopped with an error message after the IDXREF step. By inspecting IDXREF.LP, one should make sure that everything works as it should, i.e. that a large percentage of reflections was actually indexed nicely:

...
  63879 OUT OF   72321 SPOTS INDEXED.
...

***** DIFFRACTION PARAMETERS USED AT START OF INTEGRATION *****

REFINED VALUES OF DIFFRACTION PARAMETERS DERIVED FROM  63879 INDEXED SPOTS
REFINED PARAMETERS:   DISTANCE BEAM AXIS CELL ORIENTATION    
STANDARD DEVIATION OF SPOT    POSITION (PIXELS)     0.53
STANDARD DEVIATION OF SPINDLE POSITION (DEGREES)    0.12

It may be possible to adjust some parameters (for COLSPOT) so that the error message does not occur, but it is not worth the effort. So we just change

JOBS=XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT

to

JOBS=DEFPIX INTEGRATE CORRECT

and run "xds_par" again. It completes after about 5 minutes on a fast machine, and we may inspect CORRECT.LP .

Optimization

The main target of optimization is the asympototic (i.e. best) I/sigma (ISa) (Diederichs (2010) Acta Cryst. D 66, 733-40) as printed out by CORRECT. A higher ISa means better data. However: ISa also rises if more reflections are thrown out as outliers ("misfits") so it is not considered to be optimization if just WFAC1 is reduced. The following quantities may be tested for their influence on ISa:

copying GXPARM.XDS to XPARM.XDS
including the information from the first integration pass into XDS.INP - just do "grep _E INTEGRATE.LP|tail -2" and get e.g.

BEAM_DIVERGENCE=   0.386  BEAM_DIVERGENCE_E.S.D.=   0.039
REFLECTING_RANGE=  0.669  REFLECTING_RANGE_E.S.D.=  0.096

copy these two lines into XDS.INP

Example: sweep e

XDS.INP; as generated by generate_XDS.INP

CORRECT.LP main table; 1st pass

XDS.INP; optimized

CORRECT.LP main table; optimization pass

XSCALE results

a few sweeps were optimized by copying the two lines containing mosaicity and beam divergence values from INTEGRATE.LP to XDS.INP

main table

SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION     NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA   R-meas  Rmrgd-F  Anomal  SigAno   Nano
  LIMIT     OBSERVED  UNIQUE  POSSIBLE     OF DATA   observed  expected                                      Corr

    2.91       15799    2114      2147       98.5%       2.3%      2.5%    15787   73.42     2.6%     1.1%   -15%   0.705    1969
    2.06       39607    3830      3856       99.3%       2.5%      2.8%    39602   81.49     2.6%     0.9%   -11%   0.750    3794
    1.68       64423    5068      5087       99.6%       3.1%      3.3%    64415   82.27     3.3%     1.0%    -3%   0.843    5018
    1.45       72869    6147      6163       99.7%       3.2%      3.5%    72867   77.43     3.4%     1.0%     0%   0.833    6055
    1.30       71079    6652      6657       99.9%       3.3%      3.5%    71079   70.69     3.4%     1.1%     8%   0.865    6506
    1.19       74584    7287      7298       99.8%       3.2%      3.4%    74575   66.78     3.4%     1.2%     5%   0.870    7060
    1.10       84893    8268      8278       99.9%       3.5%      3.7%    84865   62.98     3.6%     1.3%     5%   0.858    7983
    1.03       87893    8585      8603       99.8%       4.2%      4.4%    87859   56.04     4.4%     1.5%     4%   0.828    8238
    0.97       92833    9457      9465       99.9%       5.2%      5.6%    92810   48.70     5.5%     1.7%     6%   0.802    9010
    0.92       83981    9911      9927       99.8%       5.7%      6.3%    83954   41.48     6.0%     2.1%     5%   0.785    9362
    0.88       74101    9620      9621      100.0%       6.3%      7.2%    74083   35.53     6.7%     2.6%     5%   0.785    9041
    0.84       81383   11511     11518       99.9%       6.8%      7.7%    81361   30.26     7.3%     3.3%     1%   0.760   10616
    0.81       67616   10240     10247       99.9%       7.1%      7.8%    67596   25.84     7.7%     4.2%     1%   0.782    9368
    0.78       74077   11807     11817       99.9%       7.2%      7.3%    74049   22.26     7.8%     5.2%     1%   0.797   10697
    0.75       86236   13831     13839       99.9%       8.5%      8.7%    86206   18.77     9.3%     6.7%     2%   0.809   12497
    0.73       64601   10481     10488       99.9%      10.4%     10.5%    64573   15.77    11.3%     8.2%     2%   0.810    9375
    0.71       71886   11727     11741       99.9%      12.8%     13.0%    71835   13.05    14.0%    10.6%     2%   0.800   10420
    0.69       80233   13156     13163       99.9%      16.5%     16.9%    80130   10.32    18.1%    13.7%     1%   0.796   11661
    0.67       84259   14746     14766       99.9%      22.0%     22.5%    84056    7.61    24.1%    19.6%     3%   0.789   12468
    0.65       60775   15579     16551       94.1%      27.5%     30.3%    59893    4.49    31.7%    32.3%     1%   0.723    8936
   total     1433128  190017    191232       99.4%       3.3%      3.5%  1431595   33.18     3.5%     3.5%     2%   0.801  170074

Comparison of data processing: published (2006) vs XDS results

	resolution (highest resolution range)	observations	unique reflections	Multiplicity	Completeness (%)	R merge (%)	mean I/sigma
published(2006)	30-0.65Å (0.67-0.65Å)	1331953 (12764)	187165 (6353)	7.1 (2.7)	97.6 (67.3)	4.5 (18.4)	36.2 (4.2)
XDS	30-0.65Å (0.67-0.65Å)	1433128 (60775)	190017 (15579)	7.5 (3.9)	99.4 (94.1)	3.3 (27.5)	33.2 (4.5)

timings for processing sweep "e" as a function of MAXIMUM_NUMBER_OF_PROCESSORS and MAXIMUM_NUMBER_OF_JOBS

The following is going to be rather technical! If you are only interested in crystallography, skip this.

Using

MAXIMUM_NUMBER_OF_PROCESSORS=2
MAXIMUM_NUMBER_OF_JOBS=8

we observe for the INTEGRATE step:

total cpu time used               2063.6 sec
total elapsed wall-clock time      296.1 sec

Using

MAXIMUM_NUMBER_OF_PROCESSORS=1
MAXIMUM_NUMBER_OF_JOBS=16

the times are

total cpu time used               2077.1 sec
total elapsed wall-clock time      408.2 sec

Using

MAXIMUM_NUMBER_OF_PROCESSORS=4
MAXIMUM_NUMBER_OF_JOBS=4

the times are

total cpu time used               2102.8 sec
total elapsed wall-clock time      315.6 sec

Using

MAXIMUM_NUMBER_OF_PROCESSORS=16 ! the default for xds_par on a 16-core machine
MAXIMUM_NUMBER_OF_JOBS=1 ! the default

the times are

total cpu time used               2833.4 sec
total elapsed wall-clock time      566.5 sec

but please note that this actually only uses 10 processors, since the default DELPHI=5 and the OSCILLATION_RANGE is 0.5°.

Using

MAXIMUM_NUMBER_OF_PROCESSORS=4
MAXIMUM_NUMBER_OF_JOBS=8

(thus overcommitting the available cores by a factor of 2) the times are

total cpu time used               2263.5 sec
total elapsed wall-clock time      320.8 sec

Using

MAXIMUM_NUMBER_OF_PROCESSORS=4
MAXIMUM_NUMBER_OF_JOBS=6

(thus overcommitting the available cores, but less severely) the times are

total cpu time used               2367.6 sec
total elapsed wall-clock time      267.2 sec

Thus,

MAXIMUM_NUMBER_OF_PROCESSORS=4
MAXIMUM_NUMBER_OF_JOBS=6

performs best for a 2-Xeon X5570 (HT enabled, thus 24 cores) machine with 24GB of memory and a RAID1 consisting of 2 1TB SATA disks. It should be noted that the dataset has 27GB, and in 296 seconds this means 92 MB/s continuous reading. The processing time is thus limited by the disk access, not by the CPU. And no, the data are not simply read from RAM (tested by "echo 3 > /proc/sys/vm/drop_caches" before the XDS run).

2VB1: Difference between revisions

Revision as of 11:22, 22 February 2011

Contents

XDS processing

Optimization

Example: sweep e

XDS.INP; as generated by generate_XDS.INP

CORRECT.LP main table; 1st pass

XDS.INP; optimized

CORRECT.LP main table; optimization pass

XSCALE results

main table

Comparison of data processing: published (2006) vs XDS results

timings for processing sweep "e" as a function of MAXIMUM_NUMBER_OF_PROCESSORS and MAXIMUM_NUMBER_OF_JOBS

Navigation menu

2VB1: Difference between revisions

Revision as of 11:22, 22 February 2011

XDS processing

Optimization

Example: sweep e

XDS.INP; as generated by generate_XDS.INP

CORRECT.LP main table; 1st pass

XDS.INP; optimized

CORRECT.LP main table; optimization pass

XSCALE results

main table

Comparison of data processing: published (2006) vs XDS results

timings for processing sweep "e" as a function of MAXIMUM_NUMBER_OF_PROCESSORS and MAXIMUM_NUMBER_OF_JOBS

Navigation menu

Search