2VB1: Difference between revisions

From XDSwiki
Jump to navigation Jump to search
No edit summary
Line 1: Line 1:
This reports processing of triclinic hen egg-white lysozyme data @ 0.65Å resolution (PDB id [[2VB1]]). Data (sweeps a to h, each comprising 60 to 360 frames of 72MB) were collected by Zbigniew Dauter at APS 19-ID and are available from [http://bl831.als.lbl.gov/example_data_sets/APS/19-ID/2vb1/ here]. Details of data collection, processing and refinement are [http://journals.iucr.org/d/issues/2007/12/00/be5097/index.html published].  
This reports processing of triclinic hen egg-white lysozyme data @ 0.65Å resolution (PDB id [http://www.rcsb.org/pdb/explore/explore.do?structureId=2VB1 2VB1]). Data (sweeps a to h, each comprising 60 to 360 frames of 72MB) were collected by Zbigniew Dauter at APS 19-ID and are available from [http://bl831.als.lbl.gov/example_data_sets/APS/19-ID/2vb1/ here]. Details of data collection, processing and refinement are [http://journals.iucr.org/d/issues/2007/12/00/be5097/index.html published].  


== XDS processing ==
== XDS processing ==


# use [[generate_XDS.INP]] to obtain a good starting point
# use [[generate_XDS.INP]] to obtain a good starting point
# edit [[XDS.INP]] and change the following:
# edit [[XDS.INP]] and change/add the following:
  ORGX=3130 ORGY=3040  ! for ADSC, header values are subject to interpretation; better inspect the table in IDXREF.LP!
  ORGX=3130 ORGY=3040  ! for ADSC, header values are subject to interpretation; these values from visual inspection
UNTRUSTED_RECTANGLE=1 3160 3000 3070  ! <xmin xmax ymin ymax> to mask shadow of beamstop; XDS-viewer to find out
  TRUSTED_REGION=0 1.5 ! we want the whole detector area
  TRUSTED_REGION=0 1.5 ! we want the whole detector area
  ROTATION_AXIS=-1 0 0 ! at this beamline the spindle goes backwards!
  ROTATION_AXIS=-1 0 0 ! at this beamline the spindle goes backwards!
# for faster processing on a machine with many cores, use (e.g. for 16 cores):
SILICON=34.812736 ! account for theta-dependant absorption in the CCD's phosphor. The correction is only
  MAXIMUM_NUMBER_OF_PROCESSORS=2
! significant for hi-res data; 34.812736=32*(value for silicon as printed to CORRECT.LP if SILICON= not given)
  MAXIMUM_NUMBER_OF_JOBS=8
MAXIMUM_NUMBER_OF_PROCESSORS=4 ! for fast processing on a machine with many cores, use (e.g. for 16 cores)
  MAXIMUM_NUMBER_OF_JOBS=6 ! This "overcommits" the available cores but on the whole this produces results faster (see below).
SPACE_GROUP_NUMBER=1                  ! this is known
UNIT_CELL_CONSTANTS= 27.07 31.25 33.76 87.98 108.00 112.11  ! from 2vb1
  FRIEDEL'S_LAW=TRUE  ! we're not concerned with the anomalous signal


For all the sweeps, processing stopped with an [[Problems#IDXREF_ends_with_message|error message]] after the IDXREF step. By inspecting IDXREF.LP, one should make sure that everything works as it should, i.e. that a large percentage of reflections was actually indexed nicely:
Then, run "xds_par". It completes after about 5 minutes on a fast machine, and we may inspect (at least) IDXREF.LP and CORRECT.LP (see below), and use "XDS-viewer FRAME.cbf" to get a visual impression of the integration as it applies to the last frame.
By inspecting IDXREF.LP, one should make sure that everything works as it should, i.e. that a large percentage of reflections was actually indexed nicely, e.g.:


  ...
  ...
Line 25: Line 31:
  STANDARD DEVIATION OF SPINDLE POSITION (DEGREES)    0.12
  STANDARD DEVIATION OF SPINDLE POSITION (DEGREES)    0.12
   
   
It may be possible to adjust some parameters (for COLSPOT) so that the error message does not occur, but it is not worth the effort. So we just change
JOBS=XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT
to
JOBS=DEFPIX INTEGRATE CORRECT
and run "xds_par" again. It completes after about 5 minutes on a fast machine, and we may inspect CORRECT.LP .
=== Optimization ===
=== Optimization ===


The main target of optimization is the asympototic (i.e. best) I/sigma (ISa) (Diederichs (2010) [http://dx.doi.org/10.1107/S0907444910014836 Acta Cryst. D 66, 733-40]) as printed out by CORRECT (and XSCALE). A higher ISa means better data.  
The main target of optimization is the asymptotic (i.e. best) I/sigma (ISa) (Diederichs (2010) [http://dx.doi.org/10.1107/S0907444910014836 Acta Cryst. D 66, 733-40]) as printed out by CORRECT (and XSCALE). A higher ISa should mean better data.  


However: ISa also rises if more reflections are thrown out as outliers ("misfits") so it is not considered to be optimization if just WFAC1 is reduced. Please note that the default WFAC1 is 1; this should result in the rejection of about 1% of observations. If you feel that 1% is too much then just increase WFAC1, to, say, 1.5 - that should result in rejection of less than 0.1%. This will slightly increase completeness, but will reduce I/sigma and ISa, and increase R-factors.
However: ISa also rises if more reflections are thrown out as outliers ("misfits") so it is not considered to be optimization if just WFAC1 is reduced. Please note that the default WFAC1 is 1; this should result in the rejection of about 1% of observations. If you feel that 1% is too much then just increase WFAC1, to, say, 1.5 - that should result in rejection of less than (say) 0.1%. This will slightly increase completeness, but will reduce I/sigma and ISa, and increase R-factors.


The following quantities may be tested for their influence on ISa:
The following quantities may be tested for their influence on ISa:
Line 43: Line 43:
  REFLECTING_RANGE=  0.669  REFLECTING_RANGE_E.S.D.=  0.096
  REFLECTING_RANGE=  0.669  REFLECTING_RANGE_E.S.D.=  0.096
copy these two lines into XDS.INP
copy these two lines into XDS.INP
* prevent refinement in INTEGRATE: REFINE(INTEGRATE)= !


== Example: sweep e ==
== Example: sweep e ==
=== [[XDS.INP]]; as generated by [[generate_XDS.INP]] ===
=== [[XDS.INP]]; as generated by [[generate_XDS.INP]] ===


... and including the changes concerning ORGX= 3130 ORGY= 3040, MAXIMUM_NUMBER_OF_PROCESSORS=2
generate_XDS.INP "../../APS/19-ID/2vb1/p1lyso_e.0???.img"
MAXIMUM_NUMBER_OF_JOBS=8, TRUSTED_REGION=0.00 1.5, and ROTATION_AXIS=-1 0 0 :
 
Then include the changes detailed above, resulting in:


<pre>
<pre>
JOB= XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT
JOB= XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT
MAXIMUM_NUMBER_OF_PROCESSORS=2
MAXIMUM_NUMBER_OF_PROCESSORS=4
MAXIMUM_NUMBER_OF_JOBS=8
MAXIMUM_NUMBER_OF_JOBS=6
ORGX= 3130 ORGY= 3040  ! check these values with adxv !
ORGX= 3130 ORGY= 3040  ! check these values with adxv !
UNTRUSTED_RECTANGLE=1 3160 3000 3070  ! <xmin xmax ymin ymax> to mask shadow of beamstop; XDS-viewer to find out
DETECTOR_DISTANCE= 99.9954
DETECTOR_DISTANCE= 99.9954
OSCILLATION_RANGE= 0.500
OSCILLATION_RANGE= 0.500
Line 64: Line 67:
! BACKGROUND_RANGE=1 10 ! rather use defaults (first 5 degree of rotation)
! BACKGROUND_RANGE=1 10 ! rather use defaults (first 5 degree of rotation)


SPACE_GROUP_NUMBER=0                   ! 0 if unknown
SPACE_GROUP_NUMBER=1                   ! 0 if unknown
UNIT_CELL_CONSTANTS= 70 80 90 90 90 90 ! put correct values if known
UNIT_CELL_CONSTANTS= 27.07    31.25    33.76  87.98 108.00 112.11  ! PDB 2vb1
INCLUDE_RESOLUTION_RANGE=50 0  ! after CORRECT, insert high resol limit; re-run CORRECT
INCLUDE_RESOLUTION_RANGE=50 0  ! after CORRECT, insert high resol limit; re-run CORRECT




FRIEDEL'S_LAW=FALSE    ! This acts only on the CORRECT step
!FRIEDEL'S_LAW=FALSE    ! This acts only on the CORRECT step
! If the anom signal turns out to be, or is known to be, very low or absent,
! If the anom signal turns out to be, or is known to be, very low or absent,
! use FRIEDEL'S_LAW=TRUE instead (or comment out the line); re-run CORRECT
! use FRIEDEL'S_LAW=TRUE instead (or comment out the line); re-run CORRECT
Line 92: Line 95:
! parameters specifically for this detector and beamline:
! parameters specifically for this detector and beamline:
DETECTOR= ADSC MINIMUM_VALID_PIXEL_VALUE= 1 OVERLOAD= 65000
DETECTOR= ADSC MINIMUM_VALID_PIXEL_VALUE= 1 OVERLOAD= 65000
SENSOR_THICKNESS=0.01 SILICON=34.812736
NX= 6144 NY= 6144  QX= 0.051294  QY= 0.051294 ! to make CORRECT happy if frames are unavailable
NX= 6144 NY= 6144  QX= 0.051294  QY= 0.051294 ! to make CORRECT happy if frames are unavailable
DIRECTION_OF_DETECTOR_X-AXIS=1 0 0
DIRECTION_OF_DETECTOR_X-AXIS=1 0 0
Line 102: Line 106:
</pre>
</pre>


=== [[CORRECT.LP]] main table; 1st pass ===
=== [[CORRECT.LP]] 1st pass ===
STANDARD DEVIATION OF SPOT    POSITION (PIXELS)    0.87
STANDARD DEVIATION OF SPINDLE POSITION (DEGREES)    0.10
CRYSTAL MOSAICITY (DEGREES)    0.126
...
    a        b          ISa
6.630E+00  1.091E-04  37.18
...
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION    NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA  R-meas  Rmrgd-F  Anomal  SigAno  Nano
  LIMIT    OBSERVED  UNIQUE  POSSIBLE    OF DATA  observed  expected                                      Corr
    1.77        9195    4841      9501      51.0%      1.5%      1.5%    8708  48.74    2.1%    1.6%    0%  0.000      0
    1.26      29991  15327    16721      91.7%      1.5%      1.6%    29328  45.26    2.1%    1.7%    0%  0.000      0
    1.03      38643  19731    21636      91.2%      1.7%      1.7%    37824  38.67    2.5%    2.1%    0%  0.000      0
    0.89      46156  23404    25561      91.6%      2.3%      2.4%    45504  27.56    3.3%    3.4%    0%  0.000      0
    0.80      51509  26034    28868      90.2%      4.0%      4.0%    50950  17.55    5.6%    7.0%    0%  0.000      0
    0.73      55989  28253    32034      88.2%      7.0%      6.8%    55472  10.98    9.8%    13.2%    0%  0.000      0
    0.68      59733  30115    34776      86.6%      13.1%    13.0%    59236    6.08    18.6%    26.0%    0%  0.000      0
    0.63      35385  18436    37367      49.3%      25.6%    26.9%    33898    2.99    36.3%    52.1%    0%  0.000      0
    0.60        8991    4972    39725      12.5%      51.2%    56.9%    8038    1.34    72.4%  105.0%    0%  0.000      0
    total      335592  171113    246189      69.5%      2.3%      2.4%  328958  19.58    3.3%    7.4%    0%  0.000      0
NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES  343716
NUMBER OF REJECTED MISFITS                            8112
NUMBER OF SYSTEMATIC ABSENT REFLECTIONS                  0
NUMBER OF ACCEPTED OBSERVATIONS                    335604
NUMBER OF UNIQUE ACCEPTED REFLECTIONS              171119
 
The number of "misfits" (rejections) is higher than expected (1 %). Either one considers the anomalous signal (of the 6 sulfurs) to be significant, or one simply increases WFAC1 from its default of 1, to (say) 1.2 .
 
=== [[XDS.INP]]; optimized ===
Using the output of "grep _E INTEGRATE.LP|tail -2" edit XDS.INP to have
JOB= INTEGRATE CORRECT
BEAM_DIVERGENCE=  0.428  BEAM_DIVERGENCE_E.S.D.=  0.043
REFLECTING_RANGE=  0.880  REFLECTING_RANGE_E.S.D.=  0.126
...
REFINE(INTEGRATE)= !


Then "cp GXPARM.XDS XPARM.XDS", and then another round of "xds_par". Five minutes later, we get:


=== [[CORRECT.LP]] optimization pass ===


=== [[XDS.INP]]; optimized ===
This looks a little bit better - less standard deviation, higher ISa, better R-factors, less misfits:


=== [[CORRECT.LP]] main table; optimization pass ===
STANDARD DEVIATION OF SPOT    POSITION (PIXELS)    0.83
STANDARD DEVIATION OF SPINDLE POSITION (DEGREES)    0.08
CRYSTAL MOSAICITY (DEGREES)    0.096
    a        b          ISa
6.439E+00  1.076E-04  37.98
...
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION    NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA  R-meas  Rmrgd-F  Anomal  SigAno  Nano
  LIMIT    OBSERVED  UNIQUE  POSSIBLE    OF DATA  observed  expected                                      Corr
    1.77        9149    4817      9501      50.7%      1.5%      1.5%    8664  49.75    2.1%    1.5%    0%  0.000      0
    1.26      30049  15348    16723      91.8%      1.5%      1.6%    29402  46.26    2.1%    1.6%    0%  0.000      0
    1.03      38920  19863    21637      91.8%      1.7%      1.7%    38114  39.61    2.4%    2.0%    0%  0.000      0
    0.89      46381  23508    25562      92.0%      2.2%      2.3%    45746  28.39    3.1%    3.2%    0%  0.000      0
    0.80      51605  26071    28868      90.3%      3.8%      3.8%    51068  18.21    5.3%    6.5%    0%  0.000      0
    0.73      56126  28314    32041      88.4%      6.6%      6.4%    55624  11.45    9.3%    12.3%    0%  0.000      0
    0.68      59735  30093    34771      86.5%      12.6%    12.3%    59284    6.34    17.8%    24.8%    0%  0.000      0
    0.63      35754  18620    37370      49.8%      24.1%    25.5%    34268    3.11    34.1%    48.9%    0%  0.000      0
    0.60        9180    5075    39730      12.8%      48.6%    54.3%    8210    1.40    68.7%  100.5%    0%  0.000      0
    total      336899  171709    246203      69.7%      2.2%      2.3%  330380  20.14    3.2%    6.9%    0%  0.000      0
NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES  344751
NUMBER OF REJECTED MISFITS                            7842
NUMBER OF SYSTEMATIC ABSENT REFLECTIONS                  0
NUMBER OF ACCEPTED OBSERVATIONS                    336909
NUMBER OF UNIQUE ACCEPTED REFLECTIONS              171714





Revision as of 17:38, 11 March 2011

This reports processing of triclinic hen egg-white lysozyme data @ 0.65Å resolution (PDB id 2VB1). Data (sweeps a to h, each comprising 60 to 360 frames of 72MB) were collected by Zbigniew Dauter at APS 19-ID and are available from here. Details of data collection, processing and refinement are published.

XDS processing

  1. use generate_XDS.INP to obtain a good starting point
  2. edit XDS.INP and change/add the following:
ORGX=3130 ORGY=3040  ! for ADSC, header values are subject to interpretation; these values from visual inspection
UNTRUSTED_RECTANGLE=1 3160 3000 3070  ! <xmin xmax ymin ymax> to mask shadow of beamstop; XDS-viewer to find out
TRUSTED_REGION=0 1.5 ! we want the whole detector area
ROTATION_AXIS=-1 0 0 ! at this beamline the spindle goes backwards!
SILICON=34.812736 ! account for theta-dependant absorption in the CCD's phosphor. The correction is only 
! significant for hi-res data; 34.812736=32*(value for silicon as printed to CORRECT.LP if SILICON= not given)
MAXIMUM_NUMBER_OF_PROCESSORS=4 ! for fast processing on a machine with many cores, use (e.g. for 16 cores)
MAXIMUM_NUMBER_OF_JOBS=6 ! This "overcommits" the available cores but on the whole this produces results faster (see below).
SPACE_GROUP_NUMBER=1                   ! this is known
UNIT_CELL_CONSTANTS=  27.07 31.25 33.76 87.98 108.00 112.11  ! from 2vb1
FRIEDEL'S_LAW=TRUE  ! we're not concerned with the anomalous signal

Then, run "xds_par". It completes after about 5 minutes on a fast machine, and we may inspect (at least) IDXREF.LP and CORRECT.LP (see below), and use "XDS-viewer FRAME.cbf" to get a visual impression of the integration as it applies to the last frame. By inspecting IDXREF.LP, one should make sure that everything works as it should, i.e. that a large percentage of reflections was actually indexed nicely, e.g.:

...
  63879 OUT OF   72321 SPOTS INDEXED.
...

***** DIFFRACTION PARAMETERS USED AT START OF INTEGRATION *****

REFINED VALUES OF DIFFRACTION PARAMETERS DERIVED FROM  63879 INDEXED SPOTS
REFINED PARAMETERS:   DISTANCE BEAM AXIS CELL ORIENTATION    
STANDARD DEVIATION OF SPOT    POSITION (PIXELS)     0.53
STANDARD DEVIATION OF SPINDLE POSITION (DEGREES)    0.12

Optimization

The main target of optimization is the asymptotic (i.e. best) I/sigma (ISa) (Diederichs (2010) Acta Cryst. D 66, 733-40) as printed out by CORRECT (and XSCALE). A higher ISa should mean better data.

However: ISa also rises if more reflections are thrown out as outliers ("misfits") so it is not considered to be optimization if just WFAC1 is reduced. Please note that the default WFAC1 is 1; this should result in the rejection of about 1% of observations. If you feel that 1% is too much then just increase WFAC1, to, say, 1.5 - that should result in rejection of less than (say) 0.1%. This will slightly increase completeness, but will reduce I/sigma and ISa, and increase R-factors.

The following quantities may be tested for their influence on ISa:

  • copying GXPARM.XDS to XPARM.XDS
  • including the information from the first integration pass into XDS.INP - just do "grep _E INTEGRATE.LP|tail -2" and get e.g.
BEAM_DIVERGENCE=   0.386  BEAM_DIVERGENCE_E.S.D.=   0.039
REFLECTING_RANGE=  0.669  REFLECTING_RANGE_E.S.D.=  0.096

copy these two lines into XDS.INP

  • prevent refinement in INTEGRATE: REFINE(INTEGRATE)= !

Example: sweep e

XDS.INP; as generated by generate_XDS.INP

generate_XDS.INP "../../APS/19-ID/2vb1/p1lyso_e.0???.img"

Then include the changes detailed above, resulting in:

JOB= XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT
MAXIMUM_NUMBER_OF_PROCESSORS=4
MAXIMUM_NUMBER_OF_JOBS=6
ORGX= 3130 ORGY= 3040  ! check these values with adxv !
UNTRUSTED_RECTANGLE=1 3160 3000 3070  ! <xmin xmax ymin ymax> to mask shadow of beamstop; XDS-viewer to find out
DETECTOR_DISTANCE= 99.9954
OSCILLATION_RANGE= 0.500
X-RAY_WAVELENGTH=   0.6525486
NAME_TEMPLATE_OF_DATA_FRAMES=../../APS/19-ID/2vb1/p1lyso_e.0???.img
! REFERENCE_DATA_SET=xxx/XDS_ASCII.HKL ! e.g. to ensure consistent indexing  
DATA_RANGE=1 360
SPOT_RANGE=1 180
! BACKGROUND_RANGE=1 10 ! rather use defaults (first 5 degree of rotation)

SPACE_GROUP_NUMBER=1                   ! 0 if unknown
UNIT_CELL_CONSTANTS= 27.07    31.25    33.76  87.98 108.00 112.11  ! PDB 2vb1
INCLUDE_RESOLUTION_RANGE=50 0  ! after CORRECT, insert high resol limit; re-run CORRECT


!FRIEDEL'S_LAW=FALSE     ! This acts only on the CORRECT step
! If the anom signal turns out to be, or is known to be, very low or absent,
! use FRIEDEL'S_LAW=TRUE instead (or comment out the line); re-run CORRECT

! remove the "!" in the following line:
! STRICT_ABSORPTION_CORRECTION=TRUE
! if the anomalous signal is strong: in that case, in CORRECT.LP the three
! "CHI^2-VALUE OF FIT OF CORRECTION FACTORS" values are significantly> 1, e.g. 1.5
!
! exclude (mask) untrusted areas of detector, e.g. beamstop shadow :
! UNTRUSTED_RECTANGLE= 1800 1950 2100 2150 ! x-min x-max y-min y-max ! repeat
! UNTRUSTED_ELLIPSE= 2034 2070 1850 2240 ! x-min x-max y-min y-max ! if needed
!
! parameters with changes wrt default values:
TRUSTED_REGION=0.00 1.5  ! partially use corners of detectors; 1.41421=full use
VALUE_RANGE_FOR_TRUSTED_DETECTOR_PIXELS=7000. 30000. ! often 8000 is ok
MINIMUM_ZETA=0.05        ! integrate close to the Lorentz zone; 0.15 is default
STRONG_PIXEL=6           ! COLSPOT: only use strong reflections (default is 3)
MINIMUM_NUMBER_OF_PIXELS_IN_A_SPOT=3 ! default of 6 is sometimes too high
REFINE(INTEGRATE)=CELL BEAM ORIENTATION ! AXIS DISTANCE 

! parameters specifically for this detector and beamline:
DETECTOR= ADSC MINIMUM_VALID_PIXEL_VALUE= 1 OVERLOAD= 65000
SENSOR_THICKNESS=0.01 SILICON=34.812736
NX= 6144 NY= 6144  QX= 0.051294  QY= 0.051294 ! to make CORRECT happy if frames are unavailable
DIRECTION_OF_DETECTOR_X-AXIS=1 0 0
DIRECTION_OF_DETECTOR_Y-AXIS=0 1 0
INCIDENT_BEAM_DIRECTION=0 0 1
ROTATION_AXIS=-1 0 0    ! at e.g. SERCAT ID-22 this needs to be -1 0 0
FRACTION_OF_POLARIZATION=0.98   ! better value is provided by beamline staff!
POLARIZATION_PLANE_NORMAL=0 1 0

CORRECT.LP 1st pass

STANDARD DEVIATION OF SPOT    POSITION (PIXELS)     0.87
STANDARD DEVIATION OF SPINDLE POSITION (DEGREES)    0.10
CRYSTAL MOSAICITY (DEGREES)     0.126

...

    a        b          ISa
6.630E+00  1.091E-04   37.18

...

SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION     NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA   R-meas  Rmrgd-F  Anomal  SigAno   Nano
  LIMIT     OBSERVED  UNIQUE  POSSIBLE     OF DATA   observed  expected                                      Corr

    1.77        9195    4841      9501       51.0%       1.5%      1.5%     8708   48.74     2.1%     1.6%     0%   0.000       0
    1.26       29991   15327     16721       91.7%       1.5%      1.6%    29328   45.26     2.1%     1.7%     0%   0.000       0
    1.03       38643   19731     21636       91.2%       1.7%      1.7%    37824   38.67     2.5%     2.1%     0%   0.000       0
    0.89       46156   23404     25561       91.6%       2.3%      2.4%    45504   27.56     3.3%     3.4%     0%   0.000       0
    0.80       51509   26034     28868       90.2%       4.0%      4.0%    50950   17.55     5.6%     7.0%     0%   0.000       0
    0.73       55989   28253     32034       88.2%       7.0%      6.8%    55472   10.98     9.8%    13.2%     0%   0.000       0
    0.68       59733   30115     34776       86.6%      13.1%     13.0%    59236    6.08    18.6%    26.0%     0%   0.000       0
    0.63       35385   18436     37367       49.3%      25.6%     26.9%    33898    2.99    36.3%    52.1%     0%   0.000       0
    0.60        8991    4972     39725       12.5%      51.2%     56.9%     8038    1.34    72.4%   105.0%     0%   0.000       0
   total      335592  171113    246189       69.5%       2.3%      2.4%   328958   19.58     3.3%     7.4%     0%   0.000       0


NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES  343716
NUMBER OF REJECTED MISFITS                            8112
NUMBER OF SYSTEMATIC ABSENT REFLECTIONS                  0
NUMBER OF ACCEPTED OBSERVATIONS                     335604
NUMBER OF UNIQUE ACCEPTED REFLECTIONS               171119

The number of "misfits" (rejections) is higher than expected (1 %). Either one considers the anomalous signal (of the 6 sulfurs) to be significant, or one simply increases WFAC1 from its default of 1, to (say) 1.2 .

XDS.INP; optimized

Using the output of "grep _E INTEGRATE.LP|tail -2" edit XDS.INP to have

JOB= INTEGRATE CORRECT
BEAM_DIVERGENCE=   0.428  BEAM_DIVERGENCE_E.S.D.=   0.043
REFLECTING_RANGE=  0.880  REFLECTING_RANGE_E.S.D.=  0.126
... 
REFINE(INTEGRATE)= !

Then "cp GXPARM.XDS XPARM.XDS", and then another round of "xds_par". Five minutes later, we get:

CORRECT.LP optimization pass

This looks a little bit better - less standard deviation, higher ISa, better R-factors, less misfits:

STANDARD DEVIATION OF SPOT    POSITION (PIXELS)     0.83
STANDARD DEVIATION OF SPINDLE POSITION (DEGREES)    0.08
CRYSTAL MOSAICITY (DEGREES)     0.096

    a        b          ISa
6.439E+00  1.076E-04   37.98

...

SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION     NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA   R-meas  Rmrgd-F  Anomal  SigAno   Nano
  LIMIT     OBSERVED  UNIQUE  POSSIBLE     OF DATA   observed  expected                                      Corr

    1.77        9149    4817      9501       50.7%       1.5%      1.5%     8664   49.75     2.1%     1.5%     0%   0.000       0
    1.26       30049   15348     16723       91.8%       1.5%      1.6%    29402   46.26     2.1%     1.6%     0%   0.000       0
    1.03       38920   19863     21637       91.8%       1.7%      1.7%    38114   39.61     2.4%     2.0%     0%   0.000       0
    0.89       46381   23508     25562       92.0%       2.2%      2.3%    45746   28.39     3.1%     3.2%     0%   0.000       0
    0.80       51605   26071     28868       90.3%       3.8%      3.8%    51068   18.21     5.3%     6.5%     0%   0.000       0
    0.73       56126   28314     32041       88.4%       6.6%      6.4%    55624   11.45     9.3%    12.3%     0%   0.000       0
    0.68       59735   30093     34771       86.5%      12.6%     12.3%    59284    6.34    17.8%    24.8%     0%   0.000       0
    0.63       35754   18620     37370       49.8%      24.1%     25.5%    34268    3.11    34.1%    48.9%     0%   0.000       0
    0.60        9180    5075     39730       12.8%      48.6%     54.3%     8210    1.40    68.7%   100.5%     0%   0.000       0
   total      336899  171709    246203       69.7%       2.2%      2.3%   330380   20.14     3.2%     6.9%     0%   0.000       0


NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES  344751
NUMBER OF REJECTED MISFITS                            7842
NUMBER OF SYSTEMATIC ABSENT REFLECTIONS                  0
NUMBER OF ACCEPTED OBSERVATIONS                     336909
NUMBER OF UNIQUE ACCEPTED REFLECTIONS               171714


XSCALE results

a few sweeps were optimized by copying the two lines containing mosaicity and beam divergence values from INTEGRATE.LP to XDS.INP

main table

SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION     NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA   R-meas  Rmrgd-F  Anomal  SigAno   Nano
  LIMIT     OBSERVED  UNIQUE  POSSIBLE     OF DATA   observed  expected                                      Corr

    2.91       15799    2114      2147       98.5%       2.3%      2.5%    15787   73.42     2.6%     1.1%   -15%   0.705    1969
    2.06       39607    3830      3856       99.3%       2.5%      2.8%    39602   81.49     2.6%     0.9%   -11%   0.750    3794
    1.68       64423    5068      5087       99.6%       3.1%      3.3%    64415   82.27     3.3%     1.0%    -3%   0.843    5018
    1.45       72869    6147      6163       99.7%       3.2%      3.5%    72867   77.43     3.4%     1.0%     0%   0.833    6055
    1.30       71079    6652      6657       99.9%       3.3%      3.5%    71079   70.69     3.4%     1.1%     8%   0.865    6506
    1.19       74584    7287      7298       99.8%       3.2%      3.4%    74575   66.78     3.4%     1.2%     5%   0.870    7060
    1.10       84893    8268      8278       99.9%       3.5%      3.7%    84865   62.98     3.6%     1.3%     5%   0.858    7983
    1.03       87893    8585      8603       99.8%       4.2%      4.4%    87859   56.04     4.4%     1.5%     4%   0.828    8238
    0.97       92833    9457      9465       99.9%       5.2%      5.6%    92810   48.70     5.5%     1.7%     6%   0.802    9010
    0.92       83981    9911      9927       99.8%       5.7%      6.3%    83954   41.48     6.0%     2.1%     5%   0.785    9362
    0.88       74101    9620      9621      100.0%       6.3%      7.2%    74083   35.53     6.7%     2.6%     5%   0.785    9041
    0.84       81383   11511     11518       99.9%       6.8%      7.7%    81361   30.26     7.3%     3.3%     1%   0.760   10616
    0.81       67616   10240     10247       99.9%       7.1%      7.8%    67596   25.84     7.7%     4.2%     1%   0.782    9368
    0.78       74077   11807     11817       99.9%       7.2%      7.3%    74049   22.26     7.8%     5.2%     1%   0.797   10697
    0.75       86236   13831     13839       99.9%       8.5%      8.7%    86206   18.77     9.3%     6.7%     2%   0.809   12497
    0.73       64601   10481     10488       99.9%      10.4%     10.5%    64573   15.77    11.3%     8.2%     2%   0.810    9375
    0.71       71886   11727     11741       99.9%      12.8%     13.0%    71835   13.05    14.0%    10.6%     2%   0.800   10420
    0.69       80233   13156     13163       99.9%      16.5%     16.9%    80130   10.32    18.1%    13.7%     1%   0.796   11661
    0.67       84259   14746     14766       99.9%      22.0%     22.5%    84056    7.61    24.1%    19.6%     3%   0.789   12468
    0.65       60775   15579     16551       94.1%      27.5%     30.3%    59893    4.49    31.7%    32.3%     1%   0.723    8936
   total     1433128  190017    191232       99.4%       3.3%      3.5%  1431595   33.18     3.5%     3.5%     2%   0.801  170074


Comparison of data processing: published (2006) vs XDS results

resolution (highest resolution range) observations unique reflections Multiplicity Completeness (%) R merge (%) mean I/sigma
published(2006) 30-0.65Å (0.67-0.65Å) 1331953 (12764) 187165 (6353) 7.1 (2.7) 97.6 (67.3) 4.5 (18.4) 36.2 (4.2)
XDS 30-0.65Å (0.67-0.65Å) 1433128 (60775) 190017 (15579) 7.5 (3.9) 99.4 (94.1) 3.3 (27.5) 33.2 (4.5)


timings for processing sweep "e" as a function of MAXIMUM_NUMBER_OF_PROCESSORS and MAXIMUM_NUMBER_OF_JOBS

The following is going to be rather technical! If you are only interested in crystallography, skip this.

Using

MAXIMUM_NUMBER_OF_PROCESSORS=2
MAXIMUM_NUMBER_OF_JOBS=8

we observe for the INTEGRATE step:

total cpu time used               2063.6 sec
total elapsed wall-clock time      296.1 sec

Using

MAXIMUM_NUMBER_OF_PROCESSORS=1
MAXIMUM_NUMBER_OF_JOBS=16

the times are

total cpu time used               2077.1 sec
total elapsed wall-clock time      408.2 sec

Using

MAXIMUM_NUMBER_OF_PROCESSORS=4
MAXIMUM_NUMBER_OF_JOBS=4

the times are

total cpu time used               2102.8 sec
total elapsed wall-clock time      315.6 sec

Using

MAXIMUM_NUMBER_OF_PROCESSORS=16 ! the default for xds_par on a 16-core machine
MAXIMUM_NUMBER_OF_JOBS=1 ! the default

the times are

total cpu time used               2833.4 sec
total elapsed wall-clock time      566.5 sec

but please note that this actually only uses 10 processors, since the default DELPHI=5 and the OSCILLATION_RANGE is 0.5°.

Using

MAXIMUM_NUMBER_OF_PROCESSORS=4
MAXIMUM_NUMBER_OF_JOBS=8

(thus overcommitting the available cores by a factor of 2) the times are

total cpu time used               2263.5 sec
total elapsed wall-clock time      320.8 sec

Using

MAXIMUM_NUMBER_OF_PROCESSORS=4
MAXIMUM_NUMBER_OF_JOBS=6

(thus overcommitting the available cores, but less severely) the times are

total cpu time used               2367.6 sec
total elapsed wall-clock time      267.2 sec

Thus,

MAXIMUM_NUMBER_OF_PROCESSORS=4
MAXIMUM_NUMBER_OF_JOBS=6

performs best for a 2-Xeon X5570 (HT enabled, thus 16 cores) machine with 24GB of memory and a RAID1 consisting of 2 1TB SATA disks. It should be noted that the dataset has 27GB, and in 296 seconds this means 92 MB/s continuous reading. The processing time is thus limited by the disk access, not by the CPU. And no, the data are not simply read from RAM (tested by "echo 3 > /proc/sys/vm/drop_caches" before the XDS run).