CORRECT.LP: Difference between revisions

Jump to navigation Jump to search
13,628 bytes added ,  12 October 2018
m (fix wraparound)
 
(13 intermediate revisions by 2 users not shown)
Line 1: Line 1:
XDS, like SCALA and d*TREK, gives statistics about unaveraged and averaged quantities, but in different tables.
== [[Space group determination]] ==
 
The approach to [[space group determination]] is well explained in CORRECT.LP :
XDS adopts the following approach.
(1) it looks for possible symmetries of the crystal lattice
(2) it computes a redundancy independent R-factor for all enantiomorphous
    point groups compatible with the observed lattice symmetry.
(3) it selects the group which explains the intensity data at an acceptable,
    redundancy-independent R-factor (Rmeas, Rrim) using a minimum number of
    unique reflections.
This approach does not test for the presence of screw axes. Consequently,
orthorhombic cell axes will be specified in increasing length (following
conventions), despite the possibility that different assignments for the
cell axes could become necessary for space groups P222(1) and P2(1)2(1)2
containing one or two screw axes, respectively.
The user can always override the automatic decisions by specifying the
correct space group number and unit cell constants in XDS.INP and repeating
the CORRECT step of XDS. This provides a simple way to rename orthorhombic
cell constants if screw axes are present.
In addition, the user has the option to specify in XDS.INP
(a) a reference data set or
(b) a reindexing transformation or
(c) the three basis vectors (if known from processing a previous data set
taken at the same crystal orientation in a multi-wavelength experiment).
These features of XDS are useful for resolving the issue of alternative
settings of polar or rhombohedral cells (like P4, P6, R3).
 
Please note the sentence: '''This approach does not test for the presence of screw axes.''' The information about those reflections that may indicate screw axes is actually given in the table '''"REFLECTIONS OF TYPE H,0,0  0,K,0  0,0,L OR EXPECTED TO BE ABSENT (*)"''' (near the end of CORRECT.LP) but there is no automatic evaluation of that table that would result in screw axis assignment.
 
Therefore, the '''space group determination of XDS only results in evaluation of the possible point groups that are compatible with the lattice symmetry'''. [[Space group determination]] of XDS only suggests one representative for each point group - the other possible space groups belonging to its point group are possibilities as well, but are not listed! For example, if XDS suggest space group 89 (P422), then any other space group of point group PG422, like 90, 91, 92, 93, 94, 95 and 96 is equally possible.
 
If the user wants more automatic determination, it is suggested to run
echo SETTING SYMMETRY-BASED | pointless XDS_ASCII.HKL
 
Please note that SETTING SYMMETRY-BASED overrides a [[pointless]] default that would lead to ambiguity between space group numbers and space group symbols for space group numbers 5, 17 and 18. The mapping of numbers and names is:
 
****** LATTICE SYMMETRY IMPLICATED BY SPACE GROUP SYMMETRY ******
BRAVAIS-          POSSIBLE SPACE-GROUPS FOR PROTEIN CRYSTALS
  TYPE                    [SPACE GROUP NUMBER,SYMBOL]
  aP      [1,P1]
  mP      [3,P2] [4,P2(1)]
mC,mI    [5,C2]
  oP      [16,P222] [17,P222(1)] [18,P2(1)2(1)2] [19,P2(1)2(1)2(1)]
  oC      [21,C222] [20,C222(1)]
  oF      [22,F222]
  oI      [23,I222] [24,I2(1)2(1)2(1)]
  tP      [75,P4] [76,P4(1)] [77,P4(2)] [78,P4(3)] [89,P422] [90,P42(1)2]
          [91,P4(1)22] [92,P4(1)2(1)2] [93,P4(2)22] [94,P4(2)2(1)2]
          [95,P4(3)22] [96,P4(3)2(1)2]
  tI      [79,I4] [80,I4(1)] [97,I422] [98,I4(1)22]
  hP      [143,P3] [144,P3(1)] [145,P3(2)] [149,P312] [150,P321] [151,P3(1)12]
          [152,P3(1)21] [153,P3(2)12] [154,P3(2)21] [168,P6] [169,P6(1)]
          [170,P6(5)] [171,P6(2)] [172,P6(4)] [173,P6(3)] [177,P622]
          [178,P6(1)22] [179,P6(5)22] [180,P6(2)22] [181,P6(4)22] [182,P6(3)22]
  hR      [146,R3] [155,R32]
  cP      [195,P23] [198,P2(1)3] [207,P432] [208,P4(2)32] [212,P4(3)32]
          [213,P4(1)32]
  cF      [196,F23] [209,F432] [210,F4(1)32]
  cI      [197,I23] [199,I2(1)3] [211,I432] [214,I4(1)32]
 
== Cell parameter refinement ==
 
This may look like
<pre>
******************************************************************************
  REFINEMENT OF DIFFRACTION PARAMETERS USING ALL IMAGES
******************************************************************************
 
 
REFINED VALUES OF DIFFRACTION PARAMETERS DERIVED FROM    197690 INDEXED SPOTS
REFINED PARAMETERS:  POSITION BEAM AXIS ORIENTATION CELL
STANDARD DEVIATION OF SPOT    POSITION (PIXELS)    0.50
STANDARD DEVIATION OF SPINDLE POSITION (DEGREES)    0.03
SPACE GROUP NUMBER    199
UNIT CELL PARAMETERS    77.443    77.443    77.443  90.000  90.000  90.000
E.S.D. OF CELL PARAMETERS  1.8E-02 1.8E-02 1.8E-02 0.0E+00 0.0E+00 0.0E+00
REC. CELL PARAMETERS  0.012913  0.012913  0.012913  90.000  90.000  90.000
COORDINATES OF UNIT CELL A-AXIS    45.490    47.263  -41.162
COORDINATES OF UNIT CELL B-AXIS  -45.887    59.759    17.904
COORDINATES OF UNIT CELL C-AXIS    42.690    13.873    63.107
CRYSTAL MOSAICITY (DEGREES)    0.048
LAB COORDINATES OF ROTATION AXIS  0.999999  0.001525 -0.000039
DIRECT BEAM COORDINATES (REC. ANGSTROEM)  -0.003439  0.001125  0.999994
DETECTOR COORDINATES (PIXELS) OF DIRECT BEAM    2299.53  2261.76
DETECTOR ORIGIN (PIXELS) AT                    2308.69  2258.76
CRYSTAL TO DETECTOR DISTANCE (mm)      199.65
LAB COORDINATES OF DETECTOR X-AXIS  1.000000  0.000000  0.000000
LAB COORDINATES OF DETECTOR Y-AXIS  0.000000  1.000000  0.000000
 
 
THE DATA COLLECTION STATISTICS REPORTED BELOW ASSUMES:
SPACE_GROUP_NUMBER=  199
UNIT_CELL_CONSTANTS=    77.44    77.44    77.44  90.000  90.000  90.000
</pre>
 
== Scaling information ==
 
<pre>
******************************************************************************
          CORRECTION FACTORS AS FUNCTION OF IMAGE NUMBER & RESOLUTION
******************************************************************************
 
RECIPROCAL CORRECTION FACTORS FOR INPUT DATA SETS MERGED TO
OUTPUT FILE: XDS_ASCII.HKL                                   
 
THE CALCULATIONS ASSUME        FRIEDEL'S_LAW= TRUE
TOTAL NUMBER OF CORRECTION FACTORS DEFINED      40
DEGREES OF FREEDOM OF CHI^2 FIT              2649.6
CHI^2-VALUE OF FIT OF CORRECTION FACTORS      1.557
NUMBER OF CYCLES CARRIED OUT                      3
 
CORRECTION FACTORS for visual inspection by XDS-Viewer DECAY.cbf         
XMIN=    0.2 XMAX=    99.7 NXBIN=    2
YMIN= 0.00100 YMAX= 0.50246 NYBIN=  20
NUMBER OF REFLECTIONS USED FOR DETERMINING CORRECTION FACTORS      4832
 
 
******************************************************************************
  CORRECTION FACTORS AS FUNCTION OF X (fast) & Y(slow) IN THE DETECTOR PLANE
******************************************************************************
 
RECIPROCAL CORRECTION FACTORS FOR INPUT DATA SETS MERGED TO
OUTPUT FILE: XDS_ASCII.HKL                                   
 
THE CALCULATIONS ASSUME        FRIEDEL'S_LAW= TRUE
TOTAL NUMBER OF CORRECTION FACTORS DEFINED      90
DEGREES OF FREEDOM OF CHI^2 FIT              2640.3
CHI^2-VALUE OF FIT OF CORRECTION FACTORS      1.456
NUMBER OF CYCLES CARRIED OUT                      4
 
CORRECTION FACTORS for visual inspection by XDS-Viewer MODPIX.cbf         
XMIN=  232.6 XMAX=  3623.4 NXBIN=    9
YMIN=  313.3 YMAX=  4064.2 NYBIN=  10
NUMBER OF REFLECTIONS USED FOR DETERMINING CORRECTION FACTORS      4832
 
 
******************************************************************************
  CORRECTION FACTORS AS FUNCTION OF IMAGE NUMBER & DETECTOR SURFACE POSITION
******************************************************************************
 
RECIPROCAL CORRECTION FACTORS FOR INPUT DATA SETS MERGED TO
OUTPUT FILE: XDS_ASCII.HKL                                   
 
THE CALCULATIONS ASSUME        FRIEDEL'S_LAW= TRUE
TOTAL NUMBER OF CORRECTION FACTORS DEFINED      26
DEGREES OF FREEDOM OF CHI^2 FIT              2652.0
CHI^2-VALUE OF FIT OF CORRECTION FACTORS      1.418
NUMBER OF CYCLES CARRIED OUT                      3
 
CORRECTION FACTORS for visual inspection by XDS-Viewer ABSORP.cbf         
XMIN=    0.2 XMAX=    99.7 NXBIN=    2
DETECTOR_SURFACE_POSITION=    1928    2189
DETECTOR_SURFACE_POSITION=    2504    2826
DETECTOR_SURFACE_POSITION=    1352    2826
DETECTOR_SURFACE_POSITION=    1352    1552
DETECTOR_SURFACE_POSITION=    2504    1552
DETECTOR_SURFACE_POSITION=    3231    2786
DETECTOR_SURFACE_POSITION=    2468    3630
DETECTOR_SURFACE_POSITION=    1388    3630
DETECTOR_SURFACE_POSITION=    625    2786
DETECTOR_SURFACE_POSITION=    625    1592
DETECTOR_SURFACE_POSITION=    1388    747
DETECTOR_SURFACE_POSITION=    2468    747
DETECTOR_SURFACE_POSITION=    3231    1592
NUMBER OF REFLECTIONS USED FOR DETERMINING CORRECTION FACTORS      4832
</pre>
=== Details about the error model ===
 
<pre>
******************************************************************************
    CORRECTION PARAMETERS FOR THE STANDARD ERROR OF REFLECTION INTENSITIES
******************************************************************************
 
The variance v0(I) of the intensity I obtained from counting statistics is
replaced by v(I)=a*(v0(I)+b*I^2). The model parameters a, b are chosen to
minimize the discrepancies between v(I) and the variance estimated from
sample statistics of symmetry related reflections. This model implicates
an asymptotic limit ISa=1/SQRT(a*b) for the highest I/Sigma(I) that the
experimental setup can produce (Diederichs (2010) Acta Cryst D66, 733-740).
 
    a        b          ISa
1.042E+00  2.177E-04  66.41
</pre>
 
== Statistics of reflections ==
 
Near the top of CORRECT.LP we find:
  531781 REFLECTIONS ON FILE "INTEGRATE.HKL"
      0 CORRUPTED REFLECTION RECORDS (IGNORED)
      0 REFLECTIONS INCOMPLETE OR OUTSIDE IMAGE RANGE      1 ...    1799
      0 OVERLOADED REFLECTIONS (IGNORED)
      81 REFLECTIONS OUTSIDE ACCEPTED RESOLUTION RANGES
                OR TOO CLOSE TO ROTATION AXIS (IGNORED)
  531700 REFLECTIONS ACCEPTED
 
== Statistics of observations ==
 
XDS, like e.g. SCALA/aimless and d*TREK, gives statistics about unaveraged (individual observations) and averaged ("merged") quantities, but in different tables.
The unaveraged values are in a table that is fine-grained in terms of resolution, at the beginning of CORRECT.LP. The Sigma values in that table are corrected to match the RMS scatter.
The unaveraged values are in a table that is fine-grained in terms of resolution, at the beginning of CORRECT.LP. The Sigma values in that table are corrected to match the RMS scatter.


Line 7: Line 208:
at first the definitions of the quantities in the table are given, and then the table itself is printed.
at first the definitions of the quantities in the table are given, and then the table itself is printed.


Specifically, the heading of the table which talks about the unaveraged data looks like this:
Specifically, the heading of the table which talks about the unaveraged data ("observations") looks like this:


   I/Sigma  = mean intensity/Sigma of a reflection in shell
   I/Sigma  = mean intensity/Sigma of a reflection in shell
Line 29: Line 230:
                                   observed  expected
                                   observed  expected
   
   
   39.660 19.587     8.23   0.96     6.36     7.12     929     940    75
   48.268 17.853     9.63   0.97     5.06     6.10     865     868      44
   19.587 14.780     7.39   0.88      5.94     7.46   1956   1959     66
   17.853 13.079    10.02  0.97      5.22      6.14   1301    1305      81
  13.079  10.812     9.83  1.10      5.56      5.94    1374    1388      99
  10.812   9.423    9.88   1.09     5.32      6.03    1820    1825    108
    9.423  8.460    9.56  1.07      6.03     6.21   2087   2101     167
  .... (many resolution shells deleted for brevity)
  .... (many resolution shells deleted for brevity)


== Statistics of unique reflections ==


----
Later tables talk about the averaged ("merged") intensities. Please note that I/SIGMA means "average of I/SIGMA", not "average of I" / "average of SIGMA".


and later it gives the table for the averaged intensities with heading


  R-FACTOR
  R-FACTOR
Line 49: Line 254:
   
   
  R-meas  = redundancy independent R-factor (intensities)
  R-meas  = redundancy independent R-factor (intensities)
Rmrgd-F  = quality of amplitudes (F) of this data set
            For definition of R-meas and Rmrgd-F see
             Diederichs & Karplus (1997), Nature Struct. Biol. 4, 269-275.
             Diederichs & Karplus (1997), Nature Struct. Biol. 4, 269-275.
  (rest of heading deleted for brevity)
   
CC(1/2)  = percentage of correlation between intensities from
            random half-datasets. Correlation significant at
            the 0.1% level is marked by an asterisk.
            Karplus & Diederichs (2012), Science 336, 1030-33
Anomal  = percentage of correlation between random half-sets
  Corr      of anomalous intensity differences. Correlation
            significant at the 0.1% level is marked.
SigAno  = mean anomalous difference in units of its estimated
            standard deviation (|F(+)-F(-)|/Sigma). F(+), F(-)
            are structure factor estimates obtained from the
            merged intensity observations in each parity class.
  Nano    = Number of unique reflections used to calculate
            Anomal_Corr & SigAno. At least two observations
            for each (+ and -) parity are required.


and the table itself is
and the table itself is
      NOTE:      Friedel pairs are treated as different reflections.
  SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
  SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
  RESOLUTION    NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA  R-meas  Rmrgd-F  Anomal  SigAno  
  RESOLUTION    NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA  R-meas  CC(1/2)  Anomal  SigAno   Nano
   LIMIT    OBSERVED  UNIQUE  POSSIBLE    OF DATA  observed  expected                                      Corr
   LIMIT    OBSERVED  UNIQUE  POSSIBLE    OF DATA  observed  expected                                      Corr
   
   
     6.66       12698   5958    10069       59.2%      5.3%    6.7%    11577   10.55     6.8%    5.5%    -27%    0.740    527
     5.72       23750   7284      7488      97.3%       6.6%      6.6%    23666  14.59     7.9%   99.3*    33*  1.043    3033
     4.74       22569   11140     17519       63.6%       7.3%    7.8%    19592   8.24     9.5%    9.1%    -25%    0.734    629
    4.06       41574  12997    13384      97.1%      10.0%      8.3%    41476  11.40    12.1%    98.3*    45*  1.341    5775
     3.88       28199   14683     22445       65.4%       7.9%    7.7%    23437   7.88    10.3%    10.6%   -31%   0.769    449
    3.32      56679  16961    17336      97.8%      16.8%    15.4%   56494   6.49    20.1%    97.9*    31*   1.079    7697
     3.37       34407   17986     26530       67.8%     12.3%   12.0%    28131   5.25    16.1%    20.6%    -19%   0.777    351
    2.88      67173  20272     20497      98.9%      38.4%    39.0%    66875    2.91    45.9%    93.1*    19*  0.840    9333
     3.01       39636   20921     29958       69.8%     22.723.3%    31896   3.08    29.8%    42.6%    -12%   0.644    211
     2.57       79365   23100     23197       99.6%     77.6%    85.3%    79063    1.46    92.1%    75.3*    5   0.701  10761
  (rest deleted for brevity)
    2.35      86431  25554     25631      99.7%    128.9%    146.7%    86014    0.86  153.2%   54.7*    3   0.633  11894
     2.18       83863   27529     27946       98.5%     197.0%    230.0%    81669   0.49  237.7%    31.6*   -1   0.575  11422
     2.04       51338   23815     29966       79.5%     286.2%   343.0%    43478   0.26  361.1%    15.1*    0   0.526    5523
     1.92       25803   15877     31898       49.8%     483.3%    577.5%   17026    0.12   635.3%     3.8      2   0.519    1856
    total      515976  173389   197343      87.9%      27.8%    29.3%   495761   2.89    33.5%    98.2*   19*  0.781  67294
NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES  531700
NUMBER OF REJECTED MISFITS                          15698
NUMBER OF SYSTEMATIC ABSENT REFLECTIONS                  0
NUMBER OF ACCEPTED OBSERVATIONS                    516002
  NUMBER OF UNIQUE ACCEPTED REFLECTIONS              173398
 
Why is there a discrepancy between "total 515976 173389" ''versus'' "NUMBER OF ACCEPTED OBSERVATIONS 516002", and "NUMBER OF UNIQUE ACCEPTED REFLECTIONS 173398" ?? The reason is that the (higher) numbers ''below'' the table ''include'' observations (and unique reflections) with I < -3*sigma(I), whereas the numbers ''in'' the table refer only to those reflections which should be used downstream (for phasing and refinement). Indeed, XDSCONV filters out those unique reflections which have I<-3*sigma(I).
 
It should also be noted that the alien rejection controlled by [http://xds.mpimf-heidelberg.mpg.de/html_doc/xds_parameters.html#REJECT_ALIEN= REJECT_ALIEN=] (default 20) will be performed ''after'' making this table. So the number of reflections which you will get from [[XDSCONV]] is not the same as reported here. If you want to see the statistics of reflections which will be converted by XDSCONV (thus will be used for further process), you should prepare REMOVE.HKL to explicitly specify the reflections which will be thrown away and run CORRECT step again.


At the bottom of CORRECT.LP we find:
NUMBER OF UNIQUE ALIEN REFLECTIONS WITH A Z-SCORE ABOVE LIMIT      162
(ALIENS ABOVE LIMIT (REJECT_ALIEN=      20.0) ARE MARKED INVALID)
NUMBER OF REFLECTION RECORDS ON OUTPUT FILE "XDS_ASCII.HKL"      531700
NUMBER OF ACCEPTED OBSERVATIONS (INCLUDING SYSTEMATIC ABSENCES)  515712
NUMBER OF REJECTED MISFITS & ALIENS (marked by -1*SIGMA(IOBS))    15988


So, the program indicates quite clearly what the statistics refer to.
The file XDS_ASCII.HKL actually has all 531700 reflections.
2,651

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.

Navigation menu