INTEGRATE: Difference between revisions

From XDSwiki
Jump to navigation Jump to search
mNo edit summary
 
(11 intermediate revisions by 2 users not shown)
Line 1: Line 1:
'''This article is under construction!'''
'''This article is under construction!'''
== Overview ==


INTEGRATE is the most important step ("JOB") of XDS. This steps writes the logfile [[INTEGRATE.LP]].
INTEGRATE is the most important step ("JOB") of XDS. This steps writes the logfile [[INTEGRATE.LP]].
Line 27: Line 30:
XDS does not adjust the integration boxes such as to center them individually on the observed reflections: it only tries to minimize the deviations between observed and calculated spot coordinates by adjusting about a dozen diffraction parameters (those given by REFINE(INTEGRATE)) for the reflections in a certain range of frames (DELPHI).
XDS does not adjust the integration boxes such as to center them individually on the observed reflections: it only tries to minimize the deviations between observed and calculated spot coordinates by adjusting about a dozen diffraction parameters (those given by REFINE(INTEGRATE)) for the reflections in a certain range of frames (DELPHI).


== How does INTEGRATE treat overlaps? ==
== How does INTEGRATE work? ==


First of all, there is no conceptual difference in XDS between overlap on a frame (due to too close detector, or smeared spots), and overlap by rotation (due to too large delta-phi, or high mosaicity).
The integration algorithm (see [http://dx.doi.org/10.1107/S0021889888007903]) proceeds along the following lines:


The integration algorithm is:
# '''pixel-labelling''': the x,y,z, center of each pixel of the detector (z corresponds to phi, and the z pixelsize is delta-phi) is assigned to its closest (predicted) reflection in reciprocal space. As a consequence, ''each pixel of the detector is used for at most one reflection''.  
# each pixel of the detector is assigned to its nearest reflection in reciprocal space ("pixel-labelling", see [http://dx.doi.org/10.1107/S0021889888007903])
# '''transformation to local coordinate system''': some of these pixels will mostly allow the background estimation, others will mostly contribute to the integration area (but there is not a 1:1 relationship).  
# some of these pixels will mostly allow the background estimation, others will mostly contribute to the integration area (but as they are transformed into a local coordinate system [http://dx.doi.org/10.1107/S0021889888007903] there is not a 1:1 relationship). At this step, pixels which should be background but are higher than expected (due to overlap) are rejected.
# '''average profile''': the average profile is formed on a grid (using the 3D local coordinate system) from strong reflections. The signal part of the profile is defined by those gridpoints of the average profile that are above a threshold (called "CUT" in XDS.INP).
# for each reflection, the background is estimated, and the 3D profile is assembled from the pixels contributing to it
# '''estimating the intensity''': for each reflection, the background is estimated, and the 3D profile is assembled from the pixels contributing to it. Pixels which are mostly background but whose counts are higher than expected (e.g. due to overlap) are rejected.
# a comparison is made: for a reflection, is the percentage of its observed profile assembled in 3. larger than some constant (called "MINPK" in XDS.INP)? If the answer is no, this reflection will be discarded (too much "overlap"). Otherwise, the observed intensity (from the incomplete profile) is scaled up, using the inverse of the observed fraction (this relies on the accuracy of the average profile).
# '''handling overlap''': not all pixels of a reflection, which would be required to assemble its full profile (whose shape is given by the average profile), may have been observed due to step 1. Therefore, in another pass, for each reflection, the observed fraction of its theoretical profile is calculated. If this fraction (column "PEAK" in XDS_ASCII.HKL) is less than a threshold (called "MINPK" in XDS.INP), this reflection will be discarded ("too much overlap"). If it is above MINPK, the observed intensity (from the incomplete profile) is scaled up with the inverse of the fraction.


Among other things, this means that:
Concerning overlap of reflections, this means that:
# the program does _not_ look around each reflection to detect an overlap situation, it just tries to gather the pixels for each reflection
* there is ''no conceptual difference'' in XDS ''between overlap in x,y'' (due to too close detector, or smeared spots), ''and overlap by phi rotation'' (due to too large delta-phi, or high mosaicity).
# as a user, when your crystal-detector distance was chosen too low, or the reflections are very broad, or if the crystal has a high mosaicity (all of which result in many overlaps), you may try reducing MINPK down to 70, 65, 60, 55 or even 50. This will result in more completeness, but you should monitor the quality of the resulting data. Conversely, if you raise MINPK over its default of 75 you will discard more reflections, but the resulting dataset will be a bit cleaner.
* the program does ''not'' look around each reflection to detect an overlap situation, it just gathers the pixels for each reflection.
* if two reflections differ in phi, but have the same position on the detector, then, as a consequence of step 1 the pixels are assigned to that reflection whose phi-calc is closest to the phi of the frame considered. The relative intensities of these reflections are not taken into account because at this stage they are unknown! Thus, ''no deconvolution is attempted''.
* as a user, when your crystal-detector distance was chosen too low, or the reflections are very broad, or if the crystal has a high mosaicity (all of which result in many overlaps), you may try reducing MINPK down to some percentage between 75 (the default) and (say) 50. This will result in more completeness, ''but you should monitor the quality of the resulting data''. Conversely, if you raise MINPK above 75 you will discard more reflections, but the resulting dataset may be cleaner - again: ''check the statistics''. In particular, the latest versions of [[XDSSTAT]] prints out R_meas as a function of PEAK and intensity.
* this method degrades if the average profiles cannot be completey formed, as the scaling-up relies on their accuracy. This may happen if the reflections are too close in x,y and, at the same time, the mosaicity is high (such that no lunes exist, with edges that help constructing the average profiles). ''It is therefore useful to check the printed profiles in INTEGRATE.LP''. Again, the latest versions of [[XDSSTAT]] help to find the best compromise between data quality and completeness.

Latest revision as of 13:14, 7 February 2011

This article is under construction!


Overview

INTEGRATE is the most important step ("JOB") of XDS. This steps writes the logfile INTEGRATE.LP. Its task is (for each reflection)

  1. to calculate the frame(s) where it contributes, and the pixel positions
  2. to integrate, using profile-fitting, the observed pixel contents
  3. to write the observed intensities, their standard deviations, their positions and a number of less important data to INTEGRATE.HKL

While it does this, it also refines all geometrical parameters of the diffraction experiment.

Some explanations of definitions given by Wolfgang Kabsch (slightly edited)

  • Observed spot coordinates
  1. A pixel is defined 'strong' if its contents is above the mean plus a certain number (say, 3) estimated standard deviations of the surrounding background pixel values.
  2. Two 'strong' pixels belong to the same spot if they are found adjacent in 3 dimensions; like x,y,z :: x+1,y,z :: x,y,z+1 :: etc.
  3. A spot is defined as the set of all 'strong' pixels being adjacent, directly or indirectly.
  4. Observed spot coordinates are defined as centroid of the 'strong' pixels (after background subtraction) and spatial corrections available from the X,Y-lookup tables are added to the centroids.

This definition covers the case that a spot may extend over many images or just appears on a single image. Note that XDS uses z-centroids instead of phi-angles about the spindle axis. This definition allows for bimodal spot shapes as well.

Note that this means that for weak reflections there are no observed spot coordinates.

  • Calculated spot coordinates

These are the x,y,z coordinates of the centroid of Gaussians (i.e. unimodal) centered at the ideal Bragg peak x,y,phi using initial guesses for the variances (which are later replaced by estimates using the observed images).

The mapping of each pixel to the Ewald sphere uses a local, reflection specific coordinate system the origin of which corresponding to the ideal Bragg peak x,y,phi. This is used for profile fitting and has nothing to do with the definitions of observed and calculated spot coordinates.

XDS does not adjust the integration boxes such as to center them individually on the observed reflections: it only tries to minimize the deviations between observed and calculated spot coordinates by adjusting about a dozen diffraction parameters (those given by REFINE(INTEGRATE)) for the reflections in a certain range of frames (DELPHI).

How does INTEGRATE work?

The integration algorithm (see [1]) proceeds along the following lines:

  1. pixel-labelling: the x,y,z, center of each pixel of the detector (z corresponds to phi, and the z pixelsize is delta-phi) is assigned to its closest (predicted) reflection in reciprocal space. As a consequence, each pixel of the detector is used for at most one reflection.
  2. transformation to local coordinate system: some of these pixels will mostly allow the background estimation, others will mostly contribute to the integration area (but there is not a 1:1 relationship).
  3. average profile: the average profile is formed on a grid (using the 3D local coordinate system) from strong reflections. The signal part of the profile is defined by those gridpoints of the average profile that are above a threshold (called "CUT" in XDS.INP).
  4. estimating the intensity: for each reflection, the background is estimated, and the 3D profile is assembled from the pixels contributing to it. Pixels which are mostly background but whose counts are higher than expected (e.g. due to overlap) are rejected.
  5. handling overlap: not all pixels of a reflection, which would be required to assemble its full profile (whose shape is given by the average profile), may have been observed due to step 1. Therefore, in another pass, for each reflection, the observed fraction of its theoretical profile is calculated. If this fraction (column "PEAK" in XDS_ASCII.HKL) is less than a threshold (called "MINPK" in XDS.INP), this reflection will be discarded ("too much overlap"). If it is above MINPK, the observed intensity (from the incomplete profile) is scaled up with the inverse of the fraction.

Concerning overlap of reflections, this means that:

  • there is no conceptual difference in XDS between overlap in x,y (due to too close detector, or smeared spots), and overlap by phi rotation (due to too large delta-phi, or high mosaicity).
  • the program does not look around each reflection to detect an overlap situation, it just gathers the pixels for each reflection.
  • if two reflections differ in phi, but have the same position on the detector, then, as a consequence of step 1 the pixels are assigned to that reflection whose phi-calc is closest to the phi of the frame considered. The relative intensities of these reflections are not taken into account because at this stage they are unknown! Thus, no deconvolution is attempted.
  • as a user, when your crystal-detector distance was chosen too low, or the reflections are very broad, or if the crystal has a high mosaicity (all of which result in many overlaps), you may try reducing MINPK down to some percentage between 75 (the default) and (say) 50. This will result in more completeness, but you should monitor the quality of the resulting data. Conversely, if you raise MINPK above 75 you will discard more reflections, but the resulting dataset may be cleaner - again: check the statistics. In particular, the latest versions of XDSSTAT prints out R_meas as a function of PEAK and intensity.
  • this method degrades if the average profiles cannot be completey formed, as the scaling-up relies on their accuracy. This may happen if the reflections are too close in x,y and, at the same time, the mosaicity is high (such that no lunes exist, with edges that help constructing the average profiles). It is therefore useful to check the printed profiles in INTEGRATE.LP. Again, the latest versions of XDSSTAT help to find the best compromise between data quality and completeness.