# INTEGRATE

## Overview

INTEGRATE is the most important step ("JOB") of XDS. This steps writes the logfile INTEGRATE.LP. Its task is (for each reflection)

1. to calculate the frame(s) where it contributes, and the pixel positions
2. to integrate, using profile-fitting, the observed pixel contents
3. to write the observed intensities, their standard deviations, their positions and a number of less important data to INTEGRATE.HKL

While it does this, it also refines all geometrical parameters of the diffraction experiment.

## Some explanations of definitions given by Wolfgang Kabsch (slightly edited)

• Observed spot coordinates
1. A pixel is defined 'strong' if its contents is above the mean plus a certain number (say, 3) estimated standard deviations of the surrounding background pixel values.
2. Two 'strong' pixels belong to the same spot if they are found adjacent in 3 dimensions; like x,y,z :: x+1,y,z :: x,y,z+1 :: etc.
3. A spot is defined as the set of all 'strong' pixels being adjacent, directly or indirectly.
4. Observed spot coordinates are defined as centroid of the 'strong' pixels (after background subtraction) and spatial corrections available from the X,Y-lookup tables are added to the centroids.

This definition covers the case that a spot may extend over many images or just appears on a single image. Note that XDS uses z-centroids instead of phi-angles about the spindle axis. This definition allows for bimodal spot shapes as well.

Note that this means that for weak reflections there are no observed spot coordinates.

• Calculated spot coordinates

These are the x,y,z coordinates of the centroid of Gaussians (i.e. unimodal) centered at the ideal Bragg peak x,y,phi using initial guesses for the variances (which are later replaced by estimates using the observed images).

The mapping of each pixel to the Ewald sphere uses a local, reflection specific coordinate system the origin of which corresponding to the ideal Bragg peak x,y,phi. This is used for profile fitting and has nothing to do with the definitions of observed and calculated spot coordinates.

XDS does not adjust the integration boxes such as to center them individually on the observed reflections: it only tries to minimize the deviations between observed and calculated spot coordinates by adjusting about a dozen diffraction parameters (those given by REFINE(INTEGRATE)) for the reflections in a certain range of frames (DELPHI).

## How does INTEGRATE treat overlaps?

The integration algorithm proceeds along the following lines:

1. the x,y,z, center of each pixel of the detector is assigned to its nearest (predicted) reflection in reciprocal space ("pixel-labelling", see [1]). (The z coordinate corresponds to phi, and the z pixelsize is delta-phi.)
2. some of these pixels will mostly allow the background estimation, others will mostly contribute to the integration area (but as they are transformed into a local coordinate system [2] there is not a 1:1 relationship). At this step, pixels which should be background but are higher than expected (due to overlap) are rejected.
3. for each reflection, the background is estimated, and the 3D profile is assembled from the pixels contributing to it.
4. the average profile is formed on a grid by superimposition of strong reflections found in step 3. The signal part of the profile is defined by those gridpoints of the average profile that are above a threshold (called "CUT" in XDS.INP).
5. not all pixels of a reflection, which would be required to assemble its full profile (whose shape is given by the average profile formed in step 4), may have been observed due to step 1. Therefore, in another pass, for each reflection, the observed fraction of its theoretical profile is calculated. If this fraction is less than a threshold (called "MINPK" in XDS.INP), this reflection will be discarded ("too much overlap"). If it is above MINPK, the observed intensity (from the incomplete profile) is scaled up with the inverse of the fraction. Of course this scaling-up relies on the accuracy of the average profile.

Among other things, this means that:

• there is no conceptual difference in XDS between overlap on a frame (due to too close detector, or smeared spots), and overlap by phi rotation (due to too large delta-phi, or high mosaicity).
• the program does not look around each reflection to detect an overlap situation, it just gathers the pixels for each reflection. However, each pixel is used for at most one reflection.
• if two reflections differ in phi, but have the same position on the detector, then, as a consequence of step 1 the pixels are assigned to that reflection whose phi-calc is closest to the phi of the frame considered. The relative intensities of these reflections are not taken into account because at this stage they are unknown! Thus, no deconvolution is attempted.
• as a user, when your crystal-detector distance was chosen too low, or the reflections are very broad, or if the crystal has a high mosaicity (all of which result in many overlaps), you may try reducing MINPK down to some percentage between 75 and (say) 50. This will result in more completeness, but you should monitor the quality of the resulting data. Conversely, if you raise MINPK over its default of 75 you will discard more reflections, but the resulting dataset will be a bit cleaner.
• this method degrades if the average profiles cannot be completey formed. This may happen if the reflections are too close in x,y and, at the same time, the mosaicity is high (such that no lunes exist, with edges that help constructing the average profiles).