How can we judge the quality of a data set? There are several possibilities:
- numerical indicators, like R-values, I/sigma and the like
- graphical representations
This article serves to demonstrate pathological cases. It collects examples for
- problems with the hardware (e.g. detector, beamline, goniostat, beam, cryo)
- problems with the crystal
- problems with data processing
Scale factor plot in case of problems (beam or spindle)
The scale factor is printed, in INTEGRATE.LP, for every frame (column 3). This plot shows spikes indicating that the beam was weak, or the spindle went too fast every 13 frames or so. A change of flux within a frame is detrimental to data quality; however if the change of flux occurs during the readout (i.e. between the frames) then the scale factor accurately compensates the flux change. A change in rotation speed is also to some extent compensated by the change in scale factor, but there is the additional effect that the next frame starts at a phi offset (which has to be compensated by ORIENTATION refinement in INTEGRATE).
Possible sources of beam flux changes are attenuators that vibrate, top-up injections, and other types of vibrations. The shorter the exposure times are, the more these problems usually become visible.
Mosaicity plots in case of problems
The same data set: the mosaicity estimates of individual frames (column 10 in INTEGRATE.LP) are high because the orientation is off. The "jumps" in the curve arises because INTEGRATE was run with MAXIMUM_NUMBER_OF_JOBS=8: since each of the 8 jobs uses the orientation matrix from IDXREF for its initial batch, and that matrix does not seem to match the actual orientation, the mosaicity appears high. Only after geometry refinement (green line) is the result reasonable (and thus the intensity estimates will not be affected). The estimate for the second batch of each job is much better, because it uses the orientation obtained from the geometry refinement as a starting point.
Exactly why the IDXREF estimate is off, and if it has something to do with the flux or spindle problem, is unknown.
With MAXIMUM_NUMBER_OF_JOBS=1 the plot would definitely not look like this - it would be much smoother because the next batch of data "knows" about the orientation of the previous one.
The same hardware problem, but a different data set: here, the mosaicity estimates of individual frames are less affected, because the initial orientation from IDXREF is good. Oscillations are not seen very well here since the period of the scale factor changes is on the order of 13 frames.
Zoomed version of the above. The oscillations are better visible.