R-factors

From CCP4 wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Historically, R-factors were introduced by ... ???

Definitions

Data quality indicators

In the following, all sums over hkl extend only over unique reflections with more than one observation!

  • Rsym and Rmerge : the formula for both is

[math]\displaystyle{ R = \frac{\sum_{hkl} \sum_{j} \vert I_{hkl,j}-\langle I_{hkl}\rangle\vert}{\sum_{hkl} \sum_{j}I_{hkl,j}} }[/math]
where [math]\displaystyle{ \langle I_{hkl}\rangle }[/math] is the average of symmetry- (or Friedel-) related observations of a unique reflection.

It can be shown that this formula results in higher R-factors when the redundancy is higher. In other words, low-redundancy datasets appear better than high-redundancy ones, which obviously violates the intention of having an indicator of data quality!

  • Redundancy-independant version of the above:

[math]\displaystyle{ R_{meas} = \frac{\sum_{hkl} \sqrt \frac{n}{n-1} \sum_{j=1}^{n} \vert I_{hkl,j}-\langle I_{hkl}\rangle\vert}{\sum_{hkl} \sum_{j}I_{hkl,j}} }[/math]
which unfortunately results in higher (but more realistic) numerical values than Rsym / Rmerge

  • measuring quality of averaged intensities/amplitudes:

for intensities use [math]\displaystyle{ R_{p.i.m.} (or R_{mrgd-I}) = \frac{\sum_{hkl} \sqrt \frac{1}{n} \sum_{j=1}^{n} \vert I_{hkl,j}-\langle I_{hkl}\rangle\vert}{\sum_{hkl} \sum_{j}I_{hkl,j}} }[/math]

and similarly for amplitudes: [math]\displaystyle{ R_{mrgd-F} = \frac{\sum_{hkl} \sqrt \frac{1}{n} \sum_{j=1}^{n} \vert F_{hkl,j}-\langle F_{hkl}\rangle\vert}{\sum_{hkl} \sum_{j}F_{hkl,j}} }[/math]
with [math]\displaystyle{ \langle F_{hkl}\rangle }[/math] defined analogously as [math]\displaystyle{ \langle I_{hkl}\rangle }[/math].

Model quality indicators

  • R and Rfree : the formula for both is

[math]\displaystyle{ R=\frac{\sum_{hkl}\vert F_{hkl}^{obs}-F_{hkl}^{calc}\vert}{\sum_{hkl} F_{hkl}^{obs}} }[/math]

where [math]\displaystyle{ F_{hkl}^{obs} }[/math] and [math]\displaystyle{ F_{hkl}^{calc} }[/math] have to be scaled w.r.t. each other. R and Rfree differ in the set of reflections they are calculated from: R is calculated for the working set, whereas Rfree is calculated for the test set.

what do R-factors try to measure, and how to interpret their values?

  • relative deviation of

Data quality

  • typical values: ...

Model quality

what kinds of problems exist with these indicators?

- (Rsym / Rmerge ) should not be used, Rmeas should be used instead (explain why ?)

- R/Rfree and NCS: reflections in work and test set are not independant