Changes

From XDSwiki
Jump to navigationJump to search
7,788 bytes added ,  13:31, 22 October 2019
m
fix links
Line 1: Line 1: −
[http://www.mpimf-heidelberg.mpg.de/~kabsch/xds/html_doc/xdsconv_parameters.html XDSCONV] is the conversion program of the [[XDS]] suite.
+
[http://xds.mpimf-heidelberg.mpg.de/html_doc/xdsconv_parameters.html XDSCONV] is the conversion program of the [[XDS]] suite.
 
  −
Possible output formats are SHELX, CNS, MTZ (FIXME: which else).
      +
Possible output formats are SHELX, CNS, CCP4 (for F,SigF,DF,SigDF,isym), CCP4_F (for F,SigF,F(+),SigF(+),F(-),SigF(-)), CCP4_I (for IMEAN,SIGIMEAN,I(+),SIGI(+),I(-),SIGI(-)) and CCP4_I+F (for IMEAN,SIGIMEAN,I(+),SIGI(+),I(-),SIGI(-),FP,SIGFP,F(+),SIGF(+),F(-),SIGF(-)) - the "+" and "-" varieties are only output if FRIEDEL'S_LAW=FALSE.
 +
----
 +
XDSCONV does outlier rejection in some modes.
 +
== Typical use ==
 
A typical input file XDSCONV.INP might look like
 
A typical input file XDSCONV.INP might look like
 
  INPUT_FILE=XDS_ASCII.HKL
 
  INPUT_FILE=XDS_ASCII.HKL
 
  INCLUDE_RESOLUTION_RANGE=50 1  ! optional  
 
  INCLUDE_RESOLUTION_RANGE=50 1  ! optional  
  OUTPUT_FILE=temp.hkl  CCP4   FRIEDEL'S_LAW=FALSE
+
  OUTPUT_FILE=temp.hkl  CCP4     ! Warning: do _not_ name this file "temp.mtz" !
This produces the file temp.hkl which is then converted to a MTZ file XDS_ASCII.mtz with
+
FRIEDEL'S_LAW=FALSE           ! default is FRIEDEL'S_LAW=TRUE
 +
This produces the file temp.hkl which is then converted to a MTZ file XDS_ASCII.mtz with (these lines are also printed out by XDSCONV):
 
  f2mtz HKLOUT temp.mtz<F2MTZ.INP
 
  f2mtz HKLOUT temp.mtz<F2MTZ.INP
 
  cad HKLIN1 temp.mtz HKLOUT XDS_ASCII.mtz<<EOF
 
  cad HKLIN1 temp.mtz HKLOUT XDS_ASCII.mtz<<EOF
  LABIN FILE 1 E1=FP E2=SIGFP E3=DANO E4=SIGDANO
+
  LABIN FILE 1 ALL
LABOUT FILE 1 E1=FP E2=SIGFP E3=DANO E4=SIGDANO
   
  END
 
  END
 
  EOF
 
  EOF
This latter step is not necessary for CNS and SHELX output formats, which are written directly by XDSCONV.
+
This latter step is not necessary for CNS and SHELX output formats, which are written directly by XDSCONV. For the CNS output format, one could use MERGE=FALSE to keep observations separate. For the SHELX output format, MERGE=FALSE is the default (I guess because George Sheldrick suggests that his programs, in particular [http://strucbio.biologie.uni-konstanz.de/ccp4wiki/index.php/XPREP XPREP], should be fed unmerged data. However I sometimes found that I obtain better SHELXD results with merging inside XDSCONV, using MERGE=TRUE).
 +
 
 +
N.B. It is good practice to always use FRIEDEL'S_LAW=FALSE - see [[Tips and Tricks]].
 +
 
 +
=== how to change column labels ===
 +
To have control over the column labels, one might want to modify the simple example above as:
 +
 
 +
f2mtz HKLOUT temp.mtz<F2MTZ.INP
 +
cad HKLIN1 temp.mtz HKLOUT junk_xdsconv.mtz<<EOF
 +
LABIN FILE 1 E1=FP E2=SIGFP E3=DANO E4=SIGDANO E5=ISYM
 +
LABOUT FILE 1 E1=FP E2=SIGFP E3=DANO_sulf E4=SIGDANO_sulf E5=ISYM_sulf
 +
END
 +
EOF
 +
 
 +
ISYM column is important if you want to run SHARP afterwards.
 +
 
 +
In the case of a MTZ file that should be used for molecular replacement and refinement, the CAD step could be used to transfer the R_free flag from a different dataset to this new dataset. Alternatively, change of labels and transfer of columns can be done in the ccp4i GUI.
 +
 
 +
== explanation of typical output ==
 +
<pre>
 +
========== CONTROL CARDS ==========
 +
 
 +
INPUT_FILE=XDS_ASCII.HKL
 +
OUTPUT_FILE=temp.hkl CCP4
 +
 
 +
 
 +
SPACE_GROUP_NUMBER=  199
 +
UNIT_CELL_CONSTANTS=    78.09    78.09    78.09  90.000  90.000  90.000
 +
FRIEDEL'S_LAW=FALSE
 +
MERGE=TRUE
 +
NUMBER OF REFLECTION RECORDS ON INPUT FILE      217611      ! observations ("spots")
 +
NUMBER OF IGNORED REFLECTIONS (I< -3.0*SIGMA)        0      ! merged (unique) reflections, Friedels counted separately
 +
NUMBER OF REFLECTIONS ACCEPTED FROM INPUT FILE  23155      ! merged (unique) reflections, Friedels counted separately
 +
 
 +
NUMBER OF UNIQUE REFLECTIONS ASSIGNED TO TEST SET        0
 +
NUMBER OF UNIQUE TEST REFLECTIONS INHERITED              0
 +
NUMBER OF UNIQUE TEST REFLECTIONS NEWLY GENERATED        0
 +
 
 +
NUMBER OF REFLECTION RECORDS ON OUTPUT FILE      12264      ! merged (unique) reflections; a Friedel pair is counted as one reflection for the MTZ file
 +
NUMBER OF RECORDS ASSIGNED TO WORKING SET        12264      ! but since each unique reflection is stored with its anomalous signal no information is lost
 +
NUMBER OF RECORDS ASSIGNED TO TEST SET              0
 +
</pre>
 +
 
 +
'''Obviously, the meaning of the word "reflection" differs between the output lines; some explanation is given after the exclamation mark.
 +
'''
 +
 
 +
== how to obtain a MTZ file with DANO SIGDANO F(+) SIGF(+) F(-) SIGF(-) ==
 +
You have to run XDSCONV twice, and combine the output with cad. At the latter step you can also change the column labels:
 +
#!/bin/csh -f
 +
# produce xds_allFinfo.mtz with FP SIGFP DANO SIGDANO F(+) SIGF(+) F(-) SIGF(-)
 +
# in the same way, the labels produced with CCP4_I could be included!
 +
#
 +
# first xdsconv run producing FP SIGFP DANO SIGDANO
 +
echo "INPUT_FILE= XDS_ASCII.HKL" > XDSCONV.INP
 +
echo "OUTPUT_FILE= temp.hkl CCP4" >> XDSCONV.INP
 +
echo "FRIEDEL'S_LAW= FALSE" >> XDSCONV.INP
 +
xdsconv
 +
f2mtz HKLOUT temp1.mtz<F2MTZ.INP
 +
 +
# second xdsconv run producing F(+) SIGF(+) F(-) SIGF(-)
 +
echo "INPUT_FILE= XDS_ASCII.HKL" > XDSCONV.INP
 +
echo "OUTPUT_FILE= temp.hkl CCP4_F" >> XDSCONV.INP
 +
echo "FRIEDEL'S_LAW= FALSE" >> XDSCONV.INP
 +
xdsconv
 +
f2mtz HKLOUT temp2.mtz<F2MTZ.INP
 +
 +
# for CAD, the 2 LABOUT cards are only required if the labels should be changed
 +
cad HKLIN1 temp1.mtz HKLIN2 temp2.mtz HKLOUT xds_allFinfo.mtz<<EOF
 +
  LABIN  FILE 1 E1=FP      E2=SIGFP      E3=DANO    E4=SIGDANO
 +
  LABIN  FILE 2 E1=F(+)    E2=SIGF(+)    E3=F(-)    E4=SIGF(-)
 +
  LABOUT FILE 1 E1=FP_Hg    E2=SIGFP_Hg    E3=DANO_Hg  E4=SIGDANO_Hg
 +
  LABOUT FILE 2 E1=F(+)_Hg  E2=SIGF(+)_Hg  E3=F(-)_Hg  E4=SIGF(-)_Hg
 +
  END
 +
EOF
 +
 
 +
The following script does the same for the input file (first parameter to the script), but also adds a SUFFIX (second parameter) to the columns to better identify the data, and optionally copies the Rfree-flag from a reference mtz-file (third parameter). If the Rfree-flag is NOT named "FreeR_flag" (the default from ccp4i), you can provide its name as fourth parameter. All steps are logged into log-files, temporary files are deleted. The input file should end with .HKL (rather than e.g. .hkl).
 +
The script also sets the resolution to that of the observed data using sftools. Otherwise the resolution of the reference data set might be shown if that is higher. You can call this script 'xds2mtz.sh'. If it is executed without arguments, you get a short usage instruction.
 +
<nowiki>#!/bin/bash
 +
 
 +
function usage {
 +
echo "Usage: xds2mtz file.HKL SUFFIX [Rfree.mtz [RfreeFlag]]"
 +
echo ""
 +
echo "      file.HKL:  Output from XDS or XSCALE"
 +
echo "      SUFFIX:    Columns suffix, e.g. FP_SUFFIX"
 +
echo "      Rfree.mtz: Reference mtz-file for Rfree transfer"
 +
echo "      RfreeFlag: Label for Rfree set, defaults to \"FreeR_flag\""
 +
echo ""
 +
}
 +
 
 +
if [ -z $1 ]; then
 +
echo "*** Error: Missing input XDS file name"
 +
usage
 +
exit -1;
 +
fi
 +
if [ ! -f $1 ]; then
 +
echo "*** Error: File $1 does not exist"
 +
usage
 +
exit -1;
 +
fi
 +
 
 +
 
 +
BASE=$(basename $1)
 +
SUFFIX=$2
 +
RFREE=$3
 +
FLAG=$4
 +
 
 +
echo "Base = $BASE, Suffix = $SUFFIX"
 +
 
 +
 
 +
echo "INPUT_FILE= $1" > XDSCONV.INP
 +
echo "OUTPUT_FILE= temp1.hkl CCP4" >> XDSCONV.INP
 +
xdsconv && f2mtz HKLOUT temp1.mtz <F2MTZ.INP | tee ${BASE%.HKL}_dano.log
   −
----
+
echo "INPUT_FILE= $1" > XDSCONV.INP
 +
echo "OUTPUT_FILE= temp2.hkl CCP4_F" >> XDSCONV.INP
 +
xdsconv && f2mtz HKLOUT temp2.mtz <F2MTZ.INP |tee ${BASE%.HKL}_pm.log
 +
 
 +
if [ -z $3 ]; then
 +
echo "Proceeding without Rfree reference file"
 +
cad HKLIN1 temp1.mtz HKLIN2 temp2.mtz HKLOUT ${BASE%.HKL}.mtz << eof | tee ${BASE%.HKL}_cad.log
 +
LABIN  FILE 1 E1=FP      E2=SIGFP      E3=DANO    E4=SIGDANO
 +
LABIN  FILE 2 E1=F(+)    E2=SIGF(+)    E3=F(-)    E4=SIGF(-)
 +
LABOUT FILE 1 E1=FP_$SUFFIX    E2=SIGFP_$SUFFIX    E3=DANO_$SUFFIX  E4=SIGDANO_$SUFFIX
 +
LABOUT FILE 2 E1=F(+)_$SUFFIX  E2=SIGF(+)_$SUFFIX  E3=F(-)_$SUFFIX  E4=SIGF(-)_$SUFFIX
 +
eof
 +
else
 +
echo "Copying Rfree from file $3"
 +
if [ -z $4 ]; then
 +
FREERFLAG="FreeR_flag" # ccp4 standard name
 +
else
 +
FREERFLAG=$4
 +
fi
 +
echo "Extracting flagged indices from ${FREERFLAG}"
 +
cad HKLIN1 temp1.mtz \
 +
HKLIN2 temp2.mtz \
 +
HKLIN3 $3 \
 +
HKLOUT ${BASE%.HKL}.mtz << eof | tee ${BASE%.HKL}_cad.log
 +
LABIN  FILE 1 E1=FP      E2=SIGFP      E3=DANO    E4=SIGDANO
 +
LABIN  FILE 2 E1=F(+)    E2=SIGF(+)    E3=F(-)    E4=SIGF(-)
 +
LABIN  FILE 3 E1=${FREERFLAG}
 +
LABOUT FILE 1 E1=FP_$SUFFIX    E2=SIGFP_$SUFFIX    E3=DANO_$SUFFIX  E4=SIGDANO_$SUFFIX
 +
LABOUT FILE 2 E1=F(+)_$SUFFIX  E2=SIGF(+)_$SUFFIX  E3=F(-)_$SUFFIX  E4=SIGF(-)_$SUFFIX
 +
LABOUT FILE 3 E1=${FREERFLAG}
 +
eof
 +
 
 +
rm temp1.mtz
 +
 
 +
# correct for FreeRflag (if new file has more reflections than reference file)
 +
freerflag hklin ${BASE%.HKL}.mtz hklout temp1.mtz << eof | tee ${BASE%.HKL}_freerflag.log
 +
COMPLETE FREE=${FREERFLAG}
 +
end
 +
eof
 +
 
 +
# correct for real data in case Rfree data set contains too many hkls
 +
# thanks to Andrey Lebedev
 +
sftools << eof | tee ${BASE%.HKL}_sftools.log
 +
READ ${BASE%.HKL}.mtz
 +
SELECT ONLY COLUMN FP_$SUFFIX PRESENT
 +
WRITE temp1.mtz
 +
END
 +
eof
 +
 
 +
mv temp1.mtz ${BASE%.HKL}.mtz
 +
 
 +
fi
 +
 
 +
rm -f XDSCONV.INP temp1.hkl temp1.mtz temp2.hkl temp2.mtz F2MTZ.INP XDSCONV.LP</nowiki>
   −
Hint for long-time XDSCONV users:
+
== Hint for long-time XDSCONV users ==
    
The latest versions of the program do not require  
 
The latest versions of the program do not require  
 
  SPACE_GROUP_NUMBER=
 
  SPACE_GROUP_NUMBER=
 
  UNIT_CELL_PARAMETERS=
 
  UNIT_CELL_PARAMETERS=
because these are picked up from the header of the input reflection file.
+
because these are picked up from the header of the input reflection file. However, if you want to ''change'' the parameters of either keyword then you have to specify '''both'''! I.e. if you want to change the spacegroup then you also have to specify the unit cell parameters.
''More to come''
 
2,522

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.

Navigation menu