Split NMR-style multiple model pdb files into individual models: Difference between revisions

Jump to navigation Jump to search
m
no edit summary
(Created page with 'This assumes that you have a correctly formatted pdb file that contains both MODEL and ENDMDL records. It gets split into individual pdb files names model_###.pdb. grep -n 'MOD…')
 
mNo edit summary
 
(12 intermediate revisions by 2 users not shown)
Line 1: Line 1:
This assumes that you have a correctly formatted pdb file that contains both MODEL and ENDMDL records.  It gets split into individual pdb files names model_###.pdb.
This assumes that you have a correctly formatted pdb file that contains both MODEL and ENDMDL records.   


grep -n 'MODEL\|ENDMDL' models.pdb |  
 
cut -d: -f 1 |  
 
awk '{if(NR%2) printf "sed -n %d,",$1+1; else printf "%dp > model_%03d.pdb\n", $1-1,NR/2;}' |  
== Bash/awk one-liner ==
bash -sf
 
 
This one-liner splits the file models.pdb into individual pdb files named model_###.pdb.
 
  grep -n 'MODEL\|ENDMDL' models.pdb | cut -d: -f 1 | \
  awk '{if(NR%2) printf "sed -n %d,",$1+1; else printf "%dp models.pdb > model_%03d.pdb\n", $1-1,NR/2;}' | bash -sf
 
== Bash script ==
 
  i=1
  while read -a line; do
    echo "${line[@]}" >> model_${i}.pdb
    [[ ${line[0]} == ENDMDL ]] && ((i++))
  done < /path/to/file.pdb
 
 
== Awk script ==
 
Should be called as
 
  awk -f script.awk < models.pdb
 
  BEGIN {file = 0; filename = "model_"  file ".pdb"}
  /ENDMDL/ {getline; file ++; filename = "model_" file ".pdb"}
  {print $0 > filename}
 
 
== Perl script ==
 
  $base='1g9e';open(IN,"<$base.pdb");@indata = <IN>;$i=0;
  foreach $line(@indata) {
  if($line =~ /^MODEL/) {++$i;$file="${base}_$i.pdb";open(OUT,">$file");next}
  if($line =~ /^ENDMDL/) {next}
  if($line =~ /^ATOM/ || $line =~ /^HETATM/) {print OUT "$line"}
  }
 
== Python script ==
 
 
For this kludgy version using Python 2.x, you need to paste the entire PDB file into the script where it says "PASTE YOUR PDB FILE TEXT HERE".
You can fork [https://github.com/fomightez/structurework/blob/master/python_scripts/super_basic_multiple_model_PDB_file_splitter.py the code here at Github].
 
(A more full-featured version there that you can just point at your file [,or a folder of files,] using an argument on the command line can be found [https://github.com/fomightez/structurework/blob/master/python_scripts/multiple_model_PDB_file_splitter.py here at Github]. )
 
  PDB_text = """
  PASTE YOUR PDB FILE TEXT HERE
  """
 
  model_number = 1
  new_file_text = ""
  for line in filter(None, PDB_text.splitlines()):
      line = line.strip () #for better control of ends of lines
      if line == "ENDMDL":
          # save file with file number in name
          output_file = open("model_" + str(model_number) + ".pdb", "w")
          output_file.write(new_file_text.rstrip('\r\n')) #rstrip to remove trailing newline
          output_file.close()
          # reset everything for next model
          model_number += 1
          new_file_text = ""
      elif not line.startswith("MODEL"):
          new_file_text += line + '\n'
 
 
Back to [[Useful scripts (aka smart piece of code)]]
6

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.

Navigation menu