Single Particle Analysis Preparation

The advent of direct detectors and automated data acquisition means that large amounts of data are acquired in a relatively short time. Most of the preprocessing can be done automatically, either streaming or as a batch of files. The following operations are typically done:

All except the last can be combined in a single command line.

1 Directory layout and parameter files

The user must make a decision as to the organization of files. The aim is to group files together in directories that have some relationship to the stage of processing. This also allows the user to try different processing strategies with the results deposited in separate directories.

It is recommended that a layout similar to that shown in Table 1 be adopted for processing in Bsoft. In the case of independent data sets ("golden standard"), the best is to split the full data set into two or more major directories and process each completely separately.

The parameter files (typically STAR files) embed the image paths relative to its location. So it is best not to move the parameter files once they have been created. To move a parameter file to a different location, copy it by using a program such as bmg and write the new file in the desired place. The intent is that this will modify the image paths so they remain correctly specified in the parameter file.

Table 1: Suggested directories for SPA in Bsoft
Directory Purpose
mg Raw micrographs/Frames
mg_b2 Micrographs binned two-fold
part Particle images extracted from the micrographs
ctf CTF-corrected particle images
ref Initial reference map(s) for orientation-finding
run1 First run of determining particle orientations with the resultant reconstruction(s)
run2 Second run of determining particle orientations with the resultant reconstruction(s)
... ...

All programs handling parameter files (such as bmg, bpartsel, borient, etc.) can read multiple files and concatenate them into one large internal parameter database. The whole internal database is then written out into one parameter file by specifying the "-output" option. If the user requires individual parameter files for each micrograph, the program bmg has a "-split" option to generate one parameter file per micrograph. Some of the programs also allow the user to set the path for files, which is very important to ensure a smooth and easy workflow.

The concept of a micrograph used in Bsoft is the equivalent of taking a single 2D image on a photographic film and scanning it in a digitization device. Single images taken on CCD or direct cameras qualify as simple micrographs. Dose-fractionation (also called movie mode) results in a series of 2D images (frames) where the aligned average is considered equivalent to a micrograph. The initial processing of the micrographs therefore depends on how they were acquired. The best advice is to keep the individual micrographs and their derivatives separate and use a script to automate all the initial processing up to fitting the CTF parameters. Each micrograph is typically preprocessed separately and the parameter files combined afterwards.

2. Gain-correction

In a CCD or direct detector, the gain is typically automatically corrected and this step can be skipped. However, sometimes gain correction is not done to speed up acquisition. In this case gain correction becomes part of the preprocessing.

Unfortunately, the gain reference image is not guaranteed to be oriented in the same way as the micrograph images. Every camera has blemishes that show up in both the gain reference and micrographs. The orientation is therefore determined by comparing the gain reference with a micrograph and finding the corresponding blemishes. When each micrograph is a set of movie frames, these can be summed to increase the signal to be able to see the blemishes:

bimg -verb 7 -sum mg2304.tif mg2304_sum.mrc

The gain reference should then be oriented to agree with the micrograph:

bimg -verb 7 -reslice x-yz gain.tif gain_resliced.mrc

Multiplication of the micrograph with the gain reference should now remove the features of the camera chip:

bop -verb 7 -sam 0.537 -mult 1,0 mrc2304_sum.mrc mrc2304_gc.mrc

The oriented gain reference is now suitable for use during movie frame alignment.

3. Movie frame alignment and summation

This step is only required where micrographs are acquired as movie frames (dose-fractionated). It combines several subprocesses in one to speed it up and conserve disk space.

The frames are first aligned progressively, starting from a chosen frame, usually not the first frame because of severe initial beam-induced movement. The reference is progressively averaged from already aligned frames to improve subsequent frame alignment. This is then followed by several iterations of alignment to the average until the shifts decrease below a threshold.

A typical command line is:

bseries -verb 1 -frames -counts -Gainref gain.mrc -dose 0.35 -rate 4 -align 5 -resol 20,1000 -shift 100 -bin 2 -write sum -out mg5523.tif