Single Particle Analysis Strategies

SPA aims at determining the structure of a particle from electron micrographs containing many particle images. The modern strategies for this processing involve the following three issues that need to be considered before starting processing:

The sections below describe these issues in general with links to more specific workflows.


1. Validation approaches

The particle images should be processed in such a way that a valid reconstruction is produced. The principle in all of these approaches is that the images should contain mutually consistent information. Pure noise images don't contain such coherent information, but they can still be aligned to any given reference to give an apparently reasonable reconstruction. The validation methods below are designed to avoid mistaking noise images for real particle images, as well as support the estimated resolution obtained for a reconstruction.

1.1. Resolution-limited alignment

Resolution limits are imposed on the alignment of particles to make use of that region of the spatial frequencies that carry orientational information. The low frequencies often have high amplitudes but not much rotational information, so they should be omitted. The high frequencies often have very low signal-to-noise, and should be omitted to avoid aligning to noise. In addition, recovery of information in the reconstruction beyond the high resolution limit used in alignment provides a measure of validation. This indicates how resolution limits can be misused. If the user select too generous high resolution limits, the data can be overfitted (just another term referring to the inclusion of high frequency noise). The high resolution limit should therefore be chosen with care.

1.2. Independent data sets ("golden standard")

The notion of independent data sets is simply a variant of repetition in science. If you get the same result by processing two or more independent data sets, then the outcome is very likely valid. The way in which this fails is if common information is used in the processing of the separate data sets. One example is if the same shape mask is used on the references and possibly the images. Such data is coerced to converge to the same result. However, this is rare and should be easily avoided by not sharing any form of information between data sets.

1.3. Tilt pair/Handedness analysis

Multiple images of the same particle can be used to verify that the processing returns the expected corrresponding orientations. This is typically done by taking tilt pairs of micrographs and comparing the orientations obtained for the particles. The angle between the orientations should agree with the angle difference between the micrographs. This is also one way to determine the handedness of the reconstructions.


2. Contrast transfer function correction

The use of underfocused images in cryo-EM to enhance contrast requires a correction based on the CTF to align the particle images and recover high frequency information in the reconstructions. The two approaches are to correct for the CTF in the particle images before alignment and reconstruction, or to defer the correction to after the alignment and during reconstruction

2.1 CTF correction before alignment

The traditional workflow for SPA involves correcting the extracted particle images followed by alignment. The images are therefore already corrected by the time the reconstruction is done. One drawback is that if the correction is too aggressive in modifying the amplitudes, it may introduce artifacts that throw off the alignment. In this case it is safest to just do phase flipping rather than a method that changes the amplitudes

2.2 CTF correction after alignment

In a newer workflow, the CTF is applied to the reference projection so that its power spectrum looks more like that of the original particle image. Because the application of the CTF is straightforward, any artifacts due to correction are avoided and the scaling of the amplitudes relative to noise is similar in both projection and image. The images are then corrected for the CTF during reconstruction (Note that the same artifacts may be introduced here if the correction is too aggressive). This is now the preferred strategy in Bsoft.


3. 3D Classification

In Bsoft, a form of K-means classification or multi-reference alignment is done as follows. The particles are aligned against multiple reference maps. The particles are then classified based on their best figure-of-merit with respect to the reference maps. The process is iterated until a stable set of maps are generated. The maps/classes can be modified after each iteration by combining those maps that are very similar, and generating intermediate maps between those that are very dissimilar.