Bsoft: Image

The Image Data Model

Images are ubiquitous objects used on all electronic media, implemented in a large number of ways with different file formats, different access strategies and different notions of information. The most general notion of an image is a 2D raster of values representing gray scale or colour values at particular sampled points within the image. An extension of the basic image model is the time-evolving model (i.e., movie or animation), with time representing a third dimension. In structural biology, we deal with 3D structures, often viewed as static, but in reality subject to time-dependent variation. In electron microscopy, the images taken are approximations of 2D projections of 3D structures. The problem with most available software is that none supports the concept of an image in its most general form.

The image as a five-dimensional data set

The structures studied in structural biology is inherently 3D, thus the basis of the image model is also 3D. In addition, we would like to pack multiple 2D images into a single file. However, packing 2D images as the sections of a 3D map confuses the distinction between 2D and 3D and should be avoided. Furthermore, we also want to be able to store multiple 3D maps, requiring an additional dimension.

Most of the data sets we work with have single values at each pixel or voxel (gray scale values), but situations may arise where each voxel may have multiple values. The most common usage of multiple pixel values is to represent complex numbers and colour, such as RGB (Red-Green-Blue) and CMYK (Cyan-Magenta-Yellow -blacK). More extensive use may be to associate a list of values or spectrum at each pixel. This requires yet another dimension in the image model. The meaning of the channels is captured in the notion of a compound type, including simple (one value), complex(two values), RGB color (three values, etc.

The five dimensions are therefore (in storage order):

Channels - one or more values associated with each voxel
X-dimension
Y-dimension
Z-dimension
Images - a series of images, typically with some relationship (such as 2D projections of the same particle, a tilt series, or a time series)

Data types

Each channel in the image contains a single value, where the data types supported in Bsoft are listed in Table 1. Images with multiple channels can have any data type. This makes the image data model more general and as an example, allows for specification of color images as floating point values.

Table 1. Bsoft image data types
Enumerated data type	C data type	Size (bytes)	Single letter code
UChar	unsigned char	1	b
SChar	signed char	1	c
UShort	unsigned short	2	u
Short	short	2	s
Int	int	4	i
Long	long	8	l
Float	float	4	f
Double	double	8	d

Compound types

The definition of the channels is captured in the compound type, where the types supported are listed in Table 2.

Table 2. Bsoft image compound types
Enumerated compound type	Elements	Size (values)	Single letter code
TSimple	gray value	1	S
TComplex	real, imaginary	2	C
TVector2	x, y	3	V
TVector3	x, y, z	3	V
TView	x, y, z, a	4	O
TRGB	r, g, b	3	R
TRGBA	r, g, b, a	4	A
TCMYK	c, m, y, k	4	K
TMulti	array of values	n	M

Image file formats

It seems that every image processing software package has one or more of its own image file formats. Even in packages where external formats have been adopted, changes in those formats literally made them different formats. There are many conversion programs dealing with specific pairwise conversions - not a particularly efficient solution to the user. Bsoft attempts to deal with images as generalized constructs, encapsulating most of the information embedded in the image files in an internal structure. The notion of conversion is now trivial, as reading and writing of multiple file formats are supported. The limiting factor in this is still the limitations within each file format. E.g., you cannot expect file formats designed for single images (such as MRC and EM) to store multiple images (whether 2D or 3D).

Table 3. Image file format features (as implemented in Bsoft)
Image format	Extensions	Data types	Dimensions	Fourier/Complex	Sampling Info	Remarks
ASCII	.asc, .txt	(text)	3D, single	List	No
BioRad	.pic	b, u	3D, single	No	No	Confocal microscopy
Brix	.brx	b	3D, single	No	Indirect	O package, Xtal
Brookhaven STEM	.dat	b	2D, double interleaved	No	One value	STEM corrections applied on reading
CCP4	.map, .ccp, .ccp4	c, s, f, S, F	3D, single	Centered hermitian	Indirect	Xtal
Digital Instruments	.di	s	2D, double	No	No	No write support
Digital Micrograph	.dm, .dm3, .dm4	b, s, i, f, F	2D, single	No	No	Proprietary format
Ditabis image plate reader	.IPL, .IPH, .IPR, .IPC	s, i	2D, single	No	Two values	Micron package
DSN6	.dsn6, .dn6, .omap	b	3D, single	No	Indirect	O package, Xtal
DX	.dx	f	3D, single	No	Three values	OpenDX, visualization
EM	.em	b, s, i, f	3D, single	Hermitian	No	EM package
Goodford	.pot	f	3D, single	No	One value	Electrostatic potential
GRD	.grd	(all)	3D, multiple	No	Three values	Complete Bsoft image data model
HKL	.hkl	(text)	3D, single	List	No	Structure factor format
Imagic	.img (.hed)	b, s, f, F	2D, multiple	Centered	No	Header in a separate file
Image Magick	.miff	b (RGB)	2D, multiple	No	No	X-window display program
JPEG	.jpg, .jpeg	b (RGB)	2D, single	No	No	Web image format
MFF	.mff	b, f	3D, single	No	Three values	Whatif package
MRC	.mrc	b, s, f, S, F	3D, multiple	Centered hermitian	Indirect	MRC package
PIC BP	.bp	b	2D, single	No	No	PIC package
PIF	.pif	b, s, i, f, S, F	3D, multiple	Binary list	Three values	PFT/EM3DR package
PNG	.png	b, s (RGB)	2D, single	No	Two values	Network image format
PNM	.pbm, .pgm, .ppm	b (RGB)	2D, single	No	None	Simple image format
Ser	.ser	f	2D, multiple	No	None	FEI series format
Situs	.situs	f	3D, single	No	One value	Situs package
SPE	.spe	f	2D, single	No	None	SPE CCD format
Spider	.spi	f	3D, multiple	Hermitian	One value	Spider package
Suprim	.spm, .sup, .f	b, s, i, f (RGB)	3D, single	Standard	One value	Suprim package
TIFF	.tif, .tiff	b, s, i, f (RGB)	3D, multiple	No	Two values	Only the byte data type is common

Sampling information: The sampling or voxel/pixel size information is represented as three values (for x, y and z), or two values (TIFF only provides for sampling information in the x and y directions), or one value (for all three directions). Crystallographic formats (such as CCP4 and MRC) give sampling indirectly, calculated from the ratios of the unit cell dimensions and the voxel size of the unit cell (this leads to inaccuracies due to round-off).

Raw files - custom interpretation of image files

Bsoft offers a "raw" format to be able to load image files where the format is either not supported, or there is a problem with the header information in the file. Any input file name appended with a series of tag-value pairs as described below, invokes an attempt to read the file based on the command-line information given by the user, and to ignore any information in the file header itself. The image file name must be following by a string using the sharp character, "#", as delimiter between tag-value pairs. E.g., to interpret the file "input.file" according to particular data type and size parameters:

bimg -verbose 7 input.file#d=f#x=120,120,55#h=1024 output.map

This line will interpret the file as containing a 3D image in floating point format, with the data starting at byte 1024. Typically, the minimum necessary to interpret a file is the data type, the size, and the header bytes to skip

Table 4. Tag-value descriptions for custom interpretation of image files
Tag	Value	Description
h	bytes	Header size = initial number of bytes to skip
d	datatype_letter	Data type character (1,b,c,u,s,j,i,k,l,f,d,S,I,F,D)
x	size_x,size_y,size_z	Image size in voxels
p	page_x,page_y,page_z	Page size in voxels
a	bytes	Number of bytes to pad between pages
s	sampling_x,sampling_y,sampling_z	Sampling/voxel size in angstrom/voxel
c	number_channels	Number of channels (gray scale = 1, RGB = 3)
n	number_images	Number of images in the file
i	selected_image	Select one image to read
f	transform_type	n=NoTransform, s=Standard, c=Centered, h=Hermitian, q=CentHerm
b	0/1	Byte swapping flag
v	0/1	VAX floating point flag