Notation

When coding with dclab, you should be aware of the following definitions and design principles.

Events

An event comprises all data recorded for the detection of one object (e.g. cell or bead) in an RT-DC measurement.

Features

A feature is a measurement parameter of an RT-DC measurement. For instance, the feature “index” enumerates all recorded events, the feature “deform” contains the deformation values of all events. There are scalar features, i.e. features that assign a single number to an event, and non-scalar features, such as “image” and “contour”. All features in a dataset are exposed as read-only to the user. The following features are supported by dclab:

Scalar features

scalar features, description [units]

area_cvx

Convex area [px]

area_msd

Measured area [px]

area_ratio

Porosity (convex to measured area ratio)

area_um

Area [µm²]

area_um_raw

Area [µm²] of raw contour

aspect

Aspect ratio of bounding box

basinmap0

Basin mapping 0

basinmap1

Basin mapping 1

basinmap2

Basin mapping 2

basinmap3

Basin mapping 3

basinmap4

Basin mapping 4

basinmap5

Basin mapping 5

basinmap6

Basin mapping 6

basinmap7

Basin mapping 7

basinmap8

Basin mapping 8

basinmap9

Basin mapping 9

bg_med

Median frame background brightness [a.u.]

bg_off

Background offset [a.u.]

bright_avg

Brightness average [a.u.]

bright_bc_avg

Brightness average (bgc) [a.u.]

bright_bc_sd

Brightness SD (bgc) [a.u.]

bright_perc_10

10th Percentile of brightness (bgc)

bright_perc_90

90th Percentile of brightness (bgc)

bright_sd

Brightness SD [a.u.]

circ

Circularity

deform

Deformation

deform_raw

Deformation of raw contour

eccentr_prnc

Eccentricity of raw contour

emodulus

Young’s modulus [kPa]

fl1_area

FL-1 area of peak [a.u.]

fl1_dist

FL-1 distance between two first peaks [µs]

fl1_max

FL-1 maximum [a.u.]

fl1_max_ctc

FL-1 maximum, crosstalk-corrected [a.u.]

fl1_npeaks

FL-1 number of peaks

fl1_pos

FL-1 position of peak [µs]

fl1_width

FL-1 width [µs]

fl2_area

FL-2 area of peak [a.u.]

fl2_dist

FL-2 distance between two first peaks [µs]

fl2_max

FL-2 maximum [a.u.]

fl2_max_ctc

FL-2 maximum, crosstalk-corrected [a.u.]

fl2_npeaks

FL-2 number of peaks

fl2_pos

FL-2 position of peak [µs]

fl2_width

FL-2 width [µs]

fl3_area

FL-3 area of peak [a.u.]

fl3_dist

FL-3 distance between two first peaks [µs]

fl3_max

FL-3 maximum [a.u.]

fl3_max_ctc

FL-3 maximum, crosstalk-corrected [a.u.]

fl3_npeaks

FL-3 number of peaks

fl3_pos

FL-3 position of peak [µs]

fl3_width

FL-3 width [µs]

flow_rate

Flow rate [µLs⁻¹]

frame

Video frame number

g_force

Gravitational force in multiples of g

index

Index (Dataset)

index_online

Index (Online)

inert_ratio_cvx

Inertia ratio of convex contour

inert_ratio_prnc

Principal inertia ratio of raw contour

inert_ratio_raw

Inertia ratio of raw contour

ml_class

Most probable ML class

nevents

Number of events in the same image

pc1

Principal component 1

pc2

Principal component 2

per_ratio

Inverse Convexity (raw to convex perimeter ratio)

per_um_raw

Perimeter [µm] of raw contour

pos_x

Position along channel axis [µm]

pos_y

Position lateral in channel [µm]

pressure

Pressure [mPa]

qpi_dm_avg

Dry mass (average) [pg]

qpi_dm_sd

Dry mass (SD) [pg]

qpi_focus

Computed focus distance [µm]

qpi_pha_int

Integrated phase [rad]

qpi_ri_avg

Refractive index (average)

qpi_ri_sd

Refractive index (SD)

size_x

Bounding box size x [µm]

size_y

Bounding box size y [µm]

sym_x

Symmetry ratio left-right

sym_y

Symmetry ratio top-bottom

temp

Chip temperature [°C]

temp_amb

Ambient temperature [°C]

tex_asm_avg

Texture angular second moment (avg)

tex_asm_ptp

Texture angular second moment (ptp)

tex_con_avg

Texture contrast (avg)

tex_con_ptp

Texture contrast (ptp)

tex_cor_avg

Texture correlation (avg)

tex_cor_ptp

Texture correlation (ptp)

tex_den_avg

Texture difference entropy (avg)

tex_den_ptp

Texture difference entropy (ptp)

tex_ent_avg

Texture entropy (avg)

tex_ent_ptp

Texture entropy (ptp)

tex_f12_avg

Texture First measure of correlation (avg)

tex_f12_ptp

Texture First measure of correlation (ptp)

tex_f13_avg

Texture Second measure of correlation (avg)

tex_f13_ptp

Texture Second measure of correlation (ptp)

tex_idm_avg

Texture inverse difference moment (avg)

tex_idm_ptp

Texture inverse difference moment (ptp)

tex_sen_avg

Texture sum entropy (avg)

tex_sen_ptp

Texture sum entropy (ptp)

tex_sva_avg

Texture sum variance (avg)

tex_sva_ptp

Texture sum variance (ptp)

tex_var_avg

Texture variance (avg)

tex_var_ptp

Texture variance (ptp)

tilt

Absolute tilt of raw contour

time

Time [s]

userdef0

User-defined 0

userdef1

User-defined 1

userdef2

User-defined 2

userdef3

User-defined 3

userdef4

User-defined 4

userdef5

User-defined 5

userdef6

User-defined 6

userdef7

User-defined 7

userdef8

User-defined 8

userdef9

User-defined 9

volume

Volume [µm³]

In addition to these scalar features, it is possible to define a large number of features dedicated to machine-learning, the “ml_score_???” features which are probability scores with values between 0 and 1. The “?” can be a digit or a lower-case letter of the alphabet, e.g. “ml_score_rbc” or “ml_score_3a3”. If “ml_score_???” features are defined, then the ancillary “ml_class” feature, which identifies the most-probable feature for each event, becomes available.

Non-scalar features

non-scalar features, description [units]

contour

Event contour

image

Gray scale event image

image_bg

Gray scale event background image

mask

Binary mask labeling the event in the image

qpi_amp

Hologram amplitude image

qpi_oah

Off-axis hologram

qpi_oah_bg

Off-axis hologram background

qpi_pha

Hologram phase image [rad]

trace

Dictionary of fluorescence traces

Examples

deformation vs. area plot

import matplotlib.pylab as plt
import dclab
ds = dclab.new_dataset("data/example.rtdc")
ax = plt.subplot(111)
ax.plot(ds["area_um"], ds["deform"], "o", alpha=.2)
ax.set_xlabel(dclab.dfn.get_feature_label("area_um"))
ax.set_ylabel(dclab.dfn.get_feature_label("deform"))
plt.show()

(Source code, png, hires.png, pdf)

_images/sec_av_notation-1.png

event image plot

import matplotlib.pylab as plt
import dclab
ds = dclab.new_dataset("data/example_video.rtdc")
ax1 = plt.subplot(211, title="image")
ax2 = plt.subplot(212, title="mask")
ax1.imshow(ds["image"][6], cmap="gray")
ax2.imshow(ds["mask"][6])

(Source code, png, hires.png, pdf)

_images/sec_av_notation-2.png

Ancillary features

Not all features available in dclab are recorded online during the acquisition of the experimental dataset. Some of the features are computed offline by dclab, such as “volume”, “emodulus”, or scores from imported machine learning models (“ml_score_xxx”). These ancillary features are computed on-the-fly and are made available seamlessly through the same interface.

Filters

A filter can be used to gate events using features. There are min/max filters and 2D polygon filters. The following table defines the main filtering parameters:

filtering, parsed, description [units]

enable filters

{f}

Enable filtering

hierarchy parent

str

Hierarchy parent of the dataset

limit events

{f}

Upper limit for number of filtered events

polygon filters

{f}

Polygon filter indices

remove invalid events

{f}

Remove events with inf/nan values

Min/max filters are also defined in the filters section:

filtering

explanation

area_um min

Exclude events with area [µm²] below this value

area_um max

Exclude events with area [µm²] above this value

aspect max

Exclude events with an aspect ratio above this value

Examples

excluding events with large deformation

import matplotlib.pylab as plt
import dclab
ds = dclab.new_dataset("data/example.rtdc")

ds.config["filtering"]["deform min"] = 0
ds.config["filtering"]["deform max"] = .1
ds.apply_filter()
dif = ds.filter.all

f, axes = plt.subplots(1, 2, sharex=True, sharey=True)
axes[0].plot(ds["area_um"], ds["bright_avg"], "o", alpha=.2)
axes[0].set_title("unfiltered")
axes[1].plot(ds["area_um"][dif], ds["bright_avg"][dif], "o", alpha=.2)
axes[1].set_title("Deformation <= 0.1")

for ax in axes:
    ax.set_xlabel(dclab.dfn.get_feature_label("area_um"))
    ax.set_ylabel(dclab.dfn.get_feature_label("bright_avg"))

plt.tight_layout()
plt.show()

(Source code, png, hires.png, pdf)

_images/sec_av_notation-3.png

excluding random events

This is useful if you need to have a (sub-)dataset of a specified size. The downsampling is reproducible (the same points are excluded).

import matplotlib.pylab as plt
import dclab
ds = dclab.new_dataset("data/example.rtdc")
ds.config["filtering"]["limit events"] = 4000
ds.apply_filter()
fid = ds.filter.all

ax = plt.subplot(111)
ax.plot(ds["area_um"][fid], ds["deform"][fid], "o", alpha=.2)
ax.set_xlabel(dclab.dfn.get_feature_label("area_um"))
ax.set_ylabel(dclab.dfn.get_feature_label("deform"))
plt.show()

(Source code, png, hires.png, pdf)

_images/sec_av_notation-4.png

Experiment metadata

Every RT-DC measurement has metadata consisting of key-value-pairs. The following are supported:

experiment, parsed, description [units]

date

str

Date of measurement (‘YYYY-MM-DD’)

event count

{f}

Number of recorded events

run identifier

str

Unique measurement identifier

run index

{f}

Index of measurement run

sample

str

Measured sample or user-defined reference

time

str

Start time of measurement (‘HH:MM:SS[.S]’)

timestamp

float

Start of measurement in unix time [s]

fluorescence, parsed, description [units]

baseline 1 offset

{f}

Baseline offset channel 1

baseline 2 offset

{f}

Baseline offset channel 2

baseline 3 offset

{f}

Baseline offset channel 3

bit depth

{f}

Trace bit depth

channel 1 name

str

FL1 description

channel 2 name

str

FL2 description

channel 3 name

str

FL3 description

channel count

{f}

Number of active channels

channels installed

{f}

Number of available channels

laser 1 lambda

float

Laser 1 wavelength [nm]

laser 1 power

float

Laser 1 output power [%]

laser 2 lambda

float

Laser 2 wavelength [nm]

laser 2 power

float

Laser 2 output power [%]

laser 3 lambda

float

Laser 3 wavelength [nm]

laser 3 power

float

Laser 3 output power [%]

laser count

{f}

Number of active lasers

lasers installed

{f}

Number of available lasers

sample rate

{f}

Trace sample rate [Hz]

samples per event

{f}

Samples per event

signal max

float

Upper voltage detection limit [V]

signal min

float

Lower voltage detection limit [V]

trace median

{f}

Rolling median filter size for traces

fmt_tdms, parsed, description [units]

video frame offset

{f}

Missing events at beginning of video

imaging, parsed, description [units]

flash device

str

Light source device type

flash duration

float

Light source flash duration [µs]

frame rate

float

Imaging frame rate [Hz]

pixel size

float

Pixel size [µm]

roi position x

{f}

Image x coordinate on sensor [px]

roi position y

{f}

Image y coordinate on sensor [px]

roi size x

{f}

Image width [px]

roi size y

{f}

Image height [px]

online_contour, parsed, description [units]

bg empty

{f}

Background correction from empty frames only

bin area min

{f}

Minium pixel area of binary image event

bin kernel

{f}

Disk size for binary closing of mask image

bin threshold

{f}

Threshold for mask from bg-corrected image

image blur

{f}

Odd sigma for Gaussian blur (21x21 kernel)

no absdiff

{f}

Do not use OpenCV ‘absdiff’ for bg-correction

online_filter, parsed, description [units]

target duration

float

Target measurement duration [min]

target event count

{f}

Target event count for online gating

pipeline, parsed, description [units]

dcnum background

str

Background ID

dcnum data

str

Data ID

dcnum feature

str

Feature extractor ID

dcnum gate

str

Gating ID

dcnum generation

str

Generation ID

dcnum hash

str

Hash

dcnum mapping

str

Event mapping from original dataset

dcnum segmenter

str

Segmenter ID

dcnum yield

{f}

Event yield

qpi, parsed, description [units]

amp border loc

str

Border location specifier for amplitude

amp border px

{f}

Width of border for amplitude [pix]

amp fit offset

str

Amplitude offset correction

amp fit profile

str

Amplitude profile correction

bg method

str

Background computation method

filter name

str

Fourier filter used

filter size

float

Fourier filter size [1/pix]

focus interval

{f}

Focus interval to search [µm]

focus kernel

str

Propagation kernel

focus metric

str

Metric used to calculate focus

focus minimizer

str

Minimizer used to calculate focus

focus padding

{f}

Level of padding for refocus

invert phase

{f}

Invert the phase data

medium index

float

Refractive index of medium

padding

{f}

Level of padding

pha border loc

str

Border location specifier for phase

pha border px

{f}

Width of border for phase [pix]

pha fit offset

str

Phase offset correction

pha fit profile

str

Phase profile correction

pixel size proc

float

QPI pixel size [µm].

pixel size raw

float

Hologram pixel size [µm].

scale to filter

{f}

Scale QPI data to filter size

sideband freq

{f}

Sideband coordinates [1/pix]

software version

str

Software version(s)

subtract mean

{f}

Subtract mean before processing

wavelength

float

Imaging wavelength [nm]

setup, parsed, description [units]

channel width

float

Width of microfluidic channel [µm]

chip identifier

{f}

Unique identifier of the chip used

chip region

{f}

Imaged chip region (channel or reservoir)

flow rate

float

Flow rate in channel [µL/s]

flow rate sample

float

Sample flow rate [µL/s]

flow rate sheath

float

Sheath flow rate [µL/s]

identifier

str

Unique setup identifier

medium

str

Medium used

module composition

str

Comma-separated list of modules used

software version

str

Acquisition software with version

temperature

float

Mean chip temperature [°C]

Example: date and time of a measurement

In [1]: import dclab

In [2]: ds = dclab.new_dataset("data/example.rtdc")

In [3]: ds.config["experiment"]["date"], ds.config["experiment"]["time"]
Out[3]: ('2017-07-16', '19:01:36')

Analysis metadata

In addition to inherent (defined during data acquisition) metadata, dclab also supports additional metadata that are relevant for certain data analysis pipelines, such as Young’s modulus computation or fluorescence crosstalk correction.

calculation, parsed, description [units]

crosstalk fl12

float

Fluorescence crosstalk, channel 1 to 2

crosstalk fl13

float

Fluorescence crosstalk, channel 1 to 3

crosstalk fl21

float

Fluorescence crosstalk, channel 2 to 1

crosstalk fl23

float

Fluorescence crosstalk, channel 2 to 3

crosstalk fl31

float

Fluorescence crosstalk, channel 3 to 1

crosstalk fl32

float

Fluorescence crosstalk, channel 3 to 2

emodulus lut

str

Look-up table identifier

emodulus medium

str

Medium used (e.g. ‘0.49% MC-PBS’)

emodulus temperature

float

Chip temperature [°C]

emodulus viscosity

float

Viscosity [Pa*s] if ‘medium’ unknown

emodulus viscosity model

str

Viscosity model for known media

User-defined metadata

In addition to the registered metadata keys listed above, you may also define custom metadata in the “user” section. This section will be saved alongside the other metadata when a dataset is exported as an .rtdc (HDF5) file.

Note

It is recommended to use the following data types for the value of each key: str, bool, float and int. Other data types may not render nicely in ShapeOut2 or DCOR.

To edit the “user” section in dclab, simply modify the config property of a loaded dataset. The changes made are not written to the underlying file.

Example: Setting custom “user” metadata in dclab

In [4]: import dclab

In [5]: ds = dclab.new_dataset("data/example.rtdc")

In [6]: my_metadata = {"inlet": True, "n_channels": 4}

In [7]: ds.config["user"] = my_metadata

In [8]: other_metadata = {"outlet": False, "RBC": True}

# we can also add metadata with the `update` method
In [9]: ds.config["user"].update(other_metadata)

# or
In [10]: ds.config.update({"user": other_metadata})

In [11]: print(ds.config["user"])
{'inlet': True, 'n_channels': 4, 'outlet': False, 'RBC': True}

# we can clear the "user" section like so:
In [12]: ds.config["user"].clear()

If you are implementing a custom data acquisition pipeline, you may alternatively add user-defined meta data (permanently) to an .rtdc file in a post-measurement step like so.

Example: Setting custom “user” metadata permanently

import h5py
with h5py.File("/path/to/your/dataset.rtdc") as h5:
    h5.attrs["user:inlet"] = True
    h5.attrs["user:n_channels"] = 4
    h5.attrs["user:outlet"] = False
    h5.attrs["user:RBC"] = True
    h5.attrs["user:project"] = "strangelove"

User-defined metadata can also be used with user-defined plugin features. This allows you to design plugin features which utilize your pipeline-specific metadata.

Basins

Since dclab 0.51.0, you can define so-called basins in .rtdc files. Basins are files or remote locations that contain additional features that are not part of the file you opened initially.

For instance, you might want to compute some additional features for a measurement, but you want to avoid editing the original file data/example.rtdc, and you also need to have access to the features of the original file when working with the new file test.rtdc.

In [13]: import dclab

# Create the smaller file with the basin defined.
In [14]: with dclab.new_dataset("data/example.rtdc") as dso, dclab.RTDCWriter("test.rtdc", mode="reset") as hw:
   ....:    # copy metadata
   ....:    meta = dict(dso.config)
   ....:    meta.pop("filtering")
   ....:    hw.store_metadata(meta)
   ....:    # store a feature from the original dataset
   ....:    hw.store_feature("deform", dso["deform"])
   ....:    # store a user-defined featurr
   ....:    hw.store_feature("userdef1", 2.5*dso["deform"])
   ....:    # store the basin information
   ....:    hw.store_basin(basin_name="mytest",
   ....:                   basin_type="file",
   ....:                   basin_format="hdf5",
   ....:                   basin_locs=["data/example.rtdc"])
   ....: 

In [15]: ds2 = dclab.new_dataset("test.rtdc")

# the basin in "test.rtdc" gives you access to features stored in "data/example.rtdc"
In [16]: print(ds2.features)
['area_cvx', 'area_msd', 'area_ratio', 'area_um', 'aspect', 'bright_avg', 'bright_sd', 'circ', 'circ_times_area', 'deform', 'frame', 'index', 'inert_ratio_cvx', 'inert_ratio_raw', 'nevents', 'pos_x', 'pos_y', 'size_x', 'size_y', 'time', 'userdef1']

For more information, please take a look at the documentation of Basin and its subclasses.