Notation

When coding with dclab, you should be aware of the following definitions and design principles.

Events

An event comprises all data recorded for the detection of one object (e.g. cell or bead) in an RT-DC measurement.

Features

A feature is a measurement parameter of an RT-DC measurement. For instance, the feature “index” enumerates all recorded events, the feature “deform” contains the deformation values of all events. There are scalar features, i.e. features that assign a single number to an event, and non-scalar features, such as “image” and “contour”. The following features are supported by dclab:

Scalar features

scalar features

description [units]

area_cvx

Convex area [px]

area_msd

Measured area [px]

area_ratio

Porosity (convex to measured area ratio)

area_um

Area [µm²]

aspect

Aspect ratio of bounding box

bright_avg

Brightness average within contour [a.u.]

bright_sd

Brightness SD within contour [a.u.]

circ

Circularity

deform

Deformation

emodulus

Young’s modulus [kPa]

fl1_area

FL-1 area of peak [a.u.]

fl1_dist

FL-1 distance between two first peaks [µs]

fl1_max

FL-1 maximum [a.u.]

fl1_max_ctc

FL-1 maximum, crosstalk-corrected [a.u.]

fl1_npeaks

FL-1 number of peaks

fl1_pos

FL-1 position of peak [µs]

fl1_width

FL-1 width [µs]

fl2_area

FL-2 area of peak [a.u.]

fl2_dist

FL-2 distance between two first peaks [µs]

fl2_max

FL-2 maximum [a.u.]

fl2_max_ctc

FL-2 maximum, crosstalk-corrected [a.u.]

fl2_npeaks

FL-2 number of peaks

fl2_pos

FL-2 position of peak [µs]

fl2_width

FL-2 width [µs]

fl3_area

FL-3 area of peak [a.u.]

fl3_dist

FL-3 distance between two first peaks [µs]

fl3_max

FL-3 maximum [a.u.]

fl3_max_ctc

FL-3 maximum, crosstalk-corrected [a.u.]

fl3_npeaks

FL-3 number of peaks

fl3_pos

FL-3 position of peak [µs]

fl3_width

FL-3 width [µs]

frame

Video frame number

g_force

Gravitational force in multiples of g

index

Event index (Dataset)

index_online

Event index (Online)

inert_ratio_cvx

Inertia ratio of convex contour

inert_ratio_prnc

Principal inertia ratio of raw contour

inert_ratio_raw

Inertia ratio of raw contour

ml_class

Most probable ML class

nevents

Total number of events in the same image

pc1

Principal component 1

pc2

Principal component 2

pos_x

Position along channel axis [µm]

pos_y

Position lateral in channel [µm]

size_x

Bounding box size x [µm]

size_y

Bounding box size y [µm]

temp

Chip temperature [°C]

temp_amb

Ambient temperature [°C]

tilt

Absolute tilt of raw contour

time

Event time [s]

userdef0

User-defined 0

userdef1

User-defined 1

userdef2

User-defined 2

userdef3

User-defined 3

userdef4

User-defined 4

userdef5

User-defined 5

userdef6

User-defined 6

userdef7

User-defined 7

userdef8

User-defined 8

userdef9

User-defined 9

volume

Volume [µm³]

In addition to these scalar features, it is possible to define a large number of features dedicated to machine-learning, the “ml_score_???” features: The “?” can be a digit or a lower-case letter of the alphabet, e.g. “ml_score_rbc” or “ml_score_3a3”. If “ml_score_???” features are defined, then the ancillary “ml_class” feature, which identifies the most-probable feature for each event, becomes available.

Non-scalar features

non-scalar features

description [units]

contour

Event contour

image

Gray scale event image

image_bg

Gray scale event background image

mask

Binary mask labeling the event in the image

trace

Dictionary of fluorescence traces

Examples

deformation vs. area plot

import matplotlib.pylab as plt
import dclab
ds = dclab.new_dataset("data/example.rtdc")
ax = plt.subplot(111)
ax.plot(ds["area_um"], ds["deform"], "o", alpha=.2)
ax.set_xlabel(dclab.dfn.get_feature_label("area_um"))
ax.set_ylabel(dclab.dfn.get_feature_label("deform"))
plt.show()

(Source code, png, hires.png, pdf)

_images/sec_av_notation-1.png

event image plot

import matplotlib.pylab as plt
import dclab
ds = dclab.new_dataset("data/example_video.rtdc")
ax1 = plt.subplot(211, title="image")
ax2 = plt.subplot(212, title="mask")
ax1.imshow(ds["image"][6], cmap="gray")
ax2.imshow(ds["mask"][6])

(Source code, png, hires.png, pdf)

_images/sec_av_notation-2.png

Ancillary features

Not all features available in dclab are recorded online during the acquisition of the experimental dataset. Some of the features are computed offline by dclab, such as “volume”, “emodulus”, or scores from imported machine learning models (“ml_score_xxx”). These ancillary features are computed on-the-fly and are made available seamlessly through the same interface.

Filters

A filter can be used to gate events using features. There are min/max filters and 2D polygon filters. The following table defines the main filtering parameters:

filtering

parsed

description [units]

enable filters

{f}

Enable filtering

hierarchy parent

str

Hierarchy parent of the dataset

limit events

{f}

Upper limit for number of filtered events

polygon filters

{f}

Polygon filter indices

remove invalid events

{f}

Remove events with inf/nan values

Min/max filters are also defined in the filters section:

filtering

explanation

area_um min

Exclude events with area [µm²] below this value

area_um max

Exclude events with area [µm²] above this value

aspect max

Exclude events with an aspect ratio above this value

Examples

excluding events with large deformation

import matplotlib.pylab as plt
import dclab
ds = dclab.new_dataset("data/example.rtdc")

ds.config["filtering"]["deform min"] = 0
ds.config["filtering"]["deform max"] = .1
ds.apply_filter()
dif = ds.filter.all

f, axes = plt.subplots(1, 2, sharex=True, sharey=True)
axes[0].plot(ds["area_um"], ds["bright_avg"], "o", alpha=.2)
axes[0].set_title("unfiltered")
axes[1].plot(ds["area_um"][dif], ds["bright_avg"][dif], "o", alpha=.2)
axes[1].set_title("Deformation <= 0.1")

for ax in axes:
    ax.set_xlabel(dclab.dfn.get_feature_label("area_um"))
    ax.set_ylabel(dclab.dfn.get_feature_label("bright_avg"))

plt.tight_layout()
plt.show()

(Source code, png, hires.png, pdf)

_images/sec_av_notation-3.png

excluding random events

This is useful if you need to have a (sub-)dataset of a specified size. The downsampling is reproducible (the same points are excluded).

import matplotlib.pylab as plt
import dclab
ds = dclab.new_dataset("data/example.rtdc")
ds.config["filtering"]["limit events"] = 4000
ds.apply_filter()
fid = ds.filter.all

ax = plt.subplot(111)
ax.plot(ds["area_um"][fid], ds["deform"][fid], "o", alpha=.2)
ax.set_xlabel(dclab.dfn.get_feature_label("area_um"))
ax.set_ylabel(dclab.dfn.get_feature_label("deform"))
plt.show()

(Source code, png, hires.png, pdf)

_images/sec_av_notation-4.png

Experiment metadata

Every RT-DC measurement has metadata consisting of key-value-pairs. The following are supported:

experiment

parsed

description [units]

date

str

Date of measurement (‘YYYY-MM-DD’)

event count

{f}

Number of recorded events

run index

{f}

Index of measurement run

sample

str

Measured sample or user-defined reference

time

str

Start time of measurement (‘HH:MM:SS[.S]’)

fluorescence

parsed

description [units]

baseline 1 offset

{f}

Baseline offset channel 1

baseline 2 offset

{f}

Baseline offset channel 2

baseline 3 offset

{f}

Baseline offset channel 3

bit depth

{f}

Trace bit depth

channel 1 name

str

FL1 description

channel 2 name

str

FL2 description

channel 3 name

str

FL3 description

channel count

{f}

Number of active channels

channels installed

{f}

Number of available channels

laser 1 lambda

float

Laser 1 wavelength [nm]

laser 1 power

float

Laser 1 output power [%]

laser 2 lambda

float

Laser 2 wavelength [nm]

laser 2 power

float

Laser 2 output power [%]

laser 3 lambda

float

Laser 3 wavelength [nm]

laser 3 power

float

Laser 3 output power [%]

laser count

{f}

Number of active lasers

lasers installed

{f}

Number of available lasers

sample rate

{f}

Trace sample rate [Hz]

samples per event

{f}

Samples per event

signal max

float

Upper voltage detection limit [V]

signal min

float

Lower voltage detection limit [V]

trace median

{f}

Rolling median filter size for traces

fmt_tdms

parsed

description [units]

video frame offset

{f}

Missing events at beginning of video

imaging

parsed

description [units]

flash device

str

Light source device type

flash duration

float

Light source flash duration [µs]

frame rate

float

Imaging frame rate [Hz]

pixel size

float

Pixel size [µm]

roi position x

{f}

Image x coordinate on sensor [px]

roi position y

{f}

Image y coordinate on sensor [px]

roi size x

{f}

Image width [px]

roi size y

{f}

Image height [px]

online_contour

parsed

description [units]

bg empty

{f}

Background correction from empty frames only

bin area min

{f}

Minium pixel area of binary image event

bin kernel

{f}

Disk size for binary closing of mask image

bin threshold

{f}

Threshold for mask from bg-corrected image

image blur

{f}

Odd sigma for Gaussian blur (21x21 kernel)

no absdiff

{f}

Do not use OpenCV ‘absdiff’ for bg-correction

online_filter

parsed

description [units]

target duration

float

Target measurement duration [min]

target event count

{f}

Target event count for online gating

setup

parsed

description [units]

channel width

float

Width of microfluidic channel [µm]

chip identifier

{f}

Unique identifier of the chip used

chip region

{f}

Imaged chip region (channel or reservoir)

flow rate

float

Flow rate in channel [µL/s]

flow rate sample

float

Sample flow rate [µL/s]

flow rate sheath

float

Sheath flow rate [µL/s]

identifier

str

Unique setup identifier

medium

str

Medium used

module composition

str

Comma-separated list of modules used

software version

str

Acquisition software with version

temperature

float

Mean chip temperature [°C]

Example: date and time of a measurement

In [1]: import dclab

In [2]: ds = dclab.new_dataset("data/example.rtdc")

In [3]: ds.config["experiment"]["date"], ds.config["experiment"]["time"]
Out[3]: ('2017-07-16', '19:01:36')

Analysis metadata

In addition to inherent (defined during data acquisition) metadata, dclab also supports additional metadata that are relevant for certain data analysis pipelines, such as Young’s modulus computation or fluorescence crosstalk correction.

calculation

parsed

description [units]

crosstalk fl12

float

Fluorescence crosstalk, channel 1 to 2

crosstalk fl13

float

Fluorescence crosstalk, channel 1 to 3

crosstalk fl21

float

Fluorescence crosstalk, channel 2 to 1

crosstalk fl23

float

Fluorescence crosstalk, channel 2 to 3

crosstalk fl31

float

Fluorescence crosstalk, channel 3 to 1

crosstalk fl32

float

Fluorescence crosstalk, channel 3 to 2

emodulus lut

str

Look-up table identifier

emodulus medium

str

Medium used (e.g. CellCarrierB, water)

emodulus model

{f}

Model [DEPRECATED]

emodulus temperature

float

Chip temperature [°C]

emodulus viscosity

float

Viscosity [Pa*s] if ‘medium’ unknown

User-defined metadata

In addition to the registered metadata keys listed above, you may also define custom metadata in the “user” section. This section will be saved alongside the other metadata when a dataset is exported as an .rtdc (HDF5) file.

Note

It is recommended to use the following data types for the value of each key: str, bool, float and int. Other data types may not render nicely in ShapeOut2 or DCOR.

To edit the “user” section in dclab, simply modify the config property of a loaded dataset. The changes made are not written to the underlying file.

Example: Setting custom “user” metadata in dclab

In [4]: import dclab

In [5]: ds = dclab.new_dataset("data/example.rtdc")

In [6]: my_metadata = {"inlet": True, "n_channels": 4}

In [7]: ds.config["user"] = my_metadata

In [8]: other_metadata = {"outlet": False, "RBC": True}

# we can also add metadata with the `update` method
In [9]: ds.config["user"].update(other_metadata)

# or
In [10]: ds.config.update({"user": other_metadata})

In [11]: print(ds.config["user"])
{'inlet': True, 'n_channels': 4, 'outlet': False, 'RBC': True}

# we can clear the "user" section like so:
In [12]: ds.config["user"].clear()

If you are implementing a custom data acquisition pipeline, you may alternatively add user-defined meta data (permanently) to an .rtdc file in a post-measurement step like so.

Example: Setting custom “user” metadata permanently

import h5py
with h5py.File("/path/to/your/dataset.rtdc") as h5:
    h5.attrs["user:inlet"] = True
    h5.attrs["user:n_channels"] = 4
    h5.attrs["user:outlet"] = False
    h5.attrs["user:RBC"] = True
    h5.attrs["user:project"] = "strangelove"

User-defined metadata can also be used with user-defined plugin features. This allows you to design plugin features which utilize your pipeline-specific metadata.