Notation

When coding with dclab, you should be aware of the following definitions and design principles.

Events

An event comprises all data recorded for the detection of one object (e.g. cell or bead) in an RT-DC measurement.

Features

A feature is a measurement parameter of an RT-DC measurement. For instance, the feature “index” enumerates all recorded events, the feature “deform” contains the deformation values of all events. There are scalar features, i.e. features that assign a single number to an event, and non-scalar features, such as “image” and “contour”. The following features are supported by dclab:

Scalar features

scalar features	description [units]
area_cvx	Convex area [px]
area_msd	Measured area [px]
area_ratio	Porosity (convex to measured area ratio)
area_um	Area [µm²]
aspect	Aspect ratio of bounding box
bg_med	Median frame background brightness [a.u.]
bright_avg	Brightness average [a.u.]
bright_bc_avg	Brightness average (bgc) [a.u.]
bright_bc_sd	Brightness SD (bgc) [a.u.]
bright_perc_10	10th Percentile of brightness (bgc)
bright_perc_90	90th Percentile of brightness (bgc)
bright_sd	Brightness SD [a.u.]
circ	Circularity
deform	Deformation
emodulus	Young’s modulus [kPa]
fl1_area	FL-1 area of peak [a.u.]
fl1_dist	FL-1 distance between two first peaks [µs]
fl1_max	FL-1 maximum [a.u.]
fl1_max_ctc	FL-1 maximum, crosstalk-corrected [a.u.]
fl1_npeaks	FL-1 number of peaks
fl1_pos	FL-1 position of peak [µs]
fl1_width	FL-1 width [µs]
fl2_area	FL-2 area of peak [a.u.]
fl2_dist	FL-2 distance between two first peaks [µs]
fl2_max	FL-2 maximum [a.u.]
fl2_max_ctc	FL-2 maximum, crosstalk-corrected [a.u.]
fl2_npeaks	FL-2 number of peaks
fl2_pos	FL-2 position of peak [µs]
fl2_width	FL-2 width [µs]
fl3_area	FL-3 area of peak [a.u.]
fl3_dist	FL-3 distance between two first peaks [µs]
fl3_max	FL-3 maximum [a.u.]
fl3_max_ctc	FL-3 maximum, crosstalk-corrected [a.u.]
fl3_npeaks	FL-3 number of peaks
fl3_pos	FL-3 position of peak [µs]
fl3_width	FL-3 width [µs]
flow_rate	Flow rate [µLs⁻¹]
frame	Video frame number
g_force	Gravitational force in multiples of g
index	Index (Dataset)
index_online	Index (Online)
inert_ratio_cvx	Inertia ratio of convex contour
inert_ratio_prnc	Principal inertia ratio of raw contour
inert_ratio_raw	Inertia ratio of raw contour
ml_class	Most probable ML class
nevents	Number of events in the same image
pc1	Principal component 1
pc2	Principal component 2
pos_x	Position along channel axis [µm]
pos_y	Position lateral in channel [µm]
pressure	Pressure [mPa]
size_x	Bounding box size x [µm]
size_y	Bounding box size y [µm]
temp	Chip temperature [°C]
temp_amb	Ambient temperature [°C]
tex_asm_avg	Texture angular second moment (avg)
tex_asm_ptp	Texture angular second moment (ptp)
tex_con_avg	Texture contrast (avg)
tex_con_ptp	Texture contrast (ptp)
tex_cor_avg	Texture correlation (avg)
tex_cor_ptp	Texture correlation (ptp)
tex_den_avg	Texture difference entropy (avg)
tex_den_ptp	Texture difference entropy (ptp)
tex_ent_avg	Texture entropy (avg)
tex_ent_ptp	Texture entropy (ptp)
tex_f12_avg	Texture First measure of correlation (avg)
tex_f12_ptp	Texture First measure of correlation (ptp)
tex_f13_avg	Texture Second measure of correlation (avg)
tex_f13_ptp	Texture Second measure of correlation (ptp)
tex_idm_avg	Texture inverse difference moment (avg)
tex_idm_ptp	Texture inverse difference moment (ptp)
tex_sen_avg	Texture sum entropy (avg)
tex_sen_ptp	Texture sum entropy (ptp)
tex_sva_avg	Texture sum variance (avg)
tex_sva_ptp	Texture sum variance (ptp)
tex_var_avg	Texture variance (avg)
tex_var_ptp	Texture variance (ptp)
tilt	Absolute tilt of raw contour
time	Time [s]
userdef0	User-defined 0
userdef1	User-defined 1
userdef2	User-defined 2
userdef3	User-defined 3
userdef4	User-defined 4
userdef5	User-defined 5
userdef6	User-defined 6
userdef7	User-defined 7
userdef8	User-defined 8
userdef9	User-defined 9
volume	Volume [µm³]

In addition to these scalar features, it is possible to define a large number of features dedicated to machine-learning, the “ml_score_???” features: The “?” can be a digit or a lower-case letter of the alphabet, e.g. “ml_score_rbc” or “ml_score_3a3”. If “ml_score_???” features are defined, then the ancillary “ml_class” feature, which identifies the most-probable feature for each event, becomes available.

Non-scalar features

non-scalar features	description [units]
contour	Event contour
image	Gray scale event image
image_bg	Gray scale event background image
mask	Binary mask labeling the event in the image
trace	Dictionary of fluorescence traces

Examples

deformation vs. area plot

import matplotlib.pylab as plt
import dclab
ds = dclab.new_dataset("data/example.rtdc")
ax = plt.subplot(111)
ax.plot(ds["area_um"], ds["deform"], "o", alpha=.2)
ax.set_xlabel(dclab.dfn.get_feature_label("area_um"))
ax.set_ylabel(dclab.dfn.get_feature_label("deform"))
plt.show()

(Source code, png, hires.png, pdf)

event image plot

import matplotlib.pylab as plt
import dclab
ds = dclab.new_dataset("data/example_video.rtdc")
ax1 = plt.subplot(211, title="image")
ax2 = plt.subplot(212, title="mask")
ax1.imshow(ds["image"][6], cmap="gray")
ax2.imshow(ds["mask"][6])

(Source code, png, hires.png, pdf)

Ancillary features

Not all features available in dclab are recorded online during the acquisition of the experimental dataset. Some of the features are computed offline by dclab, such as “volume”, “emodulus”, or scores from imported machine learning models (“ml_score_xxx”). These ancillary features are computed on-the-fly and are made available seamlessly through the same interface.

Filters

A filter can be used to gate events using features. There are min/max filters and 2D polygon filters. The following table defines the main filtering parameters:

filtering	parsed	description [units]
enable filters	`{f}`	Enable filtering
hierarchy parent	`str`	Hierarchy parent of the dataset
limit events	`{f}`	Upper limit for number of filtered events
polygon filters	`{f}`	Polygon filter indices
remove invalid events	`{f}`	Remove events with inf/nan values

Min/max filters are also defined in the filters section:

filtering	explanation
area_um min	Exclude events with area [µm²] below this value
area_um max	Exclude events with area [µm²] above this value
aspect max	Exclude events with an aspect ratio above this value
…	…

Examples

excluding events with large deformation

import matplotlib.pylab as plt
import dclab
ds = dclab.new_dataset("data/example.rtdc")

ds.config["filtering"]["deform min"] = 0
ds.config["filtering"]["deform max"] = .1
ds.apply_filter()
dif = ds.filter.all

f, axes = plt.subplots(1, 2, sharex=True, sharey=True)
axes[0].plot(ds["area_um"], ds["bright_avg"], "o", alpha=.2)
axes[0].set_title("unfiltered")
axes[1].plot(ds["area_um"][dif], ds["bright_avg"][dif], "o", alpha=.2)
axes[1].set_title("Deformation <= 0.1")

for ax in axes:
    ax.set_xlabel(dclab.dfn.get_feature_label("area_um"))
    ax.set_ylabel(dclab.dfn.get_feature_label("bright_avg"))

plt.tight_layout()
plt.show()

(Source code, png, hires.png, pdf)

excluding random events

This is useful if you need to have a (sub-)dataset of a specified size. The downsampling is reproducible (the same points are excluded).

import matplotlib.pylab as plt
import dclab
ds = dclab.new_dataset("data/example.rtdc")
ds.config["filtering"]["limit events"] = 4000
ds.apply_filter()
fid = ds.filter.all

ax = plt.subplot(111)
ax.plot(ds["area_um"][fid], ds["deform"][fid], "o", alpha=.2)
ax.set_xlabel(dclab.dfn.get_feature_label("area_um"))
ax.set_ylabel(dclab.dfn.get_feature_label("deform"))
plt.show()

(Source code, png, hires.png, pdf)

Experiment metadata

Every RT-DC measurement has metadata consisting of key-value-pairs. The following are supported:

experiment	parsed	description [units]
date	`str`	Date of measurement (‘YYYY-MM-DD’)
event count	`{f}`	Number of recorded events
run index	`{f}`	Index of measurement run
sample	`str`	Measured sample or user-defined reference
time	`str`	Start time of measurement (‘HH:MM:SS[.S]’)

fluorescence	parsed	description [units]
baseline 1 offset	`{f}`	Baseline offset channel 1
baseline 2 offset	`{f}`	Baseline offset channel 2
baseline 3 offset	`{f}`	Baseline offset channel 3
bit depth	`{f}`	Trace bit depth
channel 1 name	`str`	FL1 description
channel 2 name	`str`	FL2 description
channel 3 name	`str`	FL3 description
channel count	`{f}`	Number of active channels
channels installed	`{f}`	Number of available channels
laser 1 lambda	`float`	Laser 1 wavelength [nm]
laser 1 power	`float`	Laser 1 output power [%]
laser 2 lambda	`float`	Laser 2 wavelength [nm]
laser 2 power	`float`	Laser 2 output power [%]
laser 3 lambda	`float`	Laser 3 wavelength [nm]
laser 3 power	`float`	Laser 3 output power [%]
laser count	`{f}`	Number of active lasers
lasers installed	`{f}`	Number of available lasers
sample rate	`{f}`	Trace sample rate [Hz]
samples per event	`{f}`	Samples per event
signal max	`float`	Upper voltage detection limit [V]
signal min	`float`	Lower voltage detection limit [V]
trace median	`{f}`	Rolling median filter size for traces

fmt_tdms	parsed	description [units]
video frame offset	`{f}`	Missing events at beginning of video

imaging	parsed	description [units]
flash device	`str`	Light source device type
flash duration	`float`	Light source flash duration [µs]
frame rate	`float`	Imaging frame rate [Hz]
pixel size	`float`	Pixel size [µm]
roi position x	`{f}`	Image x coordinate on sensor [px]
roi position y	`{f}`	Image y coordinate on sensor [px]
roi size x	`{f}`	Image width [px]
roi size y	`{f}`	Image height [px]

online_contour	parsed	description [units]
bg empty	`{f}`	Background correction from empty frames only
bin area min	`{f}`	Minium pixel area of binary image event
bin kernel	`{f}`	Disk size for binary closing of mask image
bin threshold	`{f}`	Threshold for mask from bg-corrected image
image blur	`{f}`	Odd sigma for Gaussian blur (21x21 kernel)
no absdiff	`{f}`	Do not use OpenCV ‘absdiff’ for bg-correction

online_filter	parsed	description [units]
target duration	`float`	Target measurement duration [min]
target event count	`{f}`	Target event count for online gating

setup	parsed	description [units]
channel width	`float`	Width of microfluidic channel [µm]
chip identifier	`{f}`	Unique identifier of the chip used
chip region	`{f}`	Imaged chip region (channel or reservoir)
flow rate	`float`	Flow rate in channel [µL/s]
flow rate sample	`float`	Sample flow rate [µL/s]
flow rate sheath	`float`	Sheath flow rate [µL/s]
identifier	`str`	Unique setup identifier
medium	`str`	Medium used
module composition	`str`	Comma-separated list of modules used
software version	`str`	Acquisition software with version
temperature	`float`	Mean chip temperature [°C]

Example: date and time of a measurement

In [1]: import dclab

In [2]: ds = dclab.new_dataset("data/example.rtdc")

In [3]: ds.config["experiment"]["date"], ds.config["experiment"]["time"]
Out[3]: ('2017-07-16', '19:01:36')

Analysis metadata

In addition to inherent (defined during data acquisition) metadata, dclab also supports additional metadata that are relevant for certain data analysis pipelines, such as Young’s modulus computation or fluorescence crosstalk correction.

calculation	parsed	description [units]
crosstalk fl12	`float`	Fluorescence crosstalk, channel 1 to 2
crosstalk fl13	`float`	Fluorescence crosstalk, channel 1 to 3
crosstalk fl21	`float`	Fluorescence crosstalk, channel 2 to 1
crosstalk fl23	`float`	Fluorescence crosstalk, channel 2 to 3
crosstalk fl31	`float`	Fluorescence crosstalk, channel 3 to 1
crosstalk fl32	`float`	Fluorescence crosstalk, channel 3 to 2
emodulus lut	`str`	Look-up table identifier
emodulus medium	`str`	Medium used (e.g. ‘0.49% MC-PBS’)
emodulus temperature	`float`	Chip temperature [°C]
emodulus viscosity	`float`	Viscosity [Pa*s] if ‘medium’ unknown
emodulus viscosity model	`str`	Viscosity model for known media

User-defined metadata

In addition to the registered metadata keys listed above, you may also define custom metadata in the “user” section. This section will be saved alongside the other metadata when a dataset is exported as an .rtdc (HDF5) file.

Note

It is recommended to use the following data types for the value of each key: str, bool, float and int. Other data types may not render nicely in ShapeOut2 or DCOR.

To edit the “user” section in dclab, simply modify the config property of a loaded dataset. The changes made are not written to the underlying file.

Example: Setting custom “user” metadata in dclab

In [4]: import dclab

In [5]: ds = dclab.new_dataset("data/example.rtdc")

In [6]: my_metadata = {"inlet": True, "n_channels": 4}

In [7]: ds.config["user"] = my_metadata

In [8]: other_metadata = {"outlet": False, "RBC": True}

# we can also add metadata with the `update` method
In [9]: ds.config["user"].update(other_metadata)

# or
In [10]: ds.config.update({"user": other_metadata})

In [11]: print(ds.config["user"])
{'inlet': True, 'n_channels': 4, 'outlet': False, 'RBC': True}

# we can clear the "user" section like so:
In [12]: ds.config["user"].clear()

If you are implementing a custom data acquisition pipeline, you may alternatively add user-defined meta data (permanently) to an .rtdc file in a post-measurement step like so.

Example: Setting custom “user” metadata permanently

import h5py
with h5py.File("/path/to/your/dataset.rtdc") as h5:
    h5.attrs["user:inlet"] = True
    h5.attrs["user:n_channels"] = 4
    h5.attrs["user:outlet"] = False
    h5.attrs["user:RBC"] = True
    h5.attrs["user:project"] = "strangelove"

User-defined metadata can also be used with user-defined plugin features. This allows you to design plugin features which utilize your pipeline-specific metadata.