Notation
When coding with dclab, you should be aware of the following definitions and design principles.
Events
An event comprises all data recorded for the detection of one object (e.g. cell or bead) in an RT-DC measurement.
Features
A feature is a measurement parameter of an RT-DC measurement. For instance, the feature “index” enumerates all recorded events, the feature “deform” contains the deformation values of all events. There are scalar features, i.e. features that assign a single number to an event, and non-scalar features, such as “image” and “contour”. All features in a dataset are exposed as read-only to the user. The following features are supported by dclab:
Scalar features
scalar features, description [units] |
|
---|---|
area_cvx |
Convex area [px] |
area_msd |
Measured area [px] |
area_ratio |
Porosity (convex to measured area ratio) |
area_um |
Area [µm²] |
area_um_raw |
Area [µm²] of raw contour |
aspect |
Aspect ratio of bounding box |
basinmap0 |
Basin mapping 0 |
basinmap1 |
Basin mapping 1 |
basinmap2 |
Basin mapping 2 |
basinmap3 |
Basin mapping 3 |
basinmap4 |
Basin mapping 4 |
basinmap5 |
Basin mapping 5 |
basinmap6 |
Basin mapping 6 |
basinmap7 |
Basin mapping 7 |
basinmap8 |
Basin mapping 8 |
basinmap9 |
Basin mapping 9 |
bg_med |
Median frame background brightness [a.u.] |
bg_off |
Background offset [a.u.] |
bright_avg |
Brightness average [a.u.] |
bright_bc_avg |
Brightness average (bgc) [a.u.] |
bright_bc_sd |
Brightness SD (bgc) [a.u.] |
bright_perc_10 |
10th Percentile of brightness (bgc) |
bright_perc_90 |
90th Percentile of brightness (bgc) |
bright_sd |
Brightness SD [a.u.] |
circ |
Circularity |
deform |
Deformation |
deform_raw |
Deformation of raw contour |
eccentr_prnc |
Eccentricity of raw contour |
emodulus |
Young’s modulus [kPa] |
fl1_area |
FL-1 area of peak [a.u.] |
fl1_dist |
FL-1 distance between two first peaks [µs] |
fl1_max |
FL-1 maximum [a.u.] |
fl1_max_ctc |
FL-1 maximum, crosstalk-corrected [a.u.] |
fl1_npeaks |
FL-1 number of peaks |
fl1_pos |
FL-1 position of peak [µs] |
fl1_width |
FL-1 width [µs] |
fl2_area |
FL-2 area of peak [a.u.] |
fl2_dist |
FL-2 distance between two first peaks [µs] |
fl2_max |
FL-2 maximum [a.u.] |
fl2_max_ctc |
FL-2 maximum, crosstalk-corrected [a.u.] |
fl2_npeaks |
FL-2 number of peaks |
fl2_pos |
FL-2 position of peak [µs] |
fl2_width |
FL-2 width [µs] |
fl3_area |
FL-3 area of peak [a.u.] |
fl3_dist |
FL-3 distance between two first peaks [µs] |
fl3_max |
FL-3 maximum [a.u.] |
fl3_max_ctc |
FL-3 maximum, crosstalk-corrected [a.u.] |
fl3_npeaks |
FL-3 number of peaks |
fl3_pos |
FL-3 position of peak [µs] |
fl3_width |
FL-3 width [µs] |
flow_rate |
Flow rate [µLs⁻¹] |
frame |
Video frame number |
g_force |
Gravitational force in multiples of g |
index |
Index (Dataset) |
index_online |
Index (Online) |
inert_ratio_cvx |
Inertia ratio of convex contour |
inert_ratio_prnc |
Principal inertia ratio of raw contour |
inert_ratio_raw |
Inertia ratio of raw contour |
ml_class |
Most probable ML class |
nevents |
Number of events in the same image |
pc1 |
Principal component 1 |
pc2 |
Principal component 2 |
per_ratio |
Inverse Convexity (raw to convex perimeter ratio) |
per_um_raw |
Perimeter [µm] of raw contour |
pos_x |
Position along channel axis [µm] |
pos_y |
Position lateral in channel [µm] |
pressure |
Pressure [mPa] |
qpi_dm_avg |
Dry mass (average) [pg] |
qpi_dm_sd |
Dry mass (SD) [pg] |
qpi_focus |
Computed focus distance [µm] |
qpi_pha_int |
Integrated phase [rad] |
qpi_ri_avg |
Refractive index (average) |
qpi_ri_sd |
Refractive index (SD) |
size_x |
Bounding box size x [µm] |
size_y |
Bounding box size y [µm] |
sym_x |
Symmetry ratio left-right |
sym_y |
Symmetry ratio top-bottom |
temp |
Chip temperature [°C] |
temp_amb |
Ambient temperature [°C] |
tex_asm_avg |
Texture angular second moment (avg) |
tex_asm_ptp |
Texture angular second moment (ptp) |
tex_con_avg |
Texture contrast (avg) |
tex_con_ptp |
Texture contrast (ptp) |
tex_cor_avg |
Texture correlation (avg) |
tex_cor_ptp |
Texture correlation (ptp) |
tex_den_avg |
Texture difference entropy (avg) |
tex_den_ptp |
Texture difference entropy (ptp) |
tex_ent_avg |
Texture entropy (avg) |
tex_ent_ptp |
Texture entropy (ptp) |
tex_f12_avg |
Texture First measure of correlation (avg) |
tex_f12_ptp |
Texture First measure of correlation (ptp) |
tex_f13_avg |
Texture Second measure of correlation (avg) |
tex_f13_ptp |
Texture Second measure of correlation (ptp) |
tex_idm_avg |
Texture inverse difference moment (avg) |
tex_idm_ptp |
Texture inverse difference moment (ptp) |
tex_sen_avg |
Texture sum entropy (avg) |
tex_sen_ptp |
Texture sum entropy (ptp) |
tex_sva_avg |
Texture sum variance (avg) |
tex_sva_ptp |
Texture sum variance (ptp) |
tex_var_avg |
Texture variance (avg) |
tex_var_ptp |
Texture variance (ptp) |
tilt |
Absolute tilt of raw contour |
time |
Time [s] |
userdef0 |
User-defined 0 |
userdef1 |
User-defined 1 |
userdef2 |
User-defined 2 |
userdef3 |
User-defined 3 |
userdef4 |
User-defined 4 |
userdef5 |
User-defined 5 |
userdef6 |
User-defined 6 |
userdef7 |
User-defined 7 |
userdef8 |
User-defined 8 |
userdef9 |
User-defined 9 |
volume |
Volume [µm³] |
In addition to these scalar features, it is possible to define a large number of features dedicated to machine-learning, the “ml_score_???” features which are probability scores with values between 0 and 1. The “?” can be a digit or a lower-case letter of the alphabet, e.g. “ml_score_rbc” or “ml_score_3a3”. If “ml_score_???” features are defined, then the ancillary “ml_class” feature, which identifies the most-probable feature for each event, becomes available.
Non-scalar features
non-scalar features, description [units] |
|
---|---|
contour |
Event contour |
image |
Gray scale event image |
image_bg |
Gray scale event background image |
mask |
Binary mask labeling the event in the image |
qpi_amp |
Hologram amplitude image |
qpi_oah |
Off-axis hologram |
qpi_oah_bg |
Off-axis hologram background |
qpi_pha |
Hologram phase image [rad] |
trace |
Dictionary of fluorescence traces |
Examples
deformation vs. area plot
import matplotlib.pylab as plt import dclab ds = dclab.new_dataset("data/example.rtdc") ax = plt.subplot(111) ax.plot(ds["area_um"], ds["deform"], "o", alpha=.2) ax.set_xlabel(dclab.dfn.get_feature_label("area_um")) ax.set_ylabel(dclab.dfn.get_feature_label("deform")) plt.show()(
Source code
,png
,hires.png
,
event image plot
import matplotlib.pylab as plt import dclab ds = dclab.new_dataset("data/example_video.rtdc") ax1 = plt.subplot(211, title="image") ax2 = plt.subplot(212, title="mask") ax1.imshow(ds["image"][6], cmap="gray") ax2.imshow(ds["mask"][6])(
Source code
,png
,hires.png
,
Ancillary features
Not all features available in dclab are recorded online during the acquisition of the experimental dataset. Some of the features are computed offline by dclab, such as “volume”, “emodulus”, or scores from imported machine learning models (“ml_score_xxx”). These ancillary features are computed on-the-fly and are made available seamlessly through the same interface.
Filters
A filter can be used to gate events using features. There are min/max filters and 2D polygon filters. The following table defines the main filtering parameters:
filtering, parsed, description [units] |
||
---|---|---|
enable filters |
Enable filtering |
|
hierarchy parent |
Hierarchy parent of the dataset |
|
limit events |
Upper limit for number of filtered events |
|
polygon filters |
Polygon filter indices |
|
remove invalid events |
Remove events with inf/nan values |
Min/max filters are also defined in the filters section:
filtering |
explanation |
---|---|
area_um min |
Exclude events with area [µm²] below this value |
area_um max |
Exclude events with area [µm²] above this value |
aspect max |
Exclude events with an aspect ratio above this value |
… |
… |
Examples
excluding events with large deformation
import matplotlib.pylab as plt import dclab ds = dclab.new_dataset("data/example.rtdc") ds.config["filtering"]["deform min"] = 0 ds.config["filtering"]["deform max"] = .1 ds.apply_filter() dif = ds.filter.all f, axes = plt.subplots(1, 2, sharex=True, sharey=True) axes[0].plot(ds["area_um"], ds["bright_avg"], "o", alpha=.2) axes[0].set_title("unfiltered") axes[1].plot(ds["area_um"][dif], ds["bright_avg"][dif], "o", alpha=.2) axes[1].set_title("Deformation <= 0.1") for ax in axes: ax.set_xlabel(dclab.dfn.get_feature_label("area_um")) ax.set_ylabel(dclab.dfn.get_feature_label("bright_avg")) plt.tight_layout() plt.show()(
Source code
,png
,hires.png
,
excluding random events
This is useful if you need to have a (sub-)dataset of a specified size. The downsampling is reproducible (the same points are excluded).
import matplotlib.pylab as plt import dclab ds = dclab.new_dataset("data/example.rtdc") ds.config["filtering"]["limit events"] = 4000 ds.apply_filter() fid = ds.filter.all ax = plt.subplot(111) ax.plot(ds["area_um"][fid], ds["deform"][fid], "o", alpha=.2) ax.set_xlabel(dclab.dfn.get_feature_label("area_um")) ax.set_ylabel(dclab.dfn.get_feature_label("deform")) plt.show()(
Source code
,png
,hires.png
,
Experiment metadata
Every RT-DC measurement has metadata consisting of key-value-pairs. The following are supported:
experiment, parsed, description [units] |
||
---|---|---|
date |
Date of measurement (‘YYYY-MM-DD’) |
|
event count |
Number of recorded events |
|
run identifier |
Unique measurement identifier |
|
run index |
Index of measurement run |
|
sample |
Measured sample or user-defined reference |
|
time |
Start time of measurement (‘HH:MM:SS[.S]’) |
|
timestamp |
Start of measurement in unix time [s] |
fluorescence, parsed, description [units] |
||
---|---|---|
baseline 1 offset |
Baseline offset channel 1 |
|
baseline 2 offset |
Baseline offset channel 2 |
|
baseline 3 offset |
Baseline offset channel 3 |
|
bit depth |
Trace bit depth |
|
channel 1 name |
FL1 description |
|
channel 2 name |
FL2 description |
|
channel 3 name |
FL3 description |
|
channel count |
Number of active channels |
|
channels installed |
Number of available channels |
|
laser 1 lambda |
Laser 1 wavelength [nm] |
|
laser 1 power |
Laser 1 output power [%] |
|
laser 2 lambda |
Laser 2 wavelength [nm] |
|
laser 2 power |
Laser 2 output power [%] |
|
laser 3 lambda |
Laser 3 wavelength [nm] |
|
laser 3 power |
Laser 3 output power [%] |
|
laser count |
Number of active lasers |
|
lasers installed |
Number of available lasers |
|
sample rate |
Trace sample rate [Hz] |
|
samples per event |
Samples per event |
|
signal max |
Upper voltage detection limit [V] |
|
signal min |
Lower voltage detection limit [V] |
|
trace median |
Rolling median filter size for traces |
fmt_tdms, parsed, description [units] |
||
---|---|---|
video frame offset |
Missing events at beginning of video |
imaging, parsed, description [units] |
||
---|---|---|
flash device |
Light source device type |
|
flash duration |
Light source flash duration [µs] |
|
frame rate |
Imaging frame rate [Hz] |
|
pixel size |
Pixel size [µm] |
|
roi position x |
Image x coordinate on sensor [px] |
|
roi position y |
Image y coordinate on sensor [px] |
|
roi size x |
Image width [px] |
|
roi size y |
Image height [px] |
online_contour, parsed, description [units] |
||
---|---|---|
bg empty |
Background correction from empty frames only |
|
bin area min |
Minium pixel area of binary image event |
|
bin kernel |
Disk size for binary closing of mask image |
|
bin threshold |
Threshold for mask from bg-corrected image |
|
image blur |
Odd sigma for Gaussian blur (21x21 kernel) |
|
no absdiff |
Do not use OpenCV ‘absdiff’ for bg-correction |
online_filter, parsed, description [units] |
||
---|---|---|
target duration |
Target measurement duration [min] |
|
target event count |
Target event count for online gating |
pipeline, parsed, description [units] |
||
---|---|---|
dcnum background |
Background ID |
|
dcnum data |
Data ID |
|
dcnum feature |
Feature extractor ID |
|
dcnum gate |
Gating ID |
|
dcnum generation |
Generation ID |
|
dcnum hash |
Hash |
|
dcnum mapping |
Event mapping from original dataset |
|
dcnum segmenter |
Segmenter ID |
|
dcnum yield |
Event yield |
qpi, parsed, description [units] |
||
---|---|---|
amp border loc |
Border location specifier for amplitude |
|
amp border px |
Width of border for amplitude [pix] |
|
amp fit offset |
Amplitude offset correction |
|
amp fit profile |
Amplitude profile correction |
|
bg method |
Background computation method |
|
filter name |
Fourier filter used |
|
filter size |
Fourier filter size [1/pix] |
|
focus interval |
Focus interval to search [µm] |
|
focus kernel |
Propagation kernel |
|
focus metric |
Metric used to calculate focus |
|
focus minimizer |
Minimizer used to calculate focus |
|
focus padding |
Level of padding for refocus |
|
invert phase |
Invert the phase data |
|
medium index |
Refractive index of medium |
|
padding |
Level of padding |
|
pha border loc |
Border location specifier for phase |
|
pha border px |
Width of border for phase [pix] |
|
pha fit offset |
Phase offset correction |
|
pha fit profile |
Phase profile correction |
|
pixel size proc |
QPI pixel size [µm]. |
|
pixel size raw |
Hologram pixel size [µm]. |
|
scale to filter |
Scale QPI data to filter size |
|
sideband freq |
Sideband coordinates [1/pix] |
|
software version |
Software version(s) |
|
subtract mean |
Subtract mean before processing |
|
wavelength |
Imaging wavelength [nm] |
setup, parsed, description [units] |
||
---|---|---|
channel width |
Width of microfluidic channel [µm] |
|
chip identifier |
Unique identifier of the chip used |
|
chip region |
Imaged chip region (channel or reservoir) |
|
flow rate |
Flow rate in channel [µL/s] |
|
flow rate sample |
Sample flow rate [µL/s] |
|
flow rate sheath |
Sheath flow rate [µL/s] |
|
identifier |
Unique setup identifier |
|
medium |
Medium used |
|
module composition |
Comma-separated list of modules used |
|
software version |
Acquisition software with version |
|
temperature |
Mean chip temperature [°C] |
Example: date and time of a measurement
In [1]: import dclab In [2]: ds = dclab.new_dataset("data/example.rtdc") In [3]: ds.config["experiment"]["date"], ds.config["experiment"]["time"] Out[3]: ('2017-07-16', '19:01:36')
Analysis metadata
In addition to inherent (defined during data acquisition) metadata, dclab also supports additional metadata that are relevant for certain data analysis pipelines, such as Young’s modulus computation or fluorescence crosstalk correction.
calculation, parsed, description [units] |
||
---|---|---|
crosstalk fl12 |
Fluorescence crosstalk, channel 1 to 2 |
|
crosstalk fl13 |
Fluorescence crosstalk, channel 1 to 3 |
|
crosstalk fl21 |
Fluorescence crosstalk, channel 2 to 1 |
|
crosstalk fl23 |
Fluorescence crosstalk, channel 2 to 3 |
|
crosstalk fl31 |
Fluorescence crosstalk, channel 3 to 1 |
|
crosstalk fl32 |
Fluorescence crosstalk, channel 3 to 2 |
|
emodulus lut |
Look-up table identifier |
|
emodulus medium |
Medium used (e.g. ‘0.49% MC-PBS’) |
|
emodulus temperature |
Chip temperature [°C] |
|
emodulus viscosity |
Viscosity [Pa*s] if ‘medium’ unknown |
|
emodulus viscosity model |
Viscosity model for known media |
User-defined metadata
In addition to the registered metadata keys listed above, you may also define custom metadata in the “user” section. This section will be saved alongside the other metadata when a dataset is exported as an .rtdc (HDF5) file.
Note
It is recommended to use the following data types for the value of
each key: str
, bool
, float
and int
. Other data types may
not render nicely in ShapeOut2 or DCOR.
To edit the “user” section in dclab, simply modify the config property of a loaded dataset. The changes made are not written to the underlying file.
Example: Setting custom “user” metadata in dclab
In [4]: import dclab In [5]: ds = dclab.new_dataset("data/example.rtdc") In [6]: my_metadata = {"inlet": True, "n_channels": 4} In [7]: ds.config["user"] = my_metadata In [8]: other_metadata = {"outlet": False, "RBC": True} # we can also add metadata with the `update` method In [9]: ds.config["user"].update(other_metadata) # or In [10]: ds.config.update({"user": other_metadata}) In [11]: print(ds.config["user"]) {'inlet': True, 'n_channels': 4, 'outlet': False, 'RBC': True} # we can clear the "user" section like so: In [12]: ds.config["user"].clear()
If you are implementing a custom data acquisition pipeline, you may alternatively add user-defined meta data (permanently) to an .rtdc file in a post-measurement step like so.
Example: Setting custom “user” metadata permanently
import h5py with h5py.File("/path/to/your/dataset.rtdc") as h5: h5.attrs["user:inlet"] = True h5.attrs["user:n_channels"] = 4 h5.attrs["user:outlet"] = False h5.attrs["user:RBC"] = True h5.attrs["user:project"] = "strangelove"
User-defined metadata can also be used with user-defined plugin features. This allows you to design plugin features which utilize your pipeline-specific metadata.
Basins
Since dclab 0.51.0, you can define so-called basins in .rtdc files. Basins are files or remote locations that contain additional features that are not part of the file you opened initially.
For instance, you might want to compute some additional features for a
measurement, but you want to avoid editing the original file data/example.rtdc
,
and you also need to have access to the features of the original file when working with
the new file test.rtdc
.
In [13]: import dclab # Create the smaller file with the basin defined. In [14]: with dclab.new_dataset("data/example.rtdc") as dso, dclab.RTDCWriter("test.rtdc", mode="reset") as hw: ....: # copy metadata ....: meta = dict(dso.config) ....: meta.pop("filtering") ....: hw.store_metadata(meta) ....: # store a feature from the original dataset ....: hw.store_feature("deform", dso["deform"]) ....: # store a user-defined featurr ....: hw.store_feature("userdef1", 2.5*dso["deform"]) ....: # store the basin information ....: hw.store_basin(basin_name="mytest", ....: basin_type="file", ....: basin_format="hdf5", ....: basin_locs=["data/example.rtdc"]) ....: In [15]: ds2 = dclab.new_dataset("test.rtdc") # the basin in "test.rtdc" gives you access to features stored in "data/example.rtdc" In [16]: print(ds2.features) ['area_cvx', 'area_msd', 'area_ratio', 'area_um', 'aspect', 'bright_avg', 'bright_sd', 'circ', 'circ_times_area', 'deform', 'frame', 'index', 'inert_ratio_cvx', 'inert_ratio_raw', 'nevents', 'pos_x', 'pos_y', 'size_x', 'size_y', 'time', 'userdef1']
For more information, please take a look at the documentation of Basin
and its subclasses.