Getting started

Installation

To install dclab, use one of the following methods:

  • from PyPI:

    pip install dclab[all]

  • from sources:

    pip install .[all]

The extra key [all] can be omitted if you are not working with DCOR or the tdms file format or have no need to export to .avi or .fcs files. Then, the basic installation of dclab depends on the Python packages h5py, numpy, and scipy. In addition, dclab contains code from OpenCV (computation of moments) and scikit-image (computation of contours and points in polygons) to reduce the list of dependencies (these libraries are not required by dclab).

If you are working with the outdated tdms file format, you have to specify the extra key [tdms], i.e. pip install dclab[tdms] or pip install .[tdms]. This will install the additional libraries nptdms and imageio. You may also specify the extra key [export], which will install imageio and fcswrite for .avi and .fcs export. If you are working with DCOR, then you have to specify the extra key dcor, which will install the reqeusts <https://requests.readthedocs.io/en/master/> module. As mentioned above, using [all] will install all extras.

Note that if you are installing from source or if no binary wheel is available for your platform and Python version, Cython will be installed to build the required dclab extensions. If this process fails, please request a binary wheel for your platform (e.g. Windows 64bit) and Python version (e.g. 3.6) by creating a new issue.

Use cases

If you are a frequent user of RT-DC, you might run into problems that cannot (yet) be addressed with the graphical user interface Shape-Out. Here is a list of use cases that would motivate an installation of dclab.

  • You would like to convert old .tdms-based datasets to the new .rtdc file format, because of enhanced speed in Shape-Out and reduced disk usage. What you are looking for is the command line program dclab-tdms2rtdc that comes with dclab. It allows to batch-convert multiple measurements at a time. Note that you should keep the original .tdms files backed-up somewhere, because there might be future improvements or bug fixes from which you would like to benefit. Please note that DCKit offers a graphical user interface for batch conversion from .tdms to .rtdc.

  • You would like to apply a simple set of filters (e.g. polygon filters that you exported from within Shape-Out) to every new measurement you take and apply a custom data analysis pipeline to the filtered data. This is a straight-forward Python coding problem with dclab. After reading the basic usage section below, please have a look at the polygon filter reference.

  • You would like to do advanced statistics or combine your RT-DC analysis with other fancy approaches such as machine-learning. It would be too laborious to do the analysis in Shape-Out, export the data as text files, and then open them in your custom Python script. If your initial analysis step with Shape-Out only involves tasks that can be automated, why not use dclab from the beginning?

  • You simulated RT-DC data and plan to import them in Shape-Out for testing. Once you have loaded your data as a numpy array, you can instantiate an RTDC_Dict class and then use the Export class to create an .rtdc data file.

If you are still unsure about whether to use dclab or not, you might want to look at the example section. If you need advice, do not hesitate to create an issue.

Basic usage

Experimental RT-DC datasets are always loaded with the new_dataset method:

import numpy as np
import dclab

# .tdms file format
ds = dclab.new_dataset("/path/to/measurement/Online/M1.tdms")
# .rtdc file format
ds = dclab.new_dataset("/path/to/measurement/M2.rtdc")
# DCOR data
ds = dclab.new_dataset("fb719fb2-bd9f-817a-7d70-f4002af916f0")

The object returned by new_dataset is always an instance of RTDCBase. To show all available features, use:

print(ds.features)

This will list all scalar features (e.g. “area_um” and “deform”) and all non-scalar features (e.g. “contour” and “image”). Scalar features can be filtered by editing the configuration of ds and calling ds.apply_filter():

# register filtering operations
amin, amax = ds["area_um"].min(), ds["area_um"].max()
ds.config["filtering"]["area_um min"] = (amax + amin) / 2
ds.config["filtering"]["area_um max"] = amax
ds.apply_filter()  # this step is important!

This will update the binary array ds.filter.all which can be used to extract the filtered data:

area_um_filtered = ds["area_um"][ds.filter.all]

It is also possible to create a hierarchy child of this dataset that only contains the filtered data.

ds_child = dclab.new_dataset(ds)

The hierarchy child ds_child is dynamic, i.e. when the filters in ds change, then ds_child also changes after calling ds_child.apply_filter().

Non-scalar features do not support fancy indexing (i.e. ds["image"][ds.filter.all] will not work. Use a for-loop to extract them.

for ii in range(len(ds)):
    image = ds["image"][ii]
    mask = ds["mask"][ii]
    # this is equivalent to ds["bright_avg"][ii]
    bright_avg = np.mean(image[mask])
    print("average brightness of event {}: {:.1f}".format(ii, bright_avg))

If you need more information to get started on your particular problem, you might want to check out the examples section and the advanced scripting section.