Skip to main content
Software
Software
Software
Pill Icon
Introduction

To address the difficulty in determining the number of fluorescent spots, we have designed a tool based on Python and the PyQt5 framework. This tool determines which points in an image are fluorescent spots by counting the number of pixels with gray values above a certain threshold (above the normal distribution confidence interval). Additionally, to ensure the rationality of the confidence interval selection, we have developed a machine learning-like training-testing tool to decide on a suitable confidence interval.

Pill Icon
Prerequisites
  1. First, You Need to Open It:
  2. On Linux or macOS: Search for "Terminal". On Windows: Search for "Windows PowerShell". You will only need to use a few command-line commands:

    Change Directory with cd: For example, you can type cd BUCT-China/ to navigate to the "BUCT-China" directory or cd .. to move back to the parent directory. List Directory Contents with ls: This command lists the elements in the current directory. Show Current Path with pwd: This command displays the path to your current location.

  3. Prerequisites:
  4. Ensure Python is Installed on Your Computer:

    On Windows:

    Open the terminal and type py. This command will display the installed Python version. You should have at least version 3.8. Then, press Ctrl+Z (you will see ^Z) and hit Enter. This will allow you to enter the classic command line >.

    On Linux/macOS:

    Open the terminal and type python --version. This command will show the installed Python version. You should have at least version 3.8. Afterwards, press Ctrl+Z (you will see ^Z). This will enable you to enter the classic command line >. If Python is Not Installed:

    You can download it from the Python Download Page (version 3.8 or higher). Ensure You Have the Correct Version of pip:

    On Windows:

    In your terminal, type:

    py -m pip install --upgrade pip

    On Linux/macOS:

    pip install --upgrade pip
Pill Icon
Installation
  1. Access the code:
  2. Navigate to your chosen directory using the terminal and execute the following command:

    git clone https://gitlab.igem.org/2024/software-tools/buct-china

    Or download it from our GitLab: 2024 Competition / Software Tools / BUCT-China · GitLab

  3. Use the “cd” command to enter the “buct-china/” folder:
  4. cd buct-china
  5. Install the necessary dependencies:
  6. Type the following command in the terminal:

    Windows:

    py -m pip install -r requirements.txt

    MacOS/Linux:

    pip install -r requirements.txt
  7. Type the command in the terminal
  8. python3 findLight.py

    or

    python findLight.py
Pill Icon
How we develop it
First Attempt

Initially, we tried using Otsu's thresholding method to determine the bright spots. This is the most commonly used method in threshold segmentation and is recognized as the best thresholding approach in the industry. However, the first attempt did not yield optimistic results; the number of bright spots exploded.

Second Attempt

We attempted to use morphological erosion and dilation operations along with Sober and Canny edge detection operators, implementing the control variable method.

In fact, by trying to change the size of the Gaussian kernel using the control variable method, we found that when the Canny edge detection operator is selected and the kernel size is set to 7, slightly better results could be achieved. In contrast, morphological operations had minimal effect and even had a counterproductive effect. Even so, the results still showed an excessive number of bright spots.

Based on this, we began to consider a fundamental issue: Is there a problem with Otsu's calculation method?

Core Issue

To deeply investigate the root of the problem, we held multiple discussions with members of Wet-lab group. The outcome of these discussions is that Otsu's calculation method does have issues; the algorithm should adapt to the characteristics of the images and the experimental features. The reasons are as follows:

  • Otsu is very sensitive to noise. Noise involves differences and conversions between concepts in experimental fields and data fields. Some slightly gray points in the image are actually clusters of bright spots, but Otsu may consider them as noise and arbitrarily classify them as background.

  • Otsu's method of dividing the image into foreground and background does not align with our experimental concept. In fact, such a classification method does not apply to our experimental samples. Recklessly dividing our image brightness levels into background brightness and bright spot brightness would result in the loss of bright spots with uneven brightness.

  • Fluorescent probe images are obtained from experiments conducted by different experimenters each time. Due to slight differences in sample conditions, the imaging results may vary slightly, and their luminance is easily affected.

The core issue is that we should adapt to the image characteristics and experimental features. To verify our viewpoint, we conducted an experiment: using only the grayscale value as the filtering criterion, we binarized the image and set a fixed grayscale threshold each time. Points with grayscale values above the threshold were marked as "bright spots." We used connected components to identify complete "bright spot clusters," which are our desired positive points. We employed two batches of samples, using the first 10 images from each batch (each batch contained a single TIFF file). For each image, we attempted thresholds ranging from 250 to 160.

The results were successful! The first batch of samples showed good results with a threshold around 170, and the second batch around 200. This indicates that the solution to this problem might be simpler than we initially anticipated.

However, merely using manually set thresholds is far from sufficient; such a method is neither elegant nor scientific. We need to develop a calculation method unique to fluorescent probe images, which should be simple and elegant, with a single judgment criterion and statistically significant conditional constraints.

Solution: Statistical Method

We conducted a bold experiment by abandoning current common image thresholding algorithms like Otsu and instead sought statistical assistance. Initially, we treated all grayscale values of all coordinates in the entire image as one set and calculated the standard deviation and variance. At this step, our thinking was still within the old Otsu algorithm framework.

The results improved somewhat but were still not the final desired outcome. At this point, we wondered if we could utilize the definition of statistics to fit the distribution of our samples. We attempted to use the fit method in Python to fit the pixel coordinates and corresponding grayscale values of our images. We used several samples for fitting, and the vast majority of the samples conformed to a log-normal distribution. Therefore, we attempted to use a log-normal distribution for threshold segmentation. However, calculating the upper threshold of the log-normal distribution easily leads to infinite values, making it impossible to capture bright spots.

distributions = [
stats.norm,
stats.gamma,
stats.lognorm,
stats.beta,
stats.weibull_min
]

best_distribution = None
best_params = None
best_sse = np.inf

for distribution in distributions:
try:
with warnings.catch_warnings():
warnings.simplefilter("ignore")
params = distribution.fit(pixel_values)
arg = params[:-2]
loc = params[-2]
scale = params[-1]

if scale == 0:
continue

x = np.linspace(np.min(pixel_values), np.max(pixel_values), 100)
pdf = distribution.pdf(x, *arg, loc=loc, scale=scale)

hist, bin_edges = np.histogram(pixel_values, bins=100, density=True)

sse = np.sum((hist - pdf)**2)

if sse < best_sse:
best_distribution = distribution
best_params = params
best_sse = sse
except Exception as e:
print(f"Error fitting {distribution.name}: {str(e)}")
continue

if best_distribution:
print(f"Best fitting distribution: {best_distribution.name}")
else:
print("No distribution could be fitted successfully.")

We consulted relevant materials and found that the log-normal distribution is very similar to the normal distribution, with the distribution shape being quite alike. The main difference is that its probability distribution is shifted to the right. Thus, we tried using the confidence interval of the normal distribution to calculate the bright spots, and the results were very good. Especially, setting the confidence interval at 99.994% showed excellent universality for uniformly illuminated images. We set 99.994% as the default confidence interval for processing images.

However, this confidence interval only applies to uniformly illuminated images, is very sensitive to noise, and imposes very high requirements on the precision of the experiments. We need a reasonable confidence interval that should be independently applicable to each experimental design.