Close this search box.

Classification and counting of cells in brightfield microscopy images: an application of convolutional neural networks – Scientific Reports

The present work was an offshoot of a previous work of the group published in Scientific Reports, in which we used CNN to quantify the number of cells present in the microscopy images14. Our regression algorithm showed good performance and accuracy in two of the three strains tested, demonstrating that not all cells can be equally quantified by this technique. Thus, we present in the present manuscript the development of a model capable of identifying which cell lineage is present in each image based on a classification algorithm. CNNs are widely used for image data, being configured through convolutional layers, which apply filters to detect specific features in image regions. These traits are then combined and processed into subsequent layers, including pooling and fully connected layers, to perform tasks such as classification, object detection, or segmentation. Despite being a “simple” construction model, it was able to solve the problem and, therefore, no complex modifications were necessary.

Image database

The used images were acquired in projects analyzed by the Harmony software (version 3.5), embedded in an automated microscopy High Content Screening (HCS). Only phase contrast images were selected. The images of the A549, HUH7_denv, 3T3, VERO6, THP1, SH-SY5Y, A172 and HUH7_mayv cell lines were used. Light contrast adjustments (highlighting the nuclear marking) and background correction (setting the image’s background) were performed in Harmony.

Processing environment

We used Google Colab’s Integrated Development Environment (IDE) due to its large memory (currently available with 12.72 GB RAM and 107.77 GB HD). For processing purposes, we imported several libraries from the Python v9 programming language. Data (including all data, unique materials, documentation, and code used in analysis) is available at Dataset: Ferreira, E. K. G. D. & Silveira, Guilherme F. 2023. “Data-Analysis-Laboratory/Microscopy-Image-Analysis-Classification-Script-Article: 1.0.0”. Zenodo., accessible at the link:

Segmentation and increase of the image bank

The Data Augmentation technique was used to increase the number of images in the database; the orientations of the images were changed (0°, 90°, 180° or 270°), as was the scaling technique, where the images were reduced to 75%, 50% and 25% of the size of the original images (Fig. 2). The images were resized to 200 × 200 pixels to allow analysis by the algorithm. All of these images were saved in a single database.

Kernel application before the template

There was some homogeneity among the images, and the model sometimes found it difficult to differentiate between them. To work around this situation, filters were applied to highlight some of the most relevant characteristics of several images. This was only performed for the SH-SYS5, HUH7_mayv, HUH7_denv, and A549 lineages (Fig. 4). Several kernels were tested, and it was found that the best results were obtained with the Sharpen kennel, which accentuated the edges of the image. It adds contrast to edges, accentuating light and dark areas from a 3 × 3 matrix, similar to the edge detection kernel with a core value of 528.

Figure 4
figure 4

Adding Kernel to image preprocessing. (a) Kernel Sharpen applied to images. (b) Images of SH-SY5Y, HUH7_mayv, HUH7_denv and A549 strains after kernel application.

Model validation

For CNN validation, 10% of the images were randomly removed, and the remaining 90% were used for training and testing. Of these images, approximately 70% were used to train the CNN, and 30% were used to test it. Table 2 shows the number of images of each bank.

Table 2 Separate number of images for each bank.

Classification model

The images were saved and identified with the name of their lineage. To create the classes, the name of each lineage was replaced with an integer value and used to create categorical classes ranging from 0 to 7.

Model evaluation based on accuracy metrics

Four possible outcomes were considered to evaluate the accuracy of the classification model. These were the true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN).

Confusion matrix

The Confusion Matrix measures the number of correct classifications of the model in relation to the total of observations. TPi corresponds to the number of false positives in class i.

N is the total number of observations.

$$frac{(TP1+TP2+dots +TPn)}{N}$$


The precision is the number of correct classifications of the model in relation to the total of observations.

FNi corresponds to the number of false negatives in class i.



The recall is the ratio of true positives to the total positive observations in the class.

FNi corresponds to the number of false negatives in class i.



The F1-score is the harmonic mean of precision and recall, which seeks to balance the two metrics in unbalanced models.


ROC curve

The ROC (Receiver Operating Characteristic Curve) is the graphical representation of the performance of the classification model in relation to its true positives (True Positive Rate (TPR) and false positives (FPR). The ROC curve is then constructed by plotting the TPR as a function of the FPR at different classification threshold values.

$$TRP= frac{TPi}{(TPi+FNi)}$$


Regression model

As a target, the number of cells corresponding to each image from the HCS was recorded. This was used as the observed value, which was reduced in the same proportion of the images to perform the supervised training of the models and, subsequently, to perform the tests against the predicted values.

Model evaluation based on accuracy metrics

The Mean Absolute Error (MAE), Mean Square Error (MSE), and R2Score were used to evaluate the capacity and degrees of correct answers and errors of the models. However, during the training of the model, only MSE was used.

MSE is the (frac{1}{n} Sigma_{i=1}^{n}) the squares of (left( {Y_{i} – hat{Y}_{i} } right)^{2})

$$MSE=frac{1}{n}Sigma {left(y-widehat{y}right)}^{2}$$


The first layer (Conv2D) was fitted with kernel_size = 3, and the activation function Rectified Linear Unit (ReLU), although other activation functions (LeakyReLU, Tahn, and Sigmoid) were tested, ReLU had the best performance. The same parameters were used in the sequence in the MaxPooling2D layers, ending with softmax output of eight classes. The same settings were used for the regression models, and the network’s last layer was changed, ending with only one output neuron, with the ReLU activation function, which represents the number of cells in the image. To summarize the model information, the model.summary() method was used (Table 3).

Table 3 Parameters for CNN architecture development.