Watershed (image processing)

Last updated

In the study of image processing, a watershed is a transformation defined on a grayscale image. The name refers metaphorically to a geological watershed, or drainage divide, which separates adjacent drainage basins. The watershed transformation treats the image it operates upon like a topographic map, with the brightness of each point representing its height, and finds the lines that run along the tops of ridges.

Contents

There are different technical definitions of a watershed. In graphs, watershed lines may be defined on the nodes, on the edges, or hybrid lines on both nodes and edges. Watersheds may also be defined in the continuous domain. [1] There are also many different algorithms to compute watersheds. Watershed algorithms are used in image processing primarily for object segmentation purposes, that is, for separating different objects in an image. This allows for counting the objects or for further analysis of the separated objects.

Definitions

In geology, a watershed is a divide that separates adjacent catchment basins.

Watershed by flooding

The idea was introduced in 1979 by S. Beucher and C. Lantuéjoul. [2] The basic idea consisted of placing a water source in each regional minimum in the relief, to flood the entire relief from sources, and build barriers when different water sources meet. The resulting set of barriers constitutes a watershed by flooding. A number of improvements, collectively called Priority-Flood, have since been made to this algorithm. [3]

Watershed by topographic distance

Intuitively, a drop of water falling on a topographic relief flows towards the "nearest" minimum. The "nearest" minimum is that minimum which lies at the end of the path of steepest descent. In terms of topography, this occurs if the point lies in the catchment basin of that minimum. The previous definition does not verify this condition.

Watershed by the drop of water principle

Intuitively, the watershed is a separation of the regional minima from which a drop of water can flow down towards distinct minima. A formalization of this intuitive idea was provided in [4] for defining a watershed of an edge-weighted graph.

Inter-pixel watershed

S. Beucher and F. Meyer introduced an algorithmic inter-pixel implementation of the watershed method, [5] given the following procedure:

  1. Label each minimum with a distinct label. Initialize a set S with the labeled nodes.
  2. Extract from S a node x of minimal altitude F, that is to say F(x) = min{F(y)|y  S}. Attribute the label of x to each non-labeled node y adjacent to x, and insert y in S.
  3. Repeat Step 2 until S is empty.

Topological watershed

Previous notions focus on catchment basins, but not to the produced separating line. The topological watershed was introduced by M. Couprie and G. Bertrand in 1997, [6] and beneficiate of the following fundamental property. A function W is a watershed of a function F if and only if W ≤ F and W preserves the contrast between the regional minima of F; where the contrast between two regional minima M1 and M2 is defined as the minimal altitude to which one must climb in order to go from M1 to M2. [7] An efficient algorithm is detailed in the paper. [8]

Watershed algorithm

Different approaches may be employed to use the watershed principle for image segmentation.

Meyer's flooding algorithm

One of the most common watershed algorithms was introduced by F. Meyer in the early 1990s, though a number of improvements, collectively called Priority-Flood, have since been made to this algorithm, [9] including variants suitable for datasets consisting of trillions of pixels. [10]

The algorithm works on a gray scale image. During the successive flooding of the grey value relief, watersheds with adjacent catchment basins are constructed. This flooding process is performed on the gradient image, i.e. the basins should emerge along the edges. Normally this will lead to an over-segmentation of the image, especially for noisy image material, e.g. medical CT data. Either the image must be pre-processed or the regions must be merged on the basis of a similarity criterion afterwards.

  1. A set of markers, pixels where the flooding shall start, are chosen. Each is given a different label.
  2. The neighboring pixels of each marked area are inserted into a priority queue with a priority level corresponding to the gradient magnitude of the pixel.
  3. The pixel with the lowest priority level is extracted from the priority queue. If the neighbors of the extracted pixel that have already been labeled all have the same label, then the pixel is labeled with their label. All non-marked neighbors that are not yet in the priority queue are put into the priority queue.
  4. Redo step 3 until the priority queue is empty.

The non-labeled pixels are the watershed lines.

Example of a marker-supported watershed transformation for a population of pharmaceutical pellets. Watershed lines are superimposed in black on the CT image stack. FrDo Capsule VolSegmentation Watershed.jpg
Example of a marker-supported watershed transformation for a population of pharmaceutical pellets. Watershed lines are superimposed in black on the CT image stack.

Optimal spanning forest algorithms (watershed cuts)

Watersheds as optimal spanning forest have been introduced by Jean Cousty et al. [12] They establish the consistency of these watersheds: they can be equivalently defined by their “catchment basins” (through a steepest descent property) or by the “dividing lines” separating these catchment basins (through the drop of water principle). Then they prove, through an equivalence theorem, their optimality in terms of minimum spanning forests. Afterward, they introduce a linear-time algorithm to compute them. It is worthwhile to note that similar properties are not verified in other frameworks and the proposed algorithm is the most efficient existing algorithm, both in theory and practice.

Graph cuts

In 2007, C. Allène et al. [13] established links relating Graph Cuts to optimal spanning forests. More precisely, they show that when the power of the weights of the graph is above a certain number, the cut minimizing the graph cuts energy is a cut by maximum spanning forest.

Shortest-path forests

The image foresting transform (IFT) of Falcao et al. [14] is a procedure for computing shortest path forests. It has been proved by J. Cousty et al. [15] that when the markers of the IFT corresponds to extrema of the weight function, the cut induced by the forest is a watershed cut.

Random walker

The random walker algorithm is a segmentation algorithm solving the combinatorial Dirichlet problem, adapted to image segmentation by L. Grady in 2006. [16] In 2011, C. Couprie et al. proved that when the power of the weights of the graph converge toward infinity, the cut minimizing the random walker energy is a cut by maximum spanning forest. [17]

Hierarchies

A hierarchical watershed transformation converts the result into a graph display (i.e. the neighbor relationships of the segmented regions are determined) and applies further watershed transformations recursively. See [18] for more details. A theory linking watershed to hierarchical segmentations has been developed in [19]

Notes

  1. L. Najman and M. Schmitt. Watershed of a continuous function. In Signal Processing (Special issue on Mathematical Morphology.), Vol. 38 (1994), pages 99–112
  2. Serge Beucher and Christian Lantuéj workshop on image processing, real-time edge and motion detection (1979). http://cmm.ensmp.fr/~beucher/publi/watershed.pdf
  3. Barnes, R., Lehman, C., Mulla, D., 2014. Priority-flood: An optimal depression-filling and watershed-labeling algorithm for digital elevation models. Computers & Geosciences 62, 117–127. doi : 10.1016/j.cageo.2013.04.024
  4. J. Cousty, G. Bertrand, L. Najman and M. Couprie. Watershed Cuts: Minimum Spanning Forests and the Drop of Water Principle, IEEE Transactions on Pattern Analysis and Machine Intelligence 31(8) pp. 1362-1374, 2009,
  5. Serge Beucher and Fernand Meyer. The morphological approach to segmentation: the watershed transformation. In Mathematical Morphology in Image Processing (Ed. E. R. Dougherty), pages 433–481 (1993).
  6. M. Couprie, G. Bertrand. Topological gray-scale watershed transform. In Proc. of SPIE Vision Geometry V, volume 3168, pages 136–146 (1997). http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.3.7654&rep=rep1&type=pdf
  7. G. Bertrand. On topological watersheds. Journal of Mathematical Imaging and Vision, 22(2–3), pages 217–230 (2005).
  8. Michel Couprie, Laurent Najman, Gilles Bertrand. Quasi-linear algorithms for the topological watershed. Journal of Mathematical Imaging and Vision, Springer Verlag, 2005, 22 (2-3), pp.231-249.
  9. Barnes, R., Lehman, C., Mulla, D., 2014. Priority-flood: An optimal depression-filling and watershed-labeling algorithm for digital elevation models. Computers & Geosciences 62, 117–127. doi : 10.1016/j.cageo.2013.04.024
  10. Barnes, R., 2016. Parallel priority-flood depression filling for trillion cell digital elevation models on desktops or clusters. Computers & Geosciences. doi : 10.1016/j.cageo.2016.07.001
  11. Doerr, F. J. S., & Florence, A. J. (2020). A micro-XRT Image Analysis and Machine Learning Methodology for the Characterisation of Multi-Particulate Capsule Formulations. International Journal of Pharmaceutics: X, 2, 100041. https://doi.org/10.1016/j.ijpx.2020.100041
  12. Jean Cousty, Gilles Bertrand, Laurent Najman, and Michel Couprie. Watershed Cuts: Minimum Spanning Forests and the Drop of Water Principle. IEEE Transactions on Pattern Analysis and Machine Intelligence. 31 (8). August 2009. pp. 1362–1374.
  13. Cédric Allène, Jean-Yves Audibert, Michel Couprie and Renaud Keriven : "Some links between min-cuts, optimal spanning forests and watersheds", Image and Vision Computing, 2009.
  14. Falcao, A.X. Stolfi, J. de Alencar Lotufo, R. : "The image foresting transform: theory, algorithms, and applications", In PAMI, 2004
  15. Jean Cousty, Gilles Bertrand, Laurent Najman, and Michel Couprie. Watershed cuts: thinnings, shortest-path forests and topological watersheds. IEEE Transactions on Pattern Analysis and Machine Intelligence. 32 (5). 2010. pp. 925–939.
  16. Grady, L.: "Random walks for image segmentation". PAMI, 2006
  17. Camille Couprie, Leo Grady, Laurent Najman and Hugues Talbot, "Power Watersheds: A Unifying Graph-Based Optimization Framework”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 33, No. 7, pp. 1384-1399, July 2011
  18. Laurent Najman, Michel Schmitt. Geodesic Saliency of Watershed Contours and Hierarchical Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Institute of Electrical and Electronics Engineers, 1996, 18 (12), pp.1163-1173.
  19. Laurent Najman. On the equivalence between hierarchical segmentations and ultrametric watersheds. Journal of Mathematical Imaging and Vision, Springer Verlag, 2011, 40 (3), pp.231-247.

Related Research Articles

<span class="mw-page-title-main">Flood fill</span> Algorithm in computer graphics to add color or texture

Flood fill, also called seed fill, is a flooding algorithm that determines and alters the area connected to a given node in a multi-dimensional array with some matching attribute. It is used in the "bucket" fill tool of paint programs to fill connected, similarly-colored areas with a different color, and in games such as Go and Minesweeper for determining which pieces are cleared. A variant called boundary fill uses the same algorithms but is defined as the area connected to a given node that does not have a particular attribute.

<span class="mw-page-title-main">Minimum spanning tree</span> Least-weight tree connecting graph vertices

A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. That is, it is a spanning tree whose sum of edge weights is as small as possible. More generally, any edge-weighted undirected graph has a minimum spanning forest, which is a union of the minimum spanning trees for its connected components.

<span class="mw-page-title-main">Image segmentation</span> Partitioning a digital image into segments

In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.

In computer vision or natural language processing, document layout analysis is the process of identifying and categorizing the regions of interest in the scanned image of a text document. A reading system requires the segmentation of text zones from non-textual ones and the arrangement in their correct reading order. Detection and labeling of the different zones as text body, illustrations, math symbols, and tables embedded in a document is called geometric layout analysis. But text zones play different logical roles inside the document and this kind of semantic labeling is the scope of the logical layout analysis.

Simple interactive object extraction (SIOX) is an algorithm for extracting foreground objects from color images and videos with very little user interaction. It has been implemented as "foreground selection" tool in the GIMP, as part of the tracer tool in Inkscape, and as function in ImageJ and Fiji (plug-in). Experimental implementations were also reported for Blender and Krita. Although the algorithm was originally designed for videos, virtually all implementations use SIOX primarily for still image segmentation. In fact, it is often said to be the current de facto standard for this task in the open-source world.

<span class="mw-page-title-main">Scale-space segmentation</span>

Scale-space segmentation or multi-scale segmentation is a general framework for signal and image segmentation, based on the computation of image descriptors at multiple scales of smoothing.

Connected-component labeling (CCL), connected-component analysis (CCA), blob extraction, region labeling, blob discovery, or region extraction is an algorithmic application of graph theory, where subsets of connected components are uniquely labeled based on a given heuristic. Connected-component labeling is not to be confused with segmentation.

<span class="mw-page-title-main">Livewire Segmentation Technique</span>

Livewire, is a segmentation technique which allows a user to select regions of interest to be extracted quickly and accurately, using simple mouse clicks. It is based on the lowest cost path algorithm, by Edsger W. Dijkstra. Firstly convolve the image with a Sobel filter to extract edges. Each pixel of the resulting image is a vertex of the graph and has edges going to the 4 pixels around it, as up, down, left, right. The edge costs are defined based on a cost function. In 1995, Eric N. Mortensen and William A. Barrett made some extension work on livewire segmentation tool, which is known as Intelligent Scissors.

As applied in the field of computer vision, graph cut optimization can be employed to efficiently solve a wide variety of low-level computer vision problems, such as image smoothing, the stereo correspondence problem, image segmentation, object co-segmentation, and many other computer vision problems that can be formulated in terms of energy minimization. Many of these energy minimization problems can be approximated by solving a maximum flow problem in a graph. Under most formulations of such problems in computer vision, the minimum energy solution corresponds to the maximum a posteriori estimate of a solution. Although many computer vision algorithms involve cutting a graph, the term "graph cuts" is applied specifically to those models which employ a max-flow/min-cut optimization.

<span class="mw-page-title-main">Active contour model</span>

Active contour model, also called snakes, is a framework in computer vision introduced by Michael Kass, Andrew Witkin, and Demetri Terzopoulos for delineating an object outline from a possibly noisy 2D image. The snakes model is popular in computer vision, and snakes are widely used in applications like object tracking, shape recognition, segmentation, edge detection and stereo matching.

<span class="mw-page-title-main">Spectral clustering</span> Clustering methods

In multivariate statistics, spectral clustering techniques make use of the spectrum (eigenvalues) of the similarity matrix of the data to perform dimensionality reduction before clustering in fewer dimensions. The similarity matrix is provided as an input and consists of a quantitative assessment of the relative similarity of each pair of points in the dataset.

In computer vision, maximally stable extremal regions (MSER) are used as a method of blob detection in images. This technique was proposed by Matas et al. to find correspondences between image elements from two images with different viewpoints. This method of extracting a comprehensive number of corresponding image elements contributes to the wide-baseline matching, and it has led to better stereo matching and object recognition algorithms.

The image segmentation problem is concerned with partitioning an image into multiple regions according to some homogeneity criterion. This article is primarily concerned with graph theoretic approaches to image segmentation applying graph partitioning via minimum cut or maximum cut. Segmentation-based object categorization can be viewed as a specific case of spectral clustering applied to image segmentation.

Image segmentation strives to partition a digital image into regions of pixels with similar properties, e.g. homogeneity. The higher-level region representation simplifies image analysis tasks such as counting objects or detecting changes, because region attributes can be compared more readily than raw pixels.

The random walker algorithm is an algorithm for image segmentation. In the first description of the algorithm, a user interactively labels a small number of pixels with known labels, e.g., "object" and "background". The unlabeled pixels are each imagined to release a random walker, and the probability is computed that each pixel's random walker first arrives at a seed bearing each label, i.e., if a user places K seeds, each with a different label, then it is necessary to compute, for each pixel, the probability that a random walker leaving the pixel will first arrive at each seed. These probabilities may be determined analytically by solving a system of linear equations. After computing these probabilities for each pixel, the pixel is assigned to the label for which it is most likely to send a random walker. The image is modeled as a graph, in which each pixel corresponds to a node which is connected to neighboring pixels by edges, and the edges are weighted to reflect the similarity between the pixels. Therefore, the random walk occurs on the weighted graph.

In applied mathematics, lambda-connectedness deals with partial connectivity for a discrete space.

In the practice of digital image processing Alexandre X. Falcao, Jorge Stolfi, and Roberto de Alencar Lotufo have created and proven that the Image Foresting Transform (IFT) can be used as a time saver in processing 2-D, 3-D images, and moving images.

Graph cut optimization is a combinatorial optimization method applicable to a family of functions of discrete variables, named after the concept of cut in the theory of flow networks. Thanks to the max-flow min-cut theorem, determining the minimum cut over a graph representing a flow network is equivalent to computing the maximum flow over the network. Given a pseudo-Boolean function , if it is possible to construct a flow network with positive weights such that

Semi-global matching (SGM) is a computer vision algorithm for the estimation of a dense disparity map from a rectified stereo image pair, introduced in 2005 by Heiko Hirschmüller while working at the German Aerospace Center. Given its predictable run time, its favourable trade-off between quality of the results and computing time, and its suitability for fast parallel implementation in ASIC or FPGA, it has encountered wide adoption in real-time stereo vision applications such as robotics and advanced driver assistance systems.

Clément Farabet is a computer scientist and AI expert known for his contributions to the field of deep learning. He served as a research scientist at the New York University. He serves as the Vice President of Research at Google DeepMind and previously served as the VP of AI Infrastructure at NVIDIA.

References