Geschäftsstelle Schloss Dagstuhl Universität des Saarlandes Postfach 15 11 50 D-66041 Saarbrücken Germany e-mail: office@dag.uni-sb.de
Reinhard Klette, Franc Solina, Walter G. Kropatsch
Ruzena Bajcsy University of Pennsylvania Vision/Perception is NOT "l'art pour l'art"; that is it is not for its own sake. Vision serves a PURPOSE/TASK. Typically we consider the following tasks: 1. Vision for Manipulation 2. Vision for Mobility 3. Vision for Recognition 4. Vision for Communication. The (1, 2) are denoted as WHERE questions and the (3, 4) are the WHAT questions. We consider the most important problem in Visual Perception, the question of Representation. In turn Representation implies selection and construction of MODELS. Models must be on many different levels: 1. Sensory level: models of transduction mechanism; models of radiometric effects that result from interaction between the observer, light and the scene; geometric model of the optics; 2. Signal level: filters linear and non-linear; 3. Topological and geometric level of the objects and scene; 4. Material properties as they can be extracted from color and motion; 5. Kinematic properties, such as movable parts; 6. Identification of Dynamic Systems-such as fluids, flexible materials; 7. Models of Functionality; Open Problems: 1. What is observable from Vision only? If we can answer this question, it will imply what we must assume or measure in order to have a completely identifiable system. 2. The world is continuous with natural discontinuities. An open question is how identify these discontinuities. This is the classical problem of SIGNAL to SYMBOL conversion. The added difficulty is that this conversion must not be static but be able to dynamically change modulo task and context. 3. Biological systems are redundant, non-orthogonal, partially overlapped in their functionality, partially dependent and correlated. The engineering systems on the other hand are orthogonal, independent and uncorrelated. What is needed is a new calculus of non-orthogonal partially dependent systems.
Tatjana Belikova Russian Academy of Sciences The task of object extraction and location on the complex background is under consideration. Several models of object known up to their random parameters are proposed. They were used to develop linear filters that are optimal by criteria of least mean square error and max. signal to noise (s(n) ration to extract objects on the complex background or to improve s(n) ratio. The output of the last one filter was used to discriminate points with object location. Parametrical and nonparametrical estimations of the signal values were used for this purpose. For parametrical estimation we used max likelihood estimation to differ pixel values belonging to two different component that have different mean and deviation values. In nonparametrical estimation we used analysis of ordered statistics (rank ordered local gray values to find estimation of mean value of each component and to reduce deviation of object and background signal). This methods were helpful to extract and to locate micro classifications for early treats cancer diagnosis
Dimitry Chetverikov Hungarian Academy of Sciences Recently, growing attention has been paid to the investigation of oriented (anisotropic) textures. This interest has been supported by the discovery of the important role by a few dominant high level texture features, including directionality, in attentive perception of texture patterns by humans. Co-occurrence probability matrix (CPM) and gray-level difference histogram (GLDH) based features have been traditionally viewed as powerful texture analysis tools that are, however, less suitable for detailed anisotropy analysis because for small discrete spacing one cannot set fine angular resolution necessary for detailed directionality analysis. We propose a straightforward and computationally efficient extension of CPM and GLDH to arbitrary angle and spacing and apply the extended GLDH features to the analysis of texture anisotropy. Furthermore, we consider the possibility of investigating the symmetry of a texture pattern via the symmetry properties of a polar diagram (anisotropy indicatrix) describing the anisotropy of the pattern. Results of pilot experiments with real-world textures are shown and directions of further research discussed.
Konstantinos Daniilidis Christian-Albrechts University Kiel Attentive vision encompasses selective sensing in space, time, and resolution. Decreasing space and time complexity be selection arises as a practical necessity in building vision systems able to sense and act in real time. Attention does not only mean the control of the degrees of freedom of the sensorial apparatus. It necessitates selection of the appropriate representation as well as of the proper state subspace in order to accomplish a specific task in a more efficient and robust manner. We do not discuss here how attention is achieved: what to select and how to design an oculomotor control loop. Our interest is on the benefits of attention concerning the accomplishment of a motion related task. We concentrate on two aspects of attention regarding motion: fixation and space-variant polar and log-polar representation. Overcoming the field of view and bounding the retinal velocity of a moving object are obvious advantages of holding the gaze fixated on a moving object. We show that fixation enables an object-centered representation for the solution of the structure form motion problem. This representation stabilizes the estimation of lateral object translation that is confounded with the rotation in a camera-centered representation. Furthermore, fixation enables the use of scaled orthography for a distant object, leading, thus, to an affine motion field. Building upon existing methods we show how the direction of translation can be obtained from the oculomotor control inputs (camera rotation) what is supported by theories on efference copy and positive feedback. The introduction of the log-polar representation decouples the translation along from the rotation about the optical axis. We show - in contrast to existing results - that an already know function of the local motion parallax depends on the local slope of the surface. Furthermore, it turns out that the advantages regarding motion estimation are not in the logarithmic but in the polar nature of the space-variant representation. However, a log-polar transformation of the motion field facilitates independent motion detection if the observer is frontally translating.
Ulrich Eckhardt*, Longin Latecki* and Albrecht Hübler** * University Hamburg; ** Wolfsburg There are Mainly free reasons for dealing with thin subsets of the digital plane Z2 - Such sets are generated by algorithms for thinning binary images, - Thin sets are discrete analogs of curves in the plane, - In order to understand 3-D structures and algorithms it becomes necessary to revisit critically the known 2D theory. First we classify digital sets which are considered to be "thin" in some sense. There are 8- and 4-curves, contours (oriented boundaries of digital sets) and so-alled graph sets. It could be shown quite recently (Latecki, Eckhardt, Rosenfeld, 1994) that under rather mild conditions each digital set can be reduced to a topologically equivalent graph set. It is also attractive to investigate families of thin sets. This is important for studying digital analogs of circles (or equivalently, rotations of the digital plane) for defining niveau lines in gray-scale pictures and for morphological operations. Specifically one may ask under which conditions erosion is the inverse operation to dilation. One result of these investigations is a complete classification of simple and of nonsingular coverings of the digital plane by 8- (or 4-) curves and also a classification of singular points with respect to morphological operations (Eckhardt, Hübler, 1993). These latter points lead to a "morphological skeleton" of a digital set which has the property of exact reconstructability but has generally not the same topological properties as the original set.
Jan Flusser, Tomas Suk and Stanislav Saic Academy of Sciences of the Czech Republic The paper is devoted to the feature-based recognition of blurred images acquired by linear shift-invariant imaging system against an image database. The proposed approach consists of describing images by features which are invariant with respect to blur (that means with respect to the system PSF) and recognizing images in the feature space. In comparison with complicated and time-consuming "blind- restoration" approach, we do not need the PSF identification and image restoration. Thanks to his, our approach is much more effective. Two sets of invariants based on image moments are introduced in this paper - one set for symmetric blur, the order one for linear motion blur. The derivation of the invariants is a major theoretical result of the paper.
3D Scene reconstruction using a regional Approach
Andre Gagalowicz INRIA-Rocquencourt We discuss the problem of 3D indoor scene interpretation from an a priori given stereo pair of images. We stress the importance of the existence of an a priori given model of the 3D space and only study the case of a global model of this space. The proposed method consists in the use of a cooperative analysis/synthesis technique: an analysis (vision) task proposes a 3D complete model of the 3D scene incorporating geometric and photometric information. A synthesis algorithm is run afterwards, using the 3D complete model as input, and produces a synthetic stereo pair of the portion of this model possibly seen by the left and right camera. The difference between the natural and synthetic stereo pair is used to produce a better "complete" model. We consider first, the "learning" phase when we incorporate a model to interactively and visually, construct and control, the 3D space global model. In the analysis parts, we discuss the construction of a pipeline involving image segmentation region matching, stereo reconstruction, geometric and photometric interpretation of the "scene" leading to the construction of a "good" complete model of the part of 3D scene available in the stereo pair. An extension to the case of local vision problem involving an active procedure is briefly discussed as a conclusion.
Symmetric Bi- and Trinocular Stereo
Georgy Gimel'farb Academy of Sciences of the Ukraine Tradeoffs between theoretically justified and heuristic sides of the symmetric approach to intensity-based computational stereo are discussed. Under this approach a desired continuous optical surface is reconstructed from given stereo images as a bunch of epipolar profiles, each profile being obtained by maximizing a measure of similarity between intensities in the images and ortho-image (estimated coloring) of reconstructed surface points using dynamic programming (DP) techniques. In our previous papers this measure was deduced primarily under a simple Bayesian maximal-a-posteriori-probability (MAP) decision using probability models relating the profile coloring to corresponding intensities with due account of symmetries between the stereo images, independent allowable distortions of the images, possible discontinuities in each image because of partial occlusions of the surface, etc. The computational stereo belongs to the domain of ill-posed inverse photometric problems because of principal multiplicity of the surfaces given the same stereo pair of triple of images. So it is impossible to reconstruct precisely the real surface which has given the obtained stereo images. Nonetheless some theoretical models and heuristics can be introduced to bring the reconstructed surface close enough to the one perceived visually from the sole stereo pair or triple (or what is the same - to approach human vision accuracy under this very restrictive condition). Theoretical base of the computational stereo can be refined by modeling the profile geometry to describe more or less probable surface shapes, deducing compound Bayesian decisions being more adequate for solving stereo problems than the traditional MAP-decision and realized by the like DP techniques, and introducing an unified scheme of the symmetric bi- and trinocolar stereo. But to cope with discontinuities in the images, some suitable heuristics for estimating coloring in the monocularly visible points of the surface and defining signal similarity for them are necessary. Rather good experimental results for the real stereo pairs have been obtained with the similarity measure being a weighted linear combination of two like ones: between the intensities in the stereo images and estimated surface coloring and between both rectified stereo images in themselves only.
Decision Algorithms for Model-Based Vision Problems
Gregory Hager Yale University Many vision problems reduce to the problem of making a decision expressed as inegnality constraints on an appropriate parametric model. This talk formalizes this class of problems and then presents an algorithm that is correct and complete for them. This algorithm is then extended to cover problems involving segmentation also to address unstructured problems. Finally, it is observed that the use of low- level spatial organization processes are crucial for the effective use of these algorithms.
Improvement of the Curvature Computation
Vaclav Hlavac, Tomas Pajdla, Milos Sommer Czech Technical University The improvement of the curvature computing of the digitized curves was presented. The standard scheme, i.e. computing curvature by the convolution with the truncated. Gaussian, was studied. First, we show that systematic bias caused by curvature smoothing can be removed. Second, we demonstrate that about 25 % of the error has roots in other phenomena (i.e. anisotropy of the raster, limited size of the Gaussian, numerical integration of the convolution, and discretization).
Information technologies for image processing in real-time
Volodymyr Hrytsyk Academy of Sciences of the Ukraine An important unsolved problem in complex scene analysis with motion is the real- time implementation. An approach to solve the problem, based on mathematical models, method of fast features calculation and control is proposed for dynamic images and complex scenes. The theorems, which determine a constructive method of synthetics of neuronlike and systolic computing structures, are given, allowing the real-time implementation of recursive and parallel algorithms for image processing and scene analysis. An high efficiency of recursive-parallel systems for image processing is demonstrated.
Computation of mosaic images using on approximate 3D model
Pascal Jaillon, Annick Montanvert TIMC-IMAG Grenoble Image mosaicling consists in fusing images acquired from different places to build a global view of a scene. Diverse techniques provide mosaic images for satellite applications or painting reconstructions. We propose a method to mosaic images lying on three dimensional surfaces, avoiding the computation of a 3D model of the surface. A coarse model of the surface and the parameters of the projection (acquisition view point, optical axis) point to flatten images. Then images are merged with a 2D-technique of mosaicling. Finally the resulting 2D image is mapped on the evaluation of the 3D surface. This allows visualization from any view point. Such an approach can take into account the perspective distortion, and then discontinuities along the junction line are reduced. Depending on the application, we propose to apply corrections on original images or on Laplacian images. This mosaicling strategy is applied on satellite images, for paintings on vaults, and in microscopy.
Shape Reconstruction for Central Projection
Reinhard Klette Technical University Berlin The talk deals with the geometric models used for shape reconstruction where a rotation disc is used in front of a pinhole camera. For camera calibration the method by [Tsai 1986] was implemented and optimized (e.g. with respect to number of calibration planes and points in each plane). It is described how to use the calibration results for shape reconstruction for the case of objects on the rotating disc where motion vectors are used as input. Three approaches were studied: For the point-based approach the computation of accurate dense motion fields is the critical issue. An extensive evaluation of differential methods for optical flow computation was performed. On the other hand, shape reconstruction based on (assumed) accurate optical flow fields could be realized very precise. For the feature based approach (e.g. edges by the Canny-operator) the epipolar constraint of stereo cameras was modified for the rotation disc. Depth may be reconstructed at traced feature points, and even the rotation angle bay be calculated by tracing one (!) point in two consecutive images using the calibration results. For the region based approach, integrative constraints for shape reconstruction did prove to be numerically quite instable. Several theoretical results (e.g. shape from area and centroids of corresponding regions) were derived for future implementation.
Color Vision for Stereo Correspondence
Andreas Koschan Technical University Berlin Problem solving in digital image processing without color Information is sometimes difficult or even impossible as for example in the following cases: highlight detection, correspondence analysis in stereo images, image segmentation, etc. On the other hand, the necessity for color research often arises directly from the application (e.g., identification of color codes on resistors, food analysis, traffic sign recognition, etc.). In this paper it is shown that stereo matching results can be considerably improved when using color information. The Block matching technique has been extended to the so called Chromatic Block matching technique because of its efficiency already shown for gray valve images. Furthermore, it has been shown that results can be further improved when employing the I1I2I3 color space instead of the RGB solid. No significant influence has been found yet between the color measures and the results. In summary, we believe that precise dense depth maps can be obtained more easily when applying this Chromatic Block Matching technique to color stereo images.
Optimal Statistical Filtering of gray-level Images
Vladimir Kovalevsky Technische Fachhochschule Berlin A method for eliminating random noise is suggested whose performance consists in the following. The distribution of the gray values in a sliding window is approximated by up to four normal distributions. Parameters of the distributions and the probability PK(g) that a gray value g belongs to one of the distributions k are estimated by an iteration method suggested by Schlesinger many years ago. Then the gray value of the central pixel of the window in the output image is set equal to the mean value of one of the normal distributions. This distribution is selected according to the maximum a posteriory probability PK(g). The results of the filtering are compared with those of the sigma-filter.
Properties of Pyramidal Representations
Walter G. Kropatsch Technical University of Vienna The categorization of different components generalizes the classical concept of image pyramids and provides a powerful tool for efficient image analysis. Three different components of image pyramids are distinguished: their structure, the contents of their cells and the processes that operate on them. Different applications impose different requirements on the processing of the data. There are several engineering decisions to be made. The properties of the three different components of a pyramidal system are discussed and illustrated by examples. New results and research trends give an overview of the current state of the art.
Robust Recovery of Structures in Images
Ales Leonardis University of Ljubljana The significance of detecting geometric parametric structures has long been realized in the vision community. In this paper, a reliable and efficient method for extracting geometric parametric structures is presented. In method consists of two inter/wined procedures, namely model-recovery and model-selection. The first procedure systematically recovers parametric models in an image creating a redundant set of possible descriptions, while the model-selection procedure searches among them to produce an optimal result in terms of the objective function. In reliability of the recovery procedure which builds the parametric models is ensured by an iterative procedure through simultaneous performance of data classification and parameter estimation. The overall relative insensitivity to noise and minor changes in the input data is achieved by considering many competitive solutious and selecting those that produce the simplest description. The selection procedure is defined as a Quadratic Bodean problem, and the solution is sought by the WTA (winner-takes-all) technique. The presented method is efficient for two reasons: firstly, it is designed as a search which utilizes intermediate results as a guidance toward the final result, and secondly, it combines model recovery and model selection in a computationally efficient procedure. The proposed method proved to be successful for recovering parametric surface models and volumetric models (superquadrics) in range images and parametric curve models in edge images.
A structure-probabibistic approach to edge detection and adaptive filtering
Roman M. Palenichka Academy of Sciences of the Ukraine Edge detection operator can be efficiently used for image segmentation and filtering as a control possibility. For this purpose an approach, based on direct estimation of edge probability, is proposed. The approach consider two-stage detection procedure. At the first stage an image segment is tested on uniformness by evaluating the probability of uniform (smooth) segment. If this segment is non-smooth, the second test should be applied, during which the edge probability is evaluated. The edge position can be selected as a maximal value point of computed probability in a given neighborhood. This method can be successfully used for binary segmentation as thresholding procedure with floating threshold. The value of threshold in each point depends on its value in pervious point the updated threshold value as well as the edge probability at this point. This approach is based on a structural mathematical image model, composed of two components. The first one is the intensity trend and the second component represents fluctuations. For fast implementation of this method the fast recursive algorithm is proposed to calculate such local image features as mean value, median and variance.
An Incremental Learning System for Interpretation of Images
Petra Perner*, Walter Pätzold** *HWTK Leipzig, **KWT Dresden Defect Classification by image based techniques is an important issue in quality assurance and nondestructive testing. The solution of the problem is usually complex and context dependent. A domain specific interpretation of the problem is required. Thus, the acquisition, representation and use of the problem specific knowledge in combination with image processing facilities is a central point. The main problem in defect classification arises since mostly generalized knowledge is lacking. Therefore knowledge based techniques are necessary which may work based on single instances of the problem domain and learn new instances during the use of the system. This leads to the case-based reasoning paradigm in the paper, we propose an architecture for a case-based reasoning system for image interpretation. We discuss our approach on the problem domain ultra sonic image interpretation. The application is characterized by structural representation. The signal-to-symbol transformation for spatial knowledge is discribed as well as the case representation. For the determination of similarity between two structural representation we propose structural similarity and describe the algorithm for calculation of similarity. For the case base, we chose to use an hierarchical representation. Therefore, we developed an algorithm which can incremental learn this hierarchical representation.
Computer Vision and Mathematical Morphology
Jos Roerdink University of Groningen An extension of mathematical morphology is investigated incorporating symmetry and invariance concepts essential for computer vision applications. Classical morphology uses image transformations which are translation invariant. In many applications other forms of symmetry are involved, for example polar symmetry or invariance under perspective transformations. We extend morphology to such cases by considering any homogeneous space (G; X) when X is a set on G a group cutting transitively on X. Morphological transformations can then be constructed as mappings of the Boolean algebra P(X) (the power set of X) to itself, which are invariant under the group G. Examples are the plane with the Euclidean motion group or the scene with the rotation group. We discuss how this might be used for introducing projective invariance in morphology. Finally, a sketch is given of a preliminary attempt at morphological description of 3D surfaces by using concepts from differential geometry.
Globally convergent nonlinear diffusion networks for early vision
Christoph Schnörr, Rainer Sprengel University of Hamburg A class of minimization problems is considered to model nonlinear, transition preserving data-reduction processes for early vision. These problems are non- discretely formulated and have always a unique minimizing solution that continuously depends on the data. Approximate solutions based on the Galerkin method converge as the discretization parameter tends to zero. The computation of approximate solutions can be done by a globally convergent and highly parallel relaxation procedure or, in principle, by a globally, asymptotically stable analog network. The relationship of a prototype minimization problem to the nonlinear diffusion approach of Perona and Matik and its variational formulation due to Nordström is discussed. It is shown that the parametrization of our diffusion coefficient can be used to control the trade-off between smoothing and preserving data transitions. The stability of the localization of data transitions in parameter space is demonstrated, and a criterion to select these transitions is presented. Finally, the application of the general principle to locally computed motion data is considered, and two corresponding functionals and corresponding numerical results are discussed.
Banach Constructor and Image Compression
Wladyslaw Skarbak Polish Academy of Sciences The Banach constructor is defined as a concept unifying special cases of deterministic fractal modeling. The fractal compression of digital images is presented as a Banach constructor defined by a patchwork. The patchwork concept is a formal mathematical model which allowed for a compact definition of the fractal operator, specification of a condition for its contractility (for all u norms, 1 <= u <= infinity), and formulating conditions ensuring the required fidelity of the reconstructed image. Fast fractal compression algorithm (FFC) is based on patchworks which are affine (with contrast and scaling fixed), sparse, and local. While the known fractal compression schemes (Jacquin, Jacobs et al., Barnslay) require encoding time 100-1000 greater than decoding time, FFC gives high quality images of natural scenes with this ratio not exceeding 10. Formal as far best fit, affine, contrast fixed transforms which perform the best fit of two digital patches, are given for Minkowski u norms which u = 1, u = 2, and u = infinity. Experiments confirm superiority of quadratic norm at quality-time tradeoff.
On Separability Problems in Computational Geometry and their Applications
F. Sloboda, B. Zatko Slovak Academy of Sciences Properties of the external and internal shortest path of a simple polygon were described. further properties of the shortest path in a polygonally bounded compact set were described and the separability problem of two disjoint polygons were investigated.
Segmentation with Volumetric Models
Franc Solina University of Ljubljana A new approach to reliable and efficient recovery of part-level descriptions from range images is presented. It is shown that a set of superquadric volumetric models can be directly recovered from unsegmented range data. Superquadric models are an extension of ellipsoids that cover a continuum of shapes including parallelepipeds and aglinders as well. The approach is based on the recover-and-select paradigm by Leonardis that consists of two intertwined processes: model recovery and model selection. In the model recovery process a redundant set of superquadrics is initiated in the image and allowed to grow. Recovered models are selected using a MDL-like criterian which results in the simplest overall description.
Steerable Filters for Attentive Visual Processing
Gerald Sommer* and Markus Michaelis** *Christian-Albrechts-Universität Kiel **GSF-MEDIS-Institut, Neuherberg Junctions of lines or edges are important visual cues in various fields of computer vision. They are characterized by the existence of more than one orientation at one single point, the so called keypoint. In this work we investigate the performance of highly orientation selective functions to detect multiple orientations and to characterize junctions. A quadrate pair of functions. A quadrature pair of functions is used to detect lines as well as edges and to distinguish between them. An associated one-sided function with an angular periodicity of 360¡ can distinguish between terminating and non-terminating lines and edges which constitute the junctions. To calculate the response of these functions in a continuum of orientations and scales a method is used that was introduced recently by P. Perona [8]. These functions are called steerable filters. They seems to be the natural kind of operators which are able to be adapted in the degrees of freedom in the attention stage under the control of a task. It is shown that their response to local structures can be used to make explicit the recognized structures. This is the way for knowledge based computer vision. In behavior based systems (attentive systems) these response are used as implicit representations which constitute the input to associative memories to fuse several local hints to a global recognition of structure.
Theoretical Foundations of Anisotropic Diffusion in Image Processing
J. Weickert Universität Kaiserlautern A frequent problem in low level vision consists of eliminating noise and small-scale details from an image while still preserving or even enhancing the edge structure. Nonlinear anisotropic diffusion filtering using an adapted diffusion tensor offers one possibility to achieve these goals. We sketch the essential ideas of this technique and demonstrate its advantages compared to isotropic and nonlinear diffusion. Although exhibiting an edge enhancing potential, the proposed method provides a scale-space fulfilling several architectural, information reducing and invariance properties. Furthermore, it leads to well-pronounced edges with stable locations across a wide range of scales. It is shown that most of the restoration and scale- space properties carry over from the continuous to the discrete case. Applications are presented ranging from preprocessing of medical images and postprocessing of numerical results containing fluctuations to visualizing quality relevant features for the grading of wood surfaces and fleece.
Stability and Likelihood of Views of Three Dimensional Objects
Daphna Weinshall, Michael Werman and Naftali Tishby The Hebrew University of Jerusalem Can we say anything general about the distribution of two dimensional views of general three dimensional objects? In this paper we present a first formal analysis of the stability and likelihood of two dimensional views (under weak perspective projection) of three dimensional objects. This analysis is useful for various aspects of object recognition and database indexing. Examples are Bayesian recognition and image interpretation; indexing to a three dimensional database by invariants of two dimensional images; the selection of "good" templates that may reduce the complexity of correspondence between images and three dimensional objects; and ambiguity resolution using generic views. We show the following results: (1) Both the stability and likelihood of views do not depend on the particular distribution of points inside the object; they both depend on only three numbers, the three second moments of the object. (2) The most stable and the most likely views are the some view, which is the "flattest" view of the object; moreover, there is no other view which is even locally the most stable or the most likely view. Under orthographic projection, we also show: (3) the distance between one image to another does not depend on the position of its viewpoint with respect to the object, it depend only on the (geodesic) distance between the viewpoints on the viewing sphere. We demonstrate these results with real and simulated data.
Model-free texture segmentation based on distances between first-order statistics
Piero Zamperoni Technische Universität Braunschweig This contribution focuses on the key-role of the gray value distribution (first-order statistics), taken as a whole, for characterizing homogeneous regions of an image and for detecting region borders for image segmentation scopes. Where the discriminating power of the first-order statistics is not sufficient, also information on the gray values' spatial relationships must be extracted from the second-order statistics. The aim of this study is to develop efficient methods for measuring a "degree of diversity" between pairs of symmetrical and equal-sized subwindows of the observation window, centered on the current pixel P. The maximum degree of diversity, measured among 4 subwindow pairs with different orientations, is the edge value attributed to P. Repeating this procedure for all the pixels, one obtains an edge map, which represents the first and the most critical step in a segmentation process. For measuring the "degree of diversity", several approaches have been investigated, all based upon well-known pattern recognition and statistic methods, as for instance: - Distances (Minkowski, Canberra, Tanimoto-distance, scalar product, and other specially developed distance) between the vectors of the rank-ordered gray values of the two subwindows; - Distances (Kolmogorov, Bhattacharyya, Patrick-Fischer) between the estimated gray value density functions of the two subwindows; - Measures of cluster concentration in the gray value space and in the 2-dimensional space obtained by considering also a spatial distribution feature; - Non parametric Wilcoxon-type two-populations tests; - Distance between the estimated locations and between the estimated scales in the paired subwindows. Experimental results, obtained with natural images featuring problematic textures (remote sensing, radar, ultrasonic, nuclear medicine) and with synthetic images with all the approaches mentioned above, are shown and illustrated.
CITR: last update: 22 April 1998