Model-based clustering for image segmentation and large datasets via sampling

格式：pdf
大小：758.18 KB
文档页数：26

下载文档原格式

基于聚类信息的活动轮廓图像分割模型

基于聚类信息的活动轮廓图像分割模型李敏;梁久祯;廖翠萃【期刊名称】《模式识别与人工智能》【年(卷),期】2015(000)007【摘要】Based on traditional Chan-Vese ( CV) model, combining image clustering information, an effective active contour model for image segmentation is proposed in this paper. Firstly, the energy functional of CV model is improved, the gradient information of image is considered, and the accuracy of image segmentation is improved. Then, the coefficient K based on image clustering information is added in energy functional. And the image clustering information is used to initialize the level set curves automatically. In color image segmentation processing, weighting process on the RGB channel is proposed to improve the efficiency of segmentation. Finally, regularization term is added in energy functional to avoid re-initialization of the level set. The gray images and color images are segmented quickly and accurately. Experimental results shows the effectiveness of the proposed method.%基于传统Chan-Vese( CV)模型,结合图像聚类信息,提出一种有效的活动轮廓模型图像分割方法。

基于局域模糊聚类和Chan-Vese模型的CT医学图像分割

基于局域模糊聚类和Chan-Vese模型的CT医学图像分割陈俊熹;陈灵娜;李丽华【摘要】针对CT医学图像灰度不均匀的特点，研究了基于改进的模糊聚类和Chan-Vese模型的图像分割。

该分割模型综合利用基于空间信息的FCM算法、图像局部区域信息以及Chan-Vese模型，通过最小化能量函数的方式来进行曲线演化。

基于空间信息的FCM算法对曲线的演化起到了一定的收敛作用，并且局部区域信息提高了分割质量。

分割模型还考虑了分割效果和计算效率，降低了算法的时间复杂度，提高了算法的执行效率，从而提高了灰度不均匀图像分割的精度。

%Due to the inhomogeneous density of CT medical image, the segmentation model based on the local fuzzy clustering method and Chan-Vese model is investigated. This segmen-tation model makes full use of the spatial fuzzy c-means ( SFCM) algorithm,the local region in-formation,and local Chan-Vese model. The curve evolution is realized by minimizing the energy function. SFCM algorithm is convenient for controlling level set evolution and calculation con-vergence,and the local image information can improve the quality of segmentation. Moreover, the segmentation model considers the effectiveness of segmentation and high efficiency to re-duce the time complexity of the algorithm. Thus,the segmentation model gets more precise re-sult of image segmentation for the inhomogeneous density of CT medical image.【期刊名称】《南华大学学报（自然科学版）》【年(卷),期】2015(000)002【总页数】7页(P108-113,128)【关键词】医学图像分割;灰度不均匀;模糊聚类;Chan-Vese模型【作者】陈俊熹;陈灵娜;李丽华【作者单位】南华大学附属南华医院，湖南衡阳421000;南华大学计算机科学与技术学院，湖南衡阳421001;南华大学计算机科学与技术学院，湖南衡阳421001【正文语种】中文【中图分类】TP751.1key words：medical image segmentation;intensity inhomogeneity;Chan-Vese model;spatial fuzzy clustering图像分割是计算机辅助治疗系统图像分析中的关键步骤.图像分割的目的就是把输入图像分割成具有不同意义的区域，如感兴趣区域从背景中分离.然而，没有一个统一的算法用于图像分割.由于CT图像噪声和低对比度，利用传统的分割方法如阈值分割[1]、模糊聚类[2-3]、神经网络[4]、活动轮廓模型[5-6]等都很难得到准确的分割结果.然而，活动轮廓模型被称为较好的分割方法之一，它可分为基于边缘[7]的和基于区域[8]的活动轮廓模型.目前较为流行的一种基于区域的活动轮廓模型称为Chan-Vese模型，该模型是基于Mumford-Shan分割理论.近年来，研究者利用基于能量最小化理论提取出图像中感兴趣的区域.Chan和Vese提出了两相分割(Chan-Vese，C-V)模型可快速提取出感兴趣的目标，这种模型能成功应用于具有两个区域且每个区域具有相似灰度的图像分割.但是，C-V模型不能很好分割出图像灰度不均匀的区域.为了解决C-V模型应用的局限性，Chan和Vese再次提出了利用最小化Mumford-Shan数学模型得到分段光滑(Piecewise smooth,P-S)模型.然而，P-S模型计算时间复杂度高.为了解决时间复杂度和灰度不均匀问题，Li[9]等提出了基于模糊聚类的水平集方法用于图像分割.该方法中FCM算法可改善Chan-Vese方程的初始化和曲线演化，但是分割的结果容易受簇分类的影响.因此，利用基于空间信息的FCM算法(Spatial FCM，SFCM)可改进传统的模糊聚类算法的不足.除外，Liu等[10]提出了一个局部基于区域的C-V分割模型，该模型通过引入图像的局部信息，具有灰度不均匀的图像也能得到较准确的分割结果.因此，本文提出了一种基于改进的模糊聚类和Chan-Vese模型的图像分割模型，这种模型能快速实现CT医学图像分割.与传统的C-V模型相比，改进的模型不仅利用基于空间信息的模糊聚类SFCM算法，还考虑了图像局部信息和区域特征，该分割模型可较好的控制C-V模型参数，方便水平集的曲线演化.仿真结果表明，改进的分割模型不仅提高了分割质量，同时还提高了收敛速度.1.1 SFCM算法FCM算法是图像分割中最重要的一种方法，该算法的基本思想就是使得被划分到同一簇的对象之间相似度最大，而不同簇之间的相似度最小.FCM算法是用隶属度确定每个数据点属于某个聚类的程度的一种聚类算法.该算法是一个简单的迭代过程，通过最小化FCM的价值函数(或目标函数)来得到c个模糊聚类组：JFCM(U,V)=d2(xk,vi)其中X={x1,x2,…,xn}是p维矢量空间的数据集，n是数据集中数据的个数，c是聚类组的个数且2≤c≤n.uik=ui(xk)代表着第kth像素属于ith簇，m是数据集隶属度.vi是聚类的中心，d2(xk,vi)是k个数据到i簇中心的距离.模糊聚类分割是通过对式(1)的目标函数进行迭代优化过程，同时还需要初始化更新聚类中心和聚类的组数如下列公式：uik=vi=如果迭代过程中相对上次价值函数值的改变量小于某个阈值，或者达到最大的迭代次数，则算法停止.Salima等[11]在FCM算法中引入空间信息则构成了一个SFCM算法，该算法可用下列公式实现：==其中pkj利用了图像空间信息，并代表着区域内邻近像素对中心像素xk的影响概率.Nk指着进入以像素xk为中心的窗口内像素的个数.像素密度和像素之间的欧几里德距离定义如下：==‖k-j‖2接着，贡献因子pkj定义如下其中为中心像素xk所在区域的局域密度.因此，SFCM算法的新代价函数定义如下1.2 Chan-Vese (C-V) modelChan 和Vese提出了一个活动轮廓模型用于两相图像分割，活动轮廓模型是在Mumford-Shan模型基础上引入水平集理论.C-V模型的基本思想是利用水平集函数φ把一幅给定图像I(x)分割成两个区域，可通过下列能量函数最小化来实现： Ecv(c1,c2,φ)=λ1∫Ω(I(x)-c1)2H(φ(x))dx+λ2∫Ω(I(x)-c2)2(1-H(φ(x)))dx其中常量c1 和c2代表分割区域的灰度密度，系数λ1 和λ2也是两个固定的参数.C-V模型也包含正则化算子，因此能量方程Ecv(c1,c2,φ)可改写为其中H(φ) 是Heaviside函数，δ(φ)是狄拉克函数.它们的公式分别为当上式中φ(x)保持不变并对Ecv(c1,c2,φ)最小化，可得到c1 和c2如下当c1 和c2保持不变时，通过最小化Ecv(c1,c2,φ)，可得到变分水平集公式如下：上述的C-V模型用很宽的耦合范围对初始化不敏感，因此是一种较好的图像分割方法.但是，该C-V模型失败分割图像灰度不均匀的图像.1.3 基于局部的模糊聚类和Chan-Vese模型(Local fuzzy-based Chan-Vese，LFCV)为了解决C-V模型很难分割图像灰度不均匀的医学图像的问题，需要利用演化曲线附近的局域区域的图像信息特征.针对这个问题，提出了一个基于局域模糊聚类和Chan-Vese 模型.在传统C-V模型中，参数c1 和c2为常量，则无法处理包含灰度异质区域的问题.设定以图像像素x为中心的图像局部区域的灰度平均密度权值为I(y).在公式(13)的c1(φ) 和c2(φ)中引入了模糊聚类矩阵，且可由公式(4)获得，则分割区域的灰度密度估计参数c1 和c2可变成如下公式：其中gk是高斯核函数.将公式(15)代入公式(11)，能量方程可改写为通过保存c1(x) 和c2(x)不变，采用变分法最小化E(c1,c2,φ)，可得如下公式此外，为避免水平集函数的反复校正，将其符号距离函数的属性设计成能量泛函的一部分，使能量泛函取极小值时的属性能自动满足.LFCV算法的具体步骤如下：步骤1：对水平集函数φ进行初始化：φ其中Ω0是图像集合的子集，∂Ω0是Ω0的边界描述.步骤2：计算感兴趣区域ROI的聚类中心和隶属度矩阵，并代入到公式(15) 中计算c1(x) 和c2(x).步骤3：根据公式(17) 对水平集函数进行演化.步骤4：利用公式φn+1(x)=φn(x)+ΔtΔφ更新水平集函数，其中Δt为迭代时间.步骤5：当水平集的目标函数小于一个设定的阈值且达到边界点则停止，否则转置于步骤2继续迭代.在本文实验中，将传统的LSW[4]、 RD[5]、 Li[7] 三个模型与提出的LFCV模型作了分析评价与比较.运用上述模型对一些CT 医学图像进行分割，以此来验证本文算法的性能.分割的实验平台为：windows7系统、Matlab R2010b软件、Intel奔腾双核P5计算机.图1和图2显示了肝脏肿瘤图像以及分割结果.本文对LFCV模型中的参数进行了优化，可利用这样的优化参数如：σ=0.8,λ1=λ2=1.0,Δt=1,μ=0.2以及ν=0.001×255×255.同时，参数优化后能获得较好的分割结果.众所周知，传统的C-V模型主要适合分段光滑函数和灰度均值常量.因为普通的C-V模型依靠两个区域的均值来驱动水平集的演化，当两个待分割区域的均值相等时，其无需重新初始化的C-V模型的分割结果必将失败.因此，本文在传统FCM算法的目标函数中引入图像空间信息，使得到的聚类中心更加合理，并且增强了算法对噪音的鲁棒性.SFCM算法不仅能够融合图像空间信息，还可以估计模糊聚类的中心和隶属度.肝脏肿瘤图像原图如图1a所示.图1b显示了LSW模型分割的结果，从图中可以看出LSW模型不能准确分割肝脏图像中的肿瘤区域，这是因为肿瘤区域与邻近的背景区域具有相似的灰度造成的.RD模型同样不能对肿瘤区域进行很好的分割，从结果图1c中可以看出肿瘤边界出现了泄漏.这也是由于像素灰度非常接近的目标和背景边界很难实现目标区域的有效分割.采用Li模型分割后出现了不规则的边界，也不能得到准确的分割区域，如图1d所示.但是，在LFCV 模型通过引入SFCM算法，水平集曲线能准确停留在肿瘤的边界，得到了较准确的分割结果[图1e].本文还对图2a中肝脏多个肿瘤块图像进行分割，从图中可以看出，LSW、RD以及Li这三个模型都不能对噪声的弱边缘的肿瘤区域图像进行准确分割，使得演化曲线停留在错误的地方，分别如图2b，图2c以及图2d所示.图2e给出了LFCV模型的分割结果，由于水平集函数中利用了SFCM算法，使得曲线在正确的方向上演化，故正确分割肿瘤区域.实验结果表明：本文LFCV模型可以正确检测出医学图像中和周围组织灰度相似的目标区域.为了研究本文算法的有效性，利用了LSW、RD、Li以及本文的LFCV模型对图3a中含有双目标的肾脏图像进行了图像的分割，本次实验迭代次数为 200 次分割.LSW模型检测到错误的边界，如图3b所示.RD模型也不能分割出准确的感兴趣区域，如图3c所示.尽管Li的分割结果要优于LSW和RD模型，但是水平集曲线也不能停在准确的边界，如图3d所示.当分割目标数目众多时，上述三种模型存在效率不高的问题.本文的LFCV模型中SFCM算法和图像局部区域信息能引导水平集曲线快速达到感兴趣的区域边界.当曲线接近于目标边界运动时，作用力成了引导曲线向目标边界运动的主要因素.除外，本文还对图4a中肺部CT图像进行左右肺的轮廓提取.从图4b，4c以及4d中可以看出，LSW、RD、Li 三个模型对肺部图像分割时都产生了边界泄漏.通过在演化曲线函数中加入SFCM算法，则SFCM算法和区域信息能避免轮廓曲线继续演化进入目标边界外，造成边界泄露和角点丢失现象，提高目标轮廓提取的准确性，分割结果如图4e所示.本文LFCV 模型中的水平集函数不需要进行初始化，这能加速曲线耦合的速度，同时SFCM 算法改进了FCM算法中聚类中心位置的选取，以克服陷入局部极小的问题.为了进一步评估本文LFCV模型的分割性能，接着还利用LSW、RD、Li以及LFCV四个模型对灰度不均匀的CT肝脏图像进行分割.LSW、RD以及Li模型都不能对图5a中原图像进行准确分割.从图5b、图5c以及图5d中可以看出，传统模型在目标分割中仅考虑了全局图像信息，在灰度不均匀场景下的图像分割中背景和目标对象出现了错分现象，三种模型分割性能不佳，而且比较容易收敛到噪声区域.而LFCV模型应用了图像局部区域信息特征则有效缓解了噪声和弱边界的干扰，LFCV模型则克服了上述方法中的不足，可实现各类灰度不均匀场景中目标的准确分割，如图5e所示.另外，当初始条件相同，运行参数为μ=0.2,λ1=λ2=1,Δt=2，所有模型执行300步之后分割结果如图6所示.LSW、RD以及Li三种模型不适于解决目标边界结构复杂的分割问题.对于灰度不均匀(含噪声及弱边界)场景，三种模型不能得到头部图像的准确分割(图6b、图6c以及图6d).由于LFCV模型中SFCM算法和区域信息克服了灰度不均匀现象带来的不利影响，并采用基于图像局部区域信息的水平集模型中的局部极小问题，从而获得了较前述三个模型更准确的分割结果.本文提出的LFCV模型是基于模糊聚类和水平集思想的模型，它不仅利用图像局部区域信息和SFCM算法，而且通过最小化能量函数的方式来进行曲线演化.在探讨了 C-V 模型优缺点的基础上，针对 C-V 模型的不足提出了改进方法，借鉴无需重新初始化模型理论并在 C-V 模型中添加SFCM算法和空间信息约束项，使水平集函数在演过程中始终保持为符号距离函数，从而避免了演化过程中重复初始化过程，降低了算法的时间复杂度，提高了算法的执行效率，对曲线的演化起到了一定的收敛作用，从而提高了分割的精确度.通过实验证明，与LSW,RD以及Li三种模型相比，本文所采用的LFCV模型对图像的分割效果及效率都有了一定的改进.[1] Arifin A Z,Asano A.Image segmentation by histogram thresholding using hierarchical cluster analysis[J].Pattern RecognitionLetters,2006,27(13):1515-1521.[2] He Y Y,Hussaini M Y,Ma J.A new fuzzy c-means method with total variation regulation for segmentation of images with noisy and incomplete data[J].Pattern Recognition,2012,45(9):3463-3471.[3] Balafar M A.Fuzzy C-mean based brain MRI segmentationalgorithms[J].Artificial Intelligence Review,2014,41(3):441-449.[4] Torbati N,Ayatollahi A,Kermani A.An efficient neural network based method for medical image segmentation[J].Computers in biology and medicine,2014,44:76-87.[5] Wang L F,Pan C H.Robust level set image segmentation via a local correntropy-based K-means clustering[J].PatternRecognition,2014,47(5):1917-1925.[6] Li C M,Xu C Y,Gui C F,et al.Level set evolution without re-initialization:A new variational formulation[J]puter Vision and Pattern Recognition,2005,1:430-436.[7] Zhang K H,Zhang L,Song H H,et al.Reinitialization-free level set evolution via reaction diffusion[J].IEEE Trans.Image Process,2013,22(1):258-271.[8] Zhang K H,Song H H,Zhang L.Active contours driven by local image fitting energy[J].Pattern Recognition,2010,43(4):1199-1206.[9] Li B N,Chui C K,Chang S,et al.Integrating spatial fuzzy clustering with level set method for automated medical image segmentation[J].Computers in Biology and Medicine,2011,41(1):1-10.[10] Liu S G,Peng Y L.A local region-based Chan-Vese model for image segmentation[J].Pattern Recognition,2012,45(7):2769-2779.[11] Salima Q,Taleb-Ahmed A,Mohamed B.Spatial information based image clustering with a swarm approach[J].IAES International Journal of Artificial Intelligence,2012,1:149-160.。

基于支持向量机的图像分割程序实现

本科毕业设计（论文）学院(部)数学科学院题目基于支持向量机的图像分割程序实现年级14专业信息与计算科学班级学号1407405033姓名何增之指导老师陈旻昕职称副教授论文提交日期2018.5.9基于支持向量机的图像分割程序实现目录摘要 (3)Abstract (4)第一章背景资料 (5)1.1研究的动机和目的 (5)1.2研究的背景和现状 (5)1.2.1支持向量机 (5)1.2.2支持向量机的应用 (6)1.3研究内容 (6)1） MATLAB编程实现基于SVM的真彩色图像切割 (6)2） MATLAB GUI制作交互界面 (6)第二章系统理论 (7)2.1支持向量机 (7)2.1.1定义 (7)2.1.2线性支持向量机 (7)2.1.3非线性分类 (9)2.1.4计算支持向量机分类器 (9)2.1.5经验风险最小化 (12)2.1.6分类模型的VC维与可学习性 (13)2.2 libsvm工具箱 (14)2.3 图像分割的定义 (14)2.3 MATLAB用户图形界面（GUI） (15)2.3.2 MATLAB GUI (15)2.3.3 MATLAB GUIDE (15)2.3.4 函数句柄 (15)2.3.4 回调函数 (16)第三章需求分析 (17)3.1需求分析的任务 (17)3.2程序功能分析 (17)1) 用户读入jpg或rgb格式的图像 (17)2) 用户使用鼠标操作在读入的图像上选取前景和背景样本点确定训练集 (17)3) 建立支持向量机进行图像分割 (17)4) 友好的用户界面 (17)3.4程序功能模块设计 (17)1）读图模块 (17)2）选取样本模块 (17)3）建立支持向量机进行图像分割模块 (17)4）用户界面模块 (17)3.5程序功能调用图 (18)3.6程序功能设计说明 (18)第四章详细设计与软件展示及课题展望 (19)4.1详细设计与软件展示 (19)4.1.1读图模块 (19)4.1.2选取样本模块 (20)4.1.3建立支持向量机并进行图像分割模块 (23)4.2设计缺陷与课题展望： (31)第五章总结 (32)参考文献 (33)致谢 (34)附录 (35)1.1 程序功能 (35)1.2 开发测试环境 (35)1.3 运行环境 (35)1.4 使用说明 (35)摘要本论文主要介绍了应用MATLAB编程，开发简单的图像分割软件，实行基于支持向量机实行真彩色图像的分割。

一种结合遗传算法的图像分割方法

文献标识码：A
文章编号：1003-5168（2016）03-0043-06
A Image Segmentation Method Combining Genetic Algorithm
Li Yinsong Shen Shen （Patent Examination Cooperation Centernan，Zhengzhou Henan 450000）
Abstract: Fuzzy c-means clustering algorithm is a common method for solving image segmentation, stelios proposed FLICM algorithm based on fuzzy c-means clustering algorithm, which greatly improved the effect of image segmentation. Based on this, the analysis reveals that the object function of flicm is not conver⁃ gent, so this paper proposed a method that use genetic algorithm to solve the problem, which has a com⁃ plex objective function that cannot obtain the convergent iterative formulas. Experimental results on synthet⁃ ic real-world images showed that this method was not only convergent but also more efficient at providing robustness to noisy image and keeping details. Keywords: fuzzy c-means；clustering；flicm；convergence；image segmentation

automodelforimageclassification.from_pretrained_说明

automodelforimageclassification.from_pretrained说明1. 引言1.1 概述在当今数字化时代，图像分类任务变得越来越重要。

图像分类是一个将输入的图像自动归类到预定义分类标签中的任务。

它在许多领域中都有广泛的应用，包括计算机视觉、人工智能、医学影像处理等。

为了解决这个问题，研究者们提出了各种各样的方法和算法。

1.2 文章结构本文将详细介绍automodelforimageclassification.from_pretrained函数及其在图像分类任务中的应用。

文章将分为五个部分进行讨论：第一部分是引言部分，对整篇文章进行概述，并描述文章的结构。

第二部分将介绍自动模型用于图像分类任务时所面临的挑战以及传统方法的局限性。

第三部分将详细解释automodelforimageclassification.from_pretrained函数的功能和使用方法，并通过实例演示其操作过程。

第四部分将对该函数进行优点和局限性分析，评估其在实际应用中的效果和限制。

最后一部分是结论部分，对全文进行总结回顾，并展望未来研究方向。

1.3 目的本文旨在介绍automodelforimageclassification.from_pretrained函数以及其在图像分类任务中的应用。

通过深入分析该函数的功能和使用方法，我们希望读者能够对这一技术有更全面的了解，并对其在实际应用中的优点和局限性有清晰的认识。

同时，我们也希望激发读者对未来相关研究方向的兴趣，并为进一步研究提供参考。

2. 自动模型用于图像分类的介绍2.1 图像分类任务图像分类是计算机视觉中最基础和常见的任务之一。

其目标是将输入的图像分为预定义类别中的一个或多个。

在现实世界中，图像分类应用广泛，例如人脸识别、物体识别和场景分析等领域。

2.2 传统方法的局限性在过去，图像分类主要依赖于手工设计特征和使用传统机器学习算法进行学习和预测。

基于深度学习的图像描述算法研究

2023-10-31•研究背景和意义•相关工作•研究方法目录•实验结果与分析•结论与展望01研究背景和意义研究背景图像描述算法旨在将图像转化为自然语言描述，为视觉信息提供了文字表达方式。

深度学习技术的兴起为图像描述算法提供了新的解决方案，使其在多个领域具有广泛的应用前景。

图像作为信息的重要载体，在多媒体时代中扮演着不可或缺的角色。

研究意义推动多模态信息处理技术的发展图像描述算法是跨模态信息处理的一个重要方向，其研究有助于推动多模态信息处理技术的发展。

为相关领域提供技术支持例如，新闻媒体、广告、医疗影像等领域均可受益于图像描述算法的应用，从而为其提供技术支持。

提升图像理解与表达的准确性通过研究深度学习在图像描述算法中的应用，能够提高图像理解的准确性，进而提高图像的表达质量。

02相关工作图像描述算法相关工作•基于区域的方法：这类方法首先识别图像中的各种区域，然后使用逻辑规则或机器学习算法从这些区域中生成描述。

包括早期的工作如SIFT（Scale-Invariant Feature Transform）和SURF（Speeded Up Robust Features）。

•基于模板的方法：这种方法使用预先定义的模板或模式来描述图像中的对象和场景。

例如，简单模板匹配方法、基于机器学习的方法如使用SVM（Support Vector Machines）和神经网络等。

•基于关系的方法：这种方法通过分析对象之间的关系来生成描述。

例如，ObjectBank方法、SceneGraph 方法等。

•基于上下文的方法：这种方法利用图像中的上下文信息来生成描述。

例如，Context-based Object Detection（COCO）方法等。

深度学习在图像描述中的应用相关工作使用卷积神经网络（CNN）的方法例如，Faster R-CNN（Region-based Convolutional Networks）、YOLO（You Only Look Once）、SSD（Single Shot MultiBox Detector）等，这些方法在目标检测方面取得了显著的成功。

基于QPSO聚类算法的图像分割方法

基于QPSO聚类算法的图像分割方法作者：王丹周锦程来源：《科技视界》2016年第12期【摘要】量子粒子群优化算法克服了传统粒子群优化算法中无法保证全局收敛、容易陷入局部最优的缺点，是近年来优化技术领域的一个研究热点。

本文结合当前图像分割中常用的K-均值聚类算法中的相关技术，设计了基于QPSO的聚类算法并将其用于图像分割处理问题中。

实验结果表明：在图像分割处理中，相对于K-均值聚类算法，QPSO聚类算法不仅不依赖于初始聚类中心的选择，而且还能得到相对于K-均值聚类算法精度更高的聚类中心，其在图像分割中的效果优于通常的K-均值聚类算法。

【关键词】PSO算法；QPSO算法；K-Means聚类；图像分割【Abstract】The quantum particle swarm optimization（QPSO） algorithm overcomes the shortcomings of the traditional particle swarm optimization algorithm which can not guarantee the global convergence， easy to fall into the local optimal. QPSO algorithm has become a research hotspot in the optimization technology filed in recent years. In this paper， combined with the relevant technology of the current commonly K-Means clustering algorithm in the image segmentation problem， we propose a new QPSO based clustering algorithm and we use it to the image segmentation filed. Experimental results show that， with respect to the K-means clustering algorithm， QPSO clustering algorithm does not rely on the choice of the initial cluster center， but compared with the K-means clustering algorithm， it can get higher precision clustering center. Thus， it is better than the usually K-means clustering algorithm in the image segmentation problem.【Key words】PSO Algorithm； QPSO Algorithm； K-Means Clustering； Image Segmentation0 引言群体智能算法是基于群体行为对给定目标进行寻优的一种启发式搜索算法，自20世纪80年代出现这类算法以来，已经得到了众多研究人员的广泛关注，并已发展成为优化技术领域的一个热点研究方向。

利用文本特征增强与注意力机制提高图像问答准确率

Abstract
Image question and answer is one of the main directions for the successful application of deep learning in the field of computer vision. It has been widely used in artificial intelligence, natural language processing, image recognition and so on. The accuracy of the image question and answer is not only related to the design of the feature fusion module in the image question answering system, but also related to the degree of matching between the image feature and the semantic level of the question feature. In this paper, the text features and visual features of the image are first combined as the enhanced features of the image. Then, the text features are extracted from the question, and then the attention mechanism is added. The enhanced features of the image and the text features of the question are merged, and make answer prediction for fusion features. The experimental results show that the proposed method can solve the problem of mismatch between image features and text features, and improve the accuracy of the image question answering system.

Entropy-based image merging

Entropy-based image mergingA.German,M.R.Jenkin,Y.Lesp´e ranceDepartment of Computer Science and Engineering and Centre for Vision Research York University,Toronto,Ontario,Canada.{german,jenkin,lesperan}@cs.yorku.caAbstractSpacecraft docking using vision is a challenging task. Not least among the problems encountered is the need to visually localize the docking target.Here we consider the task of adapting the local illumina-tion to assist in this docking.An online approach is developed that combines images obtained under dif-ferent exposure and lighting conditions into a single image upon which docking decisions can be made. This method is designed to be used within an intel-ligent controller that automatically adjusts lighting and image acquisition in order to obtain the“best”possible composite view of the target for further im-age processing.Keywords:Image Entropy,High Dynamic Range. 1IntroductionPerhaps the most interesting vision tasks involve guiding semi-autonomous vehicles such as unmanned underwater vehicles,mining machines and space-craft.Given the widely varying and often poor light-ing conditions encountered in such tasks,the remote video camera is often associated with one or more (typicallyﬁxed)but controllable light sources.The camera itself often has a variety of controllable pa-rameters such as shutter speed and aperture.Given the controllable intrinsic camera parameters,and the controllable light sources,the remote operator ma-nipulates the various camera parameters and lighting options in order to be able to carry out the required task.This task may be performed directly by a hu-man operator or it may be performed by a software agent with or without human intervention.In either case,the operator manipulates the camera param-eters and the available lighting in order to ensure that those portions of the image that are critical to the task at hand are illuminated appropriately(see Figure1a).Choosing an appropriate illumination for a hu-man operator is an extremely complex problem. Maximizing one illuminant may place portions of the scene in high relief,while at the same time casting shadows over other portions of the image.Interac-tions between the illuminants and gain control within the camera itself complicates the task even further. Perhaps the most common version of this problem is the lighting problem portrait photographers en-counter:How should the various illuminates be lit and the camera controlled in order for the camera to best capture the subject?Note that what“best”is depends signiﬁcantly on the speciﬁc task on hand.In the machine vision domain,the task becomes even more complex.Cameras typically have a lim-ited dynamic range so they often cannot be used to eﬀectively image the whole scene in one acquisition. Unlike natural settings,one simplifying assumption that is often made is that the only active agent in a teleoperated setting is the teleoperated agent.As-suming that the scene is static,then it is possible to illuminate diﬀerent parts of the image under diﬀer-ent illuminates and camera capture parameters,and then to combine diﬀerent parts of the image captured under diﬀerent conditions into a single composite im-age.To consider this illumination problem in its sim-plest form,consider a spacecraft equipped with a camera-light arrangement like that given in Fig-ure1b.If one assumes that the underlying camera capture and scene geometry is static,i.e.,the space-craft are not moving relative to each other and the position of the camera,the lights and object being viewed remain unchanged,then the camera’s intrin-sic parameters and the level of illumination provided by each light can be manipulated.Furthermore,if the aperture,focus and focal length of the camera re-main unchanged,then over a set of images taken un-der diﬀerent lighting and camera parameters,a given pixel(u,v)in the camera will always image the same scene point and image blur will remain constant.Un-der these conditions,the process of combining mul-tiple images into a single image can be expressed at the pixel level–how should a speciﬁc pixel values at (u,v),taken under diﬀerent illumination and camera(a)A computer graphics rendering of the space shuttle docking procedure.(b)An intelligent controller can manipulate light-ing intensities and camera intrinsics in order to derive an accurate model of the relationships be-tween the spacecrafts involved in a docking pro-cedure.Figure1:Illumination issues in teleoperation.How can the scene be best illuminated and captured in order to dock the two vehicles?parameters,be combined to obtain a composite pixelvalue at(u,v)?1.1Formal Statement of the ProblemGiven a set of images{I1,...,I N},a functionφis de-sired that combines the set into a single image˜I1...N.Notationally,we seek a functionφ()that operates atthe pixel level and that has the following properties:˜I1=φ(I1)˜I1...N=φ(I1,I2,...,I N)In order for the image merging to operate in an ef-ﬁcient,online mannerφshould have the propertythat˜I1...N+1=φ(˜I1...N,I N+1)That is,it should be eﬃcient to compute the N+1th image of˜I given the computation for the N th image. 2Related WorkThe problem of combining multiple images taken un-der varying sensor/lighting conditions has received considerable attention in the literature,although not in the limited scope of the algorithm being consid-ered here.High dynamic range images have many properties in common with the task being consid-ered here(see[1]for an introduction to the prob-lem of high dynamic range images).A commonly considered problem in high dynamic range images is the task of rendering the wide range of data avail-able at a given pixel(u,v)given the limited display range of the intended displays:That is,given the set I how to compute an image˜I that best repre-sents the input images.In the high dynamic range image case,the various images I are typically cap-tured before the image processing takes place and an oﬄine version of the algorithm is appropriate–the display is not updated as new bands of image infor-mation are obtained.Several approaches to this‘ren-dering of high dynamic range images’problem have been devised and implemented in both hardware and software.Cameras like the QinetiQ High Dynamic Range Logarithmic CMOS Cameras1compress the dynamic range of the image using on-board logarith-mic intensity compression.The system described in [1]uses several images under diﬀerent exposures to recover the camera’s response function and from that is able to fuse the images into a single,high dynamic range radiance map.In[2]a contrast compression al-gorithm using a coarse-to-ﬁne hierarchy is described. In[3]a system is developed that performs gradient attenuation to reduce the dynamic range in the im-age.The algorithm described in this work is based in part on the approach of Goshtasby[4].The basic ap-proach of[4]is to combine images in a manner that maximizes the entropy of the resulting combined im-age,while using a smoothing function to ensure that the resulting image does not exhibit intensity discon-tinuities that were not present in the input images. 3Basic ApproachThe basic approach developed for the online com-bination of images builds upon Goshtasby’s entropy based high dynamic range reduction algorithm[4],(a)(b)(c)(d)(e)(f)(g)(h)Figure2:Top row:(a)Illumination by illuminant1only at100%(b)Illumination by illuminant2only at 100%(c)Illumination by illuminant3only at100%(d)Illumination by illuminant1,2and3all at100%. Bottom row:(e)-(g)The composite image at various stages during the addition of the512source images.(h)Theﬁnal composite image after all512images have been added.but diﬀers in how the images are combined.In thissystem,the images are merged on a pixel per pixelbasis by weighting the local pixel values by their localentropy estimate.Entropy was chosen as a measure of the detailprovided by each picture.The entropy(see[5])isdeﬁned as the average number of binary symbols nec-essary to code a given input given the probability ofthat input appearing an a stream.High entropy isassociated with a high variance in the pixel values,while low entropy indicates that the pixel values arefairly uniform,and hence little detail can be derivedfrom them.Therefore,when applied to groups ofpixels within the source images,entropy provides away to compare regions from the diﬀerent source im-ages and decide which provides the most detail.The method developed for this task,though sim-ple,is bothﬂexible and powerful.Every pixel in theﬁnal image is computed as the weighted average ofthe corresponding pixels in the source images whereeach value is weighted by the entropy of the sur-rounding region.For each pixel p=(u,v)in theﬁnalimage there are corresponding pixels p1,p2,...,p N,one for each source image.For each pixel p i in eachimage,the local entropy(measured within aﬁxedwindow)v i is computed,and the weighted average pis computed asp=Ni=1p i v iFigure3:The combined result of all512images after gamma correctionTheﬁnal image pixel p can be computed from G r p,G g p,G b p and I r p,I g p,I b p asp r=G r pI g pp b=G b p(see[6]).Acknowledgments:We would like to acknowledge the support of Mark Obsniuk,Andrew Hogue and Olena Borzenko.Theﬁnancial support of CITO, MDRobotics and NSERC is greatly appreciated. References[1]Debevec,P. E.and Malik,J.,“RecoveringHigh Dynamic Range Radience Maps from Pho-tographs,”Proceedings of SIGGRAPH1997, ACM Press/ACM SIGGRAPH,369-378,1997.[2]Tumblin,J.and Turk,G.,“LCIS:A Bound-ary Hierarchy for Detail-Preserving Contrast Reduction,”Proceedings of SIGGRAPH,ACM Press/ACM SIGGRAPH,83-90,1999.[3]Fattal,R.,Lischinski, D.,and Werman,M.,“Gradient Domain High Dynamic Range Com-pression,”Proceedings of SIGGRAPH,ACM Press/ACM SIGRAPH,249-256,2002.[4]Goshtasby, A. A.,“High Dynamic RangeReduction Via Maximization of Image In-formation.”,/agosh-tas/hdr.html[5]Shannon,C.E.,“A Mathematical Theory ofCommunication,”Bell System Technical Jour-nal,Vol.27,379-423,623-656,1948.[6]Borzenko,O.,Lesperance,Y.,and Jenkin,M.R.,“Controlling Camera and Lights for Intel-ligent Image Acquisition and Merging.”,IEEE Second Canadian Conference on Computer and Robot Vision(CRV2005),2005.(a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)Figure 5:Top row:(a)-(g)Images taken as luminosity increases.Bottom row:(h)-(k)Composites of (a)-(g)images with window sizes 5,11,21,41respectively(a)(b)(c)(d)(e)Figure 6:Shows the eﬀect of random noise as well as blank images on the composite.Top row:The source images (a)an image taken of the object,(b)an image of random noise,(c)a blank image.Bottom Row:(d)A composite comprised of (a)and (b),(e)a composite comprised of (a)and (c)。

快速搜索密度峰值聚类在图像检索中的应用

快速搜索密度峰值聚类在图像检索中的应用王华秋;聂珍【摘要】To reduce the image retrieval and matching range and improve the retrieval speed and accuracy,the fast search density peak clustering (DP)was adopted to cluster the image according to the feature similarity principle.Image retrieval was executed in range of the class centers and the closest one.Considering that the spatial distribution information of image color is ignored by traditional image feature extraction algorithm,and that the extracted feature information does not highlight the interesting image region,each image was partitioned by equal area rectangle ring and the correlation of the spatial regions was calculated.The im-portance of each region was calculated according to the spatial correlation to combine the spatial information and color informa-tion.The cutoff distance of the clustering algorithm was improved reasonably to ensure the accuracy of the clustering algorithm. The presented density peak clustering algorithm was applied to image retrieval.Experimental results show that the proposed method is feasible and effective.Experimental results show that the proposed clustering algorithm and spatial feature extraction method improve the efficiency and accuracy of image retrieval.%为缩减图像检索和匹配范围，提高检索速度和准确率，将快速搜索密度峰值聚类用于对图像，按照特征相似性原则进行聚类，在类中心和最接近的一类中进行图像检索。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Model-Based Clustering for Image Segmentation and Large Datasets Via Sampling1Ron Wehrens and Lutgarde M.C.BuydensUniversity of NijmegenChris Fraley and Adrian E.RafteryUniversity of WashingtonTechnical Report no.424Department of StatisticsUniversity of Washington.February13,20031Ron Wehrens is associate professor and Lutgarde M.C.Buydens is professor,both at the Department of Analytical Chemistry,University of Nijmegen,Toernooiveld1,6525ED Nijmegen, Netherlands.Email:rwehrens|lbuydens@sci.kun.nl;web:www.sci.kun.nl/cac.Chris Fraley is a research staﬀmember and Adrian E.Raftery is Professor of Statistics and Sociology,both at the Department of Statistics,University of Washington,Box354322,Seattle,WA98195-4322; Email:fraley|raftery@;Web:/fraley|raftery.The research of Fraley and Raftery was supported by NIH grant1R01CA094212-01,and by the DoD Multidisciplinary University Research Initiative(MURI)program administered by the Oﬃce of Naval Research under Grant N00014-01-10745.AbstractThe rapid increase in the size of data sets makes clustering all the more important to capture and summarize the information,at the same time making clustering more diﬃcult to accomplish.If model-based clustering is applied directly to a large data set,it can be too slow for practical application.A simple and common approach is toﬁrst cluster a random sample of moderate size,and then use the clustering model found in this way to classify the remainder of the objects.We show that,in its simplest form,this method may lead to unstable results.Our experiments suggest that a stable method with better performance can be obtained with two straightforward modiﬁcations to the simple sampling method: several tentative models are identiﬁed from the sample instead of just one,and several EM steps are used rather than just one E step to classify the full data set.Weﬁnd that there are signiﬁcant gains from increasing the size of the sample up to about2,000, but not from further increases.These conclusions are based on the application of several alternative strategies to the segmentation of three diﬀerent multispectral images,and to several simulated data sets.Keywords:EM algorithm;MRI image;Remote sensing;SamplingContents1Introduction1 2Model-Based Clustering3 3Sampling Methods4 4Data and Simulations54.1Assessment of Results (5)4.2Data (6)4.3Simulation Design (6)4.4Software and Hardware (7)5Results75.1Stability of Model Selection (8)5.2Stability of Segmentations (9)5.3Accuracy (16)5.4Timings (17)6Discussion19List of Tables1Most frequently selected models in strategy I.The columns indicate diﬀerent sample sizes.Numbers in brackets indicate how often a particular modelwas selected.The true model for Simul12and Simul12N is VEV12;the truemodel for Simul6and Simul6N is VEV6.Simul12N and Simul6N contain1500noise points(5percent) (8)2The most frequently selected models in strategy III.See caption of Table1.9 3The most frequently selected models in strategy IV.The range of models considered is much smaller than with the other strategies(see text) (9)4Approximate timings(MRI data set)for the diﬀerent strategies(minutes, user time,Pentium III1GHz processor).In this table,only the three mostelaborate models(EEV,VEV and VVV)are considered with4–14clusters.The numbers cited are means of ten repeated clusterings (17)List of Figures1Segmentations based on clustering of samples of500pixels of a set of four congruent MRI images of a patient with a brain tumour.Colors are per-muted to give maximal visual similarity (2)2Strategies to apply model-based clustering for large data sets.The dashed box contains operations on the sample only.INIT:initialization by model-based hierarchical clustering;EM:application of the EM algorithm toﬁndcluster parameters and classiﬁcations(max.100steps);MS:model selection;EM1:one iteration of the EM algorithm to classify pixels not in the sample.5 3The three data sets,plotted in order of increasing size.The T1-weighted MRI image of a patient with a brain cancer behind the left eye(left),animage of a St.Paulia(middle),and a false-color image of the remote sensingdata of the Duursche Waarden(right) (6)4Adjusted Rand indices for the comparison of segmented images from Fig-ure1:1vs.2;3vs.2;and7vs.6.Colors are based on the segmentation ofimages2,2and6,respectively.Each line corresponds to at most100pixels.10 5Agreements between clusterings from ten diﬀerent samples,indicated by mean values of the adjusted Rand index.One standard deviation above andbelow the mean value are also shown (11)6Classiﬁcation agreements for the real data sets,as measured by mean ad-justed Rand indices.The fraction of pixels taken into account is governedby the uncertainty of the classiﬁcation(y-axis):the label0.05,e.g.,meansthat only pixels that are classiﬁed with an uncertainty smaller than0.05in all ten replicated segmentations are taken into account.Sample size isdepicted on the x-axis (12)7Classiﬁcation agreements for the simulated data sets,as measured by mean adjusted Rand indices.See caption of Figure6 (13)8Stable classiﬁcations with adjusted Rand index greater than0.9(strategy II):for the three images,the uncertainty thresholds are.35,.2and.5,re-spectively (14)9Loglikelihoods of theﬁnal model selections.Means are plotted for the ten repeated samples;plus or minus one standard deviation is given as well.Thegray lines in the simulated data sets show the loglikelihood of the“true”solution (15)10Agreement with“true”class labels for the simulated data sets(adjusted Rand index) (16)11Agreement with“true”class labels(adjusted Rand index),dependent on the certainty of the classiﬁcation (18)1IntroductionToday,data are generated at unprecedented speed.The growing size of data sets and data bases has increased the need for good clustering methods to capture and summarize the in-formation.An example is the segmentation of multispectral images,where the objective is to group similar pixels,and to assess how many diﬀerent groups there are.Typically,three to ten congruent images or bands containing complementary information are recorded,of-ten containing tens of thousands of pixels per image.This places constraints on clustering techniques with respect to memory usage and computing time.Many diﬀerent clustering methods have been described(Jain and Dubes1988;Kaufman and Rousseeuw1989).Model-based clustering(McLachlan and Basford1988;Banﬁeld and Raftery1993;Fraley and Raftery2002b;McLachlan and Peel2000)is one of the more recent developments,and has shown very good performance in a number ofﬁelds(Mukherjee, Feigelson,Babu,Murtagh,Fraley,and Raftery1998;Dasgupta and Raftery1998;Yeung, Fraley,Murua,Raftery,and Ruzzo2001;Wang and Raftery2002),including image analysis (Campbell,Fraley,Murtagh,and Raftery1997;Campbell,Fraley,Stanford,Murtagh,and Raftery1999;Stanford and Raftery2002;Wehrens,Simonetti,and Buydens2002).As implemented in these applications,and in available software(?;McLachlan,Peel,Basford, and Adams1999;Fraley and Raftery2002a),model-based clustering consists ofﬁtting a mixture of multivariate normal distributions to a data set by maximum likelihood using the EM algorithm,possibly with geometric constraints on the covariances matrices,and an additional component to allow for outliers or noise.Since the likelihood surface typically has many local maximna,initialization of the EM algorithm is a very important issue. Model-based hierarchical clustering(Banﬁeld and Raftery1993)has been found to provide good initializations.Model-based hierarchical clustering generally requires storage and computing time at least proportional to the square of the dimension of the data,so that both space and time are limiting factors in its application to large data sets.Another problem is that when the size of the data set reaches a certain threshold,it is not possible to keep all of the required quantities in memory at the same time,forcing a dramatic and abrupt increase in necessary computational resources.This threshold varies with computer hardware and software,and data dimension,but at the current time it is typically on the order of several thousand objects.Various approaches to the problem of clustering large data sets have been proposed, including initialization by clustering a sample of the data(Banﬁeld and Raftery1993; Fayyad and Smyth1996;Maitra2001),and using an initial crude partitioning of the entire data set(Posse2001;Tantrum,Murua,and Stuetzle2002).The simplest and perhaps most widely applied approach is to apply the clustering methodﬁrst to a small simple random sample from the data,and then apply the resulting estimated model to the full data set using discriminant analysis(Banﬁeld and Raftery1993).The discriminant analysis can be carried out very easily in the model-based clustering framework by using a single E step (Fraley and Raftery2002b).Unfortunately,this easily implemented strategy may lead to unstable segmentationsFigure1:Segmentations based on clustering of samples of500pixels of a set of four congruent MRI images of a patient with a brain tumour.Colors are permuted to give maximal visual similarity.when used in its simplest form,as illustrated in Figure1.In thisﬁgure,ten random samples of500pixels are used to cluster an MRI data set of a patient with a brain tumour. The number of clusters,selected by the method(see below),varies between4and9. Although several features are preserved,such as the tumour region(in dark blue,behind the eye)and the cerebrospinalﬂuid(green),the variation is quite large.In this paper we show that the appealingly simple approach based on clustering a sam-ple of the data can be modiﬁed to give good and stable results,with two straightforward changes.For the image data sets we consider,we obtain good results by tentatively se-lecting several models based on the sample rather than just one,and by running several EM steps on the full data set rather than just one E step.Weﬁnd that performance improves when the size of the sample is increased up to about2,000,but that beyond that there is little gain.To reach this conclusion,we considered a range of sample sizes and several strategies of varying computational parisons were based on three typical real-world data sets,and several realistic simulations.In Section2,we give a brief overview of model-based clustering,and in Section3we propose several strategies for model-based clustering in large data sets such as those that arise in image segmentation.The image data we use and the design of our simulations is described in Section4.Segmentations using diﬀerent sample sizes and strategies are compared in Section5based on the likelihoods of the segmented images,the stability of the clusters,and the accuracy of the results in the simulated cases.Finally,recommendations are made.2Model-Based ClusteringIn model-based clustering,individual clusters are described by multivariate normal distri-butions,where the class labels,parameters and proportions are unknown.Maximum likeli-hood estimates for the resulting model can be obtained using the Expectation-Maximization (EM)algorithm(Dempster,Laird,and Rubin1977;McLachlan and Krishnan1997).Given an initial guess for the cluster meansµk,covariancesΣk,and proportionsτk for all clusters, one can calculate the conditional probability that object i belongs to cluster k:z ik=τkφk(x i|µk,Σk)/Kj=1τjφj(x i|µj,Σj,τj),whereφis the multivariate normal(Gaussian)density.This is the expectation step,or E-step of the EM algorithm.The maximization step(M-step)consists of estimating the parametersµ,Σ,andτ,from the data and the conditional probabilities z ik.The E-and M-steps iterate until convergence.Finally,each object is classiﬁed in the class in which it has the highest conditional probability.Good initialization of the EM algorithm is very important,since the method may converge to diﬀerent values depending on where it is started because the likelihood surface usually has multiple local maxima.For the initialization,we apply fast hierarchical model-based clustering(Fraley1998),the default in the mclust software(?;Fraley and Raftery 2002a).If there are no cross-cluster constraints on the cluster shapes and sizes,each one is described by1+p+p(p+1)/2parameters(the proportion,mean and covariance matrix, respectively).The covariance matrix for the k th cluster can be expressed in the formΣk=λk D k A k D Tk(1) whereλk describes the volume of the cluster,D k is the matrix of eigenvectors,governing the orientation of the cluster,and A k is a diagonal matrix,proportional to the eigenvalues, which determines the shape of the cluster.Banﬁeld and Raftery(1993)proposed cross-cluster equality constraints on any or all of the cluster volumes,orientations or shapes based on this decomposition as a way of limiting the number of parameters in the model in a geometrically intuitive way.One such model constrains all clusters to have the same shape, but allows cluster volumes and orientation to vary.This is called the VEV model(Fraley and Raftery1999)(Variable volume,Equal shape,Variable orientation).A completely unconstrained model is denoted by VVV(Fraley and Raftery1999).For a discussion of all possible combinations of constraints based on the decomposition(1),see Celeux and Govaert(1995).Each set of constraints corresponds to a diﬀerent clustering criterion:for example,if the clusters are restricted to be spherical and identical in volume,the criterion is the same as that used in Ward’s clustering and standard k-means clustering(Celeux and Govaert1995;Fraley and Raftery1998).The model-based clustering framework can be extended in a natural way to model noise and outliers(Banﬁeld and Raftery1993;Fraley and Raftery2002b):an extra“noise”classis described by a constant component density over the whole data region.If the densities of all other components are lower than the density of the noise class,an object will be classiﬁed as noise.This corresponds to modelling the noise with a homogeneous Poisson process.An initial estimate of the noise is needed.To select the optimal clustering model(deﬁned by both the cross-cluster constraints and the number of clusters),several measures have been proposed(for an overview,see e.g. McLachlan and Peel2000).In several applications,the BIC approximation to the Bayes factor(Schwarz1978;Kass and Raftery1995)has performed quite well(Fraley and Raftery 1998;Dasgupta and Raftery1998;Stanford and Raftery2000).The strategy employed here thus consists of several steps(Fraley and Raftery1998):ﬁrst perform model-based hierarchical clustering for initialization;then perform EM for several values of the number of clusters and with several sets of constraints on the covariance matrices of the clusters;ﬁnally,select the combination of model and number of groups that leads to the highest BIC value.The model-based clustering framework also provides a measure of the certainty of a particular classiﬁcation.Basically,a classiﬁcation has low uncertainty if one of the k conditional cluster membership probabilities for each data point is close to1,and the other(k−1)conditional probabilities ar close to0.Bensmail et al.(1997)quantiﬁed this notion by deﬁning the uncertainty of the classiﬁcation of object i to bez ik.u i=1−maxkThe uncertainty of the complete clustering may then be estimated by averaging over the uncertainty of all objects.For large data sets,the usual strategy is to apply model-based clustering to a random sample from the data set of a size that can be clustered comfortably(Banﬁeld and Raftery 1993;Fraley and Raftery2002b).The parameters of the clusters,found in the sample,can be used to classify the remainder of the objects by the application of a single E-step for the entire data set,which is very quick.3Sampling MethodsThe segmentations in Figure1were obtained by clustering ten random samples of500pixels each to obtain ten sets of cluster parameters,and for each random sample performing one E-step to classify all other pixels into one of the clusters.The variability that is so apparent can have several causes:ﬁrst,the sample size may be too small,so that clusters are not well described.In particular,there is a danger that one misses small clusters with sample sizes that are too small.To investigate the eﬀect of sample size,ﬁve diﬀerent sample sizes are compared:500,1000,1500,2000and2500pixels,respectively.With current software and hardware,2500pixels can be clustered within a reasonable time.Second,there may be a problem in going from the clustered sample to a complete segmented image.We will compare four strategies.The strategy described earlier,applying one E-step using the cluster parameters from the clustered sample,will be called strategy I.INITEM INIT EM INIT EM INIT EMMSIII III MSIV EM MS EM MS MS EM1Figure 2:Strategies to apply model-based clustering for large data sets.The dashed box contains operations on the sample only.INIT:initialization by model-based hierarchical clustering;EM:application of the EM algorithm to ﬁnd cluster parameters and classiﬁca-tions (max.100steps);MS:model selection;EM1:one iteration of the EM algorithm to classify pixels not in the sample.It may be beneﬁcial to do several EM steps on the complete image;this is feasible in terms of computational cost.The second strategy (II)extends strategy I by doing an additional EM optimization for at most 100steps for the selected model,taking into account all pixels.The sample,however,may be too small to pick the correct model.The third strategy (III)therefore does at most 100EM steps for the ﬁve best models,selected on the basis of the training set,and from these ﬁve eventually selects the best using the whole data set.Finally,the fourth strategy (IV)uses the sample only for the initialization,and does 100EM steps for all models considered.Only then is the best model selected.Strategy IV can be viewed as a gold standard,but it is much more computationally expensive than the other ones.The four strategies are summarized in Figure 2.Each experiment is performed ten times with diﬀerent random samples;strategies I-IV were implemented in exactly the same way for each experiment.4Data and Simulations 4.1Assessment of ResultsSeveral criteria are used to assess the eﬀects of the sample size and the strategy.One aspect is the stability of the clustering:ideally,the results should be independent of the initial sample.This means that the models selected in the ten repeated experiments should be similar,with,ideally,the same model being selected in most or all of the experiments.It also means that,even if there are diﬀerences between the selected models,the ﬁnal classi-ﬁcations should be similar.The latter is assessed by calculating the adjusted Rand index (Rand 1971;Hubert 1985).One can also consider the likelihoods of the ﬁnal segmenta-tions.Some samples may lead to local maxima which may be easily recognized.This gives an indication of what fraction should be used in the training phase and what strategy isFigure3:The three data sets,plotted in order of increasing size.The T1-weighted MRI image of a patient with a brain cancer behind the left eye(left),an image of a St.Paulia (middle),and a false-color image of the remote sensing data of the Duursche Waarden (right).best.A second aspect is the accuracy of the segmentation:how similar are the estimated clusters to the“true”clusters?We use simulated data to assess this accuracy,which depends on sample size and clustering strategy.Since we know the“correct”model,we can count the number of times the correct model is picked under the four strategies and with the varying sample sizes.Also,we can simply count the errors in the classiﬁcations, since the“true”classiﬁcation is known.Again,the adjusted Rand index is used to quantify the agreement between cluster labels and“true”labels.4.2DataThree real data sets will be used.Theﬁrst is a set of four congruent MRI images(T1-weighted,T2-weighted,proton-density and gadolineum-enhanced)of a patient with a brain tumour.In this data set,uninteresting regions(eyes,the skull and the region outside the head)have been removed,leaving23712pixels.The T1-weighted image is depicted in the left plot of Figure3.The second data set is an RGB image of a St.Pauliaﬂower with268 columns and304rows;again,pixels from the background have been removed so that the data set has45656pixels.The RGB image is plotted in Figure3(middle plot).Theﬁnal data set(RS)is a256by256remote sensing image of an area in The Netherlands,the Duursche Waarden.It is recorded by an airborne CASI scanner,and consists of9spectral bands.A false-color image is shown in the right plot of Figure3.(Bands6,3and1are used for red,green and blue,respectively.)The vertical discontinuity,slightly left of the center,is due to the fusion of twoﬂight lines.4.3Simulation DesignFour data sets were simulated by randomly drawing from a series of multivariate normal distributions obtained from clustering the MRI images.Theﬁrst pair of two simulated datasets(Simul12and Simul12N)was based on the model that led to the highest full-image segmentation loglikelihood in strategy I(N=2500):(VEV,12).This is the VEV model with12clusters.From the model parameters,30.000points were generated randomly(in four dimensions,like the MRI data set).The Simul12N data set was derived from the Simul12data set by replacingﬁve percent of the pixels by uniformly distributed noise.The second pair of simulated data sets(Simul6and Simul6N)was based on a model with fewer clusters,in this case the(VEV,6)model(MRI,N=500,sample7).Again,in the noise data set(Simul6N)ﬁve percent of the original pixels were replaced by uniformly distributed noise.In modelling the noise case,our initial estimate of the noise is based on the“true”noise,so that this is the best possible initialization.In practice,an imperfect initial estimate of the noise is likely to lead to a decrease in accuracy.4.4Software and HardwareAll experiments were performed in R(Ihaka and Gentleman1996)version1.6.0,using the 2002version of mclust by Fraley and Raftery(Fraley and Raftery1999;Fraley and Raftery 2002a).Mclust considers ten parametrizations of the cluster covariance matrices(two spherical models,four diagonal models and four ellipsoidal models)(Fraley and Raftery 2002a).Adjusted Rand indices and associated plots are programmed in R.Scripts for the application of strategies I–IV are available as supplementary material(www.sci.kun.nl/cac/people/rwehrens/software).We used an i686(Pentium III,1.0GHz)computer,running RedHat Linux(kernel version2.4.18-4smp).The stability of the EM calculations was checked by performing EM runs(maximally100steps)on a data matrix with permuted rows(RS data,so65,536 rows),starting from models initialised on ten random samples of500pixels each.In all cases,diﬀerences between loglikelihoods of the unpermuted and permuted data were on the order of10−8,so there seem to be no stability problems in the EM steps.For strategies II and III,less than100EM steps were needed to reach convergence in all cases;for strategy IV,this was the case for all relevant models.Occasionally,inappropriate models did not converge within100steps.5ResultsFor all data sets,we considered clusterings with1to20clusters,using strategies I–III. Because of time constraints,fewer possibilities were considered for strategy IV:in the MRI case4–18clusters,for the RS data set4–11clusters,and for the St.Paulia image3–15 clusters.For the smaller data sets,all ten model parametrizations available in mclust were considered in strategies I–III;for the RS data set,only the four most elaborate models were considered(EEE,EEV,VEV and VVV).For strategy IV only the EEV,VEV and VVV models were considered.Table1:Most frequently selected models in strategy I.The columns indicate diﬀerent sample sizes.Numbers in brackets indicate how often a particular model was selected. The true model for Simul12and Simul12N is VEV12;the true model for Simul6and Simul6N is VEV6.Simul12N and Simul6N contain1500noise points(5percent).Sample size5001000150020002500MRI VEV6,7,9(2)VEV7(5)VEV10(3)VEV7(3)VEV8(3) Paulia VEV4(3)VEV7(3)VEV7,9(3)VEV8(4)VEV9(4)RS VVV4(8)VVV6(5)VVV6(9)VVV6,7(5)VVV9(4) Simul12VEV4(4)VEV8,9(4)VEV7,8(3)VEV10–13(2)VEV12(5) Simul12N VEV4(3)VVV5,VEV7,8(2)VEV9(5)VEV9(4)VVV8,VEV9(3) Simul6VEV6(5)VEV6(5)VEV6(7)VEV6(6)VEV6(8) Simul6N VEV5(4)VEV6(4)VEV6,7(4)VEV6(6)VEV7(6)5.1Stability of Model SelectionThe images in Figure1,obtained by applying strategy I,with a sample size of500,cor-respond to seven diﬀerent clustering models,among which the(VEV,6),(VEV,7)and (VEV,9)models occur twice.Results like this are summarized in Table1.Sample sizes range from500to2500.For all images,the complexity of the selected model(i.e.the number of clusters)is observed to increase with the sample size.Apparently,more clusters are needed to describe the sample.This suggests either that some smaller clusters are missed with the smallest samples,or that the data are not exactly normally distributed and that including more Gaussian components in the mixture leads to a betterﬁt.The eﬀect is clearest for the RS data but is also found in the MRI and St.Paulia images,although it is not so clear from Table1:the variability in the selected models is much larger than with the RS image.In the case of a sample of500pixels from the St.Paulia image,models with3–6clusters were selected;in the case of samples of2500pixels,the models selected had7–15clusters.The true model for the simulated data sets Simul12and Simul12N is known to be(VEV,12), and models close to this are selected only for samples of at least2000pixels.On the other hand,all strategies and all sample sizes are able to pick the correct model(or a close one) for the VEV,6models of the Simul6and Simul6N data sets.Model selection for strategy II is the same as for strategy I,so we do not consider strategy II further in this section.Strategy III consists of doing EM on the whole image for theﬁve models with the highest BIC values for the sample.This invariably leads to more complex models than strategy I.If a VVV model is selected in strategy I,strategy III will typically select a VVV model with one or two extra clusters or,less often,a VEV model with5-6extra clusters; if a VEV model is chosen by strategy I,strategy III will pick a VVV model with the same number of clusters,or with one or two extra clusters,or a VEV model with more clusters (see Table2).The general trends,however,do not change:a larger sample leads to a more complex cluster model,and the variability in the models is much smaller for the RS image than for the other two images.If anything,the variability in the selected models is larger inTable2:The most frequently selected models in strategy III.See caption of Table1. Sample size5001000150020002500MRI VEV9(4)VEV10(5)VEV11(3)VEV13,VVV8(3)VEV10,12,13,VVV10(2) St.Paulia VEV6(4)VEV10(3)VVV8(3)VVV9(2)VEV14(4)RS VVV5(7)VVV8(8)VVV9(8)VVV10(5)VVV10,11(5) Simul12VEV6(4)VEV10(5)VEV11,13(3)VEV12,13(3)VEV12,14(3) Simul12N VEV6(6)VEV9(5)VEV10(5)VEV11(4)VEV9,VEV11,12(2) Simul6VVV6(4)VVV6(5)VVV6(7)VVV6(5)VVV6(8)Simul6N VVV6(3)VVV6(3)VVV7,8(3)VVV6(4)VVV7(4) Table3:The most frequently selected models in strategy IV.The range of models consid-ered is much smaller than with the other strategies(see text).MRI VVV18(6)VVV18(8)VVV18(7)VVV18(8)VVV18(5)St.Paulia VVV15(6)––––RS VEV15(5)––––Simul12VEV13,14(3)VEV13(4)VEV15(5)VEV13(5)VEV13(3)Simul6VEV6(7)VEV7(5)VEV6(7)VEV6(7)VEV6(8)strategy III than in strategy I.For the simulated data with12clusters,small samples lead to an underestimation of the complexity of the model.For samples of1000pixels and more,the models selected are approximately correct;this is an improvement compared to strategy I.The simulated data with six clusters seem to be overﬁt slightly:instead of a VEV model a VVV model is usually selected.Strategy IV is the most computationally expensive one;the sample is used only to initialize the clustering,and EM is performed using the full data set for all models.The results,using a limited set of models and a restricted range of numbers of clusters,are summarized in Table3.For the Simul6data set the correct model is retrieved in most of the runs;the model selected for the Simul12data set is usually slightly more complex than the model that generated the data.In the real data sets,in almost all cases the most complex cluster model possible is selected.5.2Stability of SegmentationsMore important than the actual models selected is the question of how diﬀerent the seg-mentation of the complete image is for the diﬀerent samples.To assess this,for each data set the clusterings from the ten samples are compared with the adjusted Rand indices;the result is the mean value of all possible45comparisons.To present an intuitive calibration of the scale of the adjusted Rand index,a graphical impression is presented in Figure4. The comparisons are between theﬁrst and second,third and second and seventh and sixth segmented images of Figure1,respectively.The adjusted Rand index of0.73for the last comparison is the highest found in this set of segmented images.One can see an obvious。