国土资源遥感, 2018, 30(2): 38-44 doi: 10.6046/gtzyyg.2018.02.05

技术方法

利用局部稀疏不变特征的遥感影像检索

胡屹群,, 周绍光,, 岳顺, 刘晓晴

河海大学地球科学与工程学院,南京 211100

Remote sensing image retrieval based on sparse local invariant features

HU Yiqun,, ZHOU Shaoguang,, YUE Shun, LIU Xiaoqing

School of Earth Science and Engineering, Hohai University, Nanjing 211100, China

通讯作者: 周绍光(1966-),男,副教授,主要从事遥感图像处理、道路网提取研究。Email:zhousg1966@126.com

第一联系人:

第一作者: 胡屹群(1990-),女,硕士研究生,主要研究方向为摄影测量与遥感。Email: 1174679344@qq.com

收稿日期: 2016-10-31   修回日期: 2017-02-14   网络出版日期: 2018-06-15

基金资助: 国家自然科学基金项目“高分辨率遥感影像中城市道路网的提取方法研究”.  编号: 41271420/D010702

Received: 2016-10-31   Revised: 2017-02-14   Online: 2018-06-15

Fund supported: .  编号: 41271420/D010702

摘要

为了增强遥感影像局部特征的表征能力并充分利用过完备字典的稀疏分解,提出了基于稀疏表示特征构建视觉词典的遥感影像检索新方法。首先,提取遥感训练影像库的局部不变特征,对大量的局部特征训练过完备字典并将在该字典更新下获取的稀疏表示作为图像的特征描述; 然后,对稀疏表示特征构建视觉词典,并进行空间金字塔匹配,获取稀疏直方图特征; 最后,使用稀疏特征训练SVM分类模型,通过分类模型输出与查询影像属于一个类别的影像,在该类别的影像集中进行相似度匹配,返回与查询影像最为相似的图像,实现检索。实验结果表明,新方法提取的特征不仅具备局部不变特征的鲁棒性,还提供了必要的语义信息,在影像检索领域具有较强的实用性和适用性。

关键词: 局部不变特征 ; 过完备字典 ; 稀疏表示 ; SVM分类模型 ; 影像检索

Abstract

In order to enhance the capability of local feature representation of remote sensing images and to make full use of sparse decomposition of over-complete dictionary, this paper proposes a new method of remote sensing image retrieval based on sparse representation feature. In this method first the local invariant features are extracted from the training remote sensing image database and trains over-complete dictionary based on local features, and thus the sparse representation will be obtained under the dictionary update; the authors regard the sparse representation as the image’s final feature description. Secondly, the authors construct a visual dictionary using sparse representation features, and obtain the sparse histograms by spatial pyramid matching algorithm. Finally, the SVM classification model is trained based on the sparse features, by using the classification model, the images classified as one category with the query image to be output. The similarity matching is carried out in the output image set, and an image with the largest similarity is returned to achieve the image retrieval. Experimental result shows that the features extracted by the new method not only possess the robustness of local invariant features but also provide the necessary semantic information, which is of great practicality and applicability in image retrieval research field.

Keywords: local invariant features ; over-complete dictionary ; sparse representation ; SVM classification model ; image retrieval

PDF (3446KB) 元数据 多维度评价 相关文章 导出 EndNote| Ris| Bibtex  收藏本文

本文引用格式

胡屹群, 周绍光, 岳顺, 刘晓晴. 利用局部稀疏不变特征的遥感影像检索. 国土资源遥感[J], 2018, 30(2): 38-44 doi:10.6046/gtzyyg.2018.02.05

HU Yiqun, ZHOU Shaoguang, YUE Shun, LIU Xiaoqing. Remote sensing image retrieval based on sparse local invariant features. REMOTE SENSING FOR LAND & RESOURCES[J], 2018, 30(2): 38-44 doi:10.6046/gtzyyg.2018.02.05

0 引言

随着遥感数据量呈几何级数的增长,如何从大容量的遥感影像库快速浏览和高效检索出感兴趣的目标或者影像成为了人们关注的焦点,也是目前遥感界迫切需要解决的问题之一。对于遥感影像数据库来说,一般的文字搜索模式的作用微乎其微。为了实现对遥感数据库的更精确、更高效的检索,近年来,基于内容的图像检索(content-based image retrieval, CBIR)技术在遥感图像检索中得到了广泛的应用[1,2]。与传统影像处理过程相似,特征提取是完成检索的核心内容,特征一般可分为低层视觉特征和高级语义特征。传统的检索方法主要是根据影像的低层视觉特征(如纹理、颜色和形状等)来进行检索[3],但是对于场景复杂、目标繁多的遥感影像来说,这些方法存在一定的局限性。局部不变特征(scale invariant feature transform,SIFT)[4,5]因具有较强的鲁棒性以及对不同场景和目标的独立性,在遥感影像检索中也得到了一定的应用[6]。由于SIFT特征提供的语义信息不充足,Yang等[7]提出了利用视觉词袋(bag of visual words,BOVW)[8]模型组合局部不变特征,通过构建视觉词典及空间金字塔匹配(spatial pyramid matching,SPM)[9]建立图像表示模型。进行遥感数据检索时,将影像的SIFT特征与BOVW模型结合在一起能够实现更好的影像检索功能[10],因而在影像分类和影像检索等遥感领域有广泛的应用。

然而,近10 a基于内容的遥感影像检索对影像提供的语义信息要求越来越高,传统的特征提取的方式不可能完全表达出影像的语义内容,因此,如何把影像的语义信息和机器提取的低层视觉特征联系在一起已经成为遥感领域基于内容检索的一大难题。与此同时,稀疏表示模型的研究[11]给遥感图像处理和计算机视觉等领域带来了深刻影响。基于过完备词典的稀疏表示是一种图像描述模型,利用词典中少量原子的线性组合来表示或者近似表示原始图像,实际上这些少量的原子已经捕获了图像的主要结构与本质属性。图像的稀疏表示模型不仅能获取图像有效的稀疏表达,还能揭示图像的语义信息。Mohamadzadeh等[12]提出使用稀疏表示进行图像检索的方法,该方法主要利用形状和纹理组合特征的稀疏表示进行图像检索,并认为稀疏表示特征可以减少检索时间和数据内存,简化搜索过程并尽可能找到需要的图像。但是相比SIFT特征的稀疏表示,上述2种组合的稀疏表示花费时间更多,过程更为复杂,这将会影响图像的检索效率。

本文提出的遥感影像检索新方法是通过基于图像SIFT特征的稀疏表示方式构建视觉BOVW,实现影像信号的有效描述,能够有效提高检索精度和效率,并且提取的稀疏表示特征在影像检索领域具有很强的适用性。

1 影像检索原理

本文提出的基于局部稀疏不变特征的遥感影像检索系统框架如图1所示。该系统的流程主要包括3个阶段: 建立稀疏表示特征数据库阶段、支持向量机(support vector machine,SVM)分类模型学习阶段和查询影像检索阶段。建立稀疏表示特征数据库阶段主要是对遥感影像数据库里每张影像提取的大量SIFT特征进行稀疏分解,获取的稀疏表示直接作为提取的影像特征,形成较大的特征数据库; SVM分类模型学习阶段是通过影像库里随机抽选的训练影像和测试影像学习并优化而获取一个最佳的分类模型,为后面的影像检索阶段提供相应指导; 查询影像检索阶段首先对训练好的SVM分类模型输入该影像的局部稀疏不变特征,接着SVM分类模型会判定查询影像所属的语义类别,最后查询影像在分类器输出的类别范围内进行相似度匹配,按照距离排名和评价,完成影像检索。

图1

图1   基于局部稀疏不变特征的遥感影像检索系统

Fig.1   Remote sensing image retrieval system based on sparse local invariant features


1.1 稀疏表示

近年来,稀疏表示已经成为了遥感图像处理的热门问题之一[13,14]。本文中,所有的数据都属于实数域。

假设输入信号bRm,信号分解是指n个基本原子aiRm的线性组合,(1≤in),构建信号的表达式为

b=a1x1+a2x2+…+anxn=Ax , (1)

式中: A=[a1,a2,…,an]∈Rm×n,为过完备字典; x= (x1,x2,,xn)TRn,为稀疏系数。通过式(2)的l1范数最小化求解就可以得到式(1)的最优稀疏表示为

x=argmin 12b-Ax22+λx1, (2)

式中: λ为正则化参数; ‖·‖2l2范数; ‖·‖1l1范数。

1.2 基于稀疏表示的遥感影像SIFT特征提取

与普通图像一样,遥感影像像素间存在相关统计性,所以一般情况下遥感影像也会含有大量的冗余信息。如何在提取SIFT特征的同时去除或者减少这些冗余特征的信息,采用何种方式对影像进行有效描述,是基于稀疏表示的遥感影像SIFT特征提取的研究初衷。首先,采用Lowe[5]的方法提取每幅影像的SIFT特征,即用均匀网格划分影像,确定图块(patch)大小,计算特征向量(dense sift);然后,以每幅影像的特征向量数据为原始信号,采用KSVD算法[15]训练过完备字典,OMP(orthogonal matching pursuit)算法进行稀疏编码[16],即影像SIFT特征集的稀疏表示。

1.2.1 KSVD算法

因为KSVD算法构建的过完备字典是来自于训练数据本身,所以这些训练数据能够充分被表示。该算法是一种基于矩阵奇异值分解的泛化K均值聚类算法。

设有训练SIFT特征集Y= {yi}i=1M(YRN×M),其中每个特征都是一个N维向量,该算法的目的就是通过训练数据集获取最佳过完备字典D={d1,d2,…,dk}(DRN×K,NK)来表示Y,其中(d1,d2,…,dK)表示词典里的K个单词。

影像训练特征的稀疏表示为

Y-DX22=Y-j=1KdjxTj22=Y-jkdjxTj-dkxTk22=Ek-dkxTk22, (3)

式中 xTk是稀疏系数矩阵X的第k行,通过更新dkxTk缩小与Ek的误差,就可以获得上式的最小值。

1.2.2 OMP算法

利用OMP算法实现SIFT特征的基于过完备字典的稀疏分解,该算法属于贪婪追踪算法,其主要思想是寻找使得影像在过完备字典上具有最大投影的少数单词,不断逼近原始影像。OMP算法在分解中选择最佳的匹配单词,使用Gram-Schmidt正交化方法进行正交化处理,接着将影像在这些正交原子构建的空间上投影,在稀疏分解的过程中,OMP算法不仅精度要求高,而且收敛速度快,计算时间少。利用OMP算法,影像的局部特征集y经过N次分解得到,即

y= K=0N-1aKxK+RNy ,且<RNy,xK>=0, K=0,…,N-1, (4)

式中: xK为第K次分解得到的分量; aK为第K次分解得到的分量系数。

2 基于局部稀疏不变特征检索流程

2.1 特征提取

2.1.1 SIFT特征

数据库中遥感影像大小为M×N像素,以a×a 网格大小无重叠地划分影像,一幅影像有(M/a)×(N/a)个图像块。设图像块大小为(2a×2a),一个图像块计算一个描述子,即一个特征向量。计算时,每个图像块划分为(a/2)×(a/2) 个方块(Bins),每个方块可以提取8维的SIFT特征,所以每个图像块获取的特征向量的维数为(a/2)×(a/2)×8=2a2 。每个图斑向左移动一个网格就获得新的图像块。以此类推,移动到影像边缘,然后往下移动。最终一幅遥感影像由(M/a-1)×(N/a-1) 个2a2 维特征向量表示。

2.1.2 SIFT特征的稀疏表示

设置Yi为第i幅影像的特征数据集,i∈(1,m),其中m为影像库中影像个数。利用KSVD算法获取每类影像的过完备字典D,设置Dk为第k类训练图像的过完备字典,k∈(1,n),n为影像数据集所有类别数目。在已知影像的特征数据集Y和相应类别的过完备字典D以及稀疏度L的条件下,利用OMP算法对影像特征数据进行重构,得到每幅影像在其所属类别的过完备字典下的稀疏系数Xi ,i∈(1,m)。本文把影像根据过完备字典进行稀疏分解后获得稀疏系数Xi直接作为低层特征。

2.2 特征建模

在获取影像集所有影像的局部稀疏特征之后,采用K-means聚类法将局部区域或者图斑的特征进行聚类。每个聚类中心看作视觉词典中一个视觉词汇(visual word),视觉词汇由聚类中心对应特征形成的码字(code word)表示,这就是特征量化过程。所有视觉词汇形成的视觉词典(visual vocabulary)就对应一本码书(code book),词典大小由词汇的个数决定每个词汇由一个2a2维特征向量表示。影像中每个特征都被映射到视觉词典中某个词汇上,这种映射通过计算特征的距离去实现。然后通过统计每个视觉单词在一幅影像特征里出现的频数,获取每幅影像的特征袋(bag of features,BOF)。在每幅影像对应的BOF提取完成的基础上,利用空间金字塔匹配模型,获得每一幅影像全局金字塔直方图特征,事实上该特征是稀疏向量。

2.3 分类检索

本文选择采用SVM分类模型进行语义检索,尝试在影像低层特征和影像高级语义信息上建立一定的联系。首先,采用SVM分类器根据提取的不同类别影像的特征,学习影像的不同类别表示方法,即要表达的不同语义信息,将训练好的分类模型保存起来,对查询影像提取其相应的低层特征; 然后,利用训练好的分类模型将提取到的低层特征在影像语义类别上进行判定,将查询影像定位到相应影像类别的范围内; 最后,在这个类别范围内进行欧式距离检索。为了避免因图像分类的错误而导致检索结果的差错,对查询图像分类的返回结果取其最相似的前3个作为其分类结果,进行图像检索时查询图像只要与数据库中属于前3类别的图像进行相似度计算,返回与其最相似的影像。

3 实验结果分析

3.1 实验数据和评估指标

实验选择Merced Land Use Dataset公开的高空间分辨率遥感影像库,包含21类卫星影像。其中每个类别含有100幅影像,从每类随机选择10幅影像训练,剩余90幅影像用于测试,每幅影像的大小都是256像元×256像元。选用影像库的示例如图2所示。另外,在特征建模算法步骤中,本文统一设置视觉词典的词汇M=200,空间金字塔层数L=3,因为在参数一致的情况下,提出的检索方法查准率和查全率相对提高,就能有效证明本文方法的优势。

图2

图2   Merced Land Use Dataset遥感影像示例

Fig.2   Sample remote sensing images from Merced Land Use Dataset


3.2 分类检索结果与对比分析

图3是本文方法检索的可视化效果图,以查询公路为例,从公路影像库任意选择一幅影像,根据相似度匹配值,从大到小排序,返回与查询影像最为相似的前20幅影像,被错误检索的用加粗标注。从图3可以发现,只有图3(q)影像是错误的,该影像应该属于高尔夫球场,其余影像则是检索正确,都属于公路影像。

图3

图3   本文方法的可视化结果

Fig.3   Visualization results of new method in this paper


将本文方法与基于SIFT特征传统检索方法和基于纹理稀疏特征检索方法进行对比,其中纹理稀疏不变特征的获取方法是将每幅图像以8像元×8像元大小进行无重叠切块,接着对图像块的纹理特征进行字典训练和稀疏编码,从而获取图像块的纹理稀疏表示特征。3种方法以分类精度和Kappa系数作为SVM分类效果评价指标,以查准率和查全率作为检索性能的评价指标。

为了便于分析比较,本文计算了传统的SIFT、纹理稀疏不变特征和局部稀疏不变特征3种影像检索方法的分类精度和Kappa系数(表1),其SVM分类效果对比如图4所示。

表1   3种方法分类效果对比表

Tab.1  Comparison of three methods’classification result

方法平均精度总体精度Kappa
SIFT0.623 60.592 70.579 3
纹理稀疏特征0.845 50.827 00.818 3
本文方法0.880 40.866 10.859 4

新窗口打开| 下载CSV


图4

图4   3种方法SVM分类效果对比

Fig.4   Comparison of three methods for SVM classification model


表1图4可以看出,本文方法在训练测试影像数量为1:9的情况下保持了平均88.01%的影像分类正确率,有效证明利用局部稀疏不变特征的分类效果高于前2种分类方法。该方法能够使多种类别的影像在大多数情况下得到正确分类,提供较精确的影像语义类别信息,而在正确分类情况下能够获取较高的影像查准率和查全率。因此本文选取了基于SIFT的稀疏表示构建SVM分类模型,用该模型指导基于内容的影像检索应用。采用图像检索系统中应用最为广泛的性能评价准则查准率、查全率以及相应的查准率-查全率曲线(图5)。

图5

图5   查准率-查全率曲线(平滑后的曲线)

Fig.5   Precision-Recall curve (smoothed curve)


图5可以看出,查准率和查全率之间存在着相互依赖和相互制约的关系,如果提高查准率,就会降低其查全率,反之亦然。总体而言,该曲线越偏向右上方,表明该方法的检索性能越好。由此可知,本文方法在查准率和查全率上具有优势。

另外考虑检索影像的排序情况,本文还采用了在MPEG-7标准化处理中广泛使用的平均归一化修改检索等级(average normalize modified retrieval rank, ANMRR),ANMRR的取值范围为0~1,取值越小,说明检索效果越好。具体的计算过程见参考文献[12]。

通过计算ANMRR值进行评价(表2),获知被检索出的相关影像的个数和排序情况。

表2   ANMRR值对比

Tab.2  Comparison of three methods’ ANMRR

类别SIFT纹理稀疏特征本文方法
ANMRR0.597 70.473 20.411 3

新窗口打开| 下载CSV


表2中ANMRR值的定量比较证明了本文方法的检索性能明显优于前面2种遥感影像检索算法。

4 结论

依据稀疏表示模型的实效性和适用性,研究了一种结合局部稀疏不变特征和视觉词袋模型的遥感影像检索新方法,解决了传统局部特征带来的存储困难、计算复杂的问题。首先,以每幅影像大量的局部不变特征作为原始数据,使用KSVD算法学习过完备字典,OMP算法获取稀疏系数矩阵,将获取的稀疏系数矩阵替代原始的密集局部不变特征; 接着,利用视觉词袋模型和空间金字塔匹配算法获取新的直方图向量,作为每幅影像最终全局表示; 最后,引入最佳SVM分类模型,通过输入查询影像的稀疏特征判定其所属类别,在类别范围内进行相似度匹配,完成检索。实验表明,与传统的局部特征检索方法相比,新方法在提高检索准确性的同时,大大减少所需存储局部不变特征的数量,提高了检索的查准率和查全率,为稀疏表示模型在遥感影像检索研究领域开拓了新思路。

参考文献

Du P J, Chen Y H, Tang H , et al.

Study on content-based remote sensing image retrieval

[C]//Proceedings of 2005 IEEE International Geoscience and Remote Sensing Symposium.Seoul:IEEE, 2005.

[本文引用: 1]

程起敏 . 遥感图像检索技术[M]. 武汉: 武汉大学出版社, 2011.

[本文引用: 1]

Cheng Q M. Remote Sensing Image Retrieval Technologies[M]. Wuhan: Wuhan University Press, 2011.

[本文引用: 1]

Dos Santos J A,Penatti O A B,Torres R D S .

Evaluating the Potential of Texture and Color Descriptors for Remote Sensing Image Retrieval and Classification

[R].Technical Report-IC-09-47, 2009.

[本文引用: 1]

Nandhini R, Joel T .

Geographic image retrieval using local invariant features with euclidean distance

[J]. IEEE International Journal for Research and Development in Engineering, 2014. 222-225.

URL     [本文引用: 1]

ABSTRACT:-An image retrieval system is a computer system for browsing, searching and retrieving images from a large database of digital images. A robust natural and geographic image retrieval using a supervised classifier which concentrates on extracted features is proposed. Gray level co- occurrence matrix (GLCM), Scale invariant feature technique(SIFT) and moment invariant features are implemented to extract the features from natural images. Then these features are passed through SVM classifier. SVM classifies whether the input is Geographic or natural image. Based on the SVM result, the retrieval process is done with Euclidean distance. The performance comparison is done with standard features such as colour and texture.

Lowe D G .

Distinctive image features from scale-invariant keypoints

[J]. International Journal of Computer Vision, 2004,60(2):91-110.

DOI:10.1023/B:VISI.0000029664.99615.94      URL     [本文引用: 2]

吴锐航, 李绍滋, 邹丰美 .

基于SIFT特征的图像检索

[J]. 计算机应用研究, 2008,25(2):478-481.

DOI:10.3969/j.issn.1001-3695.2008.02.049      URL     [本文引用: 1]

提出一种多尺度图像检索算法,该算法基于SIFT特征提取,它将一幅图像转换成特征向量的集合,图像间的相似距离是通过计算两幅图像特征向量间的欧氏距离来实现的。实验结果很好地说明了该算法具有尺度、平移、旋转不变性,一定的仿射、光照不变性以及算法能很好地应用在特定形状特征目标的检索中。

Wu R H, Li S Z, Zou F M .

Image retrieval based on SIFT features

[J]. Application Research of Computers, 2008,25(2):478-481.

[本文引用: 1]

Yang Y, Newsam S .

Geographic image retrieval using local invariant features

[J]. IEEE Transactions on Geoscience and Remote Sensing, 2013,51(2):818-832.

DOI:10.1109/TGRS.2012.2205158      URL     [本文引用: 1]

This paper investigates local invariant features for geographic (overhead) image retrieval. Local features are particularly well suited for the newer generations of aerial and satellite imagery whose increased spatial resolution, often just tens of centimeters per pixel, allows a greater range of objects and spatial patterns to be recognized than ever before. Local invariant features have been successfully applied to a broad range of computer vision problems and, as such, are receiving increased attention from the remote sensing community particularly for challenging tasks such as detection and classification. We perform an extensive evaluation of local invariant features for image retrieval of land-use/land-cover (LULC) classes in high-resolution aerial imagery. We report on the effects of a number of design parameters on a bag-of-visual-words (BOVW) representation including saliency- versus grid-based local feature extraction, the size of the visual codebook, the clustering algorithm used to create the codebook, and the dissimilarity measure used to compare the BOVW representations. We also perform comparisons with standard features such as color and texture. The performance is quantitatively evaluated using a first-of-its-kind LULC ground truth data set which will be made publicly available to other researchers. In addition to reporting on the effects of the core design parameters, we also describe interesting findings such as the performance-efficiency tradeoffs that are possible through the appropriate pairings of different-sized codebooks and dissimilarity measures. While the focus is on image retrieval, we expect our insights to be informative for other applications such as detection and classification.

Karakasis E G, Amanatiadis A, Gasteratos A , et al.

Image moment invariants as local features for content based image retrieval using the bag-of-visual-words model

[J]. Pattern Recognition Letters, 2015,55:22-27.

DOI:10.1016/j.patrec.2015.01.005      URL     [本文引用: 1]

This paper presents an image retrieval framework that uses affine image moment invariants as descriptors of local image areas. Detailed feature vectors are generated by feeding the produced moments into a Bag-of-Visual-Words representation. Image moment invariants have been selected for their compact representation of image areas as well as due to their ability to remain unchanged under affine image transformations. Three different setups were examined in order to evaluate and discuss the overall approach. The retrieval results are promising compared with other widely used local descriptors, allowing the proposed framework to serve as a reference point for future image moment local descriptors applied to the general task of content based image retrieval.

周维勋, 邵振峰, 侯继虎 .

利用视觉注意模型和局部特征的遥感影像检索方法

[J]. 武汉大学学报(信息科学版), 2015,40(1):46-52.

DOI:10.13203/j.whugis20130130      URL     Magsci     [本文引用: 1]

利用尺度不变特征变换(scale invariant feature transform,SlFT)算子直接提取遥感影像局邵特征进行检索时存在关键点数目多、特征维数高等问题,因此,本文利用视觉注意模型,根据目标显著性的大小从影像上提取显著目标区域,并采用K-means聚类方法对提取的SIF丁局邵特征进行聚类,得到用于检索的特征向量实验结果表明,该方法不仅符合人眼的视觉特性,且在降低SI FT关键点数目和特征维数的同时提高了检索精度和检索效率.

Zhou W X, Shao Z F, Hou J H .

Remote sensing imagery retrieval method based on visual attention model and local features

[J]. Geomatics and Information Science of Wuhan University, 2015,40(1):46-52.

Magsci     [本文引用: 1]

Lazebnik S, Schmid C, Ponce J .

Beyond bags of features:Spatial pyramid matching for recognizing natural scene categories

[C]//Proceedings of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.New York:IEEE, 2006: 2169-2178.

[本文引用: 1]

Yang J C, Wright J, Huang T S , et al.

Image super-resolution via sparse representation

[J]. IEEE Transactions on Image Processing, 2010,19(11):2861-2873.

DOI:10.1109/TIP.2010.2050625      URL     PMID:20483687      [本文引用: 1]

This paper presents a new approach to single-image superresolution, based upon sparse signal representation. Research on image statistics suggests that image patches can be well-represented as a sparse linear combination of elements from an appropriately chosen over-complete dictionary. Inspired by this observation, we seek a sparse representation for each patch of the low-resolution input, and then use the coefficients of this representation to generate the high-resolution output. Theoretical results from compressed sensing suggest that under mild conditions, the sparse representation can be correctly recovered from the downsampled signals. By jointly training two dictionaries for the low- and high-resolution image patches, we can enforce the similarity of sparse representations between the low-resolution and high-resolution image patch pair with respect to their own dictionaries. Therefore, the sparse representation of a low-resolution image patch can be applied with the high-resolution image patch dictionary to generate a high-resolution image patch. The learned dictionary pair is a more compact representation of the patch pairs, compared to previous approaches, which simply sample a large amount of image patch pairs , reducing the computational cost substantially. The effectiveness of such a sparsity prior is demonstrated for both general image super-resolution (SR) and the special case of face hallucination. In both cases, our algorithm generates high-resolution images that are competitive or even superior in quality to images produced by other similar SR methods. In addition, the local sparse modeling of our approach is naturally robust to noise, and therefore the proposed algorithm can handle SR with noisy inputs in a more unified framework.

Mohamadzadeh S, Farsi H .

Content-based image retrieval system via sparse representation

[J]. IET Computer Vision, 2016,10(1):95-102.

DOI:10.1049/iet-cvi.2015.0165      URL     [本文引用: 2]

The aim of image retrieval systems is to automatically assess, retrieve and represent relative images-based user demand. However, the accuracy and speed of image retrieval are still an interesting topic of many researches. In this study, a new method based on sparse representation and iterative discrete wavelet transform has been proposed. To evaluate the applicability of the proposed feature-based sparse representation for image retrieval technique, the precision at percent recall and average normalised modified retrieval rank are used as quantitative metrics to compare different methods. The experimental results show that the proposed method provides better performance in comparison with other methods.

Olshusen B A, Field D J .

Sparse coding with an overcomplete basis set:A strategy employed by V1?

[J]. Vision Research, 1997,37(23):3311-3325.

DOI:10.1016/S0042-6989(97)00169-7      URL     PMID:9425546      [本文引用: 1]

Abstract The spatial receptive fields of simple cells in mammalian striate cortex have been reasonably well described physiologically and can be characterized as being localized, oriented, and bandpass, comparable with the basis functions of wavelet transforms. Previously, we have shown that these receptive field properties may be accounted for in terms of a strategy for producing a sparse distribution of output activity in response to natural images. Here, in addition to describing this work in a more expansive fashion, we examine the neurobiological implications of sparse coding. Of particular interest is the case when the code is overcomplete--i.e., when the number of code elements is greater than the effective dimensionality of the input space. Because the basis functions are non-orthogonal and not linearly independent of each other, sparsifying the code will recruit only those basis functions necessary for representing a given input, and so the input-output function will deviate from being purely linear. These deviations from linearity provide a potential explanation for the weak forms of non-linearity observed in the response properties of cortical simple cells, and they further make predictions about the expected interactions among units in response to naturalistic stimuli.

Wright J, Ma Y, Mairal J , et al.

Sparse representation for computer vision and pattern recognition

[J]. Proceedings of the IEEE, 2010,98(6):1031-1044.

DOI:10.1109/JPROC.2010.2044470      URL     [本文引用: 1]

Techniques from sparse signal representation are beginning to see significant impact in computer vision, often on nontraditional applications where the goal is not just to obtain a compact high-fidelity representation of the observed signal, but also to extract semantic information. The choice of dictionary plays a key role in bridging this gap: unconventional dictionaries consisting of, or learned from, the training samples themselves provide the key to obtaining state-of-the-art results and to attaching semantic meaning to sparse signal representations. Understanding the good performance of such unconventional dictionaries in turn demands new algorithmic and analytical techniques. This review paper highlights a few representative examples of how the interaction between sparse signal representation and computer vision can enrich both fields, and raises a number of open questions for further study.

Aharon M, Elad M, Bruckstein A .

K-SVD:An algorithm for designing overcomplete dictionaries for sparse representation

[J]. IEEE Transactions on Singal Processing, 2006,54(11):4311-4322.

DOI:10.1109/TSP.2006.881199      URL     [本文引用: 1]

霍宏 .

生物视觉启发的高分辨率遥感影像特征提取与目标检测研究

[D]. 上海:上海交通大学, 2014.

[本文引用: 1]

Huo H .

Biological Vision-Inspired Feature Extraction and Object Detection for High Resolution Remote Sensing Images

[D]. Shanghai:Shanghai Jiao Tong University, 2014.

[本文引用: 1]

/

京ICP备05055290号-2
版权所有 © 2015 《自然资源遥感》编辑部
地址:北京学院路31号中国国土资源航空物探遥感中心 邮编:100083
电话:010-62060291/62060292 E-mail:zrzyyg@163.com
本系统由北京玛格泰克科技发展有限公司设计开发