Frequency Layered Color Indexing for Content-Based Image Retrieval

更新时间:2023-05-07 19:42:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

102IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL.12,NO.1,JANUARY2003 Frequency Layered Color Indexing for

Content-Based Image Retrieval

Guoping Qiu and Kin-Man Lam

Abstract—Image patches of different spatial frequencies are likely to have different perceptual significance as well as reflect different physical properties.Incorporating such concept is helpful to the development of more effective image retrieval techniques. In this paper,we introduce a method which separates an image into layers,each of which retains only pixels in areas with similar spatial frequency characteristics and uses simple low-level features to index the layers individually.The scheme associates indexing features with perceptual and physical significance thus implicitly incorporating high level knowledge into low level features.We present a computationally efficient implementation of the method, which enhances the power and at the same time retains the sim-plicity and elegance of basic color indexing.Experimental results are presented to demonstrate the effectiveness of the method. Index Terms—Color indexing,content-based image retrieval, human vision,signal analysis.

I.I NTRODUCTION

R EPRESENTATION is fundamental in both biological and computational vision systems[23].An effective,efficient and suitable representation is the key starting point to building image processing and computer vision systems.In many ways, the success or failure of an algorithm depends greatly on an appropriately designed representation.In the computer vision community,it is a common practice to classify representation schemes as low-level,intermediate-level and high level.Low-level deals with pixel level features,high level deals with ab-stract concepts and intermediate level deals with something in between.Whilst low level vision is fairly well studied and we have a good understanding at this level,mid and high level concepts are very difficult to grasp,certainly extremely diffi-cult to represent using computer bits.In the signal processing community,an image can be represented in the time/spatial do-main and in the frequency/spectral domain.Both time domain and frequency domain analysis technologies are very well de-veloped[21],[22].A signal/image can be represented as time sequence or transform coefficients of various types,Fourier, Wavelet,Gabor,KLT,etc.These coefficients often provide a convenient way to interpret and exploit the physical properties of the original signal.Exploiting well-established signal anal-

Manuscript received April26,2002;revised September16,2002.The as-sociate editor coordinating the review of this manuscript and approving it for publication was Dr.Tamas Sziranyi.

G.Qiu is with the the School of Computer Science,The University of Not-tingham,Nottingham NG81BB,U.K.,and also with the Center for Multimedia Signal Processing,Department of Electronic and Information Engineering,The Hong Kong Polytechnic University,Hong Kong(e-mail:qiu@1fff9e858762caaedd33d47f). 1fff9e858762caaedd33d47fm is with the Center for Multimedia Signal Processing,Department of Electronic and Information Engineering,The Hong Kong Polytechnic Uni-versity,Hung Hum,Hong Kong(e-mail:enkmlam@1fff9e858762caaedd33d47f.hk). Digital Object Identifier10.1109/TIP.2002.806228ysis technology to represent and interpret vision concepts could be a fertile area for making progress.The human vision system is extremely sophisticated and powerful.Building machines that can mimic humans ability to process visual information is the “Holy Grail”of vision research.Even though much about the human vision system remains unknown,many theories of bio-logical vision exist,e.g.,[15]–[19].These human vision theories could provide guidance to building practical engineering solu-tions to vision tasks,e.g.,[12].

Content-based image and video indexing and retrieval have been a popular research subject in many fields related to com-puter science for over a decade[1],[2].Of all the challenging issues associated with the indexing and retrieval tasks,“retrieval relevance”[20]is probably most difficult to achieve.The diffi-culties can be explained from a number of perspectives.First, relevance is a high level concept and is therefore difficult to describe numerically/using computer bits.Secondly,traditional indexing approaches mostly extract low-level features in a low-level fashion and it is therefore difficult to represent relevance using low-level features.Because low-level features can bear no correlation to high level concepts,the burden of relevant re-trieval has to be on high-level retrieval strategies,which is again hard.We believe one way in which one can make progress is to develop numerical representations(low-level features)that not only have clear physical meanings but also can be related to high level perceptual concepts.From an engineering point of view, such representation should be easy to compute,efficient to store and which should also render simple and effective retrieval. There is apparent evidence to suggest that the human vision system consists of frequency sensitive channels[14]–[18].In other words,when we see the visual world,we perform some form of frequency analysis among many other complicated and not yet understood processing.Following the frequency anal-ysis argument,it can be understood,that when a subject is pre-sented an image in front of her,she will“decompose”the image into various frequency components and processing each com-ponent with different processing channels(presumably in a par-allel fashion).It is convenient to view such a process as decom-posing the image into different layers,each layer consisting of an image the same size as the original one,but only pixels in areas within a certain spatial frequency range are retained in each layer,i.e.,a bandpass filtered version of the original image. On each layer,only those grid positions where the pixels have a certain“busyness”will have values and other grid positions will be empty.It should be noted that we are not suggesting such “analyses”occur in the human visual system,we are simply using such a representation,which we believe is more informa-tive and illuminating in our current application.

1057-7149/03$17.00?2003IEEE

QIU AND LAM:FREQUENCY LAYERED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEV AL

103

Fig.1.Schematic of frequency layered image

indexing.

1fff9e858762caaedd33d47fputationally efficient layer classification method.

In this paper,we apply the physiological and psychophysical

ideas of multiple spatial frequency channels [14]–[18]and dig-

ital signal analysis (processing)techniques [13],[20]–[23]to the

development of an efficient and effective image indexing method

for content-based image retrieval.The organization in the rest of

the paper is as follows.In Section II,we briefly review related

literature on representing image features for content-based

image retrieval.In Section III,we present the motivation and

the framework for representing an image in frequency classified

layers.Section IV presents an image-indexing algorithm based

on the frequency layered representation.Section V presents

experimental results and Section VI concludes our presentation.

II.B ACKGROUND

In the past decade,content-based image retrieval has attracted

extensive research interest [1],[2].Content description is the

crucial first step and many methods have been proposed in

the Fig.3.Magnitude of the Laplacian as a function of the input frequency.literature.Simple color indexing,or the color histogram method [4],has been shown to be very effective.The advantage of using

104IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL.12,NO.1,JANUARY

2003

Fig.4.Example images and their frequency-classified layers.A white color indicates that the pixel is absent from that position.

color histogram is that it is very simple and easy to implement. Color histogram is also insensitive to scale,rotation and trans-lation.Furthermore,color histogram is robust against occlusion and changes in camera view points.Studies have shown color histogram based image retrieval to be very successful for small size databases[4].On the other hand,when the databases con-tain tens of thousands of images,color histogram based methods are less effective[1].

One of the problems of the color histogram method is that only color information is recorded and information regarding the spatial positions of the color is not included.Consequently, two images with very different spatial color layouts can have the same color histogram.When the database is large,the chances of (visually)different images having similar histograms increase, thus reducing the effectiveness of the method.

Recognizing this problem,many authors have proposed dif-ferent approaches to remedy the scheme.One such approach is to divide an image into subimages and describe each subimage with a separate color histogram[5],[6].The drawback of such method is that it is computationally expensive and the storage overheads are high.Also these methods cannot

accommodate

Fig.5.Accumulated histogram of Laplacian images. translation and rotation of color regions.A more sophisticated approach is to use image segmentation[7].Although this tech-nique can identify regions more accurately,the difficulty asso-ciated with image segmentation makes it complicated both in terms of feature extraction and matching.It has been argued that image retrieval will probably not require accurate image segmentation[1].

QIU AND LAM:FREQUENCY LAYERED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEV AL105

Fig.6.Examples of the precision query images.When an image in(a)was

used as query,the unique correct answer is its corresponding image in(b),or

vice versa.

Another line of work designed to improve the use of color in

image retrieval is to incorporate local spatial relations.In[8],

each pixel is classified as coherent or noncoherent,based on

whether the pixel and its neighbors have similar colors.Such

approach can distinguish widely scattered pixels from closely

clustered pixels,thus increasing the discrimination power.A

similar method,known as the color structure descriptor is de-

fined in the MPEG-7standard[9].An8

106IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL.12,NO.1,JANUARY

2003

Fig.7.Precision query and correct answer images.The numbers indicate the rank of the correct answer for the two methods.

and color vision.Luminance vision is able to detect sharp edges and the fine details of patterns and textures in the image whereas color vision is left to“fill in”the color of objects and forms[19]. This may suggest that frequency analysis occur mostly in the lu-minance channel.

Digital signal processing researchers have developed a wealth of technologies to analyze physical phenomena such as sharp-ness/roughness of a signal/image.The most effective way is fre-quency analysis;technologies ranging from linear filter to filter banks have been well studied[21],[22].A busy/sharp area is as-sociated with higher frequency components and a flat area has lower frequency distributions.A busy area may be associated with textured surfaces or object boundaries;a flat area may be associated with backgrounds or interior of objects.Therefore a red color in a flat area may signify a red background or large red objects with a flat surface and a similar red color in a busy area may be indications of red colored textured surfaces or red object boundaries.

It is therefore clear from human perception’s point of view that;different frequency components of the visual stimulus may

QIU AND LAM:FREQUENCY LAYERED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEV AL

107

Fig.7.(Continued.)Precision query and correct answer images.The numbers indicate the rank of the correct answer for the two methods.

be treated differently by the visual system.Physically,different frequency components of the signal may correspond to different objects or object parts.Consequently,different frequency com-ponents of the visual stimulus may not only have different per-ceptual significance,they may also correspond to different phys-ical objects.We therefore reason that the process of judging the similarity of images may occur as follows:a frequency anal-ysis is performed and the visual stimuli are decomposed into different frequency components;the corresponding frequency components are compared and then a final similarity score is

108IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL.12,NO.1,JANUARY 2003generated.We must stress that we are not suggesting such pro-

cesses happen in the visual system,we believe such reasoning

makes “engineering”sense,i.e.,it helps us in the development

of useful systems as will be demonstrated in the rest of the paper.

Based on these observations and reasoning,we propose a spa-

tial frequency layered approach to image indexing.A schematic

is illustrated in Fig.1.Let be the input image array,

th layer image corresponding to the

Otherwise (4)where

Empty

is a Laplacian image.The magnitude of the image

indicates the sharpness of the image areas.A busy area,which

contains many sharp changes will result in large differences be-

tween the original image and the Gaussian smoothed image.A

flat area will result in small differences between the original

and the Gaussian image.Therefore the pixel magnitude in the

Laplacian is an indication of the roughness of the image area

surrounding the pixel.The scheme therefore effectively classi-

fies pixels in areas with similar roughness into the same layer.

We can formally justify the scheme as well.For analysis

convenience,we perform our analysis on 1-D signals.Let the

Gaussian filter’s impulse response be and

QIU AND LAM:FREQUENCY LAYERED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEV AL109

Fig.8.Example query images and their correct answers.The first image in each group were used as query and the rest used as the correct answers.

In the frequency domain,the spectral function of

passing through a linear filter

whose frequency response function is

110IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL.12,NO.1,JANUARY2003 An explanation of(13)is in order.Instead of building one his-

togram,we are building multiple color histograms for an image,

each taking colors from pixels in areas with similar sharpness.It

is therefore immediate that(13)will be more powerful than the

original simple color histogram.This approach not only indexes

the color,but also associates colors with their surface roughness.

As argued in the previous section,such an association not only

has perceptual significance,it also makes sense physically.Fig.4

shows some examples of image layers.Such representation will

allow not only matching the entire contents of the image,but

also allow different weightings to be given to different layers

depending on the user’s requirements.For example in the first

image in Fig.4,if a user is more interested in retrieving Chee-

tahs,then more weighting can be given to the features of the last

two layers and less weighting to the first two layers.

Computationally,the scheme is very simple.The filtering and

classification(thresholding)can be done very fast.The dimen-

sion of the color histogram does not have to be high.We imple-

mented our experiments with four-layer and64-color LCI(the

indexing feature is a256-dimensional vector,which has similar

complexity to other state of the art methods,e.g.,color correlo-

gram[10])and we have observed very good performances.We

report detailed experiments in the next section.

V.E XPERIMENTAL R ESULTS

To build the database,we used four layers and a64-color

quantizer to build the layered color indexing.Each image

was therefore represented by a

(4

,,

then

the

.The same

64colors were used in all experiments.Therefore the com-

plexity of the two methods was the same and the colors used

were exactly the

same.

Fig.9.Average recall performances of the color correlogram method and the

new LCI method.

A.The Database

The database used in our experiment consisted of20000

color images from the commercially available Corel color

photo collection.To evaluate the retrieval precision and recall

performances,we have also collected three sets of testing data.

The first set consisted of328-pair similar images that were used

to test retrieval precision.The second set consisted of99query

images and for each query,there were three to30“relevant”or

“correct”answers and this set was used to test the recall ability of

the techniques.The sources of these two testing sets came from

various places,some were photographed by us and some were

from various sites on the Internet.The third query and answer set

came from three categories in the Corel database itself and this

was used to test category query ability of the techniques.

B.Retrieval Precision

In this experiment,we collected328query images,each had a

unique answer.Some examples of the queries and corresponding

answers are shown in Fig.6.To measure the performance,we

define several numerical measures similar to those used in[10].

Let

returns.For

small

(15)

Similar to[4],we define

an

indicates better

performance

measure as(17).A lower rank of the

correct answer contributes more to the measure and therefore

a larger value

of

QIU AND LAM:FREQUENCY LAYERED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEV AL

111

Fig.10.Example images from the three categories in the Corel database.(a)Lions,(b)Tigers,and (c)Cheetah.

A method is good if it has

small

and divided by the number of queries.The results

of

and

112IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL.12,NO.1,JANUARY2003

(a)(b)

(c)(d)

Fig.11.Category query performance.(a)Results on the Lion category.(b)Results on the Tiger category.(c)Results on the Cheetah category.(d)Aggregated result over all three categories.For the Tiger category,the two methods performed similarly,but for the other two categories,our new method gave better performance. Overall,our method performed better.

complexity,LCI clearly outperformed CC.Fig.7shows some

example query images and their correct answers,also shown are

the rank positions of the correct answers.For the vast majority

of the queries,LCI gave much better performance.There were

cases,however,CC gave better performances;such examples

are also given in this figure.

C.Recall Ability

In this experiment,we collected99query images,each had

a set of hand labeled“correct”answers.The number of correct

answers for different queries ranged from three to30.Some ex-

ample queries and their correct answers are shown in Fig.8.Let

th query image and be the

is a weighted score of how many correct answers are

returned in the first

QIU AND LAM:FREQUENCY LAYERED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEV AL113

VI.C ONCLUDING R EMARKS

In this paper,we have introduced a new method for color in-dexing.It can be considered as an extension to the classic color indexing.Based on human vision theories and digital signal analysis,we argued that image patches of different spatial fre-quencies would have different perceptual as well as physical significance.We incorporated such an argument into the devel-opment of an efficient image indexing scheme which has been shown to give excellent performance in content-based image re-trieval.The method significantly enhances the power of color in-dexing but at the same time retains its simplicity and elegance.

R EFERENCES

[1] A.W.M.Smeulders et al.,“Content-based image retrieval at the end of

the early years,”IEEE Trans.Pattern Anal.Machine Intell.,vol.22,pp.

1349–1380,Dec.2000.

[2]Y.Rui,T.S.Huang,and S.-F.Chang,“Image retrieval:Current tech-

niques,promising directions and open issues,”1fff9e858762caaedd33d47fmun.Image Represent.,vol.10,pp.39–63,1999.

[3]P.K.Kaiser and R.M.Boynton,“Human Color Vision,”Opt.Soc.

Amer.,Washington,DC,1996.

[4]M.Swain and D.Ballard,“Color indexing,”1fff9e858762caaedd33d47fput.Vis.,vol.7,

pp.11–32,1991.

[5]K.Hirata and T.Kato,Rough Sketch-Based Image Information Re-

trieval:NEC Res.Develop.,1992,vol.34,pp.263–273.

[6]W.Hsu,T.S.Chua,and H.K.Pung,“An integrated color-spatial ap-

proach to content-based image retrieval,”in Proc.1995ACM Multi-media Conf.,pp.305–313.

[7] C.Carson et al.,“Blobworld,:A system for region-based image indexing

and retrieval,”in Proc.Int.Conf.Visual Information Systems,1999.

[8]G.Pass and R.Zabih,“Histogram refinement for content-based image

retrieval,”in Proc.IEEE Workshop on Applications of Computer Vision, 1996,pp.96–102.

[9]“MPEG7FCD,”Singapore,ISO/IEC JTC1/SC29/WG11,2001.

[10]J.Huang et al.,“Spatial color indexing and applications,”1fff9e858762caaedd33d47fput.

Vis.,pp.245–268,1999.

[11]G.Qiu et al.,“A binary color vision framework for content-based

image indexing,”in Recent Advances in Visual Information Systems, S-K Chang et al.,Eds.New York:Springer,2002,vol.LNCS2314, pp.50–60.

[12] A.Mojsilovic et al.,“Matching and retrieval based on the vocabulary

and grammar of color patterns,”IEEE Trans.Image Processing,vol.9, pp.38–54,Jan.2000.

[13]P.J.Burt and E.H.Adelson,“The Laplacian pyramid as a compact

image code,”IEEE 1fff9e858762caaedd33d47fmun.,1fff9e858762caaedd33d47f-31,pp.532–540,1983.

[14] F.W.Campbell and J.G.Robson,“Application of Fourier analysis to

the visibility of gratings,”J.Physiol.,vol.197,pp.551–566,1968. [15] C.B.Blakemore and F.W.Campbell,“On the existence of neurones in

the human visual system selectively sensitive to the orientation and size of retinal images,”J.Psychol.,vol.204,pp.237–260,1969.

[16] F.W.Campbell,G.F.Cooper,and C.Enroth-Cugell,“The spatial selec-

tivity PF the visual cells of the cat,”J.Physiol.,vol.203,pp.223–235, 1969.

[17] C.B.Blakemore,J.Nachmias,and P.Sutton,“The perceived spatial

frequency shift:Evidence for frequency-selective neurones in the human brain,”J.Physiol.,vol.210,pp.727–750,1970.

[18]R.L.De Valois,D.G.Albrecht,and L.G.Thorell,“Spatial frequency

selectiveity of cells in Macaque visual cortex,”Vis.Res.,vol.22,pp.

545–559,1982.

[19]K.T.Mullen and F.A.A.Kingdom,“Color contrast in form perception,”

in The Perception of Color,P.Gouras,Ed.New York:Macmillan, 1991.

[20]Y.Rui et al.,“Relevance feedback:A power tool for interactive content-

based image retrieval,”IEEE Trans.Circuits,Syst.,Video Technol.,pp.

644–655,Sept.1998.

[21] A.N.Akansu and R.A.Haddad,Multiresolution Signal Decomposi-

tion.New York:Academic,1992.

[22]R.Roberts and C.Mullis,Digital Signal Processing.Reading,MA:

Addison-Wesley,1987.

[23]M.Jenkins and L.Harris,Computational and Psychophysical Mech-

anisms of Visual Coding.Cambridge,U.K.:Cambridge Univ.Press,

1997.

Guoping Qiu received the B.Sc.degree in electronic

measurement and instrumentation from the Univer-

sity of Electronic Science and Technology of China,

Chendu,in July1984and the Ph.D.degree in elec-

trical and electronic engineering from the University

of Central Lancashire,Preston,U.K.,in1993.

He is currently a Lecturer of computer science at

the School of Computer Science and IT,University of

Nottingham,Nottingham,U.K.Before joining Not-

tingham in October2000,he was a Lecturer at the

School of Computing,the University of Leeds,U.K and the School of Computing and Mathematics,the University of Derby,U.K. His general research interests are image and signal analysis and processing al-gorithms for visual computing,including image database,content-based image and video retrieval;filtering theory and practice for image enhancement,super-resolution and high dynamic imaging,and visual object

recognition.

Kin-Man Lam received his the Associateship in

electronic engineering with distinction from the

Hong Kong Polytechnic University(formerly Hong

Kong Polytechnic)in1986.He won the S.L.Poa

Scholarship for overseas studies and received the

M.Sc.degree in communication engineering from

the Department of Electrical Engineering,Imperial

College of Science,Technology,and Medicine,

U.K.,in1987.In1993,he undertook a Ph.D.degree

program in the Department of Electrical Engineering

at the University of Sydney,Australia,and won an Australia Postgraduate Award for his studies.He received the Ph.D.degree in August1996and was awarded the IBM Australia Research Student Project Prize.

He joined TechTrend E.&C.,Ltd.,as an Application Engineer working on micro-supercomputers in1987.From1990to1993,he was a Lecturer at the Department of Electronic Engineering of the Hong Kong Polytechnic Univer-sity teaching various subjects on computer architecture and parallel processing. In October1996,he was appointed as an Assistant Professor at The Hong Kong Polytechnic University.He became Associate Professor in February1999.He was also the Secretary of the Hong Kong Special Session for the China14th Na-tional Conference on Circuits and Systems,which was held in Fuzhou,China, in April1998and the Secretary of the2001International Symposium on Intel-ligent Multimedia,Video,and Speech Processing organized by the Centre for Multimedia Signal Processing,Hong Kong Polytechnic.He was also a Program Committee Member and Session Chair of the2002Conference on Visual Com-munications and Image Processing,which was held in San Jose,CA,in January 2002.He was a Guest Editor for a Special Issue on Biometric Signals for the EURASIP Journal on Applied Signal Processing in2002.

1fff9e858762caaedd33d47fm was selected as the“best teacher”of the1997/1998academic year in the Department of Electronic Engineering.He is the Secretary of the IEEE Hong Kong Chapter of Signal Processing,the Secretary of the2003IEEE Inter-national Conference on Acoustics,Speech,and Signal Processing(ICASSP’03) to be held in Hong Kong and the Principal Member of Technical Programs of the20027th International Conference on Control,Automation,Robotics and Vi-sion(ICARCV2002)to be held Singapore.His current research interests video processing,computer vision and architecture,digital TV and pattern recogni-tion.He was also actively involved in professional activities between1987and 1993;in particular,he was the Secretary and publication secretary of several international conferences and the Secretary of the IEEE Hong Kong Chapter of Signal Processing between1992and1993.He was Session Chairman and member of the Technical Program Committee of the IEEE Symposium on Cir-cuits and Systems(ISCAS’97),which was held in Hong Kong in June1997.

本文来源:https://www.bwwdw.com/article/bife.html

Top