Professional Documents
Culture Documents
UNIVERSITY
UNDERGRADUATE
PROJECT (THESIS)
06122604
10.03.2910.07.09
()
............................................................................................................................ 1
............................................................................................................................ 3
ABSTRACT ................................................................................................................... 4
.................................................................................................................. 6
1.1 ..................................................................................................................................... 6
1.2 ......................................................................................... 8
1.2.1 .......................................................................................... 8
1.2.2 ................................................................................ 13
1.2.3 ........................................................................................................ 14
1.3 ............................................................................................................... 14
............................................................................ 17
2.1 ........................................................................................................................... 17
2.2 ....................................................................................................... 17
............................................................................ 19
3.1 ....................................................................................................................... 19
3.2 ........................................................................................................... 20
3.3 ....................................................................... 22
........................................................................................................ 23
4.1 ........................................................................................................................... 23
4.2 ........................................................................................................................... 23
4.3 ................................................................................................................... 25
4.3.1 ............................................................................................ 25
4.3.2 ............................................................................................ 26
4.3.3 ............................................................................................ 27
4.3.4 .................................................................................................... 27
4.4 ........................................................................................................................... 28
4.4.1 .................................................................................................................... 28
4.4.2 ........................................................................................................ 29
........................................................................................................ 31
5.1 .......................................................................................................................... 31
5.2 CBIR ............................................................................ 32
5.3 CBIR ................................................................................................ 33
5.3.1 .................................................................................................... 33
5.3.2 .................................................................................................................... 34
5.3.3 .................................................................................................................... 35
5.3.4 ............................................................................................................ 36
5.3.5 ................................................................................................ 36
5.4 .............................................................................................................. 37
()
................................................................................................................ 39
.......................................................................................................................... 41
...................................................................................................................... 42
.......................................................................................................................... 44
............................................................................................................................. 44
................................................................................................................................. 55
................................................................................................................................. 55
................................................................................................................................. 66
()
,,
VS2008 Windows
()
ABSTRACT
Content-based image retrieval, image analysis is a research field. It is called the
CBIR or called content-based visual information retrieval. Content-based image
retrieval aims to check the image in a given context, based on the content of
information or specify search criteria, in the image database to search for and find a
consistent query the corresponding picture.
In recent years, multimedia technology and network technology along with the rapid
development of the traditional keyword-based information retrieval technology does
not meet the requirements, therefore, content-based image retrieval technique came
into being, and aroused wide attention. More and more research on content-based
image retrieval system was developed, and made some research. Such as QBIC
(Query By Image Content), Visualseek, Webseek.
With content-based image retrieval technology, beginning to image retrieval
applications, but for different application areas, there are still a lot of technical issues
remain unresolved. Currently, network technology is already mature, carry out
scientific education through the network means to improve the public's scientific
literacy is necessary. Shanghai has many historical relics of resources to carry out on
digital museum system, universal museum knowledge, for sharing resources,
protection of valuable museum resources to enhance public understanding of history
knowledge in the interest of heritage, of great importance.
The Digital Museum is currently available based on user input keywords to find the
appropriate information, but the returned information is often non-human means.
Because people may have only a general impression of heritage, did not know that
this is what dynasties, with what meanings. So how heritage rough impression of
people in large numbers to find the appropriate Museum accurate information to help
()
people understand the history of the relevant institutions to promote heritage, display
artifacts are meaningful.
Using VS2008, in the Windows environment to develop a practical content-based
retrieval system artifacts. System can be entered by the user or their own heritage
picture drawn sketch, automatically extracting the shape features, then retrieve with
a similar heritage, and return relevant information.
()
1.1
20 70
()
90
content-based image
retrieval
GoogleYahoo MSN
query by
image example
1)
()
2)
3)
2
3 4
5
1.2
1.2.1
()
1.2.1.1
:
(1)(ColorHistogram),
,
,
RGB,CIE,HSI,HSV ,
,,
[5],
(2)(Color Correlogram)
,,
()
,,,
(3)(Color Moment)
,
,
(4)(Color Coherence Vectors,CCV)
,
,
1.2.1.2
Tamura
,,
,
:
(1)
,
,Haralick
[9],
,
Tamura
10
()
[10],
:(Coarseness)(Contrast)
(Directionality)(Linelikeness)(Regularity)(Roughness)
Tamura :
,()
Tamura QBIC MARS
(2)
,
Carlucci[9]
,Lu and Fu[9]
, 99 ,
,
(3),
,
[13](Multi-Resolution Simultane-ous Autoregressive,MRSA)
MRSA ,
(4),[15]
Gabor [14](PyramidWaveletTransform,PWT)[11]
(Tree Wavelet Trans-form,TWT)[11]Manjunath and Ma[14]
,Gabor , TWT PWT,
MRSA,,
1.2.1.3
11
()
,,
:
,
(1),
,
[16],
,
[20] ,
Delaunay ,
,
[17],,
(2)
,,
[18]
,,
,,[19]
,,
12
()
,,
,
1.2.1.4
,
,,
,
,
,
,,
1 ,
,,[17]
[29]
:
,
Tamura Gabor
,
1.2.2
13
()
1.2.3
semantic gap
relevance feedback
image segmentation
1.3
IBM QBIC
QBIC IBM
QBIC IBM QBIC UIUC
MARS MIT PhotobookUC Berkeley Digital Library Project
Columbia VisualSEEk
CBIR CT
14
()
()
:
(1)
(2)
(3)
(4)
()
15
()
(5)
,:
(1),
,
(2)
,
,
,,,
,
(3),
,
,
,,
,
,,
16
()
2.1
,,
2.2
Jain , ,
, ,
Jain , ,
400
, ,
17
()
Fourier
Descriptors) ,
, ,
,
,
,
(Local) ( Global)
, ,
, ,
,
18
()
3.1
Moment Invariants
R R p+q
xC , yC
Hu [13] 7
[14]
Yang Albregtsen Green
Kapur
[15]Gross Latecki
19
()
[15][16]
3.2
Fourier shape descriptors
xSyS 0
N-1 N
curvature
functioncentroid distancecomplex coordinates
function
K(s)
(s)
xC, yC
20
()
DC
|F-i| = |Fi |
Fi i
DC
MM 2n = 64
21
()
3.3
0.2 0.8
22
()
4.1
4.2
:
2
:
1.
2.
()
23
()
3.
4.
5.
()
6.
7.
;;
24
()
4.3
4.3.1
25
()
3
3 filename size 8
double[]bin
4.3.2
public class HuNode
{
public string filename;
public int size = 7;
public double[] bin;
public HuNode()
{
bin = new double[size];
}
};
4
size Hu 7
7
26
()
4.3.3
cvLoadImage
Image_Re Image_Im DFT
4.3.4
FD
7
27
()
4.4
4.4.1
28
()
4.4.2
OpenCV Canny
29
()
30
()
5.1
CBIR
Web of ScienceSCI1997-SSCI2000-A&HCI2001-CPCI1997-
TI=(image OR(image re-trieval)OR(content based
image retrieval)OR CBIR)AND TS=((performance evaluat*)OR(performance
as-sess*)) LISAACMEBSCOhostEmeraldSpringLink
10 10 CBIR
1996
2002 CBIR
CBIR
CBIR
CBIR
10
CBIR
CBIR
31
()
5.2 CBIR
Corel
Guojun Lu
32
()
N M
NM
5.3 CBIR
CBIR
CBIR
5.3.1
11
P=|Ra|/|A| R=|Ra|/|R|=|A|-|Ra|/|A|
Tan Kian-Lee Pnormal
Rnormal
33
()
N L
ri i
CBIR
CBIR
CBIR CBIR
5.3.2
Sameer Antani
CBIR
P CBIR Dm
Td
34
()
CBIR
N Td N1 Td
(N3)
M=(N2) Td (M1)(N3)
Td 250 Td
31250
CBIR
5.3.3
Tk
k=12M qTi N
Pj 19
RR 100%
CBIR
SPj
35
()
5.3.4
s Si
E CBIR
CBIR
5.3.5
Henning Muller 22
N NR
K1
K2 KNRRi Ki
Ri=1N[01] 0
Henning Muller 0.5
36
()
PR
CBIR
Suc-cess of Target Search STS
average match percentile AMPtau
5.4
,
,
,
,
, .
, ,
654
,
(query by example), (query
im age) , ,
, 10
, ( retrieval rate)
.
Rate = N~/N
, N~,
QSN N QS
,
, ,
37
()
38
()
MPEG-7
MPEG-7
CBIR ,
,, CBIR
, CBIR :
(1)CBIR
,
(Relevance Feedback)
,,
,,,
,
(2) MPEG-7 MPEG MPEG-7
(Multimedia Content Interface)[21,22]
,
(Visual Descriptor)
,
MPEG-7
(3)(Region-based)
(Object Level)[34]
,
,,
39
()
(4),
,
(5)
40
()
41
()
[1] 2003
12
[2]
200504
[3]
200110
[4]
200404
[5]
200407
[6]
200512
[7]
200407
[8] T. Sikora,The MPEG-7 Visual Standard for Content Description
An Overview,IEEE Trans. Circuits Syst. Video Technol., vol. 11, pp.
696-702, June 2001
[9] John R. Smith and Shih-fu Chang,VisualSEEk: a fully automated
content-based image query system,ACM Multimedia,1996,8798
[10] Jain A K, Vailaya A. Image Retrieval Using Color and Shape. Pattern
Recognition ,1996 ,29 (8)
[11] Zahn C T, Roskies R Z.Fourier Descriptors for Plane Closed Curves.
IEEE Transactions on Computers,1972,(21)
[12] Rui Y etal. Modified Fourier Descriptor for Shape Representation:
A Practical Approach. Proc. 1st International Workshop on Image Database
and Multimedia Search ,Amsterdam ,the Netherlands ,1996
42
()
43
()
using System;
using System.Collections.Generic;
using System.Collections;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Numeric;
using Emgu.CV;
using Emgu.CV.Structure;
using Emgu.CV.CvEnum;
using Emgu.Util;
using System.IO;
namespace CBIR
{
public partial class Form1 : Form
{
string HuFeature;
string FDFeature;
string ImgSearch;
string[] picImg = new string[10];
public class FDNode
{
public string filename;
public int size = 8;
public double[] bin;
public FDNode()
{
bin = new double[size];
}
};
public class HuNode
44
()
{
public string filename;
public int size = 7;
public double[] bin;
public HuNode()
{
bin = new double[size];
}
};
public struct ImgNode
{
public string filename;
public double dist;
};
public void GetFD(ref FDNode node)
{
IntPtr img = CvInvoke.cvLoadImage(node.filename,
LOAD_IMAGE_TYPE.CV_LOAD_IMAGE_GRAYSCALE);
IntPtr image_Re = CvInvoke.cvCreateImage(CvInvoke.cvGetSize(img),
IPL_DEPTH.IPL_DEPTH_64F, 1);
IntPtr image_Im = CvInvoke.cvCreateImage(CvInvoke.cvGetSize(img),
IPL_DEPTH.IPL_DEPTH_64F, 1);
IntPtr Fourier = CvInvoke.cvCreateImage(CvInvoke.cvGetSize(img),
IPL_DEPTH.IPL_DEPTH_64F, 2);
IntPtr dst = CvInvoke.cvCreateImage(CvInvoke.cvGetSize(img),
IPL_DEPTH.IPL_DEPTH_64F, 2);
CvInvoke.cvConvertScale(img, image_Re, 1, 0);
CvInvoke.cvZero(image_Im);
CvInvoke.cvMerge(image_Re, image_Im, IntPtr.Zero, IntPtr.Zero, Fourier);
CvInvoke.cvDFT(Fourier, dst, CV_DXT.CV_DXT_FORWARD, 0);
double DFTfactor = CvInvoke.cvGet2D(dst, 0, 0).v0;
45
()
46
()
{
string[] part = filename.Split('\\');
string[] name = part[part.Length - 1].Split('.');
string ret = name[0] + "_canny.jpg";
return ret;
}
public void canny(string filename)
{
IntPtr img = CvInvoke.cvLoadImage(filename,
LOAD_IMAGE_TYPE.CV_LOAD_IMAGE_GRAYSCALE);
IntPtr imgCanny = CvInvoke.cvCreateImage(CvInvoke.cvGetSize(img),
IPL_DEPTH.IPL_DEPTH_8U, 1);
CvInvoke.cvCanny(img, imgCanny, 50, 150, 3);
//CvInvoke.cvSaveImage(cannyName(filename), imgCanny);
CvInvoke.cvNamedWindow("image");
CvInvoke.cvShowImage("image", imgCanny);
CvInvoke.cvReleaseImage(ref img);
CvInvoke.cvReleaseImage(ref imgCanny);
}
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
}
private void button1_Click(object sender, EventArgs e)
{
OpenFileDialog fileDialog = new OpenFileDialog();
if (fileDialog.ShowDialog() == DialogResult.OK)
47
()
HuFeature = fileDialog.FileName;
textBox1.Text = HuFeature;
}
private void button2_Click(object sender, EventArgs e)
{
OpenFileDialog fileDialog = new OpenFileDialog();
if (fileDialog.ShowDialog() == DialogResult.OK)
FDFeature = fileDialog.FileName;
textBox2.Text = FDFeature;
}
private void button3_Click(object sender, EventArgs e)
{
OpenFileDialog fileDialog = new OpenFileDialog();
if (fileDialog.ShowDialog() == DialogResult.OK)
ImgSearch = fileDialog.FileName;
textBox3.Text = ImgSearch;
}
public double CalNodeDistFD(ref FDNode node1, ref FDNode node2)
{
double dist = 0;
for (int i = 0; i < node1.size; i++)
dist += (node1.bin[i] - node2.bin[i]) * (node1.bin[i] - node2.bin[i]);
return Math.Sqrt(dist);
}
public double CalNodeDistHu(ref HuNode node1, ref HuNode node2)
{
double dist = 0;
for (int i = 0; i < node1.size; i++)
dist += (node1.bin[i] - node2.bin[i]) * (node1.bin[i] - node2.bin[i]);
return Math.Sqrt(dist);
}
public void CalDatabaseDistFD(string database, ref FDNode node1, ref ImgNode[]
imgNode, double weight, ref int size)
{
48
()
49
()
if (i == -1)
imgNode[size].filename = word;
else
node2.bin[i] = Double.Parse(word);
i++;
}
imgNode[size++].dist += CalNodeDistHu(ref node1, ref node2) * weight;
}
}
private void button4_Click(object sender, EventArgs e)
{
Object selectedItem = comboBox1.SelectedItem;
FDNode nodeFD = new FDNode();
HuNode nodeHu = new HuNode();
nodeFD.filename = nodeHu.filename = ImgSearch;
pictureBox1.Image = Image.FromFile(ImgSearch);
//canny(ImgSearch);
int method = comboBox1.Items.IndexOf(selectedItem);
const int ImgSize = 10000;
ImgNode[] imgNode = new ImgNode[ImgSize];
for(int i = 0; i < ImgSize; i++)
imgNode[i].dist = 0;
int size = 0;
if (method == 0)
{
GetHu(ref nodeHu);
CalDatabaseDistHu(HuFeature, ref nodeHu, ref imgNode, 1, ref size);
}
else if (method == 1)
50
()
{
GetFD(ref nodeFD);
CalDatabaseDistFD(FDFeature, ref nodeFD, ref imgNode, 1, ref size);
}
else
{
GetFD(ref nodeFD);
CalDatabaseDistFD(FDFeature, ref nodeFD, ref imgNode, 0.9, ref size);
size = 0;
GetHu(ref nodeHu);
CalDatabaseDistHu(HuFeature, ref nodeHu, ref imgNode, 0.1, ref size);
}
for(int i = 0; i < size; i++)
for(int j = i + 1; j < size; j++)
if (imgNode[i].dist > imgNode[ j].dist)
{
ImgNode t = imgNode[i];
imgNode[i] = imgNode[ j];
imgNode[ j] = t;
}
pictureBox2.Image = Image.FromFile(imgNode[0].filename);
pictureBox3.Image = Image.FromFile(imgNode[1].filename);
pictureBox4.Image = Image.FromFile(imgNode[2].filename);
pictureBox5.Image = Image.FromFile(imgNode[3].filename);
pictureBox6.Image = Image.FromFile(imgNode[4].filename);
pictureBox7.Image = Image.FromFile(imgNode[5].filename);
pictureBox8.Image = Image.FromFile(imgNode[6].filename);
pictureBox9.Image = Image.FromFile(imgNode[7].filename);
pictureBox10.Image = Image.FromFile(imgNode[8].filename);
pictureBox11.Image = Image.FromFile(imgNode[9].filename);
for (int i = 0; i < 10; i++)
picImg[i] = imgNode[i].filename;
/*
for (int i = 0; i < 10; i++)
canny(imgNode[i].filename);
*/
}
51
()
52
()
GetHu(ref node);
sw.Write(filename);
for (int i = 0; i < node.size; i++)
{
sw.Write(" ");
sw.Write(node.bin[i]);
}
sw.WriteLine();
}
sr.Close();
sw.Close();
}
private void pictureBox1_Click(object sender, EventArgs e)
{
canny(ImgSearch);
}
private void pictureBox2_Click(object sender, EventArgs e)
{
canny(picImg[0]);
}
private void pictureBox3_Click(object sender, EventArgs e)
{
canny(picImg[1]);
}
private void pictureBox4_Click(object sender, EventArgs e)
{
canny(picImg[2]);
}
private void pictureBox5_Click(object sender, EventArgs e)
{
canny(picImg[3]);
}
53
()
54
()
55
()
I.INTRODUCTION
Nowadays personal digital photos are becoming a commonand valuable form of
personal information. As a large numberof family photos and other personal images
pile up, usersencounter severe difficulties with the management and retrievalof them,
especially when they want to find a desired oneamong tens of thousands of photos
using just a simple query.Traditional photo management ways based on
filefolders/albums are far from such requirement. As a result,effective management
of these large personal photo collectionsis becoming indispensable.
56
()
57
()
Mary, Tom), and Event (eg.Gathering, Sport, Visit) etc., are important cues for
photobrowsing and researching. By the frequency statistics of tagsappearing on the
Flickr website, we chose some day to dayvocabularies often used by people to mark
their photos. About120 initialized concepts are included in the
FamilyAlbumontology, in which about 12 are regarded as core top-levelconcepts. By
the bottom-up method, these concepts arearranged hierarchically in the ontology
referencing to thestructure of WordNet. In the hierarchy, the relationshipbetween
concepts can be subsumption. As show in fig. 1(a), theleft diagram is a hierarchy of
Photo class. In the right diagram,PhotoID, TakeTime, Width etc. are DataType
properties ofPhoto class. These intrinsic attributes are assigned to eachconcept
identifying it as a unique one in the whole knowledgeframework. Meanwhile, Object
properties are defined to bracethe globe knowledge network, such as
hasTarget,Event_occurTime etc. These extrinsic attributes representthe semantic
relationships between abstract concepts. Theframework of FamilyAlbum is shown in
figure 1(b).
58
()
59
()
instances
If there are appropriate matching
Then go to step 7
Else go to step 4
Step 4: Calculate the similarities based onWordNet between the key phrase and
every classconcept in the ontology
Step 5: Find out the best match concept
Step 6: Create an instance using the key phrase for the best match concept
Step 7: Connect the photo and the instance withproperties defined in the
FamilyAlbum ontology
Step 8: IF all of the key phrases are computed
Then complete annotation process
Else go to step 3
WordNet is a lexical reference system, which organizesEnglish nouns, verbs,
adjectives and adverbs into synonym sets,each representing one underlying concept.
We compute thesimilarity between a key phrase and a concept in theFamilyAlbum
ontology using javasimlib, a Java-based tool thatcomputes the similarity between
words (or synsets) over theWordNet hierarchies based on an information theoretic
metric[11]. Given two words or synsets, javasimlib returns a valuebetween 0 and 1,
where 1 indicates the highest similarity.
C. Metadata and SVM Based Automatic Annotation
An image file created by a digital camera usually containsEXIF metadata in the
file header that includes the mostimportant parameters of camera settings when the
photographwas taken. Different camera manufacturers often produce adifferent set of
EXIF parameters. The parameters we found tobe useful for photo annotations are:
date and time, f-stop,exposure time, flash and focal length.
The date and time parameter could inference some semanticconcepts related
time, such as, Calenda (eg. April 5, 2007),season (eg. spring, autumn) and time
60
()
period (eg. dawn,morning) etc. The EXIF metadata together with the colormoment
(CM) features of images also can be utilized toproduce the semantic concepts of
scene classification. In our work, the libSVM library [12] with the RBF kernel is
used totrain and classify photos. Both the CM features and the EXIFfeatures are
extracted and combined as the input of classifier.The photos are grouped into indoor,
outdoor daytime andoutdoor night automatically. The instances of these conceptswill
be created automatically in the ontology for photoannotation when they are
importing the system. Each photo isallowed to own multiple annotations.
D. Face Detection Based Automatic Annotation
Images contain rich semantic information. Inference of thesemantic concepts
from image content is another way toautomatically annotate photos. Despite the
limitation ofcomputer vision technology, there are still some comparativelymature
techniques that could be used in photo managementsystems, such as face detection.
In this paper, face detectionalgorithm provided by Intels OpenCV library are used
tolocate faces in a photo with reasonable accuracy. Bycalculating the numbers of
faces in the photo, the system canclassify the photos into portrait, group, crowd
andscenery photos etc. Meanwhile, the instances of these classconcepts are created
in the ontology and linked with thecorresponding photo instance by the system
automatically.
E. Semi-automatic Dual-level Annotation Approach
Though we can automatically detect and infer someimportant concepts from text,
EXIF metadata and results offace detection, there are still many difficulties to obtain
moreconcepts with higher accuracy. As a result, a fully automaticphoto annotation
system is unrealistic till now.
A semi-automatic dual-level annotation approach is alsoproposed in our work.
The first layer is a rough annotationlayer, in which the Nature Language Process
(NLP) technique,the face detection technique and EXIF information are used
toextract concepts automatically. The other layer is the accurateannotation layer, in
61
()
which the users can modify inappropriateannotations and create new annotations for
photos manually.The approach of dual-level annotation leverages thetediousness of
fully manual annotation and the inaccuracy offully automatic annotation.
62
()
text items areextracted key words and matched with the concepts in theontology.
Compared with manual matching, the precisions ofthe algorithm on the three
databases are shown in table 2.
63
()
REFERENCES
[1] Kang, H., Shneiderman, B. Visualization Methods for Personal PhotoCollections:
Browsing and Searching in the PhotoFinder. In Proc. ICME2000, New York City,
64
()
65
()
Yanmei Chai, Xiaoyan Zhu, Sen Zhou, Yiting Bian, Fan Bu, Wei Li and Jing Zhu
100084
/ EXIF
OntoAlbum
66
()
/
/
/
PhotoFinder
Show&TellShoeboxOHare
50
FamilyAlbum Ontology
FamilyAlbum
67
()
1995
Flickr
120
FamilyAlbum 12
WordNet
1aPhotoID
TakeTime
hasTargetEvent_occurTime
FamilyAlbum 1b
68
()
1
2 W[]
3 wW[]
7
4
4 WordNet
69
()
5
6
7 FamilyAlbum
8 3
WordNet
FamilyAlbum javasimlib Java
synsets WordNet [11]
synsetsjavasimlib 0 1 1
SVM
EXIF
EXIF
2007
4 5 EXIF
libSVM RBF
EXIF
OpenCV
70
()
EXIF
NLP EXIF
A. SVM
2376 HP
3 833 911
420 107 145
60 1
B.
3
71
()
URLhttp://dancephotography.com/
http://www.twin-springs.com
C.
1433
3
D OntoAlbum
OntoAlbum 2
72
()
GUI OntoAlbum 3
OWL
W3C SPARQL
OWL DL
Person 3
3 OntoAlbum
73
()
EXIF
OntoAlbum
74