Abstract:
It is quite often that one needs to search a specific image for a particular situation based
on visual contents of image while working with the digital images. Content based image
retrieval (CBIR) is one of the modern ways to search huge digital image repositories for
specific images. With the growing usage of World Wide Web, the use of CBIR has
acquired enhanced currency on most of the websites, software and database systems. It is
for this reason that CBIR has increasingly attracted the interest of application designers
and researchers to enhance its efficiency and make it compatible with its growing
demand. The sizes of image databases as well as the applications of image retrieval are
increasing. More and more sophisticated algorithms and techniques are being developed
to meet the increasing and complex requirements of content based image retrieval
applications. Different CBIR systems have different approaches to find images based on
their contents and have different performance and accuracy measures. Five algorithms
have been proposed in this thesis which is briefly described below. Qualitative analysis
for proposed algorithms has been performed with widely used and accepted
quantification measures, precision and recall rates and comparison has also been
performed with other state of art descriptors which proves the reliability and supremacy
of the proposed methods.
The first proposed algorithm “An Efficient Content Based Image Retrieval using EI
Classification and Color Features” is an effective method for image search and retrieval.
It decomposes an image into cells and then extracts its edges to get same number of
features as a result of which chances of missing edge features has been reduced. Great
number of edge features helps to find more relevant images. Pixels classification is done
on the basis of edge pixels and inner pixels. Features are selected from edge pixels for
populating the database. Moreover, color differences are used to cluster similar color
retrieved results. Level of best performance is to find maximum real edges to retrieve
more relevant images. For comparison it is first analyzed with only color features and
then with color and shape features combined. Results declare that combined features can
obtain higher accuracy. The proposed algorithm is robust against image scaling, rotation
and variation but could not provide greater performance for much complicated images.
Average precision and recall rates are 90% and 68% respectively. Allowed dimensions of
images are between 640x480 pixels and 1024x840 pixels.
In the second proposed algorithm “Powerful Descriptor for Image Retrieval Based on
Angle Edge and Histograms”, a new angle orientation histogram has been introduced. By
applying Pythagorean theory to image, very useful characteristics have been obtained for
image matching, search and retrieval. This technique presents satisfied results even for
complicated images like one or more complex objects or a natural scene having many
objects. It divides the image into angles for more detail and extracts inner and edge
ix
pixels. Then it computes histogram to carefully analyze changes of edges in the image.
The technique is also robust against image scaling, rotation and translation for both
simple images and complicated images. Best performance level will be to retrieve the
most complicated images like more than one animal in an image. It is compared with
existing techniques of AOP, ARP, MPEG-7, SIFT and EPOH and results show that the
proposed methodology has achieved greater relevancy. Using a centered point in the
process of image decomposition, the method is suitable for different sized images
because it always divides the image into equal parts and produces the same number of
features that is comparable even in case of images of different size. Precision and recall
rates are enhanced by this technique to 94% and 71% respectively.
The third proposed algorithm named “Content based Image Retrieval by Combining
Multiple Features of Shape and Color” has introduced a new shape features extraction
model named symmetry, area, direction angle and arc length features (SADAF) model.
SADAF model is generic in nature. In this approach, objects have been located first with
image segmentation. SADAF model has then been applied to extract visual shape
contents of image. After image matching, searching and retrieval process, the retrieved
results have been arranged on the basis of distance of color from query image as
compared to the retrieved images, calculated through color histogram. It is compared
with the existing techniques of FD, CSS and SRD and best performance results have been
obtained for the proposed methodology. SADAF is robust to image alterations like
changes in size, scale and orientation because of using symmetry, area, angle direction,
and arc length as features from the image. Average precision and recall rates are 79%
and 68% respectively. The algorithm has higher processing time and directly proportional
to number of images to be retrieved, so it sets initially 50 images to be retrieved against a
query image. In this algorithm, when the number of images to be retrieved is low,
precision increases and recall decreases.
In the fourth proposed algorithm “Content Based Image Retrieval by Shape, Color and
Relevance Feedback”, identification of image content is done using combined features of
shape, color and relevance feedback which is a key operation for a successful CIR
system. By adopting the strategy of combining multiple features of shape, color and
relevance feedback for the retrieval of images, very successful results have been
obtained. For true image representation, accurate demonstration of shape semantically is
essential to achieve correct image matching and retrieval, so shape is used as a primary
feature to identify the relevant images whereas color and relevance feedback have been
used as supporting features to make the system more efficient and accurate. A good
balance between precision and recall is necessary for better system performance and a
higher degree of relevancy, so best performance in the proposed algorithm can be
achieved if image is segmented more accurately. It presents 0.79 average results against
FD, CSS, ART and IM. Average precision for 60 retrieved images is 88% while for 100
it decreases to 68%. Although the technique is good to get more relevant images but if we
increase the number of images to be retrieved, its precision gets decreased.
The fifth proposed algorithm named „Content Based Image Retrieval Using Combined
Features of Shape, Color and Relevance Feedback” presents a unique approach of
combined features of shape, color and relevance feedback. A new and effective
methodology has been introduced for shape calculation and representation. Shape
features are estimated through second derivative, least square polynomial and shape
coding methods. Color is estimated through max-min mean of neighborhood intensities.
The methodology for relevance feedback is also novel because it is based on key points
determination of query image which is an ideal solution to retrieve the best matched
images. The index table is automatically populated with features taken by query and
database images without bothering the user. Level of best performance will be the highest
probability to be counted for an image. The proposed algorithm has been compared with
existing color histogram based techniques like CCM, HSI color histogram and CCM, HSI
color histogram, CCM and Edge histogram descriptor and HSI color histogram, CCM,
Edge histogram descriptor and Relevance feedback. 90% precision and 45% recall has
been calculated for the proposed methodology. It is best among all the proposed
techniques as its performance is not decreasing while increasing the number of retrieved
images and from the results it is also evident that the best matched images have been
retrieved through it.
Level of best performance for CBIR is higher precision and higher recall and it can be
seen from the results that the proposed methods show a nice balance of precision and
recall in minimum retrieval time along with robustness against geometric attacks which
was a major drawback in the current literature. The achieved results of proposed
algorithms comprise of 66%-100% rate for precision and 68%-80% rate for recall. Some
recommendations have been given in the end of thesis to overcome various limitations
present in the field of CBIR.