Abstract:
Content-based image retrieval (CBIR) techniques are used to retrieve similar images from
image repositories by utilizing the visual contents of the images. From last few years, bagof-
visual-words (BoVW) model is most commonly used for image retrieval and got
promising results in terms of accuracy and effectiveness. However, BoVW model still has
some problems, such as an image is represented as an orderless global histogram of visual
words that neglects the spatial layout of the image. Spatial information is an important
component that provides discriminating details for accurate retrieval of images. In this thesis,
three novel approaches for image representations are presented by the selection of appropriate
semantic regions of an image by constructing histograms of visual words. The standard image
databases are used to determine the efficiency of proposed approaches. Following approaches
are presented in this dissertation:
A novel image representation is presented using the characteristics of local and global
information in the form of histograms of visual words. The global information is obtained by
constructing the histogram of visual words over the whole image, while the histogram of
visual words for local information is constructed over the local rectangular region of the
image. The local histogram represents the spatial information of salient objects. In order to
verify the performance of the proposed approach, a number of experiments are conducted on
the standard image databases (Corel-A, Caltech-256, and Ground truth). The results show
that the proposed image representation significantly enhance the effectiveness of image
retrieval.
Based on the semantic similarity in an image, another image representation is proposed
by constructing the histograms of visual words by splitting an image into two rectangular
regions that add the spatial information to the inverted index of the BoVW based image
representation. By utilizing this phenomenon of image representation, different visual words
for upper and lower rectangular regions of an image are obtained for better image retrieval
performance. For the verification of proposed approach, extensive experiments are conducted
vii
on Corel-A, and Ground truth image databases, proof the robustness of the proposed
approach.
In order to overcome the problems of overfitting on large dictionary sizes, lack of spatial
information, and to reduce the computational cost, a new image representation based on the
weighted average of triangular histograms (WATH) is also introduced. The image is divided
into four triangular regions in order to incorporate the spatial information to the inverted
index of the BoVW based image representation, and a histogram of visual words are
computed from each triangular region. An appropriate weight is assigned to each histogram
in order to eliminate the aforementioned problems. The assigned weight reduces; the size of
the dictionary by reducing the non-salient visual words, and the computational cost. The
proposed approach also provide the consistent performance on large dictionary sizes. The
quantitative and qualitative analysis conducted on two image databases (Corel-A and Corel-
1500) shows the robustness of the proposed approach among the recent image retrieval
approaches.
Keywords: Content-based image retrieval (CBIR); Bag-of-visual-words (BoVW); Local and
global histograms; Rectangular spatial histograms; Weighted triangular histograms.