I have been playing around with imgSeek for a while. Quick, easy, a nice entry point to the Content-Based Image Retrieval field. Apart from the problems I had to make it work on my Mandriva box, no complain except for this:
sim(X,X) < 1 (quite often; I may say in my 100 images test)
Oh, my God! Never seen before (except as a bug). I have been working for years on Information Retrieval, and seeing this is like seeing P(X) = 2 or so. How is it possible that the simmilarity between an object and itself is less than one in a [0,1] scale?
Ok, imgSeek is based on the paper:
Charles E. Jacobs, Adam Finkelstein, and David H. Salesin. Fast Multiresolution Image Querying. Proceedings of SIGGRAPH 95. pp. 277-286, August 1995.
I am afraid I will have to get into the paper, if I want to understand how this is possible. If simmilarity values are not sound, I will not be able to used them for a weighted k nearest neighbor attack to pornographic image detection.