The goal of our project was to have a mobile app, which would allow our users to search through Visual Meta products just by taking a picture. So as you can imagine the system was split into two steps.
The second step, server side for the image processing, meaning digest the image sent over HTTP request to a Spring Boot server layer, extract image features, search on our index and send a fast response to the client.
In the next sessions we will go through some details related to each of the steps.
Our tip is, if you want to build a simple mobile app, go for the AngularJs and Ionic combination. (we had some issues with Android security policies, but after some empirical research and stackoverflow you will find the holy grail)
Given that the project is to allow users to search for a product in our platform by taking a picture with their mobile phone, it is very challenging given the following issues:
As first step, we have downloaded around 53500 images including boots and hats for men and women to build our index. And during the whole implementation and experiment, we used Nexus 5 as our test device.
Our initial approach was to use ORB (oriented BRIEF) in OpenCV library, which implementing the keypoint detector and descriptor extractor. Its characteristic should allow us to detect object, even when the orientation is different. (please find the following graph)
However, It has problems when the background is similar with the texture of certain product, and it will match the image with the background instead of the main object.
We tried with different matching methods and different distance calculation methods, however the results it gave are all not very satisfying, not to mention the speed is quite slow. If we would need to provide the user result within second after user has insert in the image, this is not the way to go.
As an alternative, we tried the Perceptual Hashing approach, which aims to construct a hash that will give near similar value for identical images. Instead of get rid of the background effect, we tried to crop the image automatically by detecting its contour, with a bit more of help from the user cropping on the client side.
As an additional preprocessing, data augmentation, for each image we also rotated it into 4 different directions so that it will return the most similar image regardless its orientation.
The results were very interesting, performance and retrieval quality. To build the index with plus 50K images, took less than 2 minutes and the brute-force search took less than 2 seconds. With some optimizations we proposed after the hackathon and proper infrastructure, the numbers are very attractive for a production ready application.
Within 3 days we could build from scratch a mobile image based search engine using simple techniques and a great support from the chosen technologies. Those that we definitely recommend to any other developer.
Wen-Ru Sheu (Backend Developer)
Claudio Augusto do Lago Villar (Product Manager)
Ionic Framework: http://ionicframework.com/
Spring Boot: http://projects.spring.io/spring-boot/