Object recognition in 3D scenes with the use of the SIFT algorithm dedicated for mobile platforms

Author: Karol Matusiak (2012)

The aim of the master’s thesis was to implement an application for mobile devices which would enable identification of everyday life objects in images recorded by the built-in camera. Analysis of the image identification methods demanding relatively low computing complexity was performed and the requirements of the designed application were determined. On the basis of the analysis it was decided to implement two versions of application: one using standard Scale-Invariant Feature Transform and a second one linking the SIFT transform and the Features from Accelerated Segment Test algorithm used for keypoint detection. Both algorithms provide the detection of stable characteristic features of images. The SIFT transform builds descriptors which are to a considerable degree independent from the conditions of image recording, such as: rotation, noise, scale and brightness changes. As the target mobile platform, a group of developed mobile phones with operating system Android was selected.
The program uses the touch screen interface of the mobile device. The recording of an image is done through the digital camera built into the device. The program enables the user to build a database of patterns and to compare the recorded images with these patterns using a modified nearest neighbor classifier.
Application tests were conducted on 4 objects, whose patterns were created and stored in the database. Afterwards, 12 test photos were compared against the database contents. The test results have shown that the performance of the application using the FAST algorithm is 3,5 times faster than the version using standard SIFT. In case of proper detection both programs had same success rate – 83%, but with more accuracy for the standard SIFT. In the summary, the application performance assessment and the possibilities of its improvement and development are proposed.
The application can be used as an aid for people with sight disabilities, city or tourist guides or in virtual reality systems.