Unsupervised semantic indoor scene classification for robot vision based on context of features using Gist and HSV-SIFT
Regular Research Article
06 Aug 2013
Faculty of Systems Science and Technology, Akita Prefectural University, Akita, Japan
Abstract. This paper presents an unsupervised scene classification method for actualizing semantic recognition of indoor scenes. Background and foreground features are respectively extracted using Gist and color scale-invariant feature transform (SIFT) as feature representations based on context. We used hue, saturation, and value SIFT (HSV-SIFT) because of its simple algorithm with low calculation costs. Our method creates bags of features for voting visual words created from both feature descriptors to a two-dimensional histogram. Moreover, our method generates labels as candidates of categories for time-series images while maintaining stability and plasticity together. Automatic labeling of category maps can be realized using labels created using adaptive resonance theory (ART) as teaching signals for counter propagation networks (CPNs). We evaluated our method for semantic scene classification using KTH's image database for robot localization (KTH-IDOL), which is popularly used for robot localization and navigation. The mean classification accuracies of Gist, gray SIFT, one class support vector machines (OC-SVM), position-invariant robust features (PIRF), and our method are, respectively, 39.7, 58.0, 56.0, 63.6, and 79.4%. The result of our method is 15.8% higher than that of PIRF. Moreover, we applied our method for fine classification using our original mobile robot. We obtained mean classification accuracy of 83.2% for six zones.