The Bag of Visual Words (BoW) is truly a classic approach, originally proposed by Sivich and Sisserman in 2003 [ Paper ]. He was one of the first to deviate from previous methods that preferred global descriptors over local functions such as SIFT and SURF. I recommend continuing with the implementation of this classic pipeline if you are just starting to learn about object detection and recognition.
source share