I help the farm to cluster the rooster in groups according to their cry, so that cocks with a similar cry will live together. The farmer said he wants to know if the chickens will recognize any behavior from others, if so, when he gets the chicken, he will send him to a group of good chickens and hopes that this will bring some good influence on the new chicken. My job is to record the loud similarities of each group, and after a few weeks compare the results and see any growing similarities in the groups.
My idea is to write a program that gives a similarity score for two input wav files, so each rooster can find its closest roommate and get paired groups, and then group similar groups, finally in several groups.
I have several screams for 3 roosters and are analyzed using spectrograms (each rooster shouted twice):
cock A:


cock B:


cock C:


Before calculating the similarity, I would like to divide the scream into segments, so that each segment stores a certain frequency (which will be used to calculate the similarity later). My current solution:
Step 1: when the intensity line is broken, the sound will be separated by spaces; Step 2: when a critical change in frequency occurs, this time will be considered as the boundary of the segment
I think the steps described above are sufficient or not. I hope someone else has a better suggestion and how I can improve segmentation. Are there any methods or algorithms for my situation? Thanks!
source share