added more information

This commit is contained in:
2025-04-07 00:40:21 +02:00
parent af81c82d18
commit a9d3d10da9
16 changed files with 27549 additions and 2515 deletions

View File

@@ -31,6 +31,7 @@ Another path is this paper I found while searching:
This paper employed ORB algorithm on the spectrogram image, which is interesting. But the paper specifically says that it is tested for music identification. Not ASMR audio. Although I am sure that a spectrogram is just another image for the ORB algorithm. But usual length for ASMR audio ranges from short minutes to hours long audio. And I am not sure if ORB is able to handle such extreme image proportions (extremely large images, with the audio length proportional to the X dimension of the image).
One of the ways I came up is to probably chop the audio into pieces, and then running the ORB algorithm to extract the features, that way we don't end up with extraordinary image sizes for the spectrogram, but I am not sure of its effectiveness. So I will also have to experiment with that.
We can always just use existing Music fingerprinting programs like Shazam or Chromaprint? But I highly doubt its effectiveness.
So my current approach will be experimenting these two ways using the local DLSite audio that I have. And compare the results between each other.