I’m having a little issue with the “perfect” data format needed for a video recommendation system based on BrainJS. Assuming we have the following code:
import brain from 'brain.js';
const net = new brain.NeuralNetwork();
const input = []; // Variants are on the bottom of this thread
net.train(input);
Lets say we want to check which videos have been showed to the user in a list and which ones he did watch at the end. I’ll get data like this:
{input:{'video-1':1,'video-2':1,'video-3':1},{output:{'video-3':1}}}
In this case, the user saw video 1, video 2 and video 3 in the list and has decided to play video 3.
Now, I do the following:
net.run({'video-3':1});
This will output video-3 as recommendation. This is logical, but I want to check which other videos the user has played when he has seen video 3 in his list. So I would change the data format to the following:
{input:{'video-1':1,'video-2':1},{output:{'video-3':1}}}
In this case, I have all other videos as input, and the ones that has been played as output. This works fine if the user has only played 1 video, but I get another issue if the user has multiple output values:
{input:{'video-1':1},{output:{'video-2':1', video-3':1}}}
Okay, this one is not perfect, because video 2 could also be a recommendation for video 3 and vice versa (because the user has also seen video 2 and 3 in his list, but we filtered both out because they are in the output). So at the end, I decided to create single objects for each output:
[
{input:{'video-1':1,'video-3':1},{output:{'video-2':1}}},
{input:{'video-1':1,'video-2':1},{output:{'video-3':1}}}
]
Then I do:
net.run({'video-1':1});
This will output something like this:
video-2: 0.51xxxx
video-3: 0.49xxxx
Great, I have both videos as possible recommendation. But if I add more videos that match, my threshold would decrease. For example, if we get 5 matches, the output would be something like this:
video-2: 0.2xxx
video-3: 0.2xxx
video-4: 0.2xxx
video-5: 0.2xxx
In my app I check, how close the values are to 1 to determine if a match is valid. So at the end, I would need a output like this:
video-2: 0.95
video-3: 0.92
My question is now which kind of data format or network would be the right one to solve this?
