How to forward audio/video data from a user’s browser(client) to a server?

What I am Doing

I have a javascript file that uses getUserMedia() to get the audio and video from the client’s computer. How do I implement computer vision to video feed which was extracted from the camera using javascript getUserMedia() and speech recognition to audio feed which was extracted from the microphone using javascript getUserMedia() using python?

Why do I want to use Python?

I want to use python because I have already written all the code to run this locally on my machine, using flask but to run on other computers I have to use javascript to access the client’s mic and cam and can’t use python to do that or python would detect a mic and cam in the server instead of the client’s machine.

What I Tried

Attempt #1

I have tried using js2py to translate .js file to .py file using:

js2py.translate_file(static/sketch.js', 'sketch.py')
from sketch import sketch
# I use getUserMedia() function after that from the .py file

I got an error which I found out was a too large .js file error.

Attempt #2

I looked at MDN docs articles:

https://developer.mozilla.org/en-US/docs/Web/Guide/AJAX

https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest

https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Client-side_web_APIs/Fetching_data

I got really confused about how to use Ajax so didn’t even try because I didn’t know what to do. I am highly leaning towards thinking this could solve my problem but I could be wrong that’s is why I am asking this question.

Underlying Question

My question is how to deliver audio/video data from a user’s browser to a server?