I have been scraping a investment site with Python and Selenium but due to changes needed to system I will no longer be able to use Python and only option I seem to have is Node.JS / NodeRed
Anyhow I seem to have managed to get the following to login to the site but what I get back is basically a blank page staying need Javascript. If I try to load one of the pages from the network list that comes up in Chrome Developer tools I get generally a 401 error.
Website I go to in a browser if I was to login is https://app.raizinvest.com.au/auth/login
I have found from the recording feature in chrome that this login information is “Posted” to “https://api.raizinvest.com.au/v1/sessions”
Back in the webbrowser it then takes me to a page “https://app.raizinvest.com.au/?activeTab=today” which displays my current funds among a bunch of other irrelevant stuff.
If I load that same page after my login in Node.js I get a Need Javascript page which if I look at the HTML of that page in the browser is what I see so I am happy I have logged in successfully.
What I need to know is how to get the info I want. If I look at the list of items in the network tab back in the browser there is an item account_summary with url reported as ‘https://api.raizinvest.com.au/mobile/v1/account_summary’ but if I try to load that page I get a 401 error. Likewise if I try to click on it in the browser I get an unauthorised error (See screenshot).
My script is as follows. Note: I have obviously changed my login details and the UDID I am just making random at the moment manually but I am aware there is a addon to dynamically make one
For context if relevant the output from the first console output it is a json with TOKEN and User_uuid. Not sure if this is remembered and transmitted automatically in the background of if I need to explicitly send it.
const superagent = require('superagent').agent()
const harvest = async () => {
let dashboard = await superagent
.post('https://api.raizinvest.com.au/v1/sessions')
.send({email: 'fam...@outlook....', password: 'w5.....', udid: 'bc28eff9-0cf2-414d-a17b-01ec20d53bc3' })
.set('Content-Type', 'application/json');
console.log(dashboard.text);
let dashboard2 = await superagent.get('https://app.raizinvest.com.au/?activeTab=today');
console.log(dashboard2.text);
let dashboard3 = await superagent.get('https://api.raizinvest.com.au/mobile/v1/account_summary');
console.log(dashboard3.text);
};
harvest();
Any advice on what to look at or into would be appreciated. I understand as it is behind a login it is difficult to solve but suggestions where I should look would be appreciated.
I tried what I have posted above and get a 401 error. Get the same though when I click that source object in Chrome so unsure what method to utilise to access it