I am writing a html to scrape data by puppeteer from internet with the code below:
const puppeteer = require('puppeteer');
(async ()=>
{
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto("https://www.info.gov.hk/gia/wr/202404/26.htm")
const bulletin_urls = await page.$$('div.leftBody');
for(const bulletin_url of bulletin_urls)
{
try
{
const bulletin_name = await page.evaluate(el => el.textContent, bulletin_url)
console.log(bulletin_name)
const single_url = await page.evaluate(el => el.getAttribute("href"), bulletin_url)
console.log(single_url)
}
catch(err)
{
}
}
await browser.close()
}) ();
I want to scrape 2 information of ALL PRESS WEATHER:
- the name of the bulletins
- the url of the Hyperlink.
I succeed when scraping 1 while fail at 2. What I get is simply null. I try to amend my code to
const single_url = await page.evaluate(el => el.querySelector(".NEW").getAttribute("href"), bulletin_url)
console.log(single_url)
However, it just returns me the first but not all url. What shall I do to collect all the url of the hyperlink at one command? Any suggestion will be appreciated.