I’m struggling to find a way to click on the “All” option in a dropdown list and scrape all the content inside that page. I have come across a few posts but they’re a little different from my situation. My “select” seems to not have id or label, with an empty “name”.
The website I’m scraping is https://www.nba.com/stats/players/traditional?Season=1998-99&SeasonType=Regular+Season
<div class="Pagination_pageDropdown__KgjBU">
<div class="DropDown_content__Bsm3h">
<label class="DropDown_label__lttfI">
<p data-no-label="true"></p>
<div class="DropDown_dropdown__TMlAR">
<select name="" class="DropDown_select__4pIg9">
<option value="-1">All</option>
<option value="0">1</option>
<option value="1">2</option>
<option value="2">3</option
<option value="3">4</option>
<option value="4">5</option>
<option value="5">6</option>
<option value="6">7</option>
<option value="7">8</option>
<option value="8">9</option>
</select>
…
My current code looks like this:
import asyncio
from playwright.async_api import async_playwright
from NBA.spiders.nba import NbaScraping
`async def main():
async with async_playwright() as p:
browser = await p.chromium.launch(headless=False)
page = await browser.new_page()
spider = NbaScraping()
spider.start_requests()
for url in spider.start_urls:
await page.goto(url)
selector = page.locator('.Pagination_pageDropdown__KgjBU.DropDown_select__4pIg9')
await selector.wait_for()
page.select_option(value='-1')
page.wait_for_timeout(5999)
await browser.close()
asyncio.run(main())`
I expect to scrape all the content from the “All” option in the dropdown list. I have spent a week on such problem but I couldn’t find a solution that works. I’m still new to this so forgive me if the question sounds naive to most of you