I’m building a web-scrapper to extract public data from a website’s table
This is my code:
options = webdriver.ChromeOptions()
options.add_argument('--headless')
driver = webdriver.Chrome(options=options)
driver.get(url)
# find enter button and click on it
agree_enter_button = driver.find_element(By.PARTIAL_LINK_TEXT, 'AGREE')
driver.execute_script("arguments[0].click();", agree_enter_button)
# find search fields
from_date_input = driver.find_element(By.XPATH, '//input[@id="criteria_file_date_start"]')
to_date_input = driver.find_element(By.XPATH, '//input[@id="criteria_file_date_end"]')
full_name_input = driver.find_element(By.XPATH, '//input[@id="criteria_full_name"]')
# fill the fields with inputs
driver.execute_script("arguments[0].value = arguments[1];", full_name_input, first_name + ' ' + last_name)
driver.execute_script("arguments[0].value = arguments[1];", from_date_input, from_date)
driver.execute_script("arguments[0].value = arguments[1];", to_date_input, thru_date)
# find search button and click on it
search_button = WebDriverWait(driver, 10).until(
EC.element_to_be_clickable((By.XPATH, '//a[@class="btn btn-success w-40"]')))
driver.execute_script("arguments[0].click();", search_button)
It seems to be working OK until I click the search button.
When I click the button manually, it takes about 5 seconds for the table to load with all the records via JavaScript.
But when I try to load it within the code it stays stuck on loading: (I’m disabling the headless mode and see it stuck on loading)

If I try to wait for the loading screen to pass, it doesn’t.
When I try to not wait and get the table like this:
html = driver.page_source
df = pd.read_html(html)
I get an empty table with the wrong shape.
When I try to not wait and get the tbody of the table like this:
results_set = driver.find_element(By.ID, 'panel_resultset')
content = results_set.find_element(By.CLASS_NAME, 'collapse-content')
grid = content.find_element(By.ID, 'grid_container')
scroll = grid.find_element(By.ID, 'grid_scroll')
table = scroll.find_element(By.ID, 'grid')
tbody = grid.find_element(By.TAG_NAME, 'tbody')
trs = tbody.find_elements(By.TAG_NAME, 'tr')
I get 0 rows in trs (probably because the table isn’t loaded, but its code exists?)
Seems to me the problem is with the loading screen which seems to load quickly when visiting the site manually but gets stuck on loading when using selenium web-driver.
Any help on solving this will be much appreciated!!