I have a streamlit app, and I want it to display a pdf in an iframe. My functionality requirements for my pdf viewer/iframe are:
- I want the pdf to open to a particular (parameterizable) page
- I want the pdf to open with particular (parameterizable) quote/text already highlighted
- I want the pdf to be scrollable in the viewer/iframe
The requirements above led me to go with (or attempt to go with) pdf.js, instead of the streamlit-pdf-viewer custom component.
I’ve stripped down my streamlit app to the following minimal app.py
, which just includes three buttons/links that unsuccessfully attempt to display the pdf viewer per my requirements:
import streamlit as st
import streamlit.components.v1 as components
import urllib
def main():
st.title("Hello from streamlit-n-pdfjs!")
# Locations of my (public) r2 bucket and the document in it I want to view
BUCKET_URL = "https://pub-ec8aa50844b34a22a2e6132f8251f8b5.r2.dev"
DOCUMENT_NAME = "FINAL_REPORT.pdf"
# "Media stored in the folder ./static/ relative to the running app file is served at path app/static/[filename]"
# ^ https://docs.streamlit.io/develop/concepts/configuration/serving-static-files
local_pdfjs_path = "./app/static/pdfjs-4-9-155-dist/web/viewer.html"
# Attempt to link to the doc using the pdf.js viewer
PAGENUM = 100
HIGHLIGHT_QUOTE = 'exited the stage'
ENCODED_QUOTE = urllib.parse.quote(HIGHLIGHT_QUOTE)
FULL_DOC_URL = f"{BUCKET_URL}/{DOCUMENT_NAME}#page={PAGENUM}&search={ENCODED_QUOTE}"
pdfjs_viewer = f"{local_pdfjs_path}?file={FULL_DOC_URL}"
# Clicking the link below opens the correct pdf to the correct page, but does not search/highlight the quote text,
# ...and of course does not open in an iframe
st.markdown(f"[link to the doc I can't get to open in iframe w/ buttons below]({FULL_DOC_URL})") # opens doc in a new tab but doesn't search/highlight quote
# clicking button below says "404: Not Found"
if st.button("Show PDF in iframe with highlights, via pdfjs viewer"):
components.iframe(pdfjs_viewer, height=800, scrolling=True)
# Clicking the button below opens an iframe border, but
# just says "this page has been blocked by chrome" inside
if st.button("Show PDF in iframe with highlights, via regular url w/ encoded params"):
components.iframe(FULL_DOC_URL, height=800, scrolling=True)
if __name__ == "__main__":
main()
…and the latest (unzipped) release/code of pdf.js vendor’d into my repo under vendor/
like so:
|streamlit_n_pdfjs
|--app.py
|--vendor
|----pdfjs-4-9-155-dist
|------web
|--------viewer.html
Finally, I have a .streamlit/config.toml
with:
[server]
enableStaticServing = true
and I launch my streamlit app locally with:
PYTHONPATH=. streamlit run app.py
What I get is documented in the code above, but for clarity there’s three links/buttons shown in the app (none of which work to my requirements):
- A link which opens the pdf file in a new tab to a given page, but does not use an iframe (deliberately, this is mainly testing the pdf/doc is available at the url), and not with the desired text/quote highlighted successfully
- A button that attempts to launch an iframe with the document using pdf.js, but just triggers a 404 message
- A button that attempts to launch an iframe with the document just via default browser pdf rendering from the url, but which triggers a message “this page has been blocked by Chrome”