Open pdf in pdf-js viewer from streamlit app

I have a streamlit app, and I want it to display a pdf in an iframe. My functionality requirements for my pdf viewer/iframe are:

  • I want the pdf to open to a particular (parameterizable) page
  • I want the pdf to open with particular (parameterizable) quote/text already highlighted
  • I want the pdf to be scrollable in the viewer/iframe

The requirements above led me to go with (or attempt to go with) pdf.js, instead of the streamlit-pdf-viewer custom component.

I’ve stripped down my streamlit app to the following minimal app.py, which just includes three buttons/links that unsuccessfully attempt to display the pdf viewer per my requirements:


import streamlit as st
import streamlit.components.v1 as components
import urllib

def main():
    st.title("Hello from streamlit-n-pdfjs!")

    # Locations of my (public) r2 bucket and the document in it I want to view
    BUCKET_URL = "https://pub-ec8aa50844b34a22a2e6132f8251f8b5.r2.dev"
    DOCUMENT_NAME = "FINAL_REPORT.pdf"

    # "Media stored in the folder ./static/ relative to the running app file is served at path app/static/[filename]"
    # ^ https://docs.streamlit.io/develop/concepts/configuration/serving-static-files
    local_pdfjs_path = "./app/static/pdfjs-4-9-155-dist/web/viewer.html"

    # Attempt to link to the doc using the pdf.js viewer
    PAGENUM = 100
    HIGHLIGHT_QUOTE = 'exited the stage'
    ENCODED_QUOTE = urllib.parse.quote(HIGHLIGHT_QUOTE)
    FULL_DOC_URL = f"{BUCKET_URL}/{DOCUMENT_NAME}#page={PAGENUM}&search={ENCODED_QUOTE}"

    pdfjs_viewer = f"{local_pdfjs_path}?file={FULL_DOC_URL}"

    # Clicking the link below opens the correct pdf to the correct page, but does not search/highlight the quote text,
    # ...and of course does not open in an iframe
    st.markdown(f"[link to the doc I can't get to open in iframe w/ buttons below]({FULL_DOC_URL})") # opens doc in a new tab but doesn't search/highlight quote

    # clicking button below says "404: Not Found"
    if st.button("Show PDF in iframe with highlights, via pdfjs viewer"):
        components.iframe(pdfjs_viewer, height=800, scrolling=True)

    # Clicking the button below opens an iframe border, but 
    # just says "this page has been blocked by chrome" inside
    if st.button("Show PDF in iframe with highlights, via regular url w/ encoded params"):
        components.iframe(FULL_DOC_URL, height=800, scrolling=True)
    

if __name__ == "__main__":
    main()

…and the latest (unzipped) release/code of pdf.js vendor’d into my repo under vendor/ like so:


|streamlit_n_pdfjs
|--app.py
|--vendor
|----pdfjs-4-9-155-dist
|------web
|--------viewer.html

Finally, I have a .streamlit/config.toml with:

[server]
enableStaticServing = true

and I launch my streamlit app locally with:

PYTHONPATH=. streamlit run app.py

What I get is documented in the code above, but for clarity there’s three links/buttons shown in the app (none of which work to my requirements):
enter image description here

  1. A link which opens the pdf file in a new tab to a given page, but does not use an iframe (deliberately, this is mainly testing the pdf/doc is available at the url), and not with the desired text/quote highlighted successfully
  2. A button that attempts to launch an iframe with the document using pdf.js, but just triggers a 404 message
  3. A button that attempts to launch an iframe with the document just via default browser pdf rendering from the url, but which triggers a message “this page has been blocked by Chrome”