I’m developing a NEXTjs app that allows user to view their docx files in the browser, the goal is to convert the file from word to html code with proper formatting, ps: html should not be editable
Im currently using mammoth.js but it doesn’t seems to be working below is the component
export default function Home() {
const [htmlContent, setHtmlContent] = useState("");
const handleFileUpload = async (event) => {
const file = event.target.files[0];
if (
file &&
file.type ===
"application/vnd.openxmlformats-officedocument.wordprocessingml.document"
) {
const reader = new FileReader();
reader.onload = async (e) => {
const arrayBuffer = e.target.result;
const options = {
styleMap: [
"p.Heading1 => h1",
"p.Heading2 => h2",
"p.Heading3 => h3",
"p.Heading4 => h4",
"p.Heading5 => h5",
"p.Heading6 => h6",
"p[style-name='Title'] => h1.title",
"p[style-name='Subtitle'] => h2.subtitle",
"p[style-name='Quote'] => blockquote",
"table => table.table-bordered",
"p => p.paragraph",
"r[style-name='Bold'] => strong",
"r[style-name='Italic'] => em",
],
convertImage: mammoth.images.imgElement((image) => {
return image.read("base64").then((imageBuffer) => {
return {
src: "data:" + image.contentType + ";base64," + imageBuffer,
};
});
}),
};
const result = await mammoth.convertToHtml(
{ arrayBuffer },
options
);
setHtmlContent(result.value);
};
reader.readAsArrayBuffer(file);
} catch (error) {
console.error(error);
}
};
const handleSaveHtml = () => {
const blob = new Blob([htmlContent], { type: "text/html;charset=utf-8" });
saveAs(blob, "converted-document.html");
};
return (
<>
<main className={styles.main}>
<input type="file" onChange={handleFileUpload} accept=".docx" />
{htmlContent && (
<div>
<h2>Converted HTML:</h2>
<div
dangerouslySetInnerHTML={{ __html: htmlContent }}
style={{
backgroundColor: "#fff",
color: "#000",
padding: "20px",
borderRadius: "8px",
lineHeight: "1.6",
fontFamily: "Arial, sans-serif",
}}
/>
<button onClick={handleSaveHtml} style={{ marginTop: "20px" }}>
Save HTML
</button>
</div>
)}
</main>
</>
);
}
Issue:
The problem is that when I upload a .docx file, the conversion does not work as expected. The HTML output is either not rendered at all or rendered incorrectly. There are no specific errors in the console, but the document’s content is not being displayed properly in the browser.
What I’ve Tried:
I’ve ensured that the file being uploaded is a valid .docx file.
I’ve verified that the file type is correctly identified as “application/vnd.openxmlformats-officedocument.wordprocessingml.document”.
I’ve tried adjusting the styleMap to ensure proper mapping of Word styles to HTML elements.
I’ve checked the browser console for any errors, but nothing significant shows up.
Environment:
Next.js version: 14.2.5
mammoth.js version: 1.4.2
file-saver version: 2.0.5
Browser: Chrome 104
Question:
What could be causing mammoth.js to fail in converting the DOCX file to HTML properly, and how can I fix this issue to ensure that the HTML is rendered correctly in the browser?