How can I remove tag names but leave the inner html contents using DOMDocument

I have a terribly formed html, Thanks to MS Word 10 “save as htm, html”. Here’s a sample of what I’m trying to sanitize.

<html xmlns:v="urn:schemas-microsoft-com:vml"... other xmlns>
    <head>
        <meta tags, title, styles, a couple comments too (they are irrelevant to the question)>
    </head>
    <body lang=EN-US link=blue vlink=purple style='tab-interval:36.0pt'>
        <div class=WordSection1>
            <h1>Pros and Cons of a Website</h1>
            <p class=MsoBodyText align=left style='a long irrelevant list'><span style='long list'><o:p>&nbsp;</o:p></span></p>(this is a sample of what it uses as line breaks. Take note of the <o:p> tag).
            <p class=MsoBodyText style='margin-right:5.75pt;line-height:115%'>
                A<span style='letter-spacing:.05pt'> </span>SAMPLE<span style='letter-spacing:.05pt'> </span>TEXT
            </p>
        </div>
        <div class=WordSection2>...same pattern in div 1</div>
        <div class=WordSection3>...same...</div>
   </body>
</html>

What I need from all of this is:

<div>...A SAMPLE TEXT</div>
<div>...same pattern in div 1</div>
<div>...same...</div>

What I have so far:

$dom = new DOMDocument;
$dom->loadHTML($filecontent, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
$body = $xpath->query('//html/body');
$nodes = $body->item(0)->getElementsByTagName('*');
foreach ($nodes as $node) {
    if($node->tagName=='script') $node->parentNode->removeChild($node);
    if($node->tagName=='a') continue;
    $attrs = $xpath->query('@*', $node);
    foreach($attrs as $attr) {
        $attr->parentNode->removeAttribute($attr->nodeName);
    }
}
echo str_ireplace(['<span>', '</span>'], '', $dom->saveHTML($body->item(0)));

It gives me:

<body lang="EN-US" link="blue" vlink="purple" style="tab-interval:36.0pt">
    <div>
        <h1>Pros and Cons of a Website</h1>
        <p><p> </p></p>
        <p>A SAMPLE TEXT</p>
    </div>
    <div>...same pattern in div 1</div>
    <div>...same...</div>
</body>

which I’m good with, but I want the body tag out. I also want h1 and it’s content out too, but when I say:

if($node->tagName=='script' || $node->tagName=='h1') $node->parentNode->removeChild($node);

something weird happens:

<p><p> </p></p> becomes <p class="MsoBodyText" ...all those very long stuff I was trying to remove in the first place><p> </p></p>

I’ve come across some very good answers like:

How to get innerHTML of DOMNode? (Haim Evgi’s answer, I don’t know how to properly implement it, Keyacom’s answer too), Marco Marsala’s answer is the closest I got but the divs all kept their classes.