Get in touch

Let us know your requirements and we'll get back to you as soon as possible.
Drop files here or click to upload

We care about your privacy and automatically agree to the following NDA. This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Submit

Thank you, !

Thank you very much for submitting your inquiry to Xfive! We'll be in touch with you soon.
Gopika Setlur
Gopika Setlur Elefint Designs, Inc.
Xfive is an extremely reliable and professional development partner. They have helped us improve our process and offerings. We really appreciate their flexibility, quality, and attention to detail.
Home Blog How to Parse HTML Response Without Loading Any Images

How to Parse HTML Response Without Loading Any Images

Or: How I stopped worrying and learned to createHTMLDocument.

—–

TL;DR

If you want to parse HTML response without loading any unnecessary resources like images or scripts inside, use DOMImplementation’s createHTMLDocument()
to create new document which is not connected to the current one parsed by the browser and behaves as well as normal document.

—–

There are times when as a frontend developer you can’t always use RESTful APIs providing well formatted JSON server responses with which you can do whatever you like. Sometimes you just have to use HTML responses, no matter how badly it sounds.

While working on our latest project I came across an interesting case. One of the project’s requirements was to have some of the corresponding pages’ content (basically a hero section) preloaded. The solution was maybe not the best, but it was quite simple :

  1. Send an AJAX request for the desired page.
  2. Get its whole HTML in the response (partial was not an option here)
  3. Create new document element or jQuery object from it just to find the needed section
  4. Append it to your current document when needed.

Simple, works, we can go home now. Well, nope.

Later on, while testing network usage, I realised that actually there are some images loaded, which should not be there. What the-? They’re not even in the DOM. Actually, they were there. Not rendered and attached, but created in the context of current document.

Calling document.createElement(el).innerHTML(data) or with jQuery $(data) creates a node in current document which triggers the browser to treat it like the rest of the page, which means loading all of the resources like images, scripts, etc. inside.

So, what’s the solution to that?

I’ve read whole stories about replacing src attributes with some dummy data and restoring them later, removing <img> elements with RegEx (sic!), storing them somewhere and recreate when needed, even using web workers to provide some better performance. None of this crap.

Better and easier solution

Create an entirely new document, which is not connected with current one. Short research reveals that it’s possible and is super easy to use in our case.
Document object provides  document.implementation.createHTMLDocument() method, which is intended to do just that. What’s best — creating new elements inside our entirely new and detached document isn’t recognised by a browser to load anything extra and we can traverse it with our favourite methods and then just attach it to our current document.

Here’s a basic code snippet showing how easy it is to deal with new document, where data is HTML string response. It just works:

function insertPreview (data) {
    var newHTMLDocument = document.implementation.createHTMLDocument('preview');
    var el = newHTMLDocument.createElement('div').innerHTML = data;
    var $pageHTML = $(el);
    var $pageHero = $pageHTML.find('.page-section--hero');
    $('.page--next').append($pageHero);
}

Cool, huh? What's cooler, it's supported by all modern browsers, even in IE9+.

About the author

Michal Pierzchala

Michal Pierzchala LinkedIn

Michal Pierzchala has been a member of Xfive since 2013. Part of the Jest core team. Passionate about modern web technologies, enthusiast of traveling, sailing and future space exploration.

More from Michal
More from Michal

Comments (8)

Write a comment

Carl

Wow. That's actually very cool. It a really nice way to do things.

Thanks for sharing

Aug 13, 2015

Botsonen

It looks like:

var $pageHTML = $(el);

still tries to load all the images (Chrome, Windows). So creating a new document doesn't seem to solve the original problem.

Oct 10, 2015

Michal Pierzchala

Hi Botsonen,
The feature was thoroughly tested on Chrome especially (OSX though, but they're not very different with Windows version) so it's hard for me to tell why it's not working for you. But it definitely should!

If possible please send a jsbin/jsfiddle/whatever live example of your code and I'll try to help.

Cheers,
Michal

Oct 12, 2015

Steve Mark

Thanks for sharing with us i'm trying this experiment in my recent site.

Oct 30, 2015

Jerzy

Have you heard of document.createDocumentFragment? It's much leaner.

Apr 13, 2016

Jean-Baptiste

This didn't work (still getting images loaded):
var el = newHTMLDocument.createElement('div').innerHTML = data;
var $pageHTML = $(el);

This worked:
var el = newHTMLDocument.createElement('div');
el.innerHTML = html;
var $pageHTML = $(el);

Dec 07, 2016

adipiciu

I could not get it to work, I don't know why. But I found another clean solution, using DOMParser:
var parser = new DOMParser();
var doc = parser.parseFromString(ajaxResp.responseText, "text/html");
...

Hope it helps someone

Apr 03, 2017

Jérémie

It didn't work for me on last Firefox stable version.

Jean-Baptiste way did work though.

Aug 02, 2018

Would you like to add something?

All fields are required. Your email address will not be published.

Submit

Related blog posts