Arrow Down Arrow Right Arrow Down Arrow Right Arrow Arrow Down Arrow Left Arrow Right Articles Case Study Close Facebook GitHub Google+ Menu Information Link LinkedIn x five Interview Location Code Snippet Twitter Tick Timer Users Card Discount Magnifier Time Quote Task

This website uses cookies to ensure you get the best experience on our website. Privacy Policy

Ok

How to parse HTML response without loading any images

by Michal Pierzchala on August 12, 2015
Published in Web Development 8 Comments

Or: How I stopped worrying and learned to createHTMLDocument.

—–

TL;DR

If you want to parse HTML response without loading any unnecessary resources like images or scripts inside, use DOMImplementation’s createHTMLDocument()
to create new document which is not connected to the current one parsed by the browser and behaves as well as normal document.

—–

There are times when as a frontend developer you can’t always use RESTful APIs providing well formatted JSON server responses with which you can do whatever you like. Sometimes you just have to use HTML responses, no matter how badly it sounds.

While working on our latest project I came across an interesting case. One of the project’s requirements was to have some of the corresponding pages’ content (basically a hero section) preloaded. The solution was maybe not the best, but it was quite simple :

  1. Send an AJAX request for the desired page.
  2. Get its whole HTML in the response (partial was not an option here)
  3. Create new document element or jQuery object from it just to find the needed section
  4. Append it to your current document when needed.

Simple, works, we can go home now. Well, nope.

Later on, while testing network usage, I realised that actually there are some images loaded, which should not be there. What the-? They’re not even in the DOM. Actually, they were there. Not rendered and attached, but created in the context of current document.

Calling document.createElement(el).innerHTML(data) or with jQuery $(data) creates a node in current document which triggers the browser to treat it like the rest of the page, which means loading all of the resources like images, scripts, etc. inside.

So, what’s the solution to that?

I’ve read whole stories about replacing src attributes with some dummy data and restoring them later, removing <img> elements with RegEx (sic!), storing them somewhere and recreate when needed, even using web workers to provide some better performance. None of this crap.

Better and easier solution

Create an entirely new document, which is not connected with current one. Short research reveals that it’s possible and is super easy to use in our case.
Document object provides  document.implementation.createHTMLDocument() method, which is intended to do just that. What’s best — creating new elements inside our entirely new and detached document isn’t recognised by a browser to load anything extra and we can traverse it with our favourite methods and then just attach it to our current document.

Here’s a basic code snippet showing how easy it is to deal with new document, where data is HTML string response. It just works:

function insertPreview (data) {
    var newHTMLDocument = document.implementation.createHTMLDocument('preview');
    var el = newHTMLDocument.createElement('div').innerHTML = data;
    var $pageHTML = $(el);
    var $pageHero = $pageHTML.find('.page-section--hero');
    $('.page--next').append($pageHero);
}

Cool, huh? What's cooler, it's supported by all modern browsers, even in IE9+.

About the author

Michal Pierzchala

Michal Pierzchala has been a member of Xfive since 2013. Part of the Jest core team. Passionate about modern web technologies, enthusiast of traveling, sailing and future space exploration.

More articles from Michal

Comments

Carl August 13, 2015

Wow. That's actually very cool. It a really nice way to do things.

Thanks for sharing

Botsonen October 10, 2015

It looks like:

var $pageHTML = $(el);

still tries to load all the images (Chrome, Windows). So creating a new document doesn't seem to solve the original problem.

Michal Pierzchala October 12, 2015

Hi Botsonen,
The feature was thoroughly tested on Chrome especially (OSX though, but they're not very different with Windows version) so it's hard for me to tell why it's not working for you. But it definitely should!

If possible please send a jsbin/jsfiddle/whatever live example of your code and I'll try to help.

Cheers,
Michal

Steve Mark October 30, 2015

Thanks for sharing with us i'm trying this experiment in my recent site.

Jerzy April 13, 2016

Have you heard of document.createDocumentFragment? It's much leaner.

Jean-Baptiste December 7, 2016

This didn't work (still getting images loaded):
var el = newHTMLDocument.createElement('div').innerHTML = data;
var $pageHTML = $(el);

This worked:
var el = newHTMLDocument.createElement('div');
el.innerHTML = html;
var $pageHTML = $(el);

adipiciu April 3, 2017

I could not get it to work, I don't know why. But I found another clean solution, using DOMParser:
var parser = new DOMParser();
var doc = parser.parseFromString(ajaxResp.responseText, "text/html");
...

Hope it helps someone

Jérémie August 2, 2018

It didn't work for me on last Firefox stable version.

Jean-Baptiste way did work though.

Would you like to add something?

All fields are required. Your email address will not be published.

More from the blog

Work with us