Web Document Rendering Algorithm
HTML 5 outline algorithm
In HTML 4, using div tags to build and describe a document had many limitations. First of all, divs act as block-level elements, which is not necessarily bad—until we start nesting multiple divs. Once several divs are nested, it becomes very difficult to distinguish between divs used for presentation purposes and those used for sectioning the page and grouping content into sections, without using classes and IDs. Also, it was not possible in HTML 4 to define sections with information that referred to the entire site as a whole.
HTML 5 introduced new semantic elements to help better structure documents: header, hgroup, article, section, aside, footer, nav, and others. The purpose of these new elements was not to replace the old div, but to work alongside it and with headings to better organize web page content and the relationships between different types of content on the page. Traditionally, document structure and sectioning (document outline) were created using h1–h6 tags.
The algorithm starts from the "body" tag, which is considered the root of the document, then traverses the document through all nodes to establish the structure. Each time a new section is found, it is used to schematically create the document structure. If the section has a heading, it is used as the name of that section.
For a long time, HTML 5 specifications stated that developers no longer needed to worry about which heading tag number (h1–h6) was used when combined with sectioning elements. What mattered was the nesting level of these headings within sectioning elements. This concept was announced with great fanfare in all sorts of books, articles, and expert talks. In reality, this algorithm was never fully implemented by all browsers, and furthermore, HTML 5.2 does not recommend using this algorithm.
We know we should use HTML 5 to build websites, but we also know that one of the more difficult parts to understand is how to divide document content using the new tags: section, article, aside, and nav. To understand how to use these new tags, we must understand the document rendering algorithm as well as possible.
Understanding the document rendering algorithm can be a challenge, but in the end, it's worth all the effort. You will no longer wonder whether to use the "section" tag or the "div" tag—you will immediately know how to make the right choice. Moreover, you will understand why these elements are used, and this knowledge of semantics is the greatest benefit you'll gain from learning how the algorithm works.
What is the document rendering algorithm? It is a mechanism for generating the schematic structure (a summary, a sketch) of the web page, based on how the elements on the page are marked up. Every web page has a structure, and we will reveal this structure using the application below.
Let's begin a series of examples to help us understand how the rendering algorithm works.
Let's imagine the following structure:
1 We sell beverages
1.1 Beer
1.1.1 Blonde from Vaslui
1.1.2 Brunette from Bârlad
1.1.3 Redhead from Bacău
1.2 Wine
1.2.1 Fetească Neagră
1.2.2 Muscat
1.2.3 Tămâioasă
A simple, clean, easy-to-read structure with a clear hierarchy. To make things even simpler, I'll tell you that only two things influence a document's outline:
- heading elements (h1–h6 and hgroup),
- sectioning elements (section, article, aside, and nav).
Obviously, sectioning content is the new way to structure a page in HTML5, but before that, let's see how we should use headings to achieve this structure.
Creating structure using headings (h1–h6)
To create the structure above, we could use the following markup:
<div>
<h1>We sell beverages</h1>
<h2>Beer</h2>
<h3>Blonde from Vaslui</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit.</p>
<h3>Brunette from Bârlad</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit.</p>
<h3>Redhead from Bacău</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit.</p>
<h2>Wine</h2>
<h3>Fetească Neagră</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
<h3>Muscat</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
<h3>Tămâioasă</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
<p>All beverages sold by our company are quality certified!</p>
</div>
It's that simple. The structure above is created using headings of different sizes. To check the document outline, copy the code above and paste it into the left window of the HTML semantic analyzer:
Open the semantic analyzer in a new window.
A structure created using H1–H6 tags generates implicit sections. Each heading creates its own implicit section, and any new heading of a lower level than the previous one creates another nested section within the higher-level (parent) section.
An implicit section ends when another heading of the same level or a higher level appears in the structure. In our example, the "Beer" section ends when the "Wine" section begins, and each section containing information about a specific type of beer ends when the next section with another type of beer begins.
Below you can see a clear example of an implicit section that ends with a heading of the same level and another section that ends with a heading of a higher level.
<div>
<!‐‐ Start of implicit section ‐‐>
<h3>Blonde from Vaslui</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit.</p>
<!‐‐ A new implicit section starts here, so the above section is closed ‐‐>
<h3>Brunette from Bârlad</h3>
</div><div>
<!‐‐ Start of implicit section ‐‐>
<h3>Blonde from Vaslui</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit.</p>
<!‐‐ A new implicit section starts here, with a higher-level heading ‐‐>
<h2>Brunette from Bârlad</h2>
</div>
Creating Structure Using Sectioning Elements
After seeing how a structure can be created using headings, let's mark up the page again—this time using HTML5 sectioning elements:
<div>
<h6>We sell beverages</h6>
<section>
<h1>Beer</h1>
<article>
<h1>Blonde from Vaslui</h1>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit.</p>
</article>
<article>
<h5>Brunette from Bârlad</h5>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit.</p>
</article>
<article>
<h2>Redhead from Bacău</h2>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit.</p>
</article>
</section>
<section>
<h6>Wine</h6>
<article>
<h3>Fetească Neagră</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
</article>
<article>
<h3>Muscat</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
</article>
<article>
<h1>Tămâioasă</h1>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
</article>
</section>
<p>All beverages sold by our company are quality certified!</p>
</div>
I didn't choose the headings this way because I've lost my mind, but to make something as obvious as possible—namely, that this time, the structure is defined by the sectioning elements and not by the headings.
Copy the code above and paste it again into the semantic analyzer, and you'll see that the headings have no effect on the page structure. The section, article, aside, and nav elements are the ones that shape the page, and in this case, they are called explicit sections.
One of the most discussed features of HTML5 is that multiple H1 headings are allowed, and now you've seen why. It's not an open invitation to mark every heading with an H1 tag, but a recognition that the document structure is created using sectioning elements, and now each section has its own heading structure.
The HTML5 specifications regarding headings and sections are very clear on this point:
☞ Sections may contain headings of any rank, but authors are strongly
encouraged to either use only h1 elements, or to use elements of the appropriate rank for the section's
nesting level.
Sections may contain headings of any rank, but authors are encouraged to use only H1
elements, or to use elements of appropriate rank for the section's nesting level.
Until browsers—and especially screen readers—implement and understand that sectioning elements create sub-sections, using multiple H1 headings is less safe compared to using a heading structure that logically reflects the document's content, as you can see below:
<div>
<h1>We sell beverages</h1>
<section>
<h2>Beer</h2>
<article>
<h3>Blonde from Vaslui</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit.</p>
</article>
<article>
<h3>Brunette from Bârlad</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit.</p>
</article>
<article>
<h3>Redhead from Bacău</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit.</p>
</article>
</section>
<section>
<h2>Wine</h2>
<article>
<h3>Fetească Neagră</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
</article>
<article>
<h3>Muscat</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
</article>
<article>
<h3>Tămâioasă</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
</article>
</section>
<p>All beverages sold by our company are quality certified!</p>
</div>
I must draw your attention to the paragraph: "All beverages sold by our company are quality certified!". In the example that correctly uses headings, this paragraph is an implicit part of the section created by the heading "Tămâioasă". Screen readers will clearly see that this text applies to the entire document and not just the "Tămâioasă" section.
In the example created with sectioning elements, this issue is resolved quite simply by moving the paragraph to the top-level "We sell beverages".
Let's mix the structures a bit
Let's see what happens when implicit and explicit sections are combined. As long as you remember that implicit sections can be included in explicit sections—but not the other way around—you'll be fine. For example, the following structure works well and is perfectly correct:
<h1>We sell beverages</h1>
<section>
<h2>Wine</h2>
<h3>Fetească Neagră</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
<h3>Muscat</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
<h3>Tămâioasă</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
</section>
The resulting structure is shown below:
1 We sell beverages
1.1 Wine
1.1.1 Fetească Neagră
1.1.2 Muscat
1.1.3 Tămâioasă
However, don't expect to get the same structure by inserting an explicit section inside an implicit section. It won't work!
The sectioning element <article> will simply close the implicit section created by the heading and will create a very different structure, as you can see below:
<div>
<h1>We sell beverages</h1>
<h2>Wine</h2>
<article>
<h3>Fetească Neagră</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
</article>
<article>
<h3>Muscat</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
</article>
<article>
<h3>Tămâioasă</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
</article>
<p>All beverages sold by our company are quality certified!</p>
</div>
The code above will produce the following structure:
1 We sell beverages
1.1 Wine
1.2 Fetească Neagră
1.3 Muscat
1.4 Tămâioasă
There is no way to make the explicit sections created by the article tag become sub-sections of the implicit section "Wine". Headings can be used to organize the content of sectioning elements, but not the other way around.
Important to Follow
Untitled Sections
Until now, we haven't discussed the nav and aside tags at all, but you should know that they behave similarly to the section and article tags.
If you have secondary content on the page that is related to the main topic—let's say, in our example, techniques for producing beverages—you could mark this content with the aside tag, which will result in the creation of an explicit section.
It is not absolutely necessary to use headings for nav and aside, so they will generate untitled sections.
Try the following code in the semantic analyzer:
<div>
<nav>
<ul>
<li><a href="#">Home</a></li>
<li><a href="#">Beer</a></li>
<li><a href="#">Wine</a></li>
</ul>
</nav>
<h1>We sell beverages</h1>
<section>
<h2>Beer</h2>
</section>
<section>
<h2>Wine</h2>
</section>
</div>
The nav element appears as an untitled section, which is not necessarily serious and is not considered an error in HTML5. However, it is recommended to use headings for every sectioning element in the document, the main reason being improved accessibility.
On the other hand, using section and article elements without headings should be strictly avoided.
If you're unsure when to use these two tags, a good rule is to follow the logic of the document and see if it's necessary. If not, then the wisest choice would be to use the old div.
The HTML5 specifications do not clearly state that sectioning elements must have a heading:
The section element represents a generic section of a document or application. A section, in this context, is a thematic grouping of content, typically with a heading.
However, the interpretation I give to the text above—especially the phrase "typically with a heading"—is that I would need a damn good reason not to use a heading for sections. I absolutely cannot understand it as a suggestion to ignore the use of headings with the new HTML5 elements.
Still, the specifications go even further and provide an example of using the article tag as markup for a blog's comment section, of course without a heading. So exceptions do exist. In conclusion, when using the "section" and "article" tags on a page, make sure you have a solid reason not to include a heading as well.
Root Sections
If you've been paying attention, I mentioned that sectioning elements cannot create a sub-section of an implicit section. But in the last example, the H1 heading—"We sell beverages"—not included in any section, is immediately followed by the "Beer" section, and the content of this section is actually a sub-section of the H1 heading.
The reason this happens is due to root sections. Root sections create their own structure in the document, separate from the rest of the content, and have no ancestors (parents). As the specifications indicate, sectioning elements create sub-sections of the nearest root ancestor or the nearest parent sectioning element:
☞ Sectioning content elements are always considered subsections of their
nearest ancestor sectioning root or their nearest ancestor element of sectioning content, whichever is
nearest.
Sectioning content elements are always considered subsections of the nearest root element or
the nearest sectioning ancestor, whichever is closest.
The body element is a root element.
For example, in the code below, the H1 heading is a root heading, and the "section" element is a sub-section of the root element "body":
<h1>We sell beverages</h1>
<section>
<h2>Wine</h2>
<h3>Fetească Neagră</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
<h3>Muscat</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
<h3>Tămâioasă</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing.</p>
</section>
There are five other root elements:
- blockquote
- details
- fieldset
- figure
- td
One important thing to note is that although these elements can contain their own internal structure of headings and sections, that structure will not affect the structure of their ancestors.
Finally, one last thing about the root element that brings a bit of joy: the first heading in the document that is not inside a sectioning element is the one that will serve as the document's title. Check the code below in the semantic analyzer.
<section>
<h1>This is an H1 heading.</h1>
</section>
<h6>This H6 heading is the first outside the section.</h6>
<h1>This H1 heading is the last.</h1>
Untitled Documents
If there is no heading in the root of the document (excluding inside implicit sectioning elements), then the document itself will be untitled. This issue is quite serious and can occur either due to negligence or, paradoxically, from paying too much attention to how sectioning elements should be used.
In Conclusion
The logic behind the document rendering algorithm can be difficult to understand, and the specifications themselves are also quite hard to grasp—even after repeated readings.
However, if you remember the basic concepts, such as the fact that section, article, aside, and nav elements create sub-sections in the web document, then you're 90% on the right track.
It's a good idea, after marking up content with sectioning elements, to check the result in a semantic analyzer or using other tools in your text editor, because with enough practice and repetition, you'll come to understand the rendering algorithm very well. Once you understand the algorithm, every web page you create will be semantically well-structured and robust.
