HTML and captions
If you have encountered images on the Web (and also if somehow you have not) then you might have heard of concepts like alt-text, or perhaps title text, or the general idea of captioning images.
When it comes to images and image-like elements in HTML5, there are actually several ways of attaching things that could be called captions. They have slightly different purposes, and present differently in user agents. This is a brief overview of what they are, and what they are meant for.
The title
attribute
title
is a global attribute (meaning available on all HTML elements) which allows for adding what is termed advisory information to HTML elements. An example is in listing 1.
The advisory information is usually displayed in the form of a tool-tip—a box that pops up when hovering your mouse pointer over something. The attribute can therefore be used to add tool-tips to things like paragraphs, links, or, yes, images.
There is an immediately obvious problem here: some user agents don't really have tool-tips. Touchscreen devices, for one, usually do not have mouse pointers, and support for displaying title text in browsers for such devices may be rather lacking. Even on an ordinary desktop computer with a mouse-like pointing device, title text often has poor discoverability. The standard itself discourages reliance on the title
attribute for these reasons.
Title text can have its uses, however: interfaces where it's already expected, or things like buttons with explanatory tool-tips. However, in general, by itself, it's not really a good way to caption images.
The alt
attribute
<img>
elements (which are the usual way of embedding images in an HTML document) can have an alt
attribute, which defines the element's fallback content. Listing 2 provides an example.
Fallback content, as the name indicates, is the content presented to the user if the main content cannot be presented. An obvious case of this is screen readers—they cannot display images, so instead they read out the alt
text. A less obvious case is text mode browsers like Lynx, or situations where network problems or explicit settings mean a more ordinary browser cannot display a particular image.
Of note is the fact that the HTML5 standard defines different semantics for missing and empty alt
attributes. An alt
attribute set to an empty string (alt=""
) means that the image is decorative or supplemental, and so user agents incapable of rendering images can skip it entirely, as if it did not exist. On the other hand, <img>
elements with no alt
attribute at all are considered part of content, but with no alternate textual representation. A text mode browser or a screen reader can skip images with empty alt
text entirely, while for images with missing alt
attribute, it should indicate that the image is there, but does not have fallback content. Note that instead of <img>
tags with empty string alt
attributes, the general recommendation is to include images via CSS if they are non-essential parts of the UI, or otherwise purely decorative.
alt
text is generally not expected to be displayed in a tool-tip. It is not meant to provide supplemental information, but rather an alternate representation of the content. People, however, often conflate the two, likely due to a historical quirk: old versions of Internet Explorer used to display alt
text in tool-tips, as if it were title
text. Microsoft's more modern browsers no longer do this, but some people still hold the expectation of that behavior.
Figures and their captions
A good way to include a caption with an image in HTML5 is to use the <figure>
element. <figure>
elements can include <figcaption>
elements, which, as the name hints, contains the caption for the figure.
<figure>
s can include things besides images—in fact, they can include pretty much anything. This means that your figures can be, for example, math formulas or code samples (like, say, a sample of how to use figure elements). They provide a way to mark up some content that is relevant to, but separate from the main text on the page—the standard describes it as content that is "self-contained (like a complete sentence)".
The advantage of using <figure>
s is that they are semantic HTML: elements within the figure are distinguished as separate from the rest of content, and the use of <figcaption>
makes it clear what is being captioned. Unlike title
attributes, <figcaption>
elements can contain any flow content, are generally displayed like any other text, and can be styled with CSS, which gives them better discoverability.
Summary
To sum up: generally, all <img>
elements should have an alt
attribute, even if it is an empty string. The alt text should not provide supplemental description of the image, but rather it should be the replacement text, to be used when the image cannot be displayed.
In situations where it makes sense to call something a figure, use the <figure>
element, together with <figcaption>
marking up the caption. Style with CSS as needed, showing the figure–caption relationship visually as well.
Use the title
attribute sparingly, ideally in situations where it's already expected. Keep in mind that it is less discoverable than the alternatives, and might be actually impossible to read with some user agents.
Further reading
Standard documents: