Display objects (figures, tables, boxes, math)

Status: DRAFT


JATS4R would be grateful for any input on this document during the draft period from anyone interested in how to capture article versioning and history using JATS XML. To comment, please comment the actual Display Objects working document, which is publicly available. The deadline for adding comments is 25 April, 2017.


Common: <label>, <caption>, <title>, <alt-text>, <long-desc>, <alternatives>, <permissions>, <graphic>, <media>, <object-id>

Specific: <fig>, <fig-group>, <table-wrap>, <table-wrap-group>, <boxed-text>, <chem-struct-wrap>, <disp-formula-group>


These recommendations describe best practices for tagging display objects in JATS, which include (but are not limited to) the following:

  • <boxed-text>
  • <disp-formula>, <disp-formula-group>, <inline-formula>
  • <fig>, <fig-group>
  • <graphic>, <inline-graphic>
  • <table-wrap>, <table-wrap-group>

There are two parts: the first part of the recommendations deals with those elements and attributes that are common to all display objects, and the second part deals with recommendations specific to certain display objects.

Additional reading

  1. Best practices related to tagging captions from the JATS tag library (v 1.1): http://jats.nlm.nih.gov/publishing/tag-library/1.1/element/caption.html
  2. Best practices related to accessibility from the JATS tag library (v1.1): http://jats.nlm.nih.gov/publishing/tag-library/1.1/element/alt-text.html
  3. Specifications (XML and HTML) and the production of identifiers in XML: an analysis by Nick Nunes (HighWire): https://gist.github.com/nine9ths/e5878f0b5f4a462f870ec6a67980b2cf


Part A: Recommendations on elements and attributes common to all display objects

  1. @id. To cross-reference an object (for instance link a mention in a piece of text to the object), the object must have an ID that is unique within the document. See Additional Reading item #3, above, on using identifiers in XML.
    [[Validator result: warning]].
  2. @content-type. This attribute is a general-use attribute that is available on many elements in JATS. It has no controlled or suggested values and is available for internal use by publishers within their systems. It is intended to treat an element in a distinct way; for example, to specify information classes, semantic categories, or functions for grouping elements. It can also be used to preserve semantic tagging that can’t be captured directly with an element in JATS. Because @content-type does not affect general machine readability, JATS4R has no particular recommendations on the use of this attribute.
  3. @orientation. This attribute does not affect machine reusability or exchange and is therefore not covered here.
  4. @position. Use this attribute to denote how the display object can be processed. A value of “anchor” indicates that the object must remain in its exact position in the text flow. A value of “float” indicates that the object is not anchored and may be moved, such as to a new column, a new window, a new page, or the end of the document, for example. A value of “margin” indicates that, in print, the object should be placed in the margin or gutter; online, the object should remain closely associated with the text. A value of “background” indicates that the object (typically an image) is used as background displayed visually “behind” the narrative text. If no attribute is set then the default is “float”. See notes on @position for sub element graphic in JATS Documentation.
  5. @specific-use. Use this attribute when making usage or processing distinctions. @specific-use should not be used to specify differences in appearance or formatting. It has no controlled or suggested values and is intended for internal use by publishers within their systems and not for general machine readability.
  6. @xml:lang. The default language of a JATS article is English, so it is only necessary to use this attribute to set the language value for the article as a whole on <article> if it is not English. This value will be inherited by all other elements in the article, so only use @xml:lang on objects within the document if the language for these differs from that of the article as a whole (e.g, as for a translated table). When setting the value, use an appropriate two-letter, lowercase value to represent the language. The value should come from this document: http://www.iana.org/assignments/language-subtag-registry (e.g., “en” for English, “fr” for French, etc). For more information on best practices related to this attribute and the indication of scripting (such as for Asian languages that may be represented in more than one script form), see the JATS tag library notes on this attribute. [[Validator tool result: Check whether value of xm:lang exists in language registry; if not, then an error ]]
  7. @xml:base. @xml:base provides a base URI for identifiers in the XML document or a part of the XML document. For example, if a document has an @xml:base of “https://jats.nlm.nih.gov/”, a URI inside the document “publishing/rationale.html” would be processed as if it were “https://jats.nlm.nih.gov/publishing/rationale.html”. Values of @xml:base are inherited, meaning that if there is an @xml:base on a document and on a part of the document, uses inside the part of the document take the value of the part, not the whole. [[Validator tool result: if xml:base is not a URI, result is an error ]]
  8. <object-id>. Some publishers assign component DOIs to specific parts of the article, or associated items such as figures or videos. In such instances the DOI must be identified using <object-id> with @pub-id-type=”doi” [[Validator result: check it is a DOI (2 digit.4 or 5 digit/* ]]
  9. Labels: <label>. If the object has a label (e.g., “Fig. 1”, “Table 2”, “Box 1.1”) and this information is to be captured in the XML (rather than generated by a style sheet), then contain this information within <label>, and not within <caption> [[Validator result: Warning if label-like thing is found in caption text rather than in <label> ]]
  10. Titles: <caption>, <p>, <title>. If the object has a title that is obviously separate from the main caption text (for example, if the first sentence is the title), then contain this information within the <title> element inside <caption>. Put the remainder of the caption within <p>, after <title>.
    Do not use labels (such as “Fig. 1”) within <title> or <caption><p>; use <label> instead (see Recommendation 9, above). [[Validator result: warning if sentence starts as “Fig”/”Table”/”Box” ]]
  11. <alt-text>. This element should be reserved for accessibility uses, such as assistive reading devices, and should not necessarily be a direct copy of the caption text. If used, this element should contain a piece of text intended to describe the object in a useful way and should reflect what the reading device will tell you about the object as a whole (i.e., it is not visible to the reader).
    • Further notes on usage for alternative text: <alt-text> can be considered analogous to @alt in HTML. Alternative text is text that is associated with an image and that serves the same purpose and conveys the same essential information as the image. In situations where the image is not available to the reader, perhaps because they have turned off images in their web browser or are using an assistive device to help with a visual impairment, the alternative text ensures that no information or functionality is lost. For example, if an image is used as a link, then the alternative text should contain linking information. For complex images, such as bar charts, the alternative text should describe the relative heights of the bars so that the reader can understand the intent of the image.
  12. <long-desc>. This element should be reserved for accessibility uses, such as assistive reading devices. If used, this element should contain a URI that points to a long description of the object, which is reflective of the image’s function and intended purpose (see the note on usage for alternative text, above). Include a short textual description within <long-desc> which ends with an untagged URI pointing to the location of the actual long description, and repeat the URI in the @xlink:href attribute so that a link can be made. Note that this element can be considered analogous to @longdesc in HTML, which is intended to contain a URL pointing to a long description of the object.
  13. Alternative forms: <alternatives>. If more than one form is used to represent the same object, then use <alternatives> to contain the various forms. For example, you may have different image types (.gif, .eps, etc.) to represent a given figure, or you may have a table that is represented with both a graphic (.gif) and XHTML or OASIS CALS XML. As another example, you may have one version of the graphic intended for instructors and a different one intended for students. In all cases, it is conceptually a single object which is represented in a variety of forms (image, XML, etc). Use @specific-use to indicate what each form of the object is meant for. For example, @specific-use=”web-version” vs @specific-use=”print-version”, or alternatively, @specific-use=”for-instructors” vs @specific-use=”for-students”, or @specific-use=”low-res” vs @specific-use=”high-res”. The values of @specific-use attribute are quite custom and do not affect machine reusability (see Recommendation 5, above). [[Validator result: ]]
  14. Permissions <permissions>. If the permissions on the object (i.e., the licensing or the copyright or both) are different for that object than they are for the parent article, then use the <permissions> element to contain this information, and follow the JATS4R recommendations for permissions tagging.
  15. Mime-types: @mimetype and @mime-subtype. These attributes are used to identify the type of media or graphic and must be used for media (<media>); see Recommendation 16, below. Either use @mime-subtype in the XML, or provide the file extension for graphics or inline graphics in XML. The value of the attribute should come from the Internet Assigned Numbers Authority, which is a website that can be confusing to the uninitiated. It is split into sections for 10 mimetypes: application; audio; font; example; image; message; model; multipart; text; video. For each of these there is a three-columns table: column 1 contains the “Name”, this is the name of the mime-subtype (for example, within “video”, “mp4” is listed here; column 2 is confusingly headed “Template”, this is the full mime-type that should be used (for example, video/mp4); and column 3 provides a reference. Using the example of “mp4”, it is listed under application, audio, and video, so it is important to determine which it is. If you are splitting out mime-type and mime-subtype in your XML document then the term before the slash is the mime-type and the mime-subtype is that which follows.

Part B: Recommendations on mark-up for specific display objects or object parts

  1. <graphic> vs <media>. Use <graphic> for still images only; use <media> for objects such as video and audio files.
  2. Figures (<fig>). A figure is a display object that generally contains one or more images. It should be named (with <label>) and may contain a title and caption. A single figure may contain multiple image files, but there may only be one caption. If each of the images has its own caption, use <fig-group> (see recommendation 20).
  3. Tables (<table-wrap>). A table is a display object that generally contains one or more arrays of information. It should be named (with <label>) and may contain a title and caption. A single table may contain multiple arrays, but there may only be one caption. If each of the arrays has its own caption, use <table-wrap-group> (see recommendation 20).
  4. Tables v. figures or boxes. If the object is actually a table (i.e., an array of information) but is represented graphically (i.e., with an image), then do not use <fig> to contain the image. Instead, contain the <graphic> element referencing the image within <table-wrap>. [[Validator result: ]]
  5. Group elements (<fig-group>, <table-wrap-group>, and <disp-formula-group>). The object grouping elements should only be used for a collection of objects that are presented as a larger object. There should be a label, caption, and title at the group level, and each of the child objects should be complete objects with their own label, caption, and title.
    For example, a figure group can be a group of related figures, each with their own label and caption, may all be part of a single set for which there is a label and caption that applies to the entire set. A figure with more than one image (e.g., Parts A-D provided as separate files) is not a <fig-group> unless those parts are each a <fig> with its own caption and title. A multipart figure that has only a single title/caption, should be tagged as just <fig>.
    Similarly, a table group can be a group of related tables that has an caption and/or title that applies to the set as a whole. A <table-wrap> may contain many instances of tabular material (<table>), but it should not be a <table-wrap-group> unless each is a complete <table-wrap> with its own label, caption, and title. A multipart table that has only a single title/caption, should be tagged as just <table-wrap>.
    [[Validator result: ERROR if a <table-wrap-group>, <fig-group>, or <disp-formula-group> does not contain a non-empty label, caption/p, or caption/title. Error if a <table-wrap-group>, <fig-group>, or <disp-formula-group> does not contain > 1 of the corresponding child object with their own label, caption/p, or caption/title. ]]


Example 1: a table with a title and some caption text

   <label>Table 1.</label>
      <title>Results for 25-year simulations for the most unfavorable of the four low survival scenarios, the most optimistic of the four high survival conditions.
      <p> Whole numbers are total number of individuals or items (e.g., &#x0201C;fecundity&#x0201D; is the number of eggs per female; &#x0201C;Female spawners at EQ&#x0201D;, &#x0201C;Total fry&#x0201D;, &#x0201C;Total parr&#x0201D;, and &#x0201C;Total smolt&#x0201D; are total numbers of each life stage. &#x0201C;Beta&#x0201D; is the inverse of the number of fry at which the density-dependent fry-to-parr survival rate is reduced to Alpha/2. &#x0201C;Spawner-to-spawner&#x0201D; is the number of adult spawners of both sexes that are produced by the total number of adult female spawners in the preceding generation. Decimal fractions less than 1.0 are either survival rates or proportions of individuals in a specific life stage.

Example 2: A table with only a title

<table-wrap id="tab1" position="anchor">
    <label>Table 1.</label>
       <title>Designed criterion for the set of faults.</title>

Example 3: A figure with alternative graphical representations

<fig id="f2" orientation="portrait" position="float">
    <label>Fig. 2.</label>
      <p>Numerical simulation response of the 12-pulse LCC-HVDC transmission system subjected to test includes healthy and generator side single-line-to-ground fault conditions by the DWT diagnostic tool. (<italic>a</italic>) Three-phase line-to-line voltage measured on the generator side. (<italic>b</italic>) Three-phase line current measured on the generator side. (<italic>c</italic>) DC-link measured voltage. (<italic>d</italic>) DC-link measured current. (<italic>e</italic>) Discrete wavelet transform response.</p>
       <graphic specific-use=”print” xlink:href="facets-2015-ab13f002.tif"/>
       <graphic specific-use=”web” xlink:href="facets-2015-ab13f002.gif"/>

Example 4: A figure with a DOI in <object-id>

<fig id="elementa.000120.f001" position="float">
      <object-id pub-id-type="doi">10.12952/journal.elementa.000120.f001</object-id>
      <label>Figure 1. </label>
      <caption><title>Map of the study area and location of sampling sites in 2013 and 2014.</title>
           <p>Samples were taken in (A) Kanajorsuit Bay, Greenland (N 64.44632, W 51.57724), between 27 March and 5 April, 2013, and in (B) Kobbefjord, Greenland (N 64.15340, W 51.42275), between 12 and 21 March, 2014.</p></caption>
     <graphic position="float" mimetype="image" xlink:type="simple" xlink:href="journal.elementa.000120.f001.png"/>