1. Home
  2. Recommendations
  3. Data availability statements

Data availability statements

Status: Published
Version: 1
License: this recommendation document is licensed under CC BY-ND 2.0 UK

Provenance

JATS4R subgroup. Members (in alphabetical order):

Ton Bos, Elsevier; Paul Donohoe, SpringerNature; Melissa Harrison, eLife (Chair); Christina Von Raesfeld, PLoS; Kelly McDougall, MIT Press

Context

<back>, <ref-list>, <title>, <ref-list>, <element-citation>, <mixed-citation>

@sec-type, @publication-type, @specific-use

Description

There are three types of data that may be associated with an article:

  • Generated data: included or referenced data that were generated for the study
  • Analysed data: referenced data that were analysed for the study but were not generated for the study
  • Non-analysed data: referenced data that were neither generated nor analysed for the study 
    Note: Most publishers discourage the citation of non-analysed data but there are exceptions. For example, this kind of data may be used by authors to acknowledge that other researchers have done similar work but have used different methods or techniques; in this case, the authors of this paper have not analysed the other researchers’ data but are aware of their work and need to cite it for completeness of the record

Full, structured citations should be provided for any referenced data; i.e. externally archived generated data, analysed data, and non-analysed data.

Recommended practice is to indicate the location(s) of any data that were generated or analysed for the study in a Data Availability statement (DAS), that is, not non-anaylised data. Some publishers may use the term “Data Accessibility Statement” for this content. We recommend excluding non-analysed data from the DAS, and, instead, only including it in the main reference list of the article. For any data that are not provided within the paper (i.e. as a table or a supplementary file/source data file), the DAS should provide access to the external data locations, either as direct links, or as links to the relevant reference included in the article’s reference list(s). If any generated or analysed data is not publicly available, the DAS should provide reasons for the omission.

This recommendation excludes guidance for software.

Additional reading

Recommendation

  1. <back>. Contain the DAS in a <sec> element within <back>. 

    [[Validator tool result: if: <sec @sec-type=”data-availability”> found outside <back> ERROR]]
  2. @sec-type=”data-availability”. Use this attribute on  the <sec> containing the DAS.

    [[Validator tool result: if <sec @sec-type=””> with following values found anywhere in XML document: if encounter any @sec-type labelled “Data availability”, “Data Availability”, “Data-Availability”, “data availability statement”, “Data availability statement”, “Data Availability Statement”, “data-availability-statement” “Data-availability-statement”, “Data-Availability-Statement”, “Data_Availability”, “data_availability-statement”, “Data_availability-statement”, “Data_Availability-Statement”, “data_availability”, “Data Accessibility” “Data accessibility” ERROR]]
  3. <title>. Use a <title> element to contain the title (“Data Availability”), and contain the substance of the text within <p>.

    [[Validator tool result: if <title> not found, or content not match “Data Availability” ERROR]]
  4. <ref-list>. References for the data. There are four options for capturing references to data, as follows:
    1. Contain the references within the main <ref-list> for the article (as recommended by the Force11 Publishers Early Adopters Expert Group, and as recognised by Google Scholar)
    2. Contain references to data within a <ref-list> directly in the <sec @sec-type=”data-availability”> element
    3.  Insert the data citations solely as <element-citation> or <mixed-citation> elements directly in the <sec sec-type=”data-availability”>
    4.  Contain the data references within a sub-level <ref-list> at the end of the article
  5. @publication-type=”data” on <element-citation> or <mixed-citation>. Use this attribute on all <element-citation> or <mixed-citation> elements that contain references to data. 

    [[Validator tool result: see validator tool rules for JATS4R recommendation on Data citations]]
  6. @specific-use on <element-citation> or <mixed-citation>. For publishers who elect to collect such granularity in their workflow, see the table below for four @specific-use attributes recommended for JATS XML. For publishers who use the Relation Type method for Crossref deposits we’ve provided a mapping in the table.
Data type (@specific-use)DescriptionMap to this Crossref Relation Type
“supporting”Data that supports the study’s findings. Use this generic value if you do not wish to further distinguish whether the supporting data were generated or analyzed“references”
“generated”Supporting data that were generated for the study“isSupplementedBy”
“analyzed”Supporting data that were analyzed (but not generated) for the study“references”
“non-analyzed”Referenced data that were neither generated nor analyzed for the study“references”
Data type is not indicated (no @specific-use value is supplied)“references”

Examples

Example 1: Data files provided in the paper

 <back>
. . .
<sec sec-type="data-availability">
<title>Data Availability</title>
<p> All data are provided within the paper and its supporting information.</p>
</sec>
<ref-list>
<title>References</title>
...
</ref-list>
</back>
<back>
. . .
<sec sec-type="data-availability">
<title>Data Availability</title>
<p> All data are provided in <xref ref-type="table" rid="table1">
Table 1</xref> and <xref ref-type="supplementary-material" rid="data1">Datasets S1</xref> and <xref ref-type="supplementary-material" rid="data2">S2</xref>.</p>
</sec>
<ref-list>
<title>References</title>
...
</ref-list>
</back>
<back>
. . .
<sec sec-type="data-availability">
<title>Data Availability</title>
<p>The data analysis file and all annotator data files are available in the Figshare repository, <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.6084/m9.figshare.1285515">https://doi.org/10.6084/m9.figshare.1285515</ext-link> [<xref ref-type="bibr" rid="pone.0167292.ref032">32</xref>]. The measured and simulated Euler angles, and the simulation codes are available from the Dryad database, <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5061/dryad.cv323">https://doi.org/10.5061/dryad.cv323</ext-link> [<xref ref-type="bibr" rid="pone.0167292.ref033">33</xref>]. Microarray data are deposited in the Gene Expression Omnibus under accession number <ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE70542>GSE70542</ext-link> [<xref ref-type="bibr" rid="pone.0167292.ref034">34</xref>].</p>
</sec>
<ref-list>
<title>References</title>
[Insert references here]
</ref-list>
</back>

Example 4: Data externally archived – full citations included as a ref list within the DAS

<back>
. . .
<sec sec-type="data-availability">
<title>Data Availability</title>
<p>The following datasets were generated or analyzed for this study:</p>
<ref-list>
<ref id="pone.0167830.data001">
<label>D1</label>
<element-citation publication-type="data" specific-use=”generated”>
<name>
<surname>Read</surname> 
<given-names>K</given-names>
</name>
<data-title>Sizing the Problem of Improving Discovery and Access to NIH-funded Data: A Preliminary Study (Datasets)</data-title>
<source>Figshare</source>
<year>2015</year>
<pub-id pub-id-type="doi" assigning-authority="figshare" xlink:href=
"https://doi.org/10.6084/m9.figshare.1285515">https://doi.org/10.6084/m9.figshare.1285515</pub-id>
</element-citation>
</ref>
<ref id="pone.0167830.data002">
<label>D2</label>
<element-citation publication-type="data" specific-use=”analyzed”>
<name>
<surname>Kok</surname> 
<given-names>K</given-names>
</name>
<name>
<surname>Ay</surname> 
<given-names>A</given-names>
</name>
<name>
<surname>Li</surname> 
<given-names>L</given-names>
</name> 
<data-title>Genome-wide errant targeting by Hairy</data-title>
<source>Dryad Digital Repository</source>
<year>2015</year>
<pub-id pub-id-type="doi" assigning-authority="dryad" xlink:href=
"https://doi.org/10.5061/dryad.cv323">https://doi.org/10.5061/dryad.cv323</pub-id>
</element-citation>
</ref>
<ref id="pone.0167830.data003">
<label>D3</label>
<element-citation publication-type="data" specific-use=”analyzed”>
<name>
<surname>Hoang</surname> 
<given-names>C</given-names>
</name>
<name>
<surname>Swift</surname> 
<given-names>GH</given-names>
</name> 
<name>
<surname>Azevedo-Pouly</surname> 
<given-names>A</given-names>
</name>
<name>
<surname>MacDonald</surname> 
<given-names>RJ</given-names>
</name>
<data-title>Effects on the transcriptome of adult mouse pancreas (principally acinar cells) by the inactivation of the Ptf1a gene in vivo</data-title>
<source>NCBI Gene Expression Omnibus</source>
<year>2015</year>
<pub-id pub-id-type="accession" assigning-authority="NCBI" xlink:href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE70542">GSE70542</pub-id>
</element-citation>
</ref>
</ref-list>
</sec>
<ref-list>
<title>References</title>
...
</ref-list>
</back>
<back>
. . .
<sec sec-type="data-availability">
<title>Data Availability</title>
<p>The data analysis file and all annotator data files are available in the Figshare repository:
<element-citation publication-type="data specific-use="supporting">
<name>
<surname>Read</surname>
<given-names>K</given-names>
</name>
<data-title>Sizing the Problem of Improving Discovery and Access to NIH-funded Data: A Preliminary Study (Datasets)</data-title>
<source>Figshare</source>
<year>2015</year>
<pub-id pub-id-type="doi" assigning-authority="figshare" xlink:href=
"https://doi.org/10.6084/m9.figshare.1285515">https://doi.org/10.6084/m9.figshare.1285515</pub-id>
</element-citation></p>
<p>The measured and simulated Euler angles, and the simulation codes are available from the Dryad database:
<element-citation publication-type="data" specific-use="supporting">
<name>
<surname>Kok</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Ay</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>L</given-names>
</name>
<data-title>Genome-wide errant targeting by Hairy</data-title>
<source>Dryad Digital Repository</source>
<year>2015</year>
<pub-id pub-id-type="doi" assigning-authority="dryad" xlink:href=
"https://doi.org/10.5061/dryad.cv323">https://doi.org/10.5061/dryad.cv323</pub-id>
</element-citation>
</p>
<p> Microarray data are deposited in the Gene Expression Omnibus under accession number GSE70542:
<element-citation publication-type="data" specific-use="supporting">
<name>
<surname>Hoang</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Swift</surname>
<given-names>GH</given-names>
</name>
<name>
<surname>Azevedo-Pouly</surname>
<given-names>A</given-names>
</name>
<name>
<surname>MacDonald</surname>
<given-names>RJ</given-names>
</name>
<data-title>Effects on the transcriptome of adult mouse pancreas (principally acinar cells) by the inactivation of the Ptf1a gene in vivo</data-title>
<source>NCBI Gene Expression Omnibus</source>
<year>2015</year>
<pub-id pub-id-type="accession" assigning-authority="NCBI" xlink:href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE70542">GSE70542</pub-id>
</element-citation>
</p>
</sec>
<ref-list>
<title>References</title>
...
</ref-list>
</back>
<back>
. . .
<sec sec-type="data-availability">
<title>Data Availability</title>
<p>The data analysis file and all annotator data files are available in the Figshare repository, <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.6084/m9.figshare.1285515">https://doi.org/10.6084/m9.figshare.1285515</ext-link> [<xref ref-type="bibr" rid="pone.0167292.data001">D1</xref>]. The measured and simulated Euler angles, and the simulation codes are available from the Dryad database, <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5061/dryad.cv323">https://doi.org/10.5061/dryad.cv323</ext-link> [<xref ref-type="bibr" rid="pone.0167292.data002">D2</xref>]. Microarray data are deposited in the Gene Expression Omnibus under accession number <ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE70542>GSE70542</ext-link> [<xref ref-type="bibr" rid="pone.0167292.data003">D3</xref>].</p>
</sec>
<ref-list>
<title>References</title>
...
</ref-list>
<ref-list>
<title>Data References</title>
[Insert references here]
</ref-list>
</back>

Example 7: Data cannot be made publicly available

<back>
. . .
<sec sec-type="data-availability">
<title>Data Availability</title>
<p> Ethical restrictions according to the Japanese Ethical Guidelines for Human Genome/Gene Analysis Research (http://www.lifescience.mext.go.jp/files/pdf/n796_00.pdf, page 33) prevent public sharing of individual genotype data. All summarized data are available upon request. Data requests may be sent to the UMIN IRB (irb@xxxxxxxx.jp)..</p>
</sec>
<ref-list>
<title>References</title>
...
</ref-list>
</back>

Example 8: No data was generated (This seems unlikely, but it’s possible, e.g. for theoretical works)

<back>
. . .
<sec sec-type="data-availability">
<title>Data Availability</title>
<p> During the course of this research no data was analysed, reused or generated.</p>
</sec>
<ref-list>
<title>References</title>
...
</ref-list>
</back>

History

Working: 12 January, 2017–20 April, 2017
Committee review: NA
Pending decision: 21 April 2017. The group proposed to the JATS Standing Committee that a new element be created: <data-availability>, to be contained within <back>, with the same content model as <sec>. This request was denied in November 2017.
Working: November 2017–April 2018
Public review: 30 April, 2018–31 May, 2018
Committee review: Nov 2018–Dec 2018
Published: 28 December, 2018

Updated on September 22, 2020

Related Articles

Provide feedback on this recommendation

Please note you are commenting on this specific recommendation. To suggest a new recommendation, please follow the link on the homepage. By proceeding with your comment here, you understand that your comment will be publicly visible and you may be contacted by JATS4R in case of further clarification.

You may use markdown to format your comment. For example, to allow <> tags to display, please start and end that portion of your comment with three backtick characters, ```.