#GenR Software Citation Round-up

A concluding summary of headline issues from the Open Science theme of ‘software citation’. Software citation is an important building block in the future of Open Science and has run as Generation R’s launch editorial theme. As with all of the topics of focus on Gen R editorially the issue will be revisited on regular occasions as major developments occur.

What is intrinsically important about software citation?

For the main part it would appear to be the case that until recently no one had indexed, or cataloged, research software. If we compared this situation to the cataloging of literature, and somehow nobody had cataloged publications for the last fifty years, then this would just be unimaginable. But for software this has been the case — for the last half-century there has been virtually no widespread and systematic indexing of software, or its citation in literature. There are exceptions, and the Astrophysics Source Code Library is such an exception and worthy of mention, started in 1999. (ASCL >1999)

Open Science is changing this practice to one where software citation is actively being encouraged. More generally Open Science is moving into a phase, where open research infrastructures that underpin the free-flow of knowledge, are being designed to facilitate the linking of the many types of ‘digital object’ in the research lifecycle.

Benefits: Replication crisis & improving research software

The ability to cite research software and have a record of its citation in literature that can be indexed is vital to improving the quality of research, especially in light of the ongoing ‘replication crisis’, and as importantly for the purpose of improving ‘research software’ itself.

Software citation has now been receiving a concentrated focus of research development for the last six years, or so, and the results of this research work means that the practice of software citation, software archiving, and its integration into the journal publishing systems is moving at a healthy pace.

Before and after: Software citation support

BeforeAfter
No consistent guidance on requires metadata fieldsMinimum guidelines for software citation now exist: Citation File Format (CFF), CodeMeta, etc.
No Journal guidelinesJournal guidelines: e.g., Nature, American Astrological Society (AAS) (‘AAS – American Astronomical Society’ n.d.)
How citations appeared in journals – low quality (Howison and Bullard 2016)Potentially improving
Free to use archive repository on an institution by institution basisZenodo based at CERN now offers free-to-use DOI service
Long-term preservation archives – institution specificSoftware Heritage offers a global service for source. And as of September 2018 a direct submission service (Software Heritage n.d.)
Versioning systems to offer cryptographic IDs and versioning systems – yes, but not mainstreamedNow GitLab (FOSS) or GitHub are easy to use
Learning resources – n/aLibrary Carpentries, Open Science MOOC
Policy – n/aRecommendations on the development, use and provision of Research Software – Alliance of German Science Organizations (Katerbow and Feulner 2018)
Working groups and literature – n/aMany, see the Gen R Zotero group

Table 1: An indicative list of necessary parts needed to have a functioning research software citation ecology

Moving to mandatory code submissions for journals

Although the practice of software citation has only recently become the ‘new normal’, journals have for some time been issuing guidance on how to submit code, software and how to cite software in article submissions, such as Nature (‘Nature Software Submission Guidelines’ 2018) or AAS. Currently the submission of code and software is voluntary but what is now a recommendation will likely become mandatory within a couple of years.

Credit and roles

Software differs from other media types in that changes over time, with its lifecycle for being in use ranging from half a year when something is experimental, to decades when it’s in widespread use. Not only is their longevity a defining feature, but also there are many contributors and contributor-roles to consider. It is the mutable nature of software that has seen concepts of:

Who needs to cite software and how to cite software

It’s important to know that it’s not only ‘coders’, those who make software or modify a version release, who should be citing code. The main group who should be citing research software are researchers who are writing papers. The rule of thumb is to cite software as you would any other source or literature, and to include it in the main list of citations.

Who needs to cite software
The software maker
Software contributor – code, documentation or in some other role
A contributor who has made a minor contribution
In software reuse
A person making a data set
An author listing a source in literature

Table 2: Who should be citing software?

The how of citing software depends on your discipline, existing workflow, or contributor guidelines for example with a journal. It is also the case that software citation is in its early days and so guidelines are still being worked out, and quite often recommendations are for a minimum standard, which you may well want to go above.

Where to find citation information for a piece of software
CodeMeta fileDescribes the metadata associated with a software object using JSON’s linked data (JSON-LD) notation (CiteAs)
CITATION file – Citation File Format
README BibTeX
README DOI
Use a DOI from Zenodo, or other provider
Use an identifier from Software Heritage
R DESCRIPTION
Create an entry in a citation manager like ZoteroZotero has a software category and will generate many common citation styles
Cite a software paperProgrammers can make a research paper to describe a software concept
Cite a specific line number of code from a repository
Cite a HASH code of a specific commit
Cite a specific GitHub or GitLab repositoryThis can be the URL of the repository

Table 3: Where to find citation information for a specific software (NB: some of the above are debatable) (‘CiteAs: Credit for All the Things!’ n.d.)

For the coders

For the software making it is worth seeing the Open Science MOOC ‘Module 5: Open Research Software and Open Source’ as this covers all the steps necessary to make your code citable. (Open Science MOOC, 2018).

A note on Software Papers

Although for ensuring replication of research citing the ‘source code’ and associated executables is going to be of more use for general purpose research. For those involved in making and designing of software and systems the ‘software paper’ deserves a mention, as it is of value for learning more about the objectives, motivations, and influences of a project. The software paper is a place where the developers of software can describe their project, and is used to have software be citable in conventional literature citation workflows.

Staying up-to-date with software citation

From studying the current field of activity in the software citation ‘space’ a small map has been produced, showing group, organizations, initiatives, etc.

Software citation: A map


Figure: The map is produced using the FOSS vector graphics software LibreOffice Draw only because the author is not that satisfied with the results from dedicated diagram software. Sources are here and can be edited and updated. Suggestions for more visually informative authoring packages always welcome. View a PDF here. Credit: Simon Worthington, 2018, CCBY 4.0, DOI : 10.25815/z9ra-6k38

Software citation literature

A software citation sources library on Zotero (65 items + counting) https://www.zotero.org/groups/1838445/generation_r/items/collectionKey/MA9C4EEC

Some useful literature(a selection)

Overview

Smith, Arfon M., Daniel S. Katz, and Kyle E. Niemeyer. ‘Software Citation Principles’. PeerJ Computer Science 2 (19 September 2016): e86. https://doi.org/10/bw3g.

Credit and attribution

Katz, Daniel S., and Arfon M. Smith. ‘Transitive Credit and JSON-LD’. Journal of Open Research Software 3, no. 1 (5 November 2015). https://doi.org/10/gc7q6b.

‘CRediT Home’. n.d. Accessed 1 August 2018. https://casrai.org/credit/.

Habits of citation

Howison, James, and Julia Bullard. ‘Software in the Scientific Literature: Problems with Seeing, Finding, and Using Software Mentioned in the Biology Literature’. Journal of the Association for Information Science and Technology 67, no. 9 (1 September 2016): 2137–55. https://doi.org/10/f87339.

Journal submission guidelines

 ‘Nature Software Submission Guidelines’, 2018. https://s3-service-broker-live-19ea8b98-4d41-4cb4-be4c-d68f4963b7dd.s3.amazonaws.com/documents/GuidelinesCodePublication.pdf.

‘AAS – American Astronomical Society’. Accessed 1 August 2018. http://journals.aas.org/policy/software.html.

Learning resources: How to make your code citable

‘Module-5-Open-Research-Software-and-Open-Source: Module 5: Open Research Software and Open Source’. 2017. Reprint, Open Science MOOC, 31 July 2018. https://github.com/OpenScienceMOOC/Module-5-Open-Research-Software-and-Open-Source.

References

ASCL. ‘Browsing Codes’. Astrophysics Source Code Library, >1999. https://ascl.net/code/all.

‘AAS – American Astronomical Society’. Accessed 1 August 2018. http://journals.aas.org/policy/software.html.

Howison, James, and Julia Bullard. ‘Software in the Scientific Literature: Problems with Seeing, Finding, and Using Software Mentioned in the Biology Literature’. Journal of the Association for Information Science and Technology 67, no. 9 (1 September 2016): 2137–55. https://doi.org/10/f87339.

Software Heritage. ‘Software Heritage – Mission’. Accessed 1 August 2018. https://www.softwareheritage.org/mission/.

Katerbow, Matthias, and Georg Feulner. ‘Recommendations on the Development, Use and Provision of Research Software’, 16 March 2018. https://doi.org/10.5281/zenodo.1172988.

‘Nature Software Submission Guidelines’, 2018. https://s3-service-broker-live-19ea8b98-4d41-4cb4-be4c-d68f4963b7dd.s3.amazonaws.com/documents/GuidelinesCodePublication.pdf.

Katz, Daniel S., and Arfon M. Smith. ‘Transitive Credit and JSON-LD’. Journal of Open Research Software 3, no. 1 (5 November 2015). https://doi.org/10/gc7q6b.

‘CRediT Home’. n.d. Accessed 1 August 2018. https://casrai.org/credit/.

‘CiteAs: Credit for All the Things!’ Accessed 1 August 2018. http://citeas.org/sources.

Katz, Daniel S., and Arfon M. Smith. ‘Transitive Credit and JSON-LD’. Journal of Open Research Software 3, no. 1 (5 November 2015). https://doi.org/10/gc7q6b.

‘Module-5-Open-Research-Software-and-Open-Source: Module 5: Open Research Software and Open Source’. 2017. Reprint, Open Science MOOC, 31 July 2018. https://github.com/OpenScienceMOOC/Module-5-Open-Research-Software-and-Open-Source.

DOI: 10.25815/GF21-MT06

Citation format: The Chicago Manual of Style, 17th Edition

Worthington, Simon. ‘#GenR Software Citation Round-Up’, 2018. https://doi.org/10.25815/GF21-MT06.

Notifications Inbox
https://linkedresearch.org/inbox/genr.eu/
Annotation Service
https://linkedresearch.org/annotation/genr.eu/
Simon Worthington

Posted by Simon Worthington

Editor in chief Generation R. Book Liberationist. Working at the Open Science Lab, TIB – German National Library of Science and Technology, Hannover.

Leave a Reply

Your email address will not be published. Required fields are marked *