Citations - To Require or not Require

Hey All – we’ve had an interesting discussion about what we require in the review process. id like to spark a conversation surrounding citations

See initial discussion:

I think we all agree the having a citable piece of software is important but there are some very valid concerns about requiring it as a part of the pyopensci review process. please see the link above for more. Does anyone have thoughts on how to handle citations via the review process? I believe that ropensci requires is and often uses zenodo @noamross is that correct? @ocefpaf had some very good points however surrounding issues with requiring it for some people. Do we want to strongly suggest there is a citation associated with a package but not enforce it (similar to how we might handle code coverage which is a totally different thread which i will post about in the future).

pinging @choldgraf @npch @leouieda and everyone really on this!!

Thanks for opening this discussion @lwasser. I’m 100% in favor of citations and its value is unquestionable. However, I believe we should recommend and not require b/c there are some corner cases where a software citation may become a dispute between a Research Software Engineer (RSE), who wrote the software, and a scientist who wrote the science behind the software. When they are “disconnected” the dispute is easy to solve b/c one can just cite the other. Unfortunately there are cases where the RSE works for the scientists in questions and they do not want the software to “shadow” their paper by becoming “citable.”

While this be rare, I sure hope so, it happened to me and I had to remove a DOI from a package I created some time ago. Therefore, enforcing a software in PyOpenSci to have an official citation may block some projects. (Also, most of scientific the packages are not-citable at the moment, we have to use URLs/Repositories and releases/commits to do so.)

1 Like

I fully agree with your thoughts @ocefpaf !! I know one thing that ROS does is they have stipulations for things like code coverage that allows for those edge cases that might block someone from participating. Perhaps we could consider implementing something like that - language that suggests citations but allows for flexibility. wouldn’t it be great if we could in some way help people create tools that are more easily citable (some might now even know how to go about setting citations up!).

ROpenSci doesn’t ccurrently require Zenodo, but JOSS does and we probably will soon, too. We currently require a codemeta.json file (https://codemeta.github.io/), and we’re moving towards being more stringent about other citation requirements (e.g. the R-package-specific CITATION file), too. We just want to put together the automated tooling to do all these things at once rather than impose a bunch of new steps on authors.

At RO we tend to think of proper citation practices as being paramount, and that we should act as a force to motivate such practices and not let them slide because of such co-author concerns. That said, most frameworks enable you to link the software citation with the relevant publication to deal with these issues. For instance, Zenodo lets you specify that software isSupplementTo another publication (See here), and you can put the preferred citation in a CITATION file.

3 Likes

Late to the party!

I think that this should be a “should” for now, moving towards a “must” at some point in the future once the practice of getting a citable persistent identifier becomes easier and more uniform.

A followup question is whether having a citable piece of software means:

  1. The software has a citation of any sort whatsoever, following the software citation principles

  2. The software is citable using a globally unique persistent identifier, but the software itself may not be in a preservation repository. Current example might be a GitHub PURL.

  3. The software is citable using a globally unique persistent identifier, and is in a preservation repository, but the PID might not be a DOI. Current example might be software stored in a repository that uses ARKs, or the identifiers used by Software Heritage.

  4. The software is citable using a DOI, and is in a preservation repository. Current example might be Zenodo.