Image: FORCE11 2018 Montreal Conference, group photo
I attended the FORCE11 annual conference—an event with a very broad coverage of scholarly communications—with a mission in mind. This mission was to see what decentralized web (DWeb) research projects had matured to a level to be reusable in the working context of a scholars. Most DWeb systems and services are in an alpha phase, so in early R&D state where things are still experimental and not meant for large scale professional use. When a systems is in an alpha phase the objective is to carry out R&D to be able to test a set of assumptions and so improve a system to be able for it to move onto become a beta system, and then a full release. My very real concern is that almost all DWeb systems being proposed don’t know enough about how scholars and academia works, and instead use very thin models of scholarly workflows, that in turn means the chances of adoption, moving through the development phases, or solving the big problems in science communications are greatly reduced.
The conference took place in early October, with every one of the available three hundred and fifty places booked out. Which one of the conference organizers pointed out at the start of the proceedings, announcing to take your seat quickly as there might not be any left soon, for real. The conference and other FORCE11 offerings, like—the online working groups, and summer school—are valuable meeting points and peer-learning resources to gauge the state of play in a wide variety of Open Science areas. The makeup of the conference is composed of a good portion of librarians and LIS people, senior industry people who might be—product owners, innovation managers, or a VPs—and then the rest of us. This ‘rest of us’ being made up of researchers that mainly seem to follow a common academic career arc, of having left our initial scholarly specialization and been drawn down the rabbit-hole of Open Science R&D by the usual question and corresponding response. Why is everything so broken? Let’s just fix it.
FORCE11 as a network itself started with the conference name of “Beyond the PDF” at the University of California San Diego (USA) 2011, which was proceeded by a workshop at Schloss Dagstuhl – Leibniz Center for Informatics (DE) in 2011 called “Future of Research Communications (FoRC)”. The original name “Beyond the PDF” still hold true as a philosophy for FORCE11, to quote the conference webpage from 2011 “Beyond the PDF was meant to capture a common philosophy, not necessarily to be taken literally” (Beyond the PDF 1 2011). But for the DWeb sector it’s also a literal and core concern. To be able to go beyond the PDF, digital objects need to be atomized to create new granularized digital research artifacts: with the capturing of a much wider and heterogeneous array of digital objects; refactoring of metrics and measurement; and the start of addressing a parity of other knowledge outputs, like software, with the research paper (AKA the PDF). I point to this FORCE11 history and philosophy as this is exactly the questions about the future of science communications being described, then and now, that the DWeb looks to address. This being the question of how to—describe, verify, validate, and access—research outputs, with an increase in volume, and complexity.
At this year’s conference in Montreal there were three occasions to sample DWeb research. The first, was a two hour workshop at Concordia University Library titled “Blockchain in Scholarly Communication”. The second, was a presentation on the use of DAT (Robinson 2018) at the California Digital Library for distributing multiple copies of data for secure storage. And third, a presentation by Sarven Capadisli on the Dokieli software (Capadisli 2018) and Linked Data Notification. The three instances of DWeb came with different technologies and are looking to address a wide array of research questions. A central concern that I was looking at from this cross section of DWeb research, is how can digital objects and things be identified machinically, for example: how can I validate my authorship, an identity as a person or organization, or that a file is a copy of the validated original, etc. The reason for this question about validated machinic identity—that is 100%, or close to 100% reliable—is that is what is needed if you want to sustainably publish or use open data, or other open content, in a decentralized way.
In the “Blockchain in Scholarly Communication” workshop moderated by Michael Conlon, VIVO Project Director, three presentations were given: Karmen Condic-Jurkic who founded MDbox a research project in the field of computational chemistry, talked about the complex implications of the use of blockchain technology and accompanying incentive based economics. To a large part this complexity arises specifically in blockchain, as opposed to DAT or LDN, as ideas about cybernetic systems—systems of feedback loops—are being used by proponents of blockchain to encode social relations and apply, or impose, behavioral economics. It is worth noting that the socioeconomic dimension of blockchain is not itself intrinsic to the technology but is instead a social and political milieu of ideas applied to blockchain technology; Kevin McCurry of the startup ARTiFACTS gave a show and tell of their expanded publishing and search platform that looks to fully open up the depth of field of what is captured in the research publishing cycle. ARTiFACTS shows the greatest insight into scholarly workflows and conventions, something sorely in deficit in nearly all other startups in the blockchain for science space. Since ARTiFACTS founders are all ex-publishing executives,this intimate knowledge of scholarly communications makes perfect sense. It is telling, but not surprising, that they had to leave the corporate publishing world to be able to innovate. It is worth pointing out that corporate publishing, which includes all of publishing outside of academia is the largest creative industry globally and is extremely profitable (European Commission 2012). Hence there is no incentive to innovate, and actually it is quite the opposite, as for the profitable corporate publisher stasis is the modus operandi. For example it’s no accident that the publishing industry have not been the ones to bring out the leading eReader devices, and instead it’s been down to the likes of Amazon to drive the sub-standard reading experiences like the Kindle; Michael Conlon spoke on the “Scholarly Wallet” blockchain based research project looking at how to decentralize scholars’ records of works and contributions (AKA reputation) by recording these using blockchain technology and then allowing research information management systems like VIVO to access, use, and aggregate these records in the model of ‘distributed scholarly compound objects’ (DISCO) (Hanson et al. 2015) data use.
Scholarly Wallet presentation. Garcia, Alexander, and Michael Conlon. ‘Blockchain for OpenScience’, 11 October 2018. https://doi.org/10.6084/m9.figshare.7193228.v1.
Danielle Robinson of Code for Science and Society (CS&S) gave a talk called “A decentralized scholarly commons” on a research project using DAT technology to address the issue of data sustainability by keeping multiple copies of data with collaborating institutions, while working within the constraints of existing infrastructures. The participating institutions of this ongoing research (Dat Blog 2018) are California Digital Library, San Diego Supercomputer Center, and the Internet Archive. CS&S have developed the DAT protocol which is a cryptographic versioning system using a Merkle Tree (Ralph C. Merkle. 1987) like model, which in effect give you persistent identifiers (PIDs) and a change log. It is exactly this type of ‘research in the open field’ that all of the different DWeb technologies need to emulate: to test assumptions, give feedback to improve systems, and importantly give others the ability to validate the claims made by advocates of specific systems so that the they can adopt software to use or carry out their own research.
Finally in the FORCE11 coverage of DWeb research was the presentation by Sarven Capadisli of his software project Dokieli. The session was titled “Social Scholarly Web“ and profiled the open research Sarven has been doing on Dokieli since 2015 using Linked Data Notifications, a W3C recommendation of the Social Web Working Group. The Dokieli research and software is specifically looking to stay within the bounds of open standards that are backed by the W3C to ensure that the technologies being used are sustainable and can work with the Web, and ideally be supported by all types of vendors and service providers. This is unlike DAT and blockchain which are not tested in this way to work with the Web and so face the issue of gaining trust by other means. GenR has covered Sarven’s work and Dokieli in detail in an interview from the same FORCE11 conference which traces a history of this open research and its relationship to the Solid DWeb project lead by Tim Berners Lee.
The conference provided useful examples of good practice in terms of what I think is needed in DWeb development, with Code for Science and Society and Dokieli both carrying out their research and software development in the field and out in the open. And as a very open network the FORCE11 working groups, with—regular online meetings, published papers, and networks of contacts—provides an extremely valuable resource for collaborating with scientists and learning more about how scholarly communication works. Did I find reusable DWeb systems? Close, but quite yet. As stated most DWeb systems are in an alpha state, so they are not intended for wider use. But, while saying that, if I put myself in the shoes of a librarian, for example, where I might be willing to make use of something which is still in development, what I really must have to enable this type of adoption is more example of R&D outputs and open research, and I would need to see: more well documented experiments in the field; case studies written up; benchmarking; application and evidence of design research methods having been applied; UX/UI studies; as well as installation documentation that works.
Citation format: The Chicago Manual of Style, 17th Edition
Worthington, Simon. “Research or Perish! The Decentralized Web and Open Research. A Report from the FORCE11 2018 Montreal Conference,” 2018. https://doi.org/10.25815/7GPE-G631.
‘FORCE2018! Conference’. FORCE11, 10 February 2018. https://www.force11.org/meetings/force2018.
‘FORCE11’. Accessed 21 October 2018. https://www.force11.org/.
‘FORCE11 Active Groups’. FORCE11. Accessed 9 November 2018. https://www.force11.org/groups.
‘FORCE11 Scholarly Communication Institute (FSCI)’. FORCE11, 10 November 2017. https://www.force11.org/fsci/2018.
Admin, FORCE11. ‘Guiding Principles for Findable, Accessible, Interoperable and Re-Usable Data Publishing Version B1.0’. FORCE11, 10 September 2014. https://www.force11.org/fairprinciples.
Admin, FORCE11. ‘Beyond the PDF’. FORCE11, 18 May 2015. https://www.force11.org/meetings/beyond-pdf-1.
Wadern, Schloss Dagstuhl-Leibniz-Zentrum für Informatik GmbH, 66687. ‘Schloss Dagstuhl : About Dagstuhl’. Accessed 9 November 2018. https://www.dagstuhl.de/en/.
Admin, FORCE11. ‘FORC Workshop 2011 Dagstuhl’. FORCE11, 29 September 2015. https://www.force11.org/meetings/forc-workshop-2011-dagstuhl.
‘FORCE2018: Blockchain in Scholarly Communication’. Accessed 9 November 2018. https://force2018.sched.com/event/FYrb/blockchain-in-scholarly-communication.
Robinson, Danielle. ‘A Decentralized Scholarly Commons’. 11 October 2018. https://doi.org/10.5281/zenodo.1458149.
Capadisli, Sarven. ‘Social Scholarly Web’, 12 October 2018. http://csarven.ca/presentations/social-scholarly-web/.
Troncoso, Carmela, Marios Isaakidis, George Danezis, and Harry Halpin. ‘Systematizing Decentralization and Privacy: Lessons from 15 Years of Research and Deployments’. ArXiv:1704.08065 [Cs], 26 April 2017. https://doi.org/10/gfc88f.
‘Conlon, Michael’. Accessed 9 November 2018. http://openvivo.org/display/orcid0000-0002-1304-8447.
‘MDbox Demo’. Accessed 9 November 2018. http://www.mdbox.org/.
‘ARTiFACTS – A Blockchain Platform for Scientific &Academic Research’. ARTiFACTS. Accessed 28 September 2018. https://artifacts.ai/.
European Commission. ‘Statistical, Ecosystems and Competitiveness Analysis of the Media and Content Industries: The Publishing Industry’, 2012. http://is.jrc.ec.europa.eu/pages/ISG/documents/BookReportwithcovers.pdf.
Garcia, Alexander, and Michael Conlon. ‘Blockchain for OpenScience’, 11 October 2018. https://doi.org/10.6084/m9.figshare.7193228.v1.
‘VIVO – Connect. Share. Discover’. Duraspace.org. Accessed 9 November 2018. https://duraspace.org/vivo/.
Hanson, K.L., DiLauro, T., Donoghue, M. (2015). The RMap Project: Capturing and Preserving Associations amongst Multi-Part Distributed Publications. In Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 281-282. ACM. https://doi.org/10.1145/2756406.2756952
‘Code for Science and Society’. Accessed 20 March 2018. https://codeforscience.org/.
‘Data Sharing between Institutions’. Dat Project Blog, 24 April 2018. https://blog.datproject.org/2018/04/24/data-sharing-at-institutions-and-beyond-with-dat/.
Ralph C. Merkle. 1987. A Digital Signature Based on a Conventional Encryption Function. In Advances in Cryptology – CRYPTO ’87, A Conference on the Theory and Applications of Cryptographic Techniques, Santa Barbara, California, USA, August 16-20, 1987, Proceedings (Lecture Notes in Computer Science), Carl Pomerance (Ed.), Vol. 293. Springer, 369–378. https://doi.org/10.1007/3-540-48184-2 32
‘Linked Data Notifications’. Accessed 25 October 2018. https://www.w3.org/TR/ldn/.
Socialwg – W3C Wiki’. Accessed 25 October 2018. https://www.w3.org/Social/WG.
Worthington, Simon. “An Interview with Sarven Capadisli, Dokieli-Developer, on Autonomous Linked Research” 2018. https://doi.org/10.25815/9VBE-ZF15.
‘Solid’. Accessed 22 October 2018. https://solid.mit.edu/.