We believe that novel tools and applications based on text and data mining will enrich the content that we host and improve the way we interact with the scientific literature. Therefore, we develop Europe PMC as an open innovation platform, enabling contributors, such as text miners and developers, to showcase the outputs of their work.
Maria Levchenko is a community manager for Europe PMC – a global database for life science and biomedical research literature, a partner in PubMed Central International, and the designated repository for the open access publication mandates of 27 life science funders.
Marta Hoffman-Sommer: Europe PubMed Central (Europe PMC) is a repository for publications from the life sciences domain. Would you explain shortly what are the main goals of Europe PMC? How does it differ from PubMed and PMC? Are there any specific advantages for a European-based researcher to use the Europe PMC search site?
Maria Levchenko: The mission of Europe PMC is to build open access, full-text scientific literature resources and deliver world-class literature services. We believe that novel tools and applications based on text and data mining will enrich the content that we host and improve the way we interact with the scientific literature. Therefore, we develop Europe PMC as an open innovation platform, enabling contributors, such as text miners and developers, to showcase the outputs of their work.
Europe PMC is part of the PubMed Central International (PMCI) initiative. Together with PMC USA and PMC Canada it constitutes a network of digital archives that provide free access to published peer-reviewed biomedical and health research literature. All nodes share their locally deposited manuscripts within the network, while offering different functionalities to their users. Europe PMC combines the power of both PubMed and PubMed Central as a one-stop shop for both abstracts and full-text articles that can be accessed through a single search interface. In addition to that, Europe PMC hosts a large variety of content, such as books, patents, biomedical theses and clinical guidelines. In addition to 27 million PubMed abstracts, Europe PMC covers additional sources, such as Chinese Biological Abstracts and Agricola records, bringing the total number of abstracts to 32 millions. What distinguishes Europe PMC from PubMed Central are several novel features, including advanced text and data mining tools, integrated ORCID IDs - unique identifiers from ORCID foundation to distinguish academic authors, and a Grant Finder for accessing grant information from 27 international life science funders supporting Europe PMC. New developments are highlighted on the Europe PMC blog.
The content of Europe PMC is not limited by geographical location and includes scientific literature from anywhere in the world. We hope that life science researchers all over the globe can benefit from our services.
Are there any connections between Europe PMC and OpenAIRE (or other European e-infrastructures)? Is there any exchange of data and/or metadata going on?
Europe PMC is the largest data provider for OpenAIRE, supplying more than 3.8 million documents. OpenAIRE also utilizes the open API and public web service from Europe PMC to identify FP7 and Horizon 2020 funded research and to gather the associated metadata. Finally, the External Links Service provided by Europe PMC enables OpenAIRE and similar infrastructures to link to Europe PMC records from related resources, such as full texts of articles in repositories harvested by OpenAIRE.
What is the content acquisition policy for Europe PMC? Can any author from the life sciences domain - who wishes to make their work more visible - individually deposit their article in Europe PMC? If not, what requirements should he/she meet and why?
There are several routes for content to be added to Europe PMC. Fully participating publishers deposit the complete contents of each issue or volume, while a selective deposit option is available for hybrid journals that publish a subset of articles open access.
Europe PMC Funders' Group organisations mandate that published research, arising from the research grants they award, must be made available through Europe PMC, typically within six months of being published. Any researcher supported by at least one of the Europe PMC Funders can submit the final peer-reviewed author manuscript for inclusion in Europe PMC using the Europe PMC plus deposition service. Some publishers will deposit the peer-reviewed manuscript for free on behalf of authors for the articles acknowledging funding from the Europe PMC Funders.
For all life science authors, the easiest way of ensuring visibility for their work is simply publishing in an open access journal participating in PMC, which automatically makes the full text of their article available in Europe PMC. A list of journals which signed a PMC participation agreement can be found here.
How does Europe PMC add value to the scientific literature it presents?
We see literature as a bridging mechanism for wider research infrastructure, combining all associated information and helping to transform it into knowledge. To actualize this vision we focus on three major directions: author services, data integration, and text-mining.
Europe PMC works closely with ORCID foundation to ensure credit attribution for authors. You can search the resources by ORCID ID, to find papers by a particular author. Europe PMC provides a tool for scientists to add their published works to their ORCID record, and to date almost 3.5 million articles have been claimed by more than 350,000 biomedical researchers. We also generate author profiles for researchers with ORCIDs, with citation and publication graphs, showing how many articles were published open access. Finally, we provide links to related resources – alternative metrics, post publication peer reviews from Publons, or lay audience summaries on Kudos and Wikipedia, enabling researchers to show their impact in a number of ways.
Fostering connections between scientific data and literature is a big part of our work. Publications in Europe PMC are programmatically linked to relevant records in a number of databases, including Uniprot, European Nucleotide Archive (ENA) and Protein Data Bank Europe (PDBe), with the list constantly growing. Information discovery is facilitated by directly linking out from biological entities and data citations in the text identified with the help of text-mining approaches.
We have developed a SciLite text mining tool to support scientists and database curators in their literature research. SciLite highlights text mined biological terms, displaying those annotated entities as an overlay on scientific articles in Europe PMC. Annotations are linked to the corresponding data resources, allowing the user to locate the underlying data in a straightforward way. SciLite makes it easier to scan articles for key concepts and helps to quickly grasp the essence of an article.
Europe PMC links from the scientific literature to numerous curated medical and biological databases as well as through DOIs to cited datasets. These datasets may reside in uncurated data repositories, relying on author self-deposit (e.g. Zenodo, Dryad, Figshare). Do you know how often these kinds of datasets are cited? What are your predictions on the future of data repositories that accept uncurated datasets - will they play a significant role in life sciences?
Uncurated repositories are often tailored for data provenance, in contrast to curated databases that structure data for re-use. This can result in different citation rates for curated and uncurated data. Currently, there are significantly more articles in Europe PMC linked to PDBe records, compared to those containing Dryad links (98 000 vs 11 000). Only time will tell whether uncurated datasets will pick up speed with regard to citation.
However, uncurated repositories are indisposable, when it comes to new data formats that cannot be easily accommodated by the current structured archives. This lack of structure comes at a price: as the amount of biological data keeps growing exponentially, it becomes increasingly fragmented and scattered through different places. One database addressing these challenges is BioStudies, a new data service at EMBL-EBI which acts as a data container consolidating all the data from a particular study and making it easy to find and reuse data. It links to datasets in established repositories, while also hosting unstructured data. Such an arrangement is especially useful for multi-omics experiments, where different types of data can be produced. Europe PMC links to Biostudies records from scientific papers and provides input in the form of text-mined supplemental information and accession numbers. We believe that Biostudies due to its focus on life sciences, as well as its flexible structure, allows to adapt to community requirements better than a repository that caters to all domains and disciplines.
Do you think the way PMC International functions would be a good model also for other research areas (eg. humanities or social sciences), or does every research community need to develop its own model of functioning for literature databases?
PMC International has adopted a system initially designed for genomic data providers. For instance, the European Nucleotide Archive of the European Molecular Biology Laboratory (EMBL), DNA DataBank Japan (DDBJ), and GenBank at NCBI form the International Nucleotide Sequence Database Collaboration. All three databases routinely exchange deposited data, while offering different interfaces and functionality that is best suited for the scientific community they serve. Such a system ensures archive stability and safekeeping of the stored information, at the same time allowing to enrich deposited content with local guidelines and related resources. We believe that this arrangement has its benefits for the diverse stakeholders that are invested in the research cycle, providing flexibility of choice for a preferred resource.
The most valuable peer review is based on clear criteria and guidelines, allowing a good dialogue between the researcher, the evaluator and the representatives of the scientific committee of the publication channel, and ultimately leading to the improvement of the research to be published (...). Any form of open initiative contributing to the generalisation of such a system is valuable to me.
Ioana Galleron is a Senior lecturer in French language and literature at Université de Bretagne-Sud. She is involved in research evaluation projects, such as EvalHum. Since April 2016, she is the Chair of the COST Action CA15137.
Michał Starczewski: What is the difference between the evaluation systems in STEM and SSH disciplines? How are the European Union and its member states facing this issue? Is it possible to find common criteria for all academic disciplines? Should we try?
Ioana Galleron: While systems of evaluation differ with regards to their principles (ex-post/ex-ante, performance based/size based), there are not separate systems evaluating STEM and SSH disciplines. This is not a problem in itself, except when it comes to some of the methods applied, since both the SSH and the STEM disciplines do not form homogenous groups, and there are enormous differences, with regards to the publications habits, between biology and mathematics, for instance, as well as between the SS and H, and within these broad areas. There is strong evidence that quantitative metrics such as impact factor, H-index and alike do not reflect quality or even allow to correctly observe productivity in the SSH, and therefore should not be used for evaluating these disciplines, all the more so as the data is incomplete and not representative. However, such methods are still in use, partly because national evaluation systems are often influenced by STEM scientists and inspired by what goes for these disciplines, partly because of a belief that competition to publish in international journals will improve research, partly because it produces easy to use numbers, and partly because the alternatives appear unclear or not very reliable. Indeed, SSH research evaluation has to improve its protocols, which are sometimes perceived by many SSH communities as being unfair, ill-adapted and non-transparent; also, opposition to the very principle of evaluation is to be found more often in the SSH communities, for ideological reasons, or because of traditions of collegiality. From an epistemological point of view, it was even suggested that STEM and SSH disciplines have very different positions with regards to evaluation, since in STEM there is a long tradition of progress related to the discussion and the refutation of the findings by other researchers, while such a paradigm is less frequent in the SSH. This tends to suggest that while the two fields may converge towards certain common criteria, there are intrinsic differences to be taken into account when evaluating the ones and the others.
In short, SSH evaluation is a complicated issue, both for the evaluators and for the evaluated. And while some European states have acknowledged the situation and look for appropriate solutions, in many other places this is not the case. A European initiative for the SSH is therefore much needed, so as to create a space for discussion between those who experiment more adapted and fair procedures for the SSH, and to give the issue the visibility it deserves.
You are the Chair of the COST Action European Network for Research Evaluation in the Social Sciences and the Humanities (ENRESSH). Could you describe the aims of this project?
The “European Network for Research Evaluation in the Social Sciences and the Humanities” is a COST Action starting in April 2016 and ending in April 2020, aimed at proposing clear best practices in the field of SSH research evaluation. The Action brings together various experts such as researchers in evaluation studies, policy makers and members of evaluation units, as well as researchers from SSH disciplines. The project compares strands of work dedicated to SSH research evaluation, currently under development in different parts of Europe, in order to avoid unnecessary duplication and to upscale results; it started with exchanges of experience in order to build a picture of practices across Europe and the state of the art in research on evaluation. During the subsequent years, the project team will organise conferences, workshops and meetings to bridge the gap between scholars in SSH research evaluation, research managers and policy makers.
I also seize this occasion to mention the EvalHum Initiative. This is a broad based association that seeks to bring together all those working in the field of research evaluation and impact in the social sciences and humanities. If this sounds pretty similar to ENRESSH, it is not surprising to find that it was EvalHum members who proposed the action. However, the two are strictly separate, and the association will continue when the COST Action will be over. It is also EvalHum that runs the successful RESSH conference series, with the next taking place in Antwerp in July 2017.
Sometimes the concepts used to describe scholarly communication mean very different things in different countries. This is the case of ‘monograph’, ‘paper’, ‘review’ etc. In some countries a monograph is legally defined by the number of pages, while in others by the number of authors. How much confusion does this cause on the international level? Do you think that some controversies would be easier to solve if people were more aware of this ambiguity?
Too often, evaluation procedures focus upon formal attributes of publications, without being aware that they are very rough proxies for quality, that labels (‘(peer) reviewed’, ‘university press’, ‘book’) can cover very different realities, and that numbers (of pages, of signs, of contributors…) are not relevant in themselves. There is no reason to look down on proceedings of a conference as being in principle “less good” than a single-authored book, and no reason to push towards publishing papers rather than books – or vice-versa! – since quality can come in many sizes and shapes. Such shortcuts or trends would make sense only if there was a form of consensus, within the SSH communities, as about the most adapted types of publication with regards to certain types of scientific results, and if one could have a fair amount of confidence in the thoroughness of the quality checking realised by a scientific committee (be it of a journal, a conference, a publishing house, a reading committee, etc.). Unfortunately, when one looks at the reality behind the words, especially from a pan-European point of view, one realises these two conditions are far from being met.
So, I don’t know if this will help solving controversies, but being aware about what kind of processes are involved in the publication of a scientific book in, for example, Riga as compared to what happens with a book in Venice will probably help to progress collectively towards common standards, and to promote better science rather than to keep imposing (publication) choices to the scholars.
Does open science make evaluation more effective? How?
Open science will change research practices in the SSH, and can impact evaluation methods and protocols. But there are also several questions arising, such as are SSH scholars aware of the open access debate and its implications, and do SSH scholars have the financial means to publish in open access? Article Processing Charges (APC) for articles are a problem for what is often unfunded research and book publishing is prohibitive. However, to return to the question, the very fact that certain scholars choose to publish openly their research and, moreover, their research data, should be taken into account in an evaluation grid. Open science is also better disseminated, therefore easier to reuse, and to refute, two other important aspects which can be evaluated in turn.
Moreover, open science allows researchers, in certain countries, to shortcut or to go beyond a publishing system which is, for the reasons above mentioned, quite often dysfunctional in the SSH.
How could data sharing be evaluated? Which aspect is the most important: immediate access, legal licenses or full metadata?
Data sharing is a way to make the research process more transparent and make the job for reviewers easier. In the mean time, it can spare unnecessary duplication, since studies can be replicated or re-analysed from another perspective. It is not easy, however, to conceive a protocol for evaluating data sharing, and we will probably need some time so as to see how scholars engage with the issue.
From an evaluation point of view, metadata are crucial; they are also, maybe, easier to share and to protect. In order to build robust evaluation protocols, we need to know what happens in the field, and to date this knowledge is very incomplete concerning the SSH, at national as well as at a European level. Also, the situation is very contrasted from a discipline to another, with psychology and economy being much better covered by the international databases, while research in languages and literatures are lagging largely behind. Having access to the metadata – not only of the data but also of various publications – would allow us to progress towards a European database of SSH production, so as to observe on a larger and more robust scale publication habits, collaborative practices, and even thematic trends.
Communications technology is changing very fast. Scholarly communication is changing as well. Do you think that research evaluation should keep up or should it be a little more conservative and evolve at its own pace?
Research evaluation has to be aware of research processes and the scientific habits in the communities it evaluates, so yes, for credibility’s sake it has to keep pace with the changes in scholarly communication. There is little room, in the current evaluation protocols, for scientific blogs, contributions to newsletters, and in most countries for many forms of involvement with the society at large, to give but some examples. However, all these attempts to communicate otherwise, within and without academia, create room for innovation. In the SSH, there is a huge potential unspotted by decision makers, in many cases because they are scrutinising indicators and results which are not actually reflecting what really happens in the field.
On the other hand, let’s also bear in mind that indicators used in research evaluation need time to be tested and proven to be working, and that running after each trend doesn’t make for a very robust evaluation protocol.
What is the role of social media in scholarly communication? Should academic institutions invest in tools counting tweets or posts on Facebook, such as Altmetric?
Social media have done a lot of good to the SSH, since they allowed to communicate with the large public about our topics and researches, something which did not often happen with the traditional media, more focused on discoveries from the ‘hard’ sciences. Also, as mentioned above, ‘traditional’ metrics do not work very well for the SSH. A SSH scholar with a very low H-index is not necessarily an unproductive scholar, and the number of downloads of his or her papers in repositories can demonstrate this. However, one must be aware that altmetrics would not reflect either what exactly happens in the field, because of the conservatisms in many communities. Excellent researchers not communicating through Twitter, posts on Facebook or other academic social media can still be widely read in libraries. Also, data from many Web 2.0 platforms are not reliable enough, and, as all data, they can be shaped by the powerful sciences (physics and medicine) so as to count what they are good at (stories of discovery and curing cancer) rather than our SSH stories in which people are often more interested. In short, institutions need to be aware about scholarly communication through social media, but also careful in interpreting the information they get, and cautious not to impose behaviours where these are not natural.
You might say that there is a chasm between science and society. 'Ordinary people' don’t know what science is about and they can’t participate in it. At the same time scientists often are seen as closed in an ivory tower, ignoring the real problems of today’s world. However, “citizen science” projects aim to build bridges and connect both sides. Do you think that science is becoming more egalitarian?
I am always surprised when I hear about a “chasm” between science and society, and even more surprised of people focusing on our (supposed) ivory tower, rather than, say, on bankers in their golden one… Be that as it may, is this a new phenomenon, and is this something to worry about? After all, ‘ordinary people’ living in the 16th century were even less aware about science, as the great majority was unable to read, and scientists appeared to many as dangerous. Science is better perceived and even understood in our modern, educated societies – and maybe the feeling that scientists don’t open enough to society is related to this greater capability of the ‘lay public’ to understand science, which makes people, in turn, more interested to participate in it. Science is not rejecting people, but it asks, and it is in its very nature to do that, for knowledge and skills so as to participate in it. There are also a lot of great stories of societal engagement of scholars from all disciplines, and especially from the SSH, showing that we are far from being so above the maddening crowd as the traditional saying pretends.
From a sociological point of view, what we see is a steady and continuous change of the community of researchers, whose origins and backgrounds are more and more diverse; so, in this sense also science seems to become more ‘egalitarian’.
There are lots of ideas how to improve peer review and make it more transparent. What kinds of open peer review seem to be the most valuable for you?
In spite of known limitations, traditional peer review has been proved to be useful to generate better science, provided it is done properly; what is called so by certain SSH publishing houses or journals is sometimes a very light process, resulting in one line (or even one word!) appreciations of pieces of research. This is not to deny the very good job performed by numerous journals or publishing houses but, in many other cases, who gets published depends less on the quality of the work, than on the personal relationships some researchers maintain with certain editors, or, even worse, on the capability to pay to be published. I tend therefore to say that the most valuable peer review is, above one, the real one (as opposed to what I call “mocking” or “pretence” peer-review), based on clear criteria and guidelines, allowing a good dialogue between the researcher, the evaluator and the representatives of the scientific committee of the publication channel, and ultimately leading to the improvement of the research to be published. The recent proposals by the open access biology journal eLife opens interesting perspectives in allowing a pre-publication dialogue between the anonymous reviewers and the researcher, as well as extended forms of peer review where stakeholders participate to evaluate research with a societal mission. More generally, any form of open initiative contributing to the generalisation of such a system is valuable to me.
- Michał Starczewski
When humanists say to me: you're just making us change our practices because technology has changed, I get a bit jumpy for two reasons. Firstly, the new technology of open, non-rivalrous dissemination, is much more like the things we are trying to do with scholarship than rivalrous forms. Secondly, such an argument assumes that paper, books, the codex, and other material forms are not themselves some kind of technology that has determined our practices. Nobody ever talks, really, or at least not enough, about the way in which academic discourses have been shaped by the material forms of dissemination within which they have existed for most of their lives.
Martin Eve is a Professor of Literature, Technology and Publishing at Birkbeck, University of London. He founded the Open Library of Humanities, a charitable organisation dedicated to publishing open access scholarship. He is also a steering-group member of the OAPEN-UK project, a research project gathering evidence on the open access scholarly monograph publishing in the humanities and social sciences. He is developing several digital humanities projects.
Michał Starczewski: You are the author of the book “Open Access and the Humanities”. What differences are there between the OA revolution in the humanities and in the sciences?
Martin Eve: The usual way in which open access is framed in the humanities is that it “lags behind” the sciences, but this creates a number of new problems. Why, some humanists ask, should the humanities just follow whatever the natural sciences are doing? Others ask why technological change should drive academic practice. Another set fear the influence of open licensing, which they claim may promote plagiarism (I do not believe this). Still others point to the problem of economics: far less work receives funding in the humanities and Article Processing Charges (APCs) are not readily available. Finally, others point to the fact that there isn't actually a straightforward divide between “the humanities” and “the sciences”, even on OA. Indeed, the discipline of chemistry is very poor at open access while philosophy has had a culture of pre-prints for some time.
So, there are differences in what humanists do and how it is communicated, but I often feel these are overstated. We all write because we want to be read and we know that paywalls pose a barrier to broader readership. That said, we do have a culture of monographs in the humanities that are substantially harder to make open access than articles and journals...
The discourse and practice in OA is focused on articles and journals. Meanwhile, for researchers in the humanities, monographs are often much more important than articles. One might say that the main conclusion from the Jisc OAPEN-UK final report on OA monograph publishing is that it is too early to recommend any specific model. What are the obstacles?
While I don't have the space here to go into every piece of detail, there are a set of social and economic challenges around monographs that were extremely well explored in a recent report by Geoff Crossick for HEFCE in the UK. Central to these challenges are the economics. A separate report recently issued in the USA for the Andrew W. Mellon Foundation found that the cost was “$30,000 per book for the group of the smallest university presses to more than $49,000 per book for the group of the largest presses.” At this type of cost, it becomes very difficult to support a model such as a Book Processing Charge (borne by the author/institution/funder). There is also the thorny problem of trade books, the still-underexplored issues of how OA books are used (in comparison to print), and the reticence of some tenure and promotion committees to admit born-digital manuscripts.
You have founded the Open Library of Humanities, a charitable organisation dedicated to publishing open access scholarship with no author-facing article processing charges (APCs). Could you explain how it works? Could it be a model for other institutions across the world? Are you going to publish monographs in this model as well?
The OLH works on a model of distributed library subsidy. So, instead of an author paying us ~$600 when an article has been accepted, we instead solicit contributions from libraries around the world that look like (but are not) a subscription. Libraries currently pay around $1000 per year to support our 15 journals. However, everything we publish is open access, so libraries are not “buying access” or a subscription or anything like that. They are supporting a platform that could not otherwise exist. It is a non-classical economic model but it seems to be working as around 200 libraries have currently signed up and we have seen a 100% renewal rate in our second year. We do intend to move to monographs, but this is further off. We are more interested, for now, in flipping subscription journals away from a paywalled mode and into our model. This can be achieved by journals either leaving their current publisher, or by us covering the APCs of that journal in the future. In this way, we get around the funding problems in the humanities for OA.
After the “Finch Report” the UK turned towards the Gold Route of OA. The findings of the monitoring of this policy are as follows: the majority of articles have been published in the most expensive, hybrid journals. The Wellcome Trust reported that 30% of the articles for which they had paid processing fees, were not available when the Trust checked. What went wrong with the OA policy in the UK?
I don't really think it's fair to judge whether policies have “gone wrong” at this stage and it depends upon what you wanted to achieve in the first place. If the goal was to achieve OA and for it to be cheaper than a subscription model, then yes, there are some problems emerging here. But if the goal is to achieve open access, even if it costs more, then the policy is working well. I personally think that, in the long run, we need a system that is more sensitive to the budgetary pressures of academic libraries (and I believe that academic publishing should be a not-for-profit enterprise). But the different policies in the UK – the gold RCUK policy and the HEFCE green policy – are combining to create a culture where OA is the norm. To say that these policies haven't worked after four years (RCUK) and six months (HEFCE) is a little rash.
How do you see the future role of scientific publishers in the context of OA? Do researchers need publishers to organise peer review and ensure high quality?
I tend to think about publishing in terms of the necessary labour here. I do not support the idea that, under capitalism, people should work for free. If people are performing a service, then they deserve to be remunerated for that. The labour in publishing, therefore, is labour like any other. Publishers perform a variety of tasks that I think it would be foolhardy to discard and that requires payment: peer-review organization, typesetting, proofreading, copyediting, digital preservation, platform maintenance, marketing, legal advice, identifier assignments, curation, the list goes on. Whether or not these “ensure high quality” is something of which I'm unsure. I regard peer review with deep scepticism and believe that it is more often a panacea than a rigorous gatekeeping method. Indeed, I recently wrote about the problems of predictive excellence with a group of others.
Do you think that the open data issue is as important in the humanities as in other disciplines? Is it a feasible scenario that humanities will be based on digital data? Are we witnessing a “digital turn”?
What's interesting here, I think, is that the term “data” is not well understood in the humanities. It implies a type of processing of quantitative material that most humanists don't encounter. Yet, at the same time, we all work with artefacts that could be called “data”. So, when I'm speaking with colleagues about this, I tend to use the word “evidence” or “paratext” to refer to data. I say, if you are writing about a nineteenth-century novel and you made a series of notes on this, the novel itself and your notes could both be considered data and might be valuable to someone else. That said, data sharing is controversial in many disciplines, so the fact that the humanities haven't leapt upon this is nothing to alarm us for now.
Is openness a necessary feature of the digital environment in the humanities?
It is not, sadly. As is evidenced by the fact that people have put up paywalls online around research material, it is perfectly possible to operate a closed digital environment for the humanities. That said, there is something interesting about this that always strikes me (drawing on the astute remarks of Peter Suber in his book, Open Access, from MIT Press). Knowledge, ideas and words are infinitely copyable without the original owner every losing them. If I tell you something that I know, then you know it too and we are both richer. Digital technology that allows infinite copying is directly in line with this way of thinking. So when humanists say to me: you're just making us change our practices because technology has changed, I get a bit jumpy for two reasons. Firstly, the new technology of open, non-rivalrous dissemination, is much more like the things we are trying to do with scholarship than rivalrous forms. Secondly, such an argument assumes that paper, books, the codex, and other material forms are not themselves some kind of technology that has determined our practices. Nobody ever talks, really, or at least not enough, about the way in which academic discourses have been shaped by the material forms of dissemination within which they have existed for most of their lives.
e-Infrastructure is not always interoperable. The information often can’t be distributed among different tools. The problem is serious when researchers work with GLAM (cultural institutions such as galleries, libraries and museums) resources. Is it possible to use common standards that make e-infrastructure interoperable? What are the main obstacles to using such standards?
It is, of course, possible to create common standards for e-infrastructure. However, the challenge here is that we have a highly distributed set of actors all with different end goals. Does Elsevier see itself as benefiting from working in an open, interoperable way, in the same way as a small, born-open-access book press? Does Wiley get the same benefit from being interoperable as an institutional repository? I'd argue that different stakeholder desires condition the degree of interoperability here as much as any technological aspect.
You are working on three new books. Could you please tell us something about these projects?
Certainly. The first book that is currently in a full draft state is called The Anxiety of Academia and it looks at the ways in which the concepts of critique, legitimation, and discipline are used by a set of contemporary novels to pre-anticipate the way in which academics will read such novels. The second is called The Aesthetics of Metadata and this project, which is about 50% complete, reads a series of contemporary novels for the way in which they represent metadata-like structures. So, for example, I here look at Mark Blacklock's book on the Yorkshire Ripper hoaxer in the UK and the way in which accents, writing, and location all play a role in the hunt. I also look at the false footnotes in Mark Z. Danielewski's House of Leaves alongside the objects from a ruined future in Emily St. Mandel's Station Eleven. Finally, the last book I'm working on for now is called Close-Reading with Computers, and this is also about 50% complete. This book is an exploration of the ways in which various methods from the field of computational stylometry can be used to advance the hermeneutic study of contemporary fiction, centring on David Mitchell's Cloud Atlas. I am attempting to publish all of these books through an open access route.
- Michał Starczewski
Personally, I find the green/gold debate rather limiting, as it imagines a future where the formats and pathways for sharing research map on to what we have had for many decades now. But researchers are actively pursuing completely new routes for sharing their findings and knowledge, not all of which fall into the remit of the traditional publishing paradigms.
Dr. Jennifer Edmond is Co-Director of Trinity Centre for Digital Humanities at Trinity College Dublin. She represents Digital Research Infrastructure for the Arts and Humanities (DARIAH) in Open Science Policy Platform, high-level advisory group established by the European Commision. Her primary focus in research is on the impact of technology on humanities research.
On May 27th the European Commission established the Open Science Policy Platform, an expert group that will “provide advice about the development and implementation of open science policy in Europe”. You represent Digital Research Infrastructure for Arts and Humanities (DARIAH) in the group. Could you please tell us what goals you wish to achieve in the OSPP?
I believe that making science more open will bring great benefits, for researchers, for industry, and for society as a whole. At the same time, I am very concerned that our understanding of what ‘open science’ means is being developed very much from a STEM perspective. The arts and humanities have much to offer as well, but these disciplines start out from a very different baseline, with our ‘raw’ data (to the extent you can call it that at all) often held by cultural heritage institutions, and our strong traditions of sustained argumentation and individual research leading to different types of results. To be truly a success, open science must encompass these practices as well as those of ‘hard’ sciences. In this, DARIAH can be a strong facilitator, so it is my primary goal to represent the humanists’ traditions on the platform and find ways in which DARIAH can support more openness without discounting or underestimating the strength of the humanistic suite of epistemic strategies.
According to the Amsterdam Call for Action on Open Science, two pan-European goals should be achieved by 2020: full OA for all scientific publications and a fundamentally new approach towards the reuse of research data. Will the OSPP be engaged in finding a way to do this?
Yes, these are two of the major items on the OSPP agenda. But the OSPP is also very mindful that embedding these policies into the research cultures of Europe will require wider systemic change. The OSPP is therefore also tasked to envision structures to support the ‘downstream’ issues that will inevitably arise from greater openness, such as new skills requirements, integration of citizen science, new ways to evaluate scientific production and research integrity in a shifting landscape, to give just a few examples: all of these issues are tightly intertwined with how, when and where we share research results.
The Amsterdam Call for Action on Open Science omits the Green Route of OA. Do you think the balance between green and gold OA is possible? How can we achieve it?
I think it is inevitable that we will have to have a multifacted approach to ensuring wide access to research results – the publishing ecosystem is already too diverse and too broad and it is growing and changing all of the time. Personally, I find the green/gold debate rather limiting, as it imagines a future where the formats and pathways for sharing research map on to what we have had for many decades now. But researchers are actively pursuing completely new routes for sharing their findings and knowledge, not all of which fall into the remit of the traditional publishing paradigms. I think in time we will move beyond the green and gold duality to more of a rainbow, with researchers having a greater choice of options suitable to different types of outputs, and our understanding of the real utility of the instruments we currently have to facilitate open access, from open repositories to APCs, will become focussed.
Do you think that OA policy will remain within a competence of EU member states or will the EU harmonize this area through regulations or directives?
Horizon2020 has long been a major influencer for national research policy development, but it is important to remember that H2020 policy is developed via a political process involving expertise and experiences from the member states. I would foresee a continuation of this trend toward harmonization between European and national policies, but always one that develops as a balance between bottom up input and top down incentives.
DARIAH is a European Research Infrastructure Consortium (ERIC) developing an infrastructure for digital humanities. How can researchers across Europe participate in this endeavour?
DARIAH is a large and vibrant community, which is continuously growing and launching new initiatives. Countries that are already DARIAH members will have a national committee or organizing structure – researchers who want to get involved in DARIAH would be able to connect there, learn more about what DARIAH does and how it supports research and ultimately get involved. If a country is not a DARIAH member, then the DARIAH central offices can advise as to how to lobby for your country to join or to point toward the open services available to all.
Digital humanists often declare that openness is an important value for them. What are the benefits of openness in the digital humanities?
First of all, the fact that our methods and sources can become quite intertwined in the digital humanities makes us more aware of how important it is that data move fluidly. A traditional research project can be based on records locked away and virtually hidden in an archive, but a digital one cannot. Second, the investments we make in our projects and the platforms we offer them on are such that we must look for them to become sustainable. Sustainability means many things to many people, but to me it is primarily about reuse, which is in turn about openness – in terms of technical standards, rights, adaptability, knowledge organization, and on a host of other planes. Finally, digital humanists often work in a more public space, certainly one that interacts with other disciplines, and very often one where research is made directly accessible to the public. This openness has become a fairly widely accepted part of what digital humanities is, for the richness it brings and the impact it facilitates.
The issue of Text-and-Data Mining is intensely discussed on EU level at present. What regulations are desirable from an open science and digital humanities point of view?
At this point I would settle for transparency: the biggest barrier at the moment is more that the situation is so unclear than that there are specific barriers. If we can first establish what the conditions are underlying the right to mine, how we might actually be 100% sure where we stand with a given data set or approach, then we could start to build from this baseline in a balanced and responsible way. Once this is accomplished, we will need more good quality content to become openly available, which I think would become a fast emerging infrastructural space as soon as a clearer rights environment became established.
How deep is the “digital turn” in the humanities? Does it change the principles of research?
In some ways the digital turn is deeper than we care to admit: very few scholars would work today without using tools like JSTOR or keeping hybrid notes and records of their research. But much of the answer to this question depends on what you expect of the digital. For some it is transforming their work utterly – this is still a minority, however, and like any methodological approach, I would never expect this deep adoption of digital tools to be something everyone does, just something that everyone appreciates the place of. The nature of the available tools complicates my response to this question as well. Currently, many digital tools are being adapted from outside of the humanities research ecosystem. This works for some, but in many cases, these tools do not take the deep practices of humanities research into account. So I think the digital turn is yet to reach its full potential for the humanities, and will incompletely realised until we create research environments that truly encompass the deeply engrained, embodied and serendipitous aspects of humanities research.
What chances do you see for broader cooperation between humanities researchers and STEM researchers in the digital environment? Which tools from the open science toolkit could stimulate such cooperation?
For me, the excitement of digital humanities stems from the way it offers a translational approach to humanities research, bringing an applied edge to what is generally very basic research, allowing that work to interface with other users, disciplines and perspectives. If we can find more effective ways to share and validate knowledge in ‘real time,’ then this potential will grow. Open data will be a key driver for this, but not on its own. More important will be, for example, developing a capacity in Europe to recognize new professional pathways in between those of the librarian, the researcher and the technician. We need these hybrid people, they are the bridges that create dialogue and bring disciplines together. There are good people moving in to these spaces in an ad hoc way, but there aren’t enough of them, and they often face artificial barriers to their mobility between expertise spaces and into positions of authority. If we can better optimise the overall system to allow for such intermediaries to be effective, then cooperation will grow significantly.
- Michał Starczewski
Our special guest today is Stevan Harnad, a prominent figure in the Open Access movement. Author of the famous 'Subversive Proposal', founder of 'Psycoloquy' and the journal 'Behavioral and Brain Sciences', creator and administrator of AmSciForum, one of the main coordinators of CogPrints initiative – the list could be stretched far beyond that – he doesn't really need introduction for anyone not wholly a stranger to the story of the Open Access movement. A cognitive scientist specialising in categorization, communication and consciousness, Harnad is Professor of cognitive sciences at the Université du Québec à Montréal and University of Southampton, external member of the Hungarian Academy of Sciences and doctor honoris causa, University of Liège. But even his polemics with John Searle about the Chinese Room didn't become as famous and influential as his Open Access advocacy.
Open Access Archivangelist
It's often said that it all began in 1994 with Harnad’s 'Subversive Proposal': a call to fellow academics to upload all their previously published research output to open access repositories, thus making it freely accessible to anyone with Internet access. Yet Stevan Harnad's adventure with open scholarly publishing began before that – as far back as 1978, when he founded the Open Peer Commentary journal 'Behavioral and Brain Sciences'. The journal was unique in the way it complemented traditional peer review with open peer commentary: copies of each accepted article were sent to about 100 experts in the fields it touched. Their short commentaries were then co-published with the target article along with the author's replies. As Harnad made clear in his 2007 interview with Richard Poynder, although the journal was published in the paper era and was thus technologically incapable of becoming anything close to what later became known as Open Access – it made Stevan wonder about ways in which more people could benefit from open peer commentary. So when in the middle of the 1980's Harnad was exposed to the emerging Usenet, his maturing ideas at last met the right technology. Harnad called the idea 'skywriting'. Open access to scholarly literature was then the only logical conclusion – a necessary condition for skywriting.
Stevan Harnad was with the Open Access movement from the very beginning, longer even than the term itself existed (the term 'Open Access' was introduced in 2002 with the Budapest Open Access Initiative). For a number of years he expressed his thoughts about the state of Open Access in particular and academic publishing in general in the American Scientist Open Access Forum (now the “Global Open Access List” (GOAL) as well as on his blog 'Open Access Archivangelism'.
100% Green Gratis OA
Harnad's long-standing advocacy for “Green” Open Access (OA) is well known. According to him, the fundamental priority is for academics to fill their institutional research repositories. Once all published research output is openly available via this “Green Road” without delay (embargo), academic publishing will have to be modified in order to survive. The emerging model (“Fair Gold”) will be Open Access, with journal publishers' roles reduced to their sole remaining essential function: managing peer review. Much of what we do toward attaining Libre (CC-BY) Gold OA before academic output reaches 100% Gratis (toll-free) Green OA is premature, redundant and may even delay the transformation of academic publishing to Fair Gold OA (playing into the hands of publishers who are trying to delay OA as long as they can).
It seems, however, that the long era of Harnad's 'Archivangelism' for Open Access is coming to an end. Earlier this year, 22 years after the 'Subversive Proposal', Harnad made it quite clear via Twitter that he is about to quit Open Access Advocacy.
Tomasz Lewandowski contacted Stevan and asked him about his decision, its context and his plans for the future. Stevan was kind enough to give us an interview summing up his career as Open Access advocate.
Tomasz Lewandowski: Can you tell us a bit about your research?
Stevan Harnad: My research is on how the brain learns new categories and how that changes our perception as well as on the "symbol grounding problem” (how do words get their meaning?) and the origin and adaptive value of language. I also work on the Turing Test (how and why can organisms do what they can do? what is the causal mechanism?) and on consciousness (the “hard problem” of how and why organisms feel). Apart from that I work on open-access scientometrics (how OA increases research impact and how OA mandates increase OA). I also edit the journal Animal Sentience: An Interdisciplinary Journal on Animal Feeling and I am beginning to do research on animal sentience and human empathy.
Concerning your now rather famous tweet about your retirement as Open Access Archivangelist: what was the context of this decision? What's next?
The context (if you look at the tweet conversation) was that Mike Eisen (co-founder of PLoS) was implying that copyright law and lawyers consider the requesting and receiving of reprints or preprints to be illegal. I think that is nonsense in every respect. It is not illegal, it is not considered illegal, and even if it were formally illegal, everyone does it, preventing it would be unenforceable, and no one has challenged it for over a half century!
So I replied that this was wishful thinking on Mike’s part. (He is an OA advocate, but also co-founder and on the board of directors of a very successful Gold OA publisher, PLoS. There is a conflict of interest between publishers (whether they be TA [Toll Access] publishers or OA publishers) and the advocates of Green OA or eprint-sharing. So what I meant was that Mike was wishing it to be true that eprint-sharing was somehow illegal, and hoping that it would not happen, as it conflicts with the interests of getting researchers to pay to publish in Gold OA journals -- rather than to publish in TA journals and self-archive -- if they want OA).
Mike replied (equally ironically) that I was the champion of wishful thinking. At first I was going to reply in the same vein, with a light quip. But then I decided, no, it’s true: I had long wished for all refereed research to be Green OA, and my wish has not been fulfilled. So I simply stated the fact: That he is right, I have lost and I have given up archivangelizing.
If it turns out that the wish is nevertheless fulfilled eventually, all the better. But if it is overpriced Gold OA (“Fool’s Gold”) that prevails instead, well then so be it. It’s still OA.
My own scenario for a rational transition to “Fair Gold” OA via Green OA has been published and posted many times, and it may eventually still turn out to be the path taken, but for the past few years I find that all I am doing is repeating what I have already said many times before.
So I think suffering animals need me much more than the research community does. This does not mean I will not be around to say or do what needs to be said or done, for OA, if and when there is anything new I can say or do. But the repetition I will have to leave to others. I’ve done my part.
Bekoff, M., & Harnad, S. (2015). Doing the Right Thing: An Interview With Stevan Harnad. Psychology Today.
Was there any point during your time as an OA activist that you felt it is all going in the right direction and your vision will soon become true?
Quite a few times: First in 1994, when I made the subversive proposal; I thought it would just take a year or two and the transition to universal self-archiving would be complete. Then I thought commissioning CogPrints would do the trick. Then making CogPrints OAI-compliant. Then creating OA EPrints software; then demonstrating the OA citation advantage; then designing Green OA mandates by institutions and funders; then designing the copy-request Button; then showing which mandates were effective; then debunking Fool's Gold and the Finch Report.
But now I see that although the outcome is optimal, inevitable and obvious, the human mind (and hand) are just too sluggish and there are far, far more important things to devote my own time to now. I've said and done as much as I could. To do more would just be to repeat what has already been said and done many times over.
Carr, L., Swan, A. & Harnad, S. (2011) Creating and Curating the Cognitive Commons: Southampton’s Contribution. In, Simons, Maarten, Decuypere, Mathias, Vlieghe, Joris and Masschelein, Jan (eds.) Curating the European University. Universitaire Pers Leuven 193-199.
In an interview you gave in 2007 to Richard Poynder you draw a vision of something one might call an intrinsic history of ideas of Open Access. In the beginnings of the modern Open Access movement there were, according to what you said, two main streams of thought. One stream was concerned with accessibility to scholarly literature, the other - with its affordability. In the former stream, one can position your BBS and arXiv and other early OA initiatives. In the second stream, there is Ann Okerson and her efforts to make scholarly literature more affordable for universities (though perhaps not for the broader public). And although not primarily concerned with Open Access, the search for a more affordable scholarly journal financing model eventually led to the APC model (paid Gold OA). So then the accessibility movement became the Green Road to Open Access and the affordability movement - the Gold Road to Open Access.
Now, when things are put this way, I think we can see the inner tension of the Open Access movement more clearly. A possibility arises that these two streams within the Open Access movement were not that compatible. Could you elaborate on that? If you look at the Open Access Movement as an offspring of two separate problems: one related to the accessibility of the scholarly literature and the other related to its affordability, do you think that OA was ever really a single, coherent movement at all?
First of all, two pertinent details: (1) the APC (paid Gold OA) cost-recovery model was already there, explicitly, in the 1994 Subversive Proposal – as was the assumption that universal Green OA self-archiving must come first. (2) Ann Okerson was not particularly an advocate of APCs as the solution to the affordability but of licensing.
Affordability is just an aspect of the accessibility problem: If there were no accessibility problem -- if there were no need for all researchers to have access to all research, or if they somehow already had it -- then affordability would not be a problem, or a very minor one. Conversely, if there were no affordability problem, then accessibility would not be a problem.
But affordability was always primarily a problem experienced directly by institutional librarians (the "serials crisis") whereas accessibility was a problem experienced directly by researchers. The solution for affordability seemed to be lower journal prices whereas the solution for accessibility was for researchers to provide to their final refereed drafts the open access that the online era had made possible -- by self-archiving them in their institutional repositories (i.e., what came to be called "Green OA").
The ultimate solution, of course, was (1) universal Green OA self-archiving followed by (2) universal journal subscription cancellation by institutions, (3) the cutting of all obsolete journal products and services and their costs by publishers, and (4) a transition to author-institutional payment for the remaining essential cost (managing peer review) up front (what came to be called "Gold OA").
But this optimal Gold outcome was from the very beginning (already in the 1994 Subversive Proposal) predicated on first providing universal Green OA as the source of the access and the driver of the cancellations, downsizing, and conversion to Gold OA. Without providing Green first, the only way to get to Gold OA is to pay the inflated price of pre-Green "Fool's Gold" OA, which does not solve the affordability problem, leaving all obsolete products and services bundled into the inflated price per article. And even the notion of a global "flip" of all the planet's journals to Fool's Gold OA is obviously incoherent to anyone who thinks it through.
So the rush for a pre-emptive solution to the affordability problem has become a Fool's Gold Rush. Only if institutions and funders first mandate Green OA globally can there be a viable, stable transition to affordable, scalable, sustainable "Fair Gold" OA.
Harnad, S (2014) The only way to make inflated journal subscriptions unsustainable: Mandate Green Open Access. LSE Impact of Social Sciences Blog 4/28.
You once defined Open Access as 100% Open Access - meaning it's either 100% or not at all, because only 100% will make the traditional publishers fall. This definition is fair enough but in reference to some of the previous questions I think we might rather need an operational one. So let me ask you - what would need to happen for you to say "Hey, today we have Open Access in the academic world"?
See above: "Only if institutions and funders first mandate Green OA globally can there be a viable, stable transition to affordable, scalable, sustainable "Fair Gold" OA". Without 100% Green OA, journals are not cancellable.
Piece-wise local transitions to (Fool's) Gold OA (by country, institution, funder, field or publisher) not only add to the overall costs of access while subscriptions continue everywhere else, but they divert attention from what really needs to be done, which is for all funders and institutions to mandate Green OA (with deposit required immediately upon acceptance for publication plus either immediate OA or the copy-request-Button). In contrast, unlike Fool's Gold OA, Green OA can be mandated piece-wise (by country, institution or funder).
Sale, A., Couture, M., Rodrigues, E., Carr, L. and Harnad, S. (2014) Open Access Mandates and the "Fair Dealing" Button. In: Dynamic Fair Dealing: Creating Canadian Culture Online (Rosemary J. Coombe & Darren Wershler, Eds.).
What will the world of scholarly communication look like after 100% Open Access is established? How would this influence the entire model of scholarly communication?
Once Green OA is universally mandated and provided, there will be the transition to Fair Gold OA, with peer-review being the only remaining service provided by publishers, and paid for by institutions out of a fraction of their subscription cancellation savings. Once research papers are all open and text-minable, open data will soon follow, and with it open science. The rate of progress and collaboration in research will be greatly enhanced and we will have a rich battery of OA metrics for monitoring and measuring research progress, productivity, and currents of influence.
Harnad, Stevan (2013) The Postgutenberg Open Access Journal (revised). In, Cope, B and Phillips, A (eds.) The Future of the Academic Journal (2nd edition). 2nd edition of book Chandos.
Your phrase stating that Elsevier was "on the side of angels" when it comes to embracing OA made a career of its own. You once took that position in 2007 after Elsevier's policy on Green Open Access was introduced. You still maintained it even when the Cost of Knowledge boycott was in its peak phase. Even when in 2013 Elsevier excluded from its policy researchers that were under institutional mandates you "continued to attest that". As far as in 2015 Michael Eisen still seemed to hold a grudge toward you for this statement. Could you once more recall the context in which this phrase was coined and what exactly did it mean? Do you still continue to attest that?
The "side of the angels" quip was always a ruse, designed to keep Elsevier from trying to embargo Green OA for as long as possible by throwing them a token credit to use as PR amidst their onslaught of blame (from librarians and authors). Elsever knew it, I knew it, and so did anyone else with a realistic sense of what was going on, and what was at stake.
I also did not believe in boycotts (and their failure every time has borne me out) but in mandates (though they have not yet prevailed as I had hoped either).
But much less trivially, although it was just as obvious and inevitable as OA that publishers would use every trick possible to try to stave off Green OA for as long as possible, it should be obvious that publishers are not the real obstacles to OA. The real obstacles are precisely the ones who will benefit directly from OA the most: researchers. (The biggest indirect beneficiary is of course the tax-paying public that supports the research and researchers.)
If researchers worldwide had not been so sluggish, timid and obtuse, and had provided Green OA of their own accord as of 1994 (as computer scientists and physicists had already been doing then for over a decade, taking advantage of each new online means of providing OA as it appeared, completely oblivious to what publishers might think or say about it) then we would have long reached the optimal and inevitable by now.
But most researchers didn't. So we are still busy adopting OA mandates (many of them weak, hence ineffectual) and trying to get their details right:
Vincent-Lamarre, P, Boivin, J, Gargouri, Y, Larivière, V & Harnad, S (2016) Estimating open access mandate effectiveness: The MELIBEA Score. Journal of the Association for Information Science and Technology (JASIST), 67.
On the one hand, big legacy publishers are embracing Open Access more and more - by introducing open access options to their old journals, by establishing policies for self-archiving and by creating new open access journals. All this has been happening ever since Springer bought BioMed Central back in 2008. On the other hand, their revenues have stayed as high as before, or they've even increased. You yourself wrote many posts concerning the phenomenon of "double-dipping". In reference to your answer to the last question - could you comment more broadly on big for-profit scholarly publishers and their relation to Open Access? Maybe you have some predictions about the nearest future of the business, which you would like to share?
I continue to believe that it is virtually irrelevant what publishers say or do. The sole retardant is researchers; their institutions and funders can ensure that they do the right (optimal, inevitable) thing -- though it is too late now to get them to have done it as soon as it was possible!
Publishers' Fool's Gold OA options are just distractions, designed to delay the optimal and inevitable outcome for as long as possible (and publishers know this full well).
So it all depends on how soon effective Green OA mandates by institutions and publishers get adopted globally: Only the universal availability of Green OA will make journal subscriptions cancellable, thereby forcing publishers to cut all their remaining obsolete Gutenberg-era products and services (like the print edition, the online edition, archiving and access-provision) and their costs, downsize to the sole remaining essential service of PostGutenberg peer-reviewed journal publishers (namely, the management of the peer review, which researchers provide for free, just as they provide their research for free), and convert to Fair-Gold OA fees in order to recover its minimal costs.
Harnad, S. (2010) No-Fault Peer Review Charges: The Price of Selectivity Need Not Be Access Denied or Delayed. D-Lib Magazine 16 (7/8).
You said that even should the overpaid "Fool's Gold" prevail, it would still be Open Access. So (in reference to the two streams of the Open Access Movement) - is it for you accessibility above affordability? Accessibility no matter the costs?
If the planet were to opt for universal Fool's Gold instead of mandating Green OA and attaining OA plus Fair Gold that would be fine with me. Fools will always be parted needlessly from their money, and my only real objective all along was universal OA, as soon as possible.
The reasons I oppose and mock the Fool's Gold Rush, however, are precisely the ones I've described: Fool's Gold is bloated, unscalable, unaffordable and unsustainable -- hence (I infer) unattainable. And the result is that it is diverting attention and energy from the only route to universal OA that I believe will work, and that route is Green OA self-archiving, mandated globally by all research institutions and funders.
There are numerous Open Access policies all around the world introduced by either research organisations, funders or various other stakeholders. ROARMAP currently lists over 700 policies. Back in 2007 you seemed to think all is needed for 100% Green Open Access are open access mandates. Now we have legal Open Access mandates introduced and working in practice - and there are over 500 more of them than back in 2007. What makes you think we still aren’t any closer to our goal after all?
We are closer, but not nearly as close as we could and should be, because there are still far from enough Green OA mandates, and many of them are needlessly weak and ineffectual.
What does an ideal, strong and effectual OA mandate look like? Are there any mandates like that out in the wild?
The essential features of an effective Green OA mandate are the following.
(1) It must require deposit immediately upon acceptance for publication (not after an embargo).
(2) It must require deposit of the author's refereed, accepted final draft (not the publisher's PDF).
(3) It must require deposit in the author's institutional repository (not institution-externally).
(4) Immediate deposit must be made a prerequisite for research performance evaluation.
(5) The repository must implement the copy-request Button.
(6) The immediate-deposit need not be immediate-OA (as long as the Button is implemented).
Harnad, Stevan (2015) Open Access: What, Where, When, How and Why. In: Ethics, Science, Technology, and Engineering: An International Resource eds. J. Britt Holbrook & Carl Mitcham, (2nd edition of Encyclopedia of Science, Technology, and Ethics, Farmington Hills MI: MacMillan Reference).
As far back as 10 years ago you thought that progress in self-archiving is far too slow. In the paper "Opening Access by Overcoming Zeno's Paralysis" you diagnosed the academic community as overhelmed by what you called "Zeno's Paralysis". Therefore, as could be understood, you maintain the position that the problem is psychological in nature. There are others, however that maintain that the problem is more systematic in nature. The whole "publish or perish" scientific communication system that emerged over the last few decades has too many intrinsic incentives that guide researchers in wrong directions and too few incentives that would direct them towards OA. In the perspective of years gone by, has your diagnosis changed? Who is to be blamed - the scientists or the system they work in?
The ones to blame are (1) the scientists themselves, for not providing Green OA of their own accord, unmandated, and (2) their institutions and funders, for being so sluggish in mandating it, and so slow to optimize their mandates.
The systemic problems of research funding, publication and assessment (peer review, publication lag, publish or perish, impact factors, research evaluation, data-mining, re-use licensing, etc.) are real enough, but they are not access problems -- and it was, and continues to be a big mistake to conflate them with the much simpler, focused problem of providing immediate toll-free online access to refereed research to all would-be users.
You mean researchers specifically as researchers or researchers as human beings? If it’s only about being a researcher, than what makes this particular group as sluggish as, according to you, they now are? Yet you said previously it’s about “human mind”.
I'd change it now from "human mind" to "academic" mind (though the more general case can probably be made too, as an academic is merely a human being in a certain kind of profession...).
I have to confess that I don't understand why it's taking academics so long. Some say it's because they are already overworked, but I think that's a self-serving view and probably not true about most academics. Besides, self-archiving takes next to no time per paper (and even with "publish or perish" academics don't publish that many papers per year!)
But I've taken a stab at trying to diagnose and catalogue the many causes of "Zeno's Paralysis" in the BOAI self-archiving FAQ. There are at least 38 of them at last count. The top two are laziness and fear of publishers.
Harnad, S. (2006) Opening Access by Overcoming Zeno's Paralysis, in Jacobs, N., Eds. Open Access: Key Strategic, Technical and Economic Aspects, Chapter 8. Chandos.
Nowadays you don’t hear about plain Open Access that often. I mean the term is still largely used and is a highly recognisable mark. Yet it is often either traded for “Public Access - especially often when gratis Open Access was meant - or, when libre Open Access comes into the consideration - incorporated into Open Science. Is Open Access, according to you, an inseparable part of Open Science (as Open Data perhaps)? Or is it rather a separate goal, affordable by separate means and you think it is being thrown into a big, loose and fuzzy bag labeled “Open Science” for purely rhetorical reasons?
Not only is universal toll-free online access to refereed research (OA) -- Gratis, Green OA -- the first and foremost goal, but it is and has long been completely within the research community's immediate reach. It just has not been grasped. We cannot have Libre OA and CC-BY till we first have Gratis OA. And we cannot have Open Science without Libre OA and CC-BY. And Open Data, even if CC-BY, are of limited use if the refereed articles based on them are not OA.
So just as the optimal and inevitable outcome has been delayed by the pre-emptive Fool's Gold Rush, so it has been delayed by trying to reach pre-emptively for Libre OA, CC-BY and Open Science without first troubling to mandate universal Green, Gratis OA. (I’ve called this “Rights Rapture.”)
Harnad, S. (2013). Worldwide open access: UK leadership? Insights, 26(1).
Apart from Open Access, what do you mainly do as a researcher?
My research is on how people acquire categories. To categorize is to do the right thing with the right kind of thing: to approach it, avoid it, eat it, mate with it, manipulate it, name it, describe it. Categories are kinds, and our brains need to find the features that distinguish the members from the nonmembers of each category relevant to our survival and success.
I say “our brains” do it because often we categorize without knowing how we are doing it. My field, cognitive science, is devoted to "reverse-engineering" the mechanisms in our brains that generate our capacity to do all the things we can do. The ultimate goal is to create a model that can pass the Turing Test, a model that is able to do all the things we can do.
In the lab people learn new categories (e.g., new kinds of shapes) by trial and error, with feedback signaling to them whether they are right or wrong. We measure what is going on in their brains as they learn, and we also model the process with computer-simulated neural networks that are trying to learn the same categories.
We share the capacity to learn categories from direct trial-and-error experience with many other species, but that is not the only way to learn categories. Our species is unique in that we can also learn categories verbally: Someone else who knows which features distinguish the members from the non-members of a new category can tell us. Almost all the words in a dictionary are the names of categories. And every word in a dictionary is defined. So if there is a word whose meaning you don't know, you can look up its definition. But what if you don't know the meaning of the words in its definition? You can look those up too. But it can't continue like that indefinitely, otherwise you would eventually cycle through the whole dictionary without having learned anything. This is called the "symbol grounding problem." Some of the meanings of some words, at least, must have been grounded the old way that we share with all other species -- via the direct trial-and-error experience we study and model in the lab -- in order to ground the meaning of a new category learned though verbal definition alone.
How many words need to be already grounded in experience -- and which ones -- so that all the rest can be learned from verbal definition alone? This is another problem we work on, by doing graph-theoretic analysis of dictionaries. The number is surprisingly small, under 1500 words for the biggest dictionaries we have analyzed so far. The grounding words tend to be learned earlier, more frequent and more concrete than the rest of the words in the dictionary. We think this may also provide some clues about the evolutionary origin of language as well as its adaptive function: Language is what allowed our species to acquire infinitely more new categories than any other species, and far more quickly and safely, by combining the names of the already grounded ones into definitions or descriptions of new ones, conveyed by those who already know the new category to those who do not. It is also what made science possible -- and it is also what led to Open Access. If 300,000 years ago we had “charged” one another a toll for access to information about new categories, language would never have evolved. (Nor would money!)
The connections between my research on the two ways of acquiring categories and the need for open access was mapped out in an interview with Richard Ponder a decade ago.
Poynder, R. & Harnad S. (2007) From Glottogenesis to the Category Commons. The Basement Interviews.
Blondin-Massé, A., Harnad, S., Picard, O. & St-Louis, B. (2013) Symbol Grounding and the Origin of Language: From Show to Tell. In: Lefebvre C, Comrie B & Cohen H (Eds.) Current Perspective on the Origins of Language, Benjamin.
Turing Test: effective or not? Does written communication on unspecified subjects have to involve processes we could safely call cognitive or will chatbots stay what they are today? And today they are apparently much like what chess playing algorithms are: essentially a set of clever heuristic rules and vast libraries of optimal movement sequences. ELIZA wasn’t even that (meaning it had no library and the heuristic rules weren’t that clever) and still it fooled quite many human testers.
The Turing Test is not a 10-minute chatbot test. Nor is it about "fooling" anyone. It is a scientific attempt to reverse-engineer cognition: to discover its underlying causal mechanisms. Turing's criterion is performance capacity. The model has to be able to do anything and everything a normal human can do, indistinguishably from a human (for a lifetime, if need be). Turing's insight is that if the mechanism can do everything we can do, indistinguishably from any of us, then we have no better or worse reason for affirming or denying that it has a mind than we have for affirming or denying it of any of us.
But the Turing Test comes at several levels. The best-known one, "T2," is Turing-indistinguishable verbal capacity (tested via email only). But we have many other capacities, and our verbal capacities are almost certainly grounded in them, as I described above: "T3" requires Turing-indistinguishability not just in verbal capacity, but in the capacity to interact with the world of things that words refer to. Hence T3 is Turing-indistinguishability in robotic (sensorimotor) capacity. (One can also require T4, Turing-indistinguishability in neural activity inside the head, but this is probably needlessly over-demanding.)
Harnad, S. (1992) The Turing Test Is Not A Trick: Turing Indistinguishability Is A Scientific Criterion. SIGART Bulletin 3(4) (October 1992) pp. 9 - 10.
Harnad, S. (2014) Turing Testing and the Game of Life: Cognitive science is about designing lifelong performance capacity not short-term fooling. LSE Impact Blog 6/10 June 10 2014.
You acknowledge Turing’s statement that T2 and T3 are exactly what we do with other human beings in order to know whether they have minds or not. Yet you say that the test lasts for an entire lifetime, if needed. Does it mean we can’t really be sure if other humans have minds? Is assuming another person’s rationality just a courteous convention?
No, the lifetime capacity is just to rule out short-term tricks that really do just fool people. The Turing Test has two criteria: The first is that the model has to have our full performance capacity; the second is that we cannot tell it apart from a real person exercising that capacity. People can be fooled in the short-term, so it's important that neither the test nor the capacity be just short-term. But in practice I think that any robot that could interact with us (and the world) indistinguishably for a few days would probably be able to do it for a lifetime (i.e., it would probably have our full capacity).
In his "Return from the Stars" Stanisław Lem episodically pictures a vision of a robot cementary. Various dysfunctional "automatons" (as he called them) are awaiting in a kind of giant hangar to be melted back to recyclable materials. The process is fully automated and supervised only by other robots. When the protagonist (an astronaut that came back from a 10 years long voyage to Fomalhaut and due to the time dilation phenomenon faces a brave new world 127 years later back on Earth) strays into the hall, he witnesses a large spectrum of the dysfunctional robots' behaviours. One appears to be particularly resourceful and in order to avoid the melting desperately poses as a man wrongfully taken to be a machine. Another one is apparently praying.
You're a cognitivist and much of your research was devoted to problems of Artificial Intelligence. If AI would show such behaviour as portrayed by Lem, would you fight for the rights of robots as well as for the rights of animals? Even if after all Searle was right and they all came out to be just "Searle's chinese rooms"?
Yes, if ever we have robots that are T3-indistinguishable from us I would conclude that they are, like us, sentient, and I would fight for their right to be free of needless human-inflicted suffering.
But are you not aware of the irony of segueing into speculations about science fiction when there is a stark reality very much like this one that is transpiring, unopposed, everywhere, at every moment, as we speak, not with robots but with living, feeling members of species other than our own?
Harnad, S. (2014) Animal pain and human pleasure: ethical dilemmas outside the classroom. LSE Impact Blog 6/13 June 13 2014.
In the Cambridge Declaration on Consciousness the mirror test is mentioned as an important way to distinguish between certain classes of animals according to their intelligence. As a cognitivist, what can you tell a layman about those animals that “pass” the mirror test?
It is actually trivial to design a robot that can pass the mirror test (locate and manipulate parts of its body using the mirror) and it certainly does not mean that the robot is conscious. To be “conscious,” by the way, means to feel. And even animals that don't recognize themselves in the mirror feel (i.e., they are sentient). So I consider the mirror test as just a test of some higher-order cognitive capacities. The real question is whether an entity feels. There is as much evidence that other mammals and other higher vertebrates feel as there is that preverbal children feel. And almost as much evidence that lower vertebrates and invertebrates feel.
Harnad, Stevan (2016) Animal sentience: The other-minds problem Animal Sentience 2016.001.
Can AI play any significant role in Open Science? Or maybe it is playing one now?
It could, if we had open science (but we don't!). AI and deep learning are already being applied to data-mining the tiny fragment of the scientific corpus that is online and open, but there is much more to come -- once we have OA.
Dror, I. and Harnad, S. (2009) Offloading Cognition onto Cognitive Technology. In Dror, I. and Harnad, S. (Eds) (2009): Cognition Distributed: How Cognitive Technology Extends Our Minds. Amsterdam: John Benjamins.
Now the other way round. Can Open Science (or maybe some particular aspect of it; Open Data for example) play a significant role in AI development?
Of course it could -- just as it could play a significant role in the development of any area of science. But for that you need open science. And for Open Science, we first need OA...
So instead of dreaming about the potential benefits of OS, we should first grasp the Green Gratis OA that has long been within our reach.
Without being too specific we might say we are witnessing a certain decline in liberal democratic trends all over the world. We could even speak of a crisis of the liberal democratic model. Can this situation influence the way the current science system works? And if yes, then to what degree?
The most flagrant example of this among the liberal western democracies is transpiring right now, in the heart of the EU, in my country of birth, Hungary. (Your own country, Poland, alas looks like the next one to follow suit.)
The current Hungarian regime's first attempt at an assault on science in 2011 failed, fortunately, but it's a fair harbinger of what is in store for science and scientists if the anti-democratic regimes' assault on democracy and human rights is not successfully resisted.
Nevertheless, in the end you didn't wholly abandon Open Access. You are currently engaged in scientometrics of Open Access publications. Could you make our readers more familiar with this branch of knowledge? Is there something you learned lately in this area which might change your view on Open Access?
Metrics has its proponents and its detractors. But if you think it through, what Bradley said of metaphysics -- "the man who is ready to prove that metaphysics is wholly impossible... is a brother metaphysician with a rival theory" -- is just as true of metrics: Metrics just means measures. Academics don't like having their performance evaluated by metrics like publication counts or citation counts (they don't like being evaluated at all) but we can only gain if we enrich our repertoire of metrics as well as validating their predictive power against a face-valid criterion -- which in the case of research evaluation might be peer rankings (another metric!).
The OA corpus offers the potential for measuring and validating many new metrics, field by field, including: (1) download counts, (2) chronometrics (growth- and decay-rate parameters for citations and downloads), (3) Google PageRank-like citation counts, recursively weighted (citations from highly cited articles or authors get higher weights), (4) co-citation analysis, (5) hub/authority metrics, (6) endogamy/exogamy metrics (narrowness/width of citations across co-authors, authors and fields), (7) text-overlap and other semiometric measures, (8) prior research funding levels, doctoral student counts, etc.
But for all this, we need one thing first: Universal Green OA.
Let's say I haven't abandoned OA. I've just had my say (many times over) and am now waiting patiently for the global research community to get its act together.
Harnad, S. (2008) Validating Research Performance Metrics Against Peer Rankings. Ethics in Science and Environmental Politics 8 (11) doi:10.3354/esep00088 The Use And Misuse Of Bibliometric Indices In Evaluating Scholarly Performance.
You have lately devoted yourself to the fight for animals' rights. Can I ask you about the philosophico-ethical background of this part of your activities? What's your main argument for the animal rights?
The only animal "right" for which I am fighting is the right of sentient organisms to be free of needless human-inflicted suffering. And it is not an abstract philosophical issue but the greatest crime of humanity. (Humanity's greatest crime against humanity was the Holocaust. But its greatest crime tout court is the Eternal Treblinka we are inflicting on nonhuman animals.)
Why is it the greatest crime? Because we do it even though (at least in the developed world today) the horrors we inflict on animals are necessary neither for our survival nor for our health. And they are indeed horrors: indescribable, unspeakable, unpardonable horrors.
There is no horror that we inflict on nonhuman animals that we have not also inflicted on humans. But the fundamental difference is that we have decided that inflicting these horrors on humans is wrong, we have made them illegal, and all but sociopaths and sadists would never dream of inflicting them on people -- whereas inflicting them on animals is not only legal, but most of the human population acquiesces and collaborates in inflicting them, by demanding and sustaining the resulting products.
The only hope for the countless tragic victims of this crime of crimes is that the decent majority, once it is made aware of two fundamental facts -- (1) the true horror of what it entails and (2) that it is totally unnecessary -- will realize that it is wrong, just it did with rape, murder, violence and slavery, and will renounce and outlaw it, just as it did with rape, murder, violence and slavery.
Rather than continuing to bang on about OA (which is a foregone conclusion in any case, and only a matter of time), I want to devote my efforts to hastening the end of this monstrous animal agony, inflicted needlessly by humans, and far more urgent (for the victims). Ironically, part of the solution here turns out to be an open access matter too: CCTV cameras, videotaping the horrors in the slaughterhouses and web-streaming the evidence openly online, for public inspection through crowd-sourcing.
Harnad, S (2015a) To Close Slaughterhouses We Must Open People's Hearts. HuffPost Impact Canada July 2 2015.
Harnad, S. (2015). Taste and Torment: Why I Am Not a Carnivore. Québec Humaniste.
Patterson, C. (2002). Eternal Treblinka: Our treatment of animals and the Holocaust. Lantern Books.