It All Leads to Open Science


Maciej Chojnowski: My guest today is Professor Włodzisław Duch: a physicist, cognitivist and Under-Secretary of State in the Ministry of Science and Higher Education. Good morning, Minister.

Włodzisław Duch: Good morning.

Minister – in the repository run by Nicolaus Copernicus University in Toruń, your home university, there are over 20 publications that are either authored or co-authored by you, and are now available in Open Access. But the Nicolaus Copernicus University (UMK) does not have any regulations that would oblige researchers to deposit their works in an institutional repository. Why did you decide to use this opportunity yourself?

There are some encouragements coming from the rector's office, so I think the regulations will be passed, eventually. I have been sharing my works for a very long time. The first contact with foreign science, which allowed me to co-author a work with a colleague from Copenhagen, was in 1986... so that's ancient history. Since then I came to appreciate the advantages of Internet connection and the opportunities that come from making our works freely accessible online – both in terms of  availability and advertising. That's why we set up our very first web server as early as 1993. I was working on a certain European project, and even then Internet websites were the best way of disseminating scientific works. Since that time I have set up my own account and I've been opening every research output I could. There are works that are not copyrighted, various notes, presentations from lectures, etc. Many people started to contact me, asking questions, but also – proposing cooperation. Many European projects that we were invited to happened precisely because somebody noticed those publications. I've been working mostly in computer science for the past 20 years. In that area, most  publications are the conference papers. They are not as easily accessible as journal papers. So if we want our work to be read – not just stand on the shelf and look pretty – we have to advertise that work. Then we can hope that someone will want to read it, that we're not writing it in vain.

So, one could say that you are one of the pioneers of Open Access in Poland?

At some point most of the Internet traffic at the UMK was generated by my website. Now it's changed, because  almost everybody has their own website. But in the 90s, even mid-90s, very few people would make use of that.

I would like to address the Ministry's policy concerning Open Access. In 2012 the then Minister of Science and Higher Education, Barbara Kudrycka, declared that at the turn of 2015/2016 60 percent of all research funded from public money is to be openly accessible. Also, in July last year the Conference of Rectors of Academic Schools in Poland and the Presidium of the Polish Academy of Sciences collectively supported the idea by publishing a statement in which Open Access was declared as the right solution, requiring further regulations, both at the political and infrastructural level. And now, in August 2014, I would like to ask you: what steps does the Ministry intend to take to make these regulations happen? And another question: do you believe those declarations from two years ago can be fulfilled?

It is not an easy task... or at least more difficult than we'd thought. I have been with the Ministry for a relatively short time – since April – but as soon as I got here, I started investigating what was happening with those declarations. And so, we made our first step – we distributed a certain document proposing how to implement those ideas, with a specific schedule. It was sent out for public consultations, and in September we will hold a conference to summarize all the feedback for this document. There is bound to be a lot of criticism, some things might be difficult to carry out, but we still believe we can confirm this schedule and implement it in a timely manner. For instance, can we force the grantees to put everything in Open Access? It is all a question of costs... and, of course, there are different forms of Open Access.

Repositories store initial versions of publications, which often are not subject to copyrights – there are no issues here. However, in science we try to uphold a certain level of quality. It can be provided by journals which can operate in both a closed and an open way. Currently, the good Open Access ones charge quite a lot for each publication. However, we did make a step forward by setting up a Virtual Library of Science which is publicly accessible and allows to view all kinds of scientific journals, published by Nature Publishing, Elsevier, Wiley or Springer. In some cases, like with Springer, we negotiated an opportunity for Polish authors to publish for free in Open Access in the journals whose licences we bought. So far, so good, right? Everybody should make sure that their publication would be available to the widest possible audience.

With other journals, those from Nature Publishing or Elsevier – the publication costs would be considerable. Just look at the costs of library subscriptions. Many universities or research units try to save money here, because they have access to publications through a Virtual Library of Science. We can buy either national or consortium licences – the latter only if the number of recipients is relatively small. Now, if we obliged everyone to publish, it would be necessary to pump a lot of money into grants designed for publication. It's rather not possible, because there are no additional funds. However, we can still lobby for open publication through our evaluation system, either directed at units or individual researchers, encouraging them to make their work as widely accessible as possible. When you put your works in Open Access, the chance of being cited is higher. If we take citations and accessibility into account in our evaluation, we will have a soft point of pressure. Can we gather enough funds in the next year's budget to use them and support publishing in highly ranked OpenAccess journals? We'll see.

I have an idea which was not yet included in the document we sent out. The idea was that we ourselves should create some platforms that would function in connection with the journals. Platforms that will publish scientific books. There is a lot to do here, especially in the humanities and social sciences, because social and historical publications tend to have a very limited circulation. And if they're published on paper only, they end up in some libraries, and few people ever have a chance to reach them. And so, the money seeps through the cracks – there isn't much return we get from that. Polish authors are under-appreciated because their works are not easily accessible. The idea is to prevent this by creating repositories – and we have a couple of those – to share books, publications, and journals. In Poland we have about 2,000 journals and, according to Wikipedia, this includes 121 historical journals. It is a great number, but it does not influence the general view of Polish science. There are people – very competent people – that see Polish science as weak. They complain about it being somewhere past the four– or five-hundredth mark in the ranking. They are often professors at prestigious universities. And when you ask them: “What did you do to put your university higher in the ranking?”, they have no answer. Their works are stored on shelves, in libraries, well-hidden. No one can index them properly, so a chance for the university to go up in the ranking is virtually nil. So we need to look at ourselves first and answer: “What did I do for my university to perform better in the ranking?”.

To do better, you need to have a more efficient platform. We will have this opportunity – on one hand, there are repositories. We would like to have open journals that will verify their content in terms of quality and will slowly gain recognition. The idea of a journal platform is developing – there are several such initiatives. The biggest one is part of the SYNAT and Open Science projects run by the Interdisciplinary Centre for Mathematical and Computational Modelling in Warsaw. ICM is involved with supporting and creating such repositories and platforms. There are over 400 journals that agreed for their content to be published there. Some of those journals are private, which cause copyright and administrative disputes, but some journals are already on this platform. There are several universities running their own platforms. UMK in Toruń currently hosts 46 of its own and 40 external journals. The Jagiellonian University has several dozen journals, and so does the Polish Academy of Sciences (PAN). A consortium of research institutes from PAN created the RCIN repository, which already cooperates with about 40 institutes. It is a massive-scale movement. A lot of journals are being scanned. From what I've heard, the National Library is already scanning 1100 out of the total number of 1900 Polish scholarly journals. So the information does get out into public circulation. We expect this to give us a number of advantages, including better benchmarks to measure the quality of journals. And in our evaluations of the units' scientific activity, we would be able to create meaningful indices, set a scoring system that would measure how often a journal is cited and how it is viewed. But most importantly – the authors will gain recognition. Maybe new citations will appear and everything will go in the right direction.

We may even start thinking about transferring the journals themselves – I have already talked to representatives of Elsevier and they are willing to transfer our best journals from our own open platforms to the Science Direct Platform. Not for free, of course, but I think that costs will not be excessive, and it could help some of our journals to become visible at an international level. Now for the other possibility: in the last dozen years we saw the emergence of mega-journals such as PLoS ONE. There are at least seven consortia that include scientific publications in their portfolios. They create platforms whose operating model is very different. It is not a typical journal with an editorial board – it is a journal which only decides whether a given work makes sense and is worth accepting. Evaluation of the work itself is performed in the course of discussion that occurs after it appears on the platform. This completely changes the paradigm of science evaluation. The method might not be popular in Poland – yet – but PLoS ONE is a highly valued platform, publishing works from many different areas. The citations from them are many, so I hope that would be the direction we can go. Creating our own platforms, helping to set up such mega-platforms in the future – this requires a wider discussion in the academic circles. I hope it could begin in September, when we receive the critical feedback concerning our proposals. We are determined to strengthen the position of our science and make this step forward. I hope we can achieve a lot during the next year, and that it won't be all promises.

Minister, can we please talk more of the division you have mentioned? I mean – the division into green OA (repositories)  and gold Open Access (open journals). You talked about platforms which would serve as a centre for sharing different journals, but I would like to speak of the repositories again. There are several internationally renowned repositories – like arXiv or PubMed Central. Does the Ministry's strategy differentiate between these two roads of Open Access, or shall one of them be somehow privileged?

We are not trying to grant privilege to anything. arXiv is always the starting point for a discussion. After that, proper publications appear and are verified by specialists who engage in such discussions. I imagine that our own repositories should also serve as a basis for discussion. This method of evaluating publications will keep changing, but the criteria right now are clear – we award points for the quality of journals in which papers are published. It would be very hard to change, saying: “go and publish in repositories, everyone, anywhere”. A verification needs to occur so we can tell good science from bad science. And judging by experience with our academic staff, generating a large number of citations in a repository is pretty easy. It would be more difficult to be cited in Nature, for example. But we are not really counting on it. I think a repository should be the first step towards a real publication – in a journal. The journal can be traditional – which limits the number of recipients – or virtual, open and easily accessible. Right now, thanks to national licences, we have a sort of Open Access, for which the Ministry pays a hefty sum. The academic circles don't always want to remember that. Right now, the biggest single item in our budget is the price that allows everyone to click and download for example Nature publications. Downloading a single work costs almost 50 PLN, so we are really pumping money into this. But there are many countries that don't have this access. Yesterday I met with the minister from Ukraine and their delegation complained about a lack of such access. They didn't invest in a similar open library. For them, Open Access really is a blessing. It allows them to reach any publication that's in there. I don't think we'd put more pressure on open repositories than journals. Both solutions are developing. We shall see what happens after the September conference.

So, you are of good hope then? I wanted to ask you about the reception of Open Access in the academic circles in Poland. On one hand, there are EU recommendations which seem a bit like bureaucratic nagging. Researchers work elsewhere, and may not want to implement the ideas of civil servants. On the other hand, you are a prime example of how opening one's own publications can bring advantages, and not just abstract ones, like intangible social profits on a very wide scale, either economic or academic, but very concrete opportunities for the researcher. The researcher's name, their academic background becomes ever more recognised, the citation index increases. If we apply specially designed mechanisms to make it all easier, it would completely change the landscape of science, both in terms of access to publications and in terms of evaluation. Do you think Polish scientists are aware of those potential advantages?

It depends on the person. On one hand, there are areas in science which have no problems with that, and researchers already do publish in the best journals, which are properly indexed. The opposite applies mostly to social sciences and humanities. They are often not used to this, and mostly publish in book form. Their academic circle is pretty closed, watertight. As a result, we have no idea what is happening, how many people read those works, etc. Lately, I came into a bit of a shock when I heard one of our most prominent philosophers, from one of the best universities, speak on the subject. When he heard that his book was to be published in electronic form, he immediately withdrew his consent. So now, I'm asking – does he only want the book to look good on a shelf? Or would he like people to read and discuss it? Because if it's the latter, then the book's form does matter. A printed text would not be read by quite as many people.

I'm interested in a lot of research myself – sociology, history, etc. I would gladly read a lot of papers, but I have no time to go to a library. Look at what's happening at universities – very rarely do you see a young researcher in a library. I remember an older professor from New Zealand who used to work in our institute for a dozen years. He always talked about the importance of paper and libraries. Myself, I have never seen him in a library, he was forever in front of a screen reading from arXiv. So – people say one thing and then do the other. They read from the screen, because it's convenient. I read everything on my tablet, because it's convenient and because I wouldn't be able to carry that many books and papers. My shelves are still crammed with books and photocopies. Even if I have something there, finding it on the net is still faster. The speed of reaching information, the accessibility – that's the main difference. I think even the people who have great barriers can still see that it all leads to open science.

Moreover, there are several very far-reaching ideas concerning social sciences and humanities. There are large European projects, such as DARIAH. The idea is to create a common, European digital space for the arts and humanities. There are a lot of interdisciplinary projects. People tend to get wrapped up in their own specialisations, while the opposite should be true. We should cooperate, reaching various sources, texts, video documents, audio documents, etc. Various things in electronic form, which we cannot have in print. They will allow specialists to work together, creating new quality. An initiative group in Poland has recently written a paper on how those platforms for social sciences and humanities could help to integrate all science areas at the European level. We really have much to be proud of, but cannot express it properly. There are foreigners writing on the history of Poland, because our history books are not published in foreign languages. The key goal is to share this heritage. If we start creating open journals, they can be written in Polish, but they should at least have abstracts in congress languages, especially English, so they would be indexed properly.

The information on our science and whatever is happening with it, should be widely available. Good search engines are the key. There is a number of ideas on how to approach this. One of the leading American universities, Rice University, had a project called Connections. The idea was that people would write modules – like a single lecture – put them into a repository and then anyone could use them, for instance, put it in a course book. They became a huge success – they print books on demand. It is actually very popular now. This could revolutionize the whole approach to creating textbooks, journals and specialized works. But the key is a good search system with decent scaling. It is easy to put a hundred works in a repository, but when the hundred becomes tens of thousands – then organisation is the key.

Open Access, in its basic understanding, is about abolishing the technical and financial barriers. So, at first glance, it's about making publications freely accessible on the Internet. But it is a very basic understanding of this concept. It can also be understood as abolishing legal barriers. What is it really about? When the legal barriers are removed, it becomes possible not only to absorb a given material passively, but also to reuse it, as we see fit. Text A could be included in a set of texts that are then automatically processed – what we call “text mining”. It could also be used for educational purposes. There are no limitations resulting from traditionally understood copyrights. Now I would like to ask this: does the Ministry consider both of these aspects – traditionally termed “gratis Open Access” and “libre Open Access ” – equally important?

The Ministry cannot tell the academic circles what to do. We have to discuss it with a wider audience to sort out the moods. I think it is good to take a step forward based on existing materials. In some areas, before real-time communication, people kept doing the same things over and over. It all went in circles, instead of going forward. The idea is to use the material available. As Isaac Newton put it: “If I have seen further it is by standing on the shoulders of giants”. The “giants” might be an exaggeration, but we do use a lot of old material. For instance, the pictures I find on the web can later be used in publications. Quite recently I wanted to use some neurobiology atlas with pictures of the brain. I wrote to the authors, and they told me that each slide I use in my presentation will cost me $90. A single slide! This one lecture would have cost me thousands because I needed a lot of pictures. On the other hand, materials such as pictures are pretty common and make our life easier. People upload their video lectures, share their illustrations, etc. We build upon what we have and this gives us progress. It is sometimes a mixed blessing. I get a great idea, try to check it out, and learn that somebody else has already done it. In the past, I would have started to develop it myself and probably lost years of work. It's better to know immediately.

I think both roads have advantages, but sharing everything causes intellectual property to become blurred. I'm not that attached to my ideas, I have too many to keep track of. Sometimes I would toss some ideas out there, forget about them, and later discover people developing them, not remembering where they came from. Tough, I say. We are working for the common goal, not just for our personal credit. But I know that approaches tend to differ. People who are still developing their careers would like to get credit for their creative effort. So, I think the limited access – gratis Open Access – would also be popular. But everybody should specify how much of their work can be reused by others. But then the libre OA gives an opportunity to reuse works, with full credit to the author. There are benefits to this too, mainly citations. We toss ideas, someone develops them, and our pioneer work resurfaces. But then the citations stop, because the works become so advanced. Nobody reaches back to the original sources. There are people who are wrongly considered pioneers – and they aren't, it's just that nobody investigated far enough.

I think we can talk about Open Access as part of a wider initiative that is Open Science. In this area there are talks, not only of OA articles, but also open data, open peer review or even  so-called “open notebook science”.  While open data is ever more present in science funding, like in the pilot programme in Horizon 2020, the other two ideas – open peer review and open notebook science – are not widely practiced. How do you evaluate the chances of success for these initiatives?

I think the road will be long and rocky before they really catch on. It is discussed on many forums where data is collected. Open data is very useful if you want to hold a contest. A couple of years ago I held a contest myself. It was about eyeball movement analysis, conducted on people with various illnesses. The data was collected by psychiatrists, and we held a very interesting contest. The people analysing those signals were able to extract very compelling diagnostic information. But data processing is very tedious. Another thing we did with our colleagues from the US was to prepare hospital discharge cards, which could be generated automatically, so the patient would know what had been done to them, and how much it had cost. Three companies made their own annotations and we prepared a final version. Even processing textual data is very labour-intensive. The data on neuroimaging is not easily shared. Every equipment is specific – sometimes an electrode does not contact properly, or a channel does not work.

There is a lot of talk on how this data should be shared. The idea of big data is becoming ever more popular. The problem is that most researchers are quite egoistical, they only ask: “What do I get out of this?”. If I create some data, then I get cited often. Now we have whole repositories for training algorithms, machine learning. A whole set of data is shared and it may contain some sort of information – perhaps, how to tell two illnesses apart. Two sorts of cancer, etc. Now, people will try to analyse this data, someone will have a good idea. A lot of people try, and the folks who collected the data get cited. There are some benefits, yes, but sometimes it's terribly arduous to prepare the data. So the whole issue is moving forward rather slowly. When it comes to, say, electroencephalography, in Warsaw, there is a group in neuroinformatics who have been doing this for years. Some of the data comes from contests, but not much. Accessing data is difficult for them – they need to know what equipment was used, was there much noise, artefacts, other things. But in broadly understood science, where things can be done together... If it's completely open, it might cause some trouble. Many people do not show their findings until they're published, because in some areas there's a lot of competition. Elsewhere, they're a bit more willing to share. But with areas involving innovation, the solutions close to being implemented by the industry... The industry would not like them to be distributed. In drug research, there is really fierce competition and all the findings are confidential. Even the authorities who later allow certain drugs to be put on the market have trouble getting full information on them, because drug companies are so determined to protect data from competitors.

There are a lot of issues stemming from intellectual property rights, but there is progress in this area. People are trying to cooperate. In 1994 I witnessed a final stage of a 5th generation project – large, intelligent computers made by the Japanese. And the conclusion was that the era of big projects, where a couple of companies have created their own institute, must end. That virtual, dissipated research groups which cooperate using the right tools will do the same job more efficiently, without building any central structures. In our European projects, there are talks of opening virtual research institutes – maybe in Poland as well. The Foundation for Polish Science is engaged with this, maybe the groups working in such units will use open notebooks. But they will more likely use them within their closed circuit. Unless it would be research requiring wide cooperation – then we'll try to open it as much as we can. There is room for improvement on both sides.

I would like to ask you one more question about infrastructure. Implementing Open Access is not only a question of passing regulations, is it? It is vital to provide an infrastructure that will allow researchers to work efficiently. Real action steps. Moreover, they should not only be consistent at a national, but also European, or even global level. So – how do you see the current status of this infrastructure in Poland right now? What needs to be done? What has been done?

Certainly a lot. There are a lot of  projects which need coordinating, and that's a job for the Ministry. Many universities create repositories without agreeing on standards, and that may cause trouble later. It is an ailment of all big IT projects: everyone wants to have things immediately, and the end products are not always compatible. But we are building these systems on a larger and larger scale. We are building central systems, like POL-on, which contain information on all publications and what's happening at universities, what equipment is used. There are large data repositories appearing that are to be maintained by the Ministry of Administration and Digitization. There are numerous platforms for repositories and journals, and we will try to encourage them to abide by certain standards. The biggest endeavour is SYNAT, a project coordinated by the Interdisciplinary Centre for Mathematical and Computational Modelling. I hope it can be used to achieve much on a central level, and the local efforts can be integrated top-down. It will require a lot of agreements between creators of local and central platforms. Luckily, there aren't that many. But the numbers are growing, and there are the European projects for the social sciences and humanities like DARIAH infrastructure.

The infrastructure will need to be pretty advanced, because their functionality is very innovative. Not only provides tools for cooperation, access to indexed and scanned sources, but also opportunities to make science. Like what Google did with culturomics. It's very interesting – with so many scanned resources, you can follow a lot of trends. When did certain phrases originate, how did they change, etc. It's very useful to linguists and historians. Broadly speaking – a wide access to information appeared, because a lot of books had been scanned. And this includes old books, which are no longer copyrighted. It caused a completely new field of study to appear – a discipline called “culturomics”. The Institute of Literary Research of the Polish Academy of Sciences created its own Digital Humanities Centre – I think digital humanities have a great opportunity to develop now. It's not only repositories, there is also automatic analysis of various links and connections.

There are new opportunities that emerge thanks to artificial intelligence systems, speech-to-text tools that allow to archive radio auditions, TV programmes, etc. Also: new possibilities of analysing natural language, like the CLARIN project done by the Wrocław University of Technology. There are different ideas and projects in different circles, but if they are to cause some deeper change, they should be consistent with one another. I hope that the Centre for Open Science will play a leading part here, helping to integrate those local projects.

To wrap up, I would like to ask you about Open Access in a global context. Open Access is not only limited to Europe or North America. In every part of the globe, there is at least one institution that implements Open Access. There are many solutions – different countries adopt different policies and do not always agree on how particular solutions should be implemented on a national level. It is different for Denmark and, say, Argentina. Does any of the international solutions appeal to you in particular?

There is a series of European projects on Open Access – PASTEUR4OA, Open Access Policy Alignment Strategies for European Union Research. These are various European endeavours. The project only started in February this year, and it is to last for three years only. There is SPARC that works for Open Access in Europe, and we even have our own representative there. It's Bożena Bednarek-Michalska, Deputy Director of the UMK library, she's very active in the Open Science domain, and organises a lot of events. There are many ideas on how to unify the open structure on a large scale. We, as the Ministry, try to portion our funds rationally, trying to direct them with maximum benefit for the Polish science and the society in general. But we are not omniscient – these issues need to be discussed by the academic circles, not just in Poland, but within all those other organisations. I hope we can reach a consensus and start building common platforms. Without them, everyone will end up with something incompatible. Then it would be hard to build a search system that would encompass it all.

Minister, I would like to thank you for a very interesting conversation.

I would like to thank you too.


The interview was held in Warsaw, 28th August 2014.

For Polish version of this interview click here.

Additional information