Open Access to Peer-Reviewed Research through Author/Institution Self-Archiving: Maximizing Research Impact by Maximizing Online AccessS Harnad
Chaire de Recherche du Canada, Centre de Neuroscience de la Cognition (CNC), Universite du Quebec a Montreal, Montreal, Quebec, H3C 3P8., Canada
The optimal situations for researchers are:
l Online availability of the entire full-text refereed research corpus
l Availability on every researcher's desktop, everywhere 24 hours a day
l Interlinking of all papers & citations
l Fully searchable, navigable, retrievable, impact-rankable research papers
l For free, for all, forever
All of this will come to pass. The real question is “How Soon?” And will we still be compos mentis and fit to benefit from it, or will it only be for the napster generation? Future historians, posterity, and our own still-born potential scholarly impact are already poised to chide us in hindsight. What can the research community do to hasten the inevitable process of instituation of optimal conditions? Here are some recent concepts that may help.
During the transition from the Gutenberg (on-paper) to the Post-Gutenberg (online) era, several changes have occured in the field of scientific and scholarly publication, we have to take note of five critical distinctions:
l Distinguish the non-give-away literature from the give-away literature: This is the most important Post-Gutenberg distinction of all. It is what makes this small, refereed research literature anomalous (~24,000 refereed journals, ~2,500,000 articles annually) - fundamentally unlike the bulk of the written literature. Its authors do not seek, nor do they receive, royalties or fees for their writings. The only thing these authors seek is research “impact”, which comes from accessing the eyes and minds of all potentially interested fellow-researchers, so that they can read, use, cite, apply, and build upon their work.
l Distinguish income (arising from article sales) from impact (arising from article use): Unlike all other authors, researchers derive their income not from the sale of their research reports but from the scholarly/scientific impact of their reported findings: how much they are read, used, cited, applied and built upon by other researchers. Hence all toll-based access-barriers are income-barriers for research and researchers, restricting their potential impact to only those research institutions that can and do pay the access-tolls.
l Distinguish between copyright protection against theft-of-authorship (plagiarism) and copyright protection against theft-of-text (piracy): The copyright law offers protection from plagiarism, which is a matter of concern for both “non-give-away” and “give-away” authors. In contrast, theft of text (piracy) does not concern “give away” authors; but “non-give-away authors would like to prevent it. Copyright laws offer hardly any protcetion from piracy.
l Distinguish self-publishing (vanity press) from self-archiving (of published, refereed research): The essential difference between unrefereed research and refereed research is quality control (peer review) and its certification (by an established peer-reviewed journal of known quality). Although researchers have always wished to give away their peer-reviewed research findings, they still wish them to be peer-reviewed, revised (if necessary), and then certified as having met established quality standards. The self-archiving of refereed research should in no way be confused with self-publishing, for it includes, as its most important component, the online self-archiving, free for all, of peer-reviewed, published research papers.
l Distinguish unrefereed preprints from refereed postprints: E-print (“e-prints” = preprints + postprints) archives, consisting of research papers self-archived online by their authors, are not, and have never been, merely “preprint archives” for unrefereed research. Authors can self-archive therein all the embryological stages of the research they wish to report (pre-refereeing preprints and its through successive revisions), till the peer-reviewed journal-certified postprint. These could be complemented with any subsequent corrected, revised, or otherwise updated drafts (post-postprints), as well as any commentaries or responses linked to them. These are all just way-stations along the scholarly skywriting continuum. See http://www.eprints.org/self-faq/
Subscription/Site-License/Pay-Per-View Tolls: The impact/access-barriers
Subscription/License/Pay-Per-View (S/L/P) tolls are the access-barriers. They therefore, act as the impact-barriers, constraining researchers in sharing their research. http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0003.gif Tolls are the journal publisher's means of recovering costs and making a fair profit. High costs were inescapable in the expensive and inefficient on-paper Gutenberg era. But today, in the online Post-Gutenberg era, continuing to do it all the old Gutenberg way, with its high costs is unjustifiable and shuld not be the obligatory feature that it used to be. The only essential service still provided by journal publishers (for this anomalous, give-away literature in the Post-Gutenberg era) is peer review., In the online era there is no longer any necessity, and hence no longer any justification, for continuing to hold the refereed research hostage to access-tolls bundled with whatever add-ons they happen to pay for.
Quality Control and Certification: peer review
Peer review is not a luxary for research and researchers, for certification is essential., Without peer review, the research literature would be neither reliable nor navigable, its quality uncontrolled, unfiltered, un-sign-posted, unknown and, unaccountable. But the peers who review it for the journals are researchers themselves, and they review it for free, just as the researchers report it for free. So it must be made quite clear that the only real quality-control cost is that of implementing the peer review, not actually performing it. Estimates as well as the real experience of online-only journals (e.g., Journal of High Energy Physics http://jhep.cern.ch/; Psycoloquy
http://www.cogsci.soton.ac.uk/psycoloquy/) have shown that the peer review implementation cost is quite low - about 1/3 (c. $500) of the total amount that the world's institutional libraries (or rather, the small subset of them that can afford any given journal at all!) are currently paying every year per article, jointly, in access tolls (c. $1500).
Separating peer review service-provision from eprint access-provision (and from optional add-ons)
Researchers need not and should not wait until journal publishers voluntarily decide to separate the provision of the essential peer review service from all the other optional add-on products (on-paper version, publisher's PDF version, deluxe enhancements) before their give-away refereed research can at last be freed of all access- and impact-barriers. All researchers can free their own refereed research now, virtually overnight, by taking the matter into their own hands; they can self-archive it in their institutional Eprint Archives: http://www.eprints.org/. Access to the eprints of their refereed research is then immediately freed of all toll-barriers, forever, and its research impact is at last maximized.
Interoperability: The Open Archive initiative (OAI)
Papers self-archived by their authors in their institutional Eprint Archives can be accessed by anyone, anywhere, with no need to know their actual location, because all Eprints Archives are compliant with the Open Archives Initiative (OAI) meta-data tagging protocol for interoperability: http://www. openarchives.org
Because of their OAI-compliance, the papers in all registered Eprints Archives can be harvested and searched by Open Archive Services such as Cite-Base http://citebase.eprints.org/help/, the Cross Archive Searching Service http://arc.cs.odu.edu/, and OAISter http://oaister.umdl.umich.edu/o/oaister/ providing seamless access to all the eprints across all the Eprint Archives, as if they were all in one global, virtual archive.
Eight steps are described that would free the entire refereed corpus, forever, immediately:
The first four are not hypothetical in any way; they are guaranteed to free the entire refereed research literature (~24000 journals annually) from its access/impact-barriers right away. The only thing that researchers and their institutions need to do is to take these first four steps. The next four steps are hypothetical predictions, but nothing hinges on them; the refereed literature will already be free for everyone as a result of steps i-iv, irrespective of the outcome of steps v-viii.
i. Universities install and register OAI-compliant Eprint Archives (http://www.eprints.org).
The Eprints software is free and GNU open-source. It is quick and easy to install and maintain; it is OAI-compliant. Eprint Archives are all interoperable with one another and can hence be harvested and searched as if they were all in one global “virtual” archive of the entire research literature, both pre- and post-refereeing.
ii. Authors self-archive their pre-refereeing preprints and post-refereeing postprints in their own university's Eprint Archives. All researchers must self-archive their papers therein if the literature is to be freed of its access- and impact-barriers. Self-archiving is quick and easy, it need only be done once per paper.
iii. Universities subsidize a first start-up wave of self-archiving by proxy where needed.
Self-archiving is quick and easy, but there is no need for it to be held back if any researcher feels too busy, tired, old or otherwise unable to do it himself. Library staff or students can be paid to “self-archive” the first wave of papers by proxy on their behalf (http://eprints.st-andrews.ac.uk/proxy_archive.html).
iv. The Give-Away corpus is freed from all access/impact-barriers online.
Once a critical mass of researchers has self-archived, the refereed research literature is at last free of all access- and impact-barriers, as it was always destined to be. http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0004.gif
Steps i-iv are sufficient to free the refereed research literature. We can also speculate as to what may happen after that, but these are really just guesses. This is what might happen:
v. Users will prefer the free version
It is likely that once a free, online version of the refereed research literature is available, all researchers will prefer to use the free online versions. Note that it is quite possible that there will always continue to be a market for the toll-based options (on-paper version, publisher's online PDF, deluxe enhancements) even though most users use the free versions.
vi. Publisher toll revenues shrink as Institutional toll savings grow
It is possible that libraries may begin to cancel journals, and as institutional toll savings grow, journal publisher toll revenues will shrink. The extent of the cancellation will depend on the extent to which there remains a market for the toll-based add-ons, and for how long. If the toll-based market stays large enough, nothing else need change.
vii.Publishers downsize to become providers of peer-review service + optional add-on products?
It will depend entirely on the size of the remaining market for the toll-based options whether and to what extent journal publishers will have to cut costs and downsize to provide only the essentials: The only essential, indispensable service is peer review.
viii. Peer-review service costs on outgoing research funded out of toll-savings on incoming research?
If publishers can continue to cover costs and make a decent profit from the toll-based optional add-ons market, without needing to downsize to peer-review service-provision alone, nothing much changes. But if publishers do need to abandon providing the toll-based products and to scale down instead to providing only the peer-review service, then universities, having saved 100% of their annual access-toll budgets, will have plenty of annual windfall savings from which to pay for their own researchers' continuing (and essential) annual journal-submission peer-review costs (1/3). The rest of their savings (2/3) could be spent as they wish (e.g., on books - plus a bit for Eprint Archive maintenance).
There is a great deal of concern about copyright in the digital age, and some of it may not be easily resolvable. Apart from the protection against plagiarism and assurance of priority that all authors seek, the only other “protection” the give-away author of refereed research reports seeks is the protection of his give-away rights! (The intuitive model for this is advertisements: would an advertiser want to lose his right to give away his ads for free, diminishing their potential impact by charging for access to them?)
There is now no longer any need for the authors of refereed research to worry about exercising their give-away rights, for they can do it legally, even under the most restrictive copyright agreement, by using the following strategy.
Self-archive the pre-refereeing preprint
Self-archiving the preprint is the critical first step. Even before it has been submitted to a journal, your intellectual property is incontestably your own, and not bound by any future copyright transfer agreement. So archive the preprints (as physicists have been doing for 12 years now, with over 250,000 papers).
[Note that some journals have, apart from copyright policies, which are a legal matter, “embargo policies,” which are merely policy matters (non-legal). Invoking the “Ingelfinger (Embargo) Rule,” some journals state that they will not referee (let alone publish) papers that have previously been “publicised” in any way, whether through conferences, press releases, or online self-archiving. The Ingelfinger Rule, apart from being directly at odds with the interests of research and researchers, and having no intrinsic justification whatsoever - other than as a way of protecting the journals' current revenue streams - is not a legal matter, and is unenforceable. The “Ingelfinger Rule” is under review by journals in any case; Nature http://npg.nature.com/pdf/05_news.pdf has already dropped it; Science will probably follow suit too.]
Submit the preprint for refereeing and at acceptance, try to fix the copyright transfer agreement to allow self-archiving Copyright transfer agreements take many forms. Whatever the wording is, if it does not explicitly permit online self-archiving, modify it so that it does. Here is a sample way to word it (http://cogprints.soton.ac.uk/copyright.html): I hereby transfer to [publisher or journal] all rights to sell or lease the text (on-paper and online) of my paper [paper-title]. I retain only the right to self-archive it publicly online on my institution's website.
About 35% of journals already formally support self-archiving of the preprint and 20% support self-archiving of the refereed postprint; many others will agree if asked: (http://www.lboro.ac.uk/departments/ls/disresearch/romeo/Romeo%20Publisher%20Policies.htm).
If the above is successful, self-archive the refereed postprint Some journals, however, will respond that they decline to publish your paper unless you sign their copyright transfer agreement verbatim. In such cases, sign their agreement and proceed to the next step.
If the above is unsuccessful, archive and link a“corrigenda”file to the already-archived preprint:
Your pre-refereeing preprint has already been publicly self-archived prior to submission, and is not covered by the copyright agreement, which pertains to the revised final (“value-added”) draft. Hence all you need to do is to self-archive a further file, linked to the archived preprint, which simply lists the corrections that the reader may wish to make in order to conform the preprint to the refereed, accepted version.
This simple, strategy is also feasible, and legal - and sufficient to free the entire current refereed corpus of all access/impact-barriers immediately!
The freeing of their present and future refereed research from all access- and impact-barriers forever is now entirely in the hands of researchers. Physicists have already shown the way.
It is hoped that distributed, institution-based self-archiving, as a powerful and natural complement to central, discipline-based self-archiving, will now broaden and accelerate the self-archiving initiative, putting us all over the top at last, with the entire distributed corpus integrated by the glue of interoperability (http://www.openarchives.org).
As to the past (retrospective) literature: The preprint+corrigenda strategy will not work there, but as retrospective journal literature brings virtually no revenue, most publishers will agree to the author self-archiving after a sufficient period (6 months to 2 years) has elapsed. Moreover, for the really old literature, it is not clear whether online self-archiving was covered by the old copyright agreements at all. And if all else fails for the retrospective literature, a variant of the preprint+corrigenda strategy will still work: simply do a revised 2nd edition! Update the references, rearrange the text (and add more text and data if you wish). For the record, the enhanced draft can be accompanied by a “de-corrigenda” file, stating which of the enhancements were not in the published version.
Universities: Install Eprint Archives, mandate them; help in author start-up
Universities should create institutional Eprint Archives (e.g., CalTech) for all their researchers. They should also mandate that they be filled. It is already becoming normal practice for faculty to keep and update their institutional CVs online; it should be made standard practice by both research institutions and research funders as well as research analyzers and assessors that all CV entries for refereed journal articles are linked to their archived full-text version in the university's Eprint Archive. Here is a model and free software for adopting such a standardized CV: http://paracite.eprints.org/cgi-bin/rae_front.cgi
Universities need to mandate the self-archiving of all peer-reviewed research output in order to maximize its research impact for exactly the same reasons they currently mandate publishing it (and indeed as the quite natural Post-Gutenberg extension of “publish or perish”: “publish with maximized research impact, through self-archiving”). For a model university/departmental self-archiving policy statement, see: http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html.
For researchers who feel too busy, tired, old, or inadequate to self-archive their papers, a modest start-up budget to pay library experts or students to do it for them would be a small amount of money very well-invested. It will only be needed to get the first wave over the top; from then on, the momentum from the enhanced access and impact will maintain itself, and self-archiving will become as standard a practice as email.
But what needs energetic initial promotion and support is the first wave. If (i) the enhanced visibility, accessibility and usability, of their own research output and its resulting enhanced impact on the research of others, plus (ii) the enhanced access for their own researchers to the research output of others are not incentive enough for universities to promote and support the self-archiving initiative energetically, they should also consider that it will be an investment in (iii) a potential solution to their serials crisis and hence the possible recovery of 2/3 of their annual serials (toll) budget.
Libraries: Maintain the University Eprint archives; help in author start-up
Libraries are the most natural allies of researchers in the self-archiving initiative to free the refereed journal literature. Not only are they groaning under the yoke of the growing serials budget crisis, but librarians are also eager to establish a new digital niche for themselves, once the journal corpus is online. Maintaining the Eprint Archives, and facilitating the all-important start-up wave of self-archiving (by being ready to do “proxy” self-archiving on behalf of authors who feel they cannot do it themselves) will be a critical role for libraries to play.
1. Trained library staff should help in showing the faculty how to self-archive papers in the university Eprint Archive (it is very easy). http://library.caltech.edu/evdv/CODA.ppt
2. The library staff should also offer to help in doing “proxy” self-archiving, on behalf of authors who feel that they are personally unable (too busy or technically incapable) to self-archive. Authors need to supply their digital full-texts in word-processor form: the digital archiving assistants can do the rest (usually only a few dozen key/mouse-
strokes per paper). http://eprints.st-andrews.ac.uk/proxy_archive.html
3. The librarians, collaborating with web system staff, should be involved in ensuring the proper maintenance, backup, mirroring, upgrading, and migration that ensures the perpetual preservation of the university Eprint Archives. Mirroring and migration should be handled in collaboration with counterparts at all other institutions supporting OAI-compliant Eprint Archives.
Students: Stay the course! Surf! The future is yours!
Students are well advised to keep doing what they do naturally: favour material that is freely accessible on the Web. This will not net them very much of the non-give-away literature, but it will put consumer pressure on the non-give-away research literature, especially as these students come of age, and become researchers in their turn.
Publishers: Support self-archiving
1. Explicitly allow and encourage your authors to self-archive their pre-refereeing preprints. One potential model is: Nature's embargo statement: “Nature does not wish to hinder communication between scientists... Neither conferences nor pre-print servers constitute prior publication.”
2. Also explicitly allow and encourage your authors to self-archive their peer-reviewed postprints. One potential model is the American Physical Society's copyright statement: “The author(s) shall have the following rights... The right to post and update the Article on e-print servers as long as files prepared and/or formatted by APS or its vendors are not used for that purpose. Any such posting made or updated after the acceptance of the Article for publication shall include a link to the online abstract in the APS journal or to the entry page of the journal.”
In this critical transitional time between the paper and online eras, refereed journal publishers are best-advised to concede graciously on self-archiving, as the American Physical Society (APS) and so many other publishers are doing, rather than attempting instead to use copyright or embargo policy to prevent or retard self-archiving. A much better policy is to accept and support what is undeniably the optimal outcome for research, researchers, and their institutions in the online era, namely, their research impact maximized through toll-free access for all its would-be users. Publishers can confirm their support for open access by becoming Romeo “blue/green” publishers (as 55% of journal publishers already are): http://www.lboro.ac.uk/departments/ls/disresearch/romeo/Romeo Publisher Policies.htm
Government/Society: Mandate public archiving of public research worldwide
1. Mandate that the research that is publicly funded must not merely be published but it must be publicly accessible online (whether through self-archiving, open-access journals, or both).
2. Make it part of grant applications that CVs and bibliographies citing the applicant's prior work should contain links to the online free full-text.
The Government and society should support the self-archiving initiative, reminding themselves that most of this give-away research has been supported by public funds, with the support explicitly conditional on making the research findings public.
The beneficiaries will not just be research and researchers, but society itself, inasmuch as research is supported because of its potential benefits to society. Researchers in developing countries and at the less affluent universities and research institutions of the developed countries will benefit even more from toll-free access to the research literature than the better-off institutions, but it is instructive to remind ourselves that even the most affluent institutional libraries cannot afford most of the refereed journals! So open access to it all will benefit all institutions. And on the other side of barrier-free access to the work of others, all researchers, even the most affluent, will benefit from the barrier-free impact of their own work on the work of others. Moreover, a toll-free, interoperable, digital research literature will not only radically enhance access, navigation (e.g., citation-linking) and impact, and thereby improve research productivity and quality, but it will also spawn new ways of monitoring and measuring impact, productivity and quality (e.g., download impact, links, immediacy, comments, and the higher-order dynamics of a citation-linked corpus) that can be analyzed from preprint to post-postprint.,
(see also Peter Suber's fuller timeline at the Free Online Scholarship site: http://www.earlham.edu/~peters/fos/timeline.htm )
Psycoloquy (Refereed On-Line-Only Journal) (1989): http://www. cogsci.soton.ac.uk/psycoloquy
“Scholarly Skywriting” (1990): http://cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad90.skywriting.html
Physics Archive (1991): http://arxiv.org
“PostGutenberg Galaxy” (1991): http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad91.postgutenberg.html
“Interactive Publication” (1992): http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad92.interactivpub.html
Self-Archiving (“Subversive”) Proposal (1994)” http://www.arl.org/scomm/subversive/toc.html
“Tragic Loss” (Odlyzko) (1995): http://www.research.att.com/~amo/doc/tragic.loss.txt
“Last Writes” (Hibbitts) (1996): http://www.law.pitt.edu/hibbitts/lastrev.htm
NCSTRL: Networked Computer Science Technical Reference Library (1996): http://cs-tr.cs.cornell.edu
University Provosts' Initiative (1997): http://library.caltech.edu/publications/ScholarsForum/
CogPrints: Cognitive Sciences Archive (1997): http://cogprints.soton.ac.uk
Journal of High Energy Physics (Refereed On-Line-0Only Journal) (1998): http://jhep.cern.ch/
Science Policy Forum (1998): http://www.sciencemag.org/cgi/content/full/281/5382/1459
American Scientist Forum (1998): http://amsci-forum.amsci.org/archives/september98-forum.html, http://www.cogsci.soton.ac.uk/~harnad/Hypermail/Amsci/subject.html
OpCit:Open Citation Linking Project (1999) http://opcit.eprints.org
E-biomed: Varmus (NIH) Proposal (1999) http://www.nih.gov/about/director/pubmedcentral/pubmedcentral.htm
Open Archives Initiative (1999) http://www.openarchives.org
Cross-Archive Searching Service (2000) http://arc.cs.odu.edu
Eprints: Free OAI-compliant Eprint-Archive-creating software (2001) http://www.eprints.org
Citebase: Scientometric Search Engine (2001): http://citebase.eprints.org/
FOS: Free Online Scholarship Movement (2001) http://www.earlham.edu/~peters/fos/timeline.htm
BOAI: Budapest Open Access Initiative (2002) http://www.soros.org/openaccess
UK RAE Reform Proposal: http://www.ariadne.ac.uk/issue35/harnad/
Berlin Declaration (2003): http://www.ecs.soton.ac.uk/~harnad/Temp/berlin.htm