Skip to Main Content

Scholarly Communication Training for Librarians: OA Repository Services

Hello! This is the guide for all your scholarly communication training needs

Module 2: Open Access Repository Services

In this module, we will focus on Georgia College's institutional repository, The Knowledge Box. We will learn about OA policies and requirements and common data formats, seek to understand the complex world of OA copyright and licensing, and also how to work with researchers.

Knowledge

Open Access Policies and Requirements

Many higher education institutions today have an open access mandate or policy, and most major research funders require outputs to be open access. An open access mandate/policy means that all of the scholarly output of a university or college must be, or is strongly encouraged to be, published open access. A research funder OA requirement means that in order for a project to receive funding, the results and data must be made available to the public free of charge. 

To find out which institutions have an open access mandate, browse the ROARMAP or COAPI. To find out which funders have an open access requirement, browse the "Research Funder Open Access Requirements" page at MIT or SHERPA/JULIET.

Currently, neither GCSU nor USG has mandated open access for scholarly output.

Data Formats

Research data comes in many varied formats: text, numeric, multimedia, models, software languages, discipline specific (e.g. crystallographic information file (CIF) in chemistry), and instrument specific. See the chart below for a sampling of data formats you may find in an institutional repository.

Type of Data File Format
Quantitative tabular data with extensive metadata
  • SPSS portable format (.por)
  • (SPSS, Stata, SAS, etc.) containing metadata information
Quantitative tabular data with minimal metadata
  • comma-separated values (CSV) file (.csv)
  • tab-delimited file (.tab)
  • widely-used formats, e.g. MS Excel (.xls/.xlsx), MS Access (.mdb/.accdb), dBase (.dbf) and OpenDocument Spreadsheet (.ods)
Geospatial data
  • ESRI Shapefile (essential: .shp, .shx, .dbf ; optional: .prj, .sbx, .sbn)
  • geo-referenced TIFF (.tif, .tfw)
  • CAD data (.dwg)
  • tabular GIS attribute data
  • ESRI Geodatabase format (.mdb)
  • MapInfo Interchange Format (.mif) for vector data
Qualitative data
  • eXtensible Mark-up Language (XML) text according to an appropriate Document Type Definition (DTD) or schema (.xml)
  • Rich Text Format (.rtf)
  • plain text data, UTF-8 (Unicode; .txt)
  • plain text data, ASCII (.txt)
  • Hypertext Mark-up Language (HTML) (.html)
  • widely-used proprietary formats, e.g. MS Word (.doc/.docx)
  • LaTeX (.tex)
Digital images
  • TIFF version 6 uncompressed (.tif)
  • JPEG (.jpeg, .jpg)
  • TIFF (other versions; .tif, .tiff)
  • JPEG 2000 (.jp2)
  • Adobe Portable Document Format (PDF/A, PDF) (.pdf)
Digital audio
  • Free Lossless Audio Codec (FLAC) (.flac)
  • Waveform Audio Format (WAV) (.wav)
  • MPEG-1 Audio Layer 3 (.mp3) - spoken word audio only
  • MPEG-1 Audio Layer 3 (.mp3)
  • Audio Interchange File Format (AIFF) (.aif)
Digital video
  • MPEG-4 High Profile (.mp4)
  • motion JPEG 2000 (.jp2)
Documentation and scripts
  • Rich Text Format (.rtf)
  • Open Document Text (.odt)
  • HTML (.htm, .html)
  • plain text (.txt)
  • widely-used proprietary formats, e.g. MS Word (.doc/.docx) or MS Excel (.xls/.xlsx)
  • XML marked-up text (.xml) according to an appropriate DTD or schema, e.g.  HMTL 1.0
  • PDF/A or PDF (.pdf)

 

Source:

Adapted from Research Data Services: Data Types & File Formats (OSU Libraries)

Further Reading:

Understanding

Copyright and Licensing Issues

There are several different issues that arise when dealing with artifacts deposited in the institutional repository: quality, intellectual property, research ethics, and privacy. For the purpose of this guide, we will be considering the question of intellectual property as it pertains to copyright and licensing.

First, there are two types of works deposited in The Knowledge Box: published and unpublished.

Published works

  • Post-print (accepted manuscript) version: the manuscript after it has undergone peer review, the author has made necessary changes, and the article has been accepted for publication.
  • Final published version (version of record, publisher’s PDF): the PDF generated by the publisher after copyediting and layout.

Unpublished works

  • Preprints: the original manuscript as submitted by the author to the publisher.
  • Electronic Theses and Dissertations (ETDs).
  • Conference slides, posters, capstone projects, and presentations are also considered unpublished works.
Is an author even allowed to deposit their work in the IR?

The vast majority of publishers will allow a pre-print to be posted on the author’s IR or their personal website. It is considered best practice to post a citation on the preprint that states it has been “Submitted to Journal XYZ.”

Once accepted for publication, many (but not all) publishers will allow the author to keep the pre-print online, but with a different citation: “This is the pre-peer reviewing version of the following article: [give citation of published version].”

An increasing number of publishers will allow deposit of the post-print version to be posted on the author’s IR. As with pre-prints, most publishers require a citation: “This is a peer-reviewed, electronic version of the following article: [give citation of published version]. The articles is available online at [URL].” There may also be an embargo imposed on the post-print version of the article.

Who owns this copyright?

Determining ownership (warranting): Though it is the owner of the work who should be taking responsibility for the content, verifying the copyright ownership of an unpublished work is an important consideration for the library. It is also necessary to ensure that the work will not place the institution at undue risk for legal action (due to inclusion of copyrighted materials).

What contract does the library make with authors?

When an author deposits a work into an IR, they typically agree to certain conditions. For example, when students or faculty deposit items into The Knowledge Box, they agree to the following:

  • Grant to GCSU a non-exclusive, irrevocable, and perpetual right to retain reproduce, digitize, and distribute the deposited work in whole or in part, in and from its electronic format, without fee.
  • GCSU may remove the work for professional or administrative reasons, or if it is found to violate copyright laws.
  • GCSU may make and keep more than one copy of the work and may migrate the work to any medium or format for the purpose of preservation.

Georgia College does NOT require a transfer of copyright to the college. Authors retain ownership of their own copyrights.

 

Source:

Gilman, I. (2013). Institutional Repositories. In Library Scholarly Communication Programs: Legal and Ethical Considerations. Oxford: Chandos.

Further Reading:

Ability

Working With Researchers

The Knowledge Box is an institutional repository (IR), which is an electronic system that captures, preserves, and provides access to the digital work products of a community. Works are generally not "published" by the IR; that is, they are published elsewhere and deposited in the IR. Currently, the only work that could be considered "published" by The Knowledge Box is The Corinthian.

While IRs have many benefits that should be persuasive to faculty, faculty remain reluctant to deposit their work into them. It appears that IRs "fail to be compelling and useful to authors and owners of content" [1]. Also, small academic institutions “face unique challenges” in recruiting content for their IRs, in particular “repositories at smaller institutions face further difficulties unique to an academic environment that emphasizes teaching” [2]. So how do we get faculty to deposit their work in The Knowledge Box?

We know what faculty DON'T want: to know how The Knowledge Box works or what it actually is [1]. They also don't want to hear the terms "institutional repository," "IR," "metadata," or "open source" [3]. They don't want the emphasis to be on the institution at the expense of the individual.

What they DO want, however, is relatively simple. They want the emphasis to be on the work that they do. They want other people to find, use, and cite the work that they put into The Knowledge Box [1]. A personalized, tailored approach works best.

Below are some talking points you can use, based on what we know faculty want... 

...to maintain a digital archive of their research

  • When depositing in The Knowledge Box, faculty maintain ownership of their own work and control who sees it.
  • Items in The Knowledge Box will be preserved far into the future, safe from loss or damage. The same cannot be said for items uploaded to ResearchGate or Academia.edu, as these sites do not provide the same sorts of services, such as supporting open metadata or long-term preservation.

...to share their research with others

  • The Knowledge Box will make their own work easily accessible to others on the web through Google searches.
  • Faculty will be able to give out links to their work so that they do not have to spend time finding files and sending them out as email attachments.
  • It is statistically proven that depositing work in an open access venue, like The Knowledge Box, increases access to research output and publicity. See also: a summary of open access citation advantage studies and a bibliography of studies on the effect of OA on citation impact
  • Most publishers include the right to publish in an IR in their contracts, and many others will accept an author addendum that asks for the right to do so.

Finally, here are a few tips on language [3] to use with faculty:

  • When discussing advantages of The Knowledge Box, say:
    • Their scholarly work is all in one place
    • No broken links
    • Permanent access
    • File preservation
  • When discussing negotiating with a publisher for the right to deposit in The Knowledge Box, say:
    • Greater impact, greater visibility, and greater availability of grey literature
    • More citations
    • Better access for international and interdisciplinary colleagues

 

Sources:

[1] Foster, N.F., & Gibbons, S. (2005). Understanding faculty to improve content recruitment for institutional repositories. D-Lib Magazine, 11(1).

[2] Wu, M. (2015). The future of institutional repositories at small academic institutions: Analysis and insights. D-Lib Magazine, 21(9/10), 8.

[3] Fuchs, S., & Brannon, P. (2008). Developing effective scholarly communication advocates: A case study [PowerPoint slides]. 

Further Reading: