It is the mission of the Texas Digital library (TDL) to enable digital initiatives in support of research, scholarship, and learning in Texas. As a part of this mission, the TDL endeavors to collect, preserve, and disseminate scholarly materials for the benefit of both producers and consumers of academic research and scholarship. TheTexas Data Repository, TDL’s instance of the Dataverse software, encompassing all of the Dataverse collections of its member institutions, is the digital resource intended to address a consortium-level need for publishing, managing, and providing access to research-generated data sets. The following Digital Preservation Policy describes the extent to which the TDL will support sustainable access to the digital research data and related content deposited in the Texas Data Repository.
The preservation objectives of the Texas Data Repository are:
Part of the TDL’s vision in establishing a consortium Dataverse repository installation is to make research materials freely available to anyone, anywhere, and at any time. The TDL is an advocate for open access to scholarly work including research data. The incentives to researchers for publishing and preserving their research data in the Texas Data Repository are:
The TDL accepts the responsibility to preserve and provide access to research data, including associated metadata and documentation that is properly deposited in the Texas Data Repository. This responsibility includes the provision of digital means to preserve and ensure ongoing access to said content for a minimum period of ten years after it is deposited in the TDR Dataverse repository. Long-term preservation of TDR Dataverse repository content, beyond the ten-year retention period, is subject to the TDL’s selection criteria, appraisal of the content, and budgetary and technical support of resources necessary to meet this goal. Metadata for content removed from the TDR Dataverse repository, regardless of reason or retention period, may be preserved for an undetermined period of time after said content’s removal.
The Texas Data Repository content will be selected and appraised according to the following preservation priorities and levels of commitment:
Additionally, the Texas Data Repository will accept data submissions of any format, but only provides full support (i.e. data exploration, analysis, and meta-analysis via the TwoRavens suite of statistical tools) to tabular data preferably in the following formats:
These files can be in compressed ZIP format at ingest, however, they may not exceed 2 GB in size. Any individual file uploaded to the repository must be under 4GB, though any uploads over 2GB, and some below that threshold, may be slow or stall due to variables outside of TDL's control. Please email email@example.com if you having trouble uploading files. If you have files over 4GB, we will consider support options on a case by case basis and in consultation with your Institutional TDR liaison. Please see http://guides.dataverse.org/en/latest/user/tabulardataingest/index.html and http://guides.dataverse.org/en/latest/user/dataset-management.html for more specific information on data set and metadata formats.
Texas Data Repository provides basic, bit-level preservation through fixity checks and secure backup of deposited content (See also Information Security). Further and more in-depth digital preservation activities and services must be provided by a digital preservation program at the institution where the research data was originally generated.
The TDL has an official backup strategy that requires all digital content to be:
The TDL systems also provide security services key to basic digital preservation, namely access control, network monitoring and protection, encryption, and system updates (see Information Security Policy). There are currently no institutional limitations to the overall quantity of data that can be stored on TDL servers, only limitations on the size of individual files (4 GB) uploaded via the Texas Data Repository application and a recommendation for datasets not to exceed 10GB.
The Dataverse software's best practices for data management and preservation include:
The TDL systems infrastructure includes bit-level fixity checking via Amazon EBS and S3 host service. For more information about security, backups and integrity checking, see also Information Security.
The Dataverse Project, “Harvard Dataverse Preservation Policy,” http://best-practices.dataverse.org/harvard-policies/harvard-preservation-policy.html
Purdue University Research Repository (PURR), “PURR Digital Preservation Policy,” https://purr.purdue.edu/legal/digitalpreservation
Texas Digital Library, “Our Mission and Vision,” https://www.tdl.org/strategic-plan/vision/
Preserving digital Objects With Restricted Resources, “Tool Grid,” http://digitalpowrr.niu.edu/tool-grid/
Digital Curation Centre, “DataVerse,” http://www.dcc.ac.uk/resources/external/dataverse
Harvard Dataverse, “UCLA Social Science Data Archive Dataverse,” http://dataarchives.ss.ucla.edu/archive%20tutorial/archivingdata.html
Harvard’s Institute for Quantitative Social Science (IQSS), “About TwoRavens,” http://datascience.iq.harvard.edu/about-tworavens
University of North Carolina – The Odum Institute, “Digital Preservation Policies,” http://www.irss.unc.edu/odum/contentSubpage.jsp?nodeid=629
Harvard Dataverse Project, “User Guide: Tabular Data File Ingest,” http://guides.dataverse.org/en/latest/user/tabulardataingest/index.html
Elizabeth Quigley, IQSS-Harvard University, “The Expanding Dataverse,” http://dataverse.org/files/dataverseorg/files/introduction_to_dataverse.pdf?m=1447352697