Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

7. Data Integrity and Authenticity

Digital Object Management

R7. The repository guarantees the integrity and authenticity of the data.

Statement of Compliance Level:

  • The guideline has been fully implemented for the needs of the repository.

“The TDL actively addresses the need to ensure the accuracy, integrity, authenticity, and permanence of the digital content that it manages, as well as the security of the services and platforms that it provides.”

...

Description and Details of Version Control Strategy:16

When initially published, a dataset is automatically assigned to the category “Version 1.” Any subsequent changes to the dataset result in the creation of a new version. A ‘small’ change (correcting a typo) would create a “Version 1.1”. A ‘large’ change (adding a new data column) creates a “Version 2.0”. Adding a new file automatically creates a “Version 2.0”. All versions can be made public.

...

  • Dublin Core (DC) and Open Archives Initiative (OAI)

  • ISO 8601 for date entry

  • GeoNames for geospatial metadata

  • Data Documentation Initiative for social science/humanities metadata

  • SIMBAD astronomical database and FITS for astronomy and astrophysics metadata

  • NCBI Taxonomy & NCBO Bioportal for life sciences metadata

  • ISA-Tab for biomedical http://guides.dataverse.org/en/latest/user/appendix.html

    Provenance:

    Depositors must log-in to the TDR service through their respective TDL member institution. As an example, University of Texas at Austin affiliates must provide their UT-EID and password via Shibboleth internet protocols. If logging-in for the first time, depositors must agree to the General Terms of Use before being allowed to create an account.

    Users are required to provide the TDR with accurate and complete registration information. Depositors’ first and last names and affiliation are displayed connected to their uploads. Terms of Use

    Each dataset is assigned a persistent ID (doi) and the corresponding metadata is a part of the complete digital object.

    The TDR also tracks/records information when a registered or non-registered guest downloads a file. “When you download a file from Texas Data Repository, our software collects user account data such as your name, username, email, institution and position if provided (or the session ID data for guest users) and accompanying download data such as the time of the download.”

17

...

8. Appraisal

Digital Object Management

R8. The repository accepts data and metadata based on defined criteria to ensure relevance and understandability for data users.

Statement of Compliance Level:

  • The guideline has been fully implemented for the needs of the repository.

Collection Policies

The TDR does not have a specific collection development policy with respect to data deposit and “encourages data deposit from all disciplines and can accept any type of data file.”

...

While the TDR does not publish a list of preferred formats, it does advice depositors to provide data in non-proprietary formats in order to ensure broader use for research, e.g. CSV or XML. Additional features and support for certain types of data files exist.18

https://texasdigitallibrary.atlassian.net/wiki/spaces/TDRUD/pages/289079303/Frequently+Asked+Quest ions#FrequentlyAskedQuestions-FAQNine

...

Depositors grant all necessary permissions and required licenses to the TDR to make submitted or deposited content available for archiving, preservation and access, within the site. Among others this includes permission to “store, translate, copy or re-format the content in any way to ensure its future preservation and accessibility, and improve usability.” Terms of Use

19

...

9. Documented Storage Procedures

Digital Object Management

R9. The repository applies documented processes and procedures in managing archival storage of the data.

Statement of Compliance Level:

  • The guideline has been fully implemented for the needs of the repository.

User Documentation:

User documentation, processes, and procedures are available on the TDR’s Altassian wikispace:

https://texasdigitallibrary.atlassian.net/wiki/spaces/TDRUD/overview

This wikispace tracks changes over time so users can access and compare process and procedures as the TDR evolves.

...

The TDL has an official backup strategy in which TDL retains:20

  • the copy of the data residing on the production server, which is an Amazon S3 volume;

  • nightly snapshots that can be used to restore the entire service to a particular date within the

    preceding month;

  • and one snapshot from each month, retained for one year.

    Snapshot backups are stored in Amazon Elastic Block Store (EBS) snapshots, which is replicated storage with regular systematic data integrity checks. Information Security

    Risk Management:

    The TDL actively addresses the need to ensure the accuracy, integrity, authenticity, and permanence of the digital content that it manages, as well as the security of the services and platforms that it provides. The TDL ensures the security of its Dataverse instance as follows:

  • System Security

  • Data Integrity

  • Regulatory and Legal Considerations

    Information Security

    Consistency across Archival Copies:

    Checksums are generated for data files upon ingest (UNF for tabular data, MD5 for other files). Data sets are assigned persistent URLs and DOIs. Changes made to a data set create new versions. The doi is always attached to the most current published version.

    Storage Media Monitoring:

    Amazon Web Services is responsible for monitoring the status of the servers.

    https://aws.amazon.com/compliance/data-center/controls/

21

...

10. Preservation Plan

Digital Object Management

R10. The repository assumes responsibility for long-term preservation and manages this function in a planned and documented way

Statement of Compliance Level:

  • The guideline has been fully implemented for the needs of the repository.

The Texas Data Repository’s preservation plan is outlined in its Digital Preservation and Security section of the TDR’s policies. The Texas Digital Library “accepts the responsibility to preserve and provide access to research data, including associated metadata and documentation that is properly deposited in the Texas Data Repository. This responsibility includes the provision of digital means to preserve and ensure ongoing access to said content for a minimum of ten years after it is deposited in Dataverse.” https://texasdigitallibrary.atlassian.net/wiki/spaces/TDRUD/pages/291635428/Digital+Preservation+Poli cy

...

An AIP is the set of content and metadata managed by a preservation repository, and organized in a way that allows the repository to perform preservation services. Upon ingest, the TDR uses the tool JHOVE to identify aspects of file formats. Provenance data is provided through the depositor and his/her affiliated

22

institution. Amazon Web Services performs both fixity and cyclic redundancy checks. Backup plans and access rights information are all documented. https://texasdigitallibrary.atlassian.net/wiki/spaces/TDRUD/pages/291635428/Digital+Preservation+Poli cy

https://aws.amazon.com/s3/faqs/#Durability_.26_Data_Protection

23

...

11. Data Quality

Digital Object Management

R11. The repository has appropriate expertise to address technical data and metadata quality and ensures that sufficient information is available for end users to make quality-related evaluations.

Statement of Compliance Level:

  • The guideline has been fully implemented for the needs of the repository.

The TDR does not review datasets before the sets are uploaded and then subsequently published. As such, the TDR does not endorse, take responsibility for, or make any representations or warranties for any user uploads. Terms of Use

...

And members can always provide feedback on any aspect of the TDR to their institution’s designated liaison.

24

...

12. Workflow

Digital Object Management

R12. Archiving takes place according to defined workflows from ingest to dissemination.

Statement of Compliance Level:

  • The guideline has been fully implemented for the needs of the repository.

Workflows/Process Descriptions:

...

For example, one can find, among others, the Service Level Agreement, Terms of Use, User Guide, Policies, and FAQs all in one location: https://texasdigitallibrary.atlassian.net/wiki/spaces/TDRUD/overview

These confluence pages come with a page history functionality. It is possible for a user to track changes to workflows over time. And the Service Level Agreement states that the TDL will “provide timely reporting to data repository liaisons regarding any system issues.” TDR MOUs and SLAs

...

The repository does make appraisal decisions with respect to long-term preservation (after the 10-year minimum period). The selection criteria are listed in the Digital Preservation Policy section. https://texasdigitallibrary.atlassian.net/wiki/spaces/TDRUD/pages/291635428/Digital+Preservation+Poli cy

25

Checksums are generated for data upon ingest. After download, users can generate their own checksums and compare.

26

...

13. Data Discovery and Identification

Digital Object Management

R13. The repository enables users to discover the data and refer to them in a persistent way through proper citation.

Statement of Compliance Level:

  • The guideline has been fully implemented for the needs of the repository.

The Texas Data Repository has a search box on the left side of the screen that searches within the dataverse. There is also an advanced search feature available that provides faceted metadata searching.

...

o Dainer-Best,Justin,2018,"ReplicationdataandmaterialsforPositiveImageryTraining Increases Positive Self-Referent Cognition in Depression", doi:10.18738/T8/RHEMGW, Texas Data Repository Dataverse, V3, UNF:6:FgY50+UEDA/95sPKids5WA==

27

...

14. Data Reuse

...

R14. The repository enables reuse of the data over time, ensuring that appropriate metadata are available to support the understanding and use of the data.

Statement of Compliance Level:

  • The guideline has been fully implemented for the needs of the repository.

The Texas Data Repository requires a minimum of 9 metadata fields to be completed before a dataset can be uploaded and subsequently published. Those fields are:

...

TDR policies and the Terms of Use agreement require depositors to grant to the repository all necessary permissions and required licenses to make the content [the depositor] submit[s] or deposit[s] available for archiving, preservation and access, within the site. This includes permission to “store, translate, copy28

or re-format the content in any way to ensure its future preservation and accessibility, and improve usability and/or protect respondent confidentiality.” Terms of Use

...