Roadmap

Public TDR Roadmap and Annual Activities

This document is a living document which outlines the TDR service development and enhancement priorities of the TDR Steering Committee and the TDL.

2024-25 Priorities

User-facing resources

  • Video tutorials
    • Finding and downloading data
    • READme
    • Using archive/zip files
    • How to update metadata
    • How to use bulk file uploader tools
    • How to create thumbnails
    • Tips on searching the TDR dataverse
    • Uploading folders 
    • dataset templates
    • best practices for research data management
  • Documentation in Confluence
    • APIs (for researchers)
    • RAQ/FAQs for TDR SC liaisons on Confluence
  • Format Management Best Practices

Data retention

  • Phase 3
    • Establish criteria for deaccessioning
    • Develop tools for implementation
  • Phase 4
    • Propose a policy and workflow
    • Make recommendations to TDR Steering Committee and Harvard Dataverse

Larger data

Sensitive data & Data curation

  • ON HOLD for this year
  • Sensitive Data Documentation requests:
    • sensitive data recommendations and guidance (general)
    • tips for anonymizing/deidentification
    • collected institutional guidance for handling of data types

2023-24 Priorities

Data retention

  • Review proposed Dataverse documentation about data retention handling in the system
  • Recommend data retention and deaccessioning strategies for the TDR SC

Sensitive data 

  • Gather, review and summarize resources about managing sensitive data in repositories and share them on the TDR wiki
    • Resources may include
      • Workarounds applied by other Dataverse users
      • Options for depositors with sensitive data to de-identify or use a workaround 
      • Policy examples (ie "share metadata only")
      • Data tags
  • Recommend sensitive data strategies for the TDR SC

Larger data

2022-23 Priorities

Sensitive data 

    • Change the structure of TDR to accomplish - TDR SC can make recommendations for the TDL roadmap. Note that implementation is based on TDL's budget and staff availability
    • Create  workarounds 
      • Create a flowchart for depositors with sensitive data with different options to de-identify or use a workaround 
      • Policy - for example, workarounds like 'just share metadata'
    • Data tags
      • collect and share resources, review and summarize
      • determine TDR recommendations
    • See also https://guides.dataverse.org/en/latest/developers/big-data-support.html

Larger data




deprecated as of May 2022

Outreach & Training

  • Outreach to Dataverse software adopters. Use the data provided to the TDR liaison in reports to reach out (thank you notes, etc) to researchers to strengthen relationships and to recruit.
  • Create targeted, discipline-specific guidance, metadata recommendations and DMPTool templates for researchers
  • Documentation and training materials refresh (2021-22) (especially for the adjectival changes to 'dataverse')
  • Annual activities:
    • Present on/promote the repository at a minimum 3 conferences (disciplinary and library)
      • add annual info about this to annual report template
    • Webinar series or in-person event 
      • continue Webinar series (relatively 2 per term)
  • Carpentries cohort continuance (branding, future training, etc)

Larger datasets 

Datasets >10GB

  • Define and document recommended data limits per institution
    • TDR WG creating policy and procedures
  • Consider payment system for more space

Assessment

  • User facing assessment survey
  • Complete Core Trust Seal certification - ongoing
  • Google Analytics
  • Annual activities:
    • TCDL report on usage
  • Carpentries Pilot Report (2021) - by Needs Assessment group of the Carpentries Cohort and Leads

Accessibility

  • Policy guidance for audio/visual material
  • TDL is working on platform accessibility audits in the coming year, creating and sharing VPATs, and future work

Curation

  • DMPTool - has anyone reviewed since they redid it (Research Outputs is sep tab, and you can register the DMP with a researcher ID - ORCID, etc)
  • Investigate existing curation tools 

System Integrations

  • Ongoing
    • R
    • GeoBlacklight
    • Previewers
  • Consider and prioritize
    • ORCiD - https://github.com/IQSS/dataverse/issues/4236
    • OJS - site growing number of journal datasets in Dataverse - consider engaging PKP
    • DSpace - await stable release (linking publications with their data)
    • Vireo - await stable release
    • Software containers
      • WholeTale or Code Ocean
      • Docker
    • OSF (follow the lead of the UVA work)

Other

  • sensitive data
    • Carpentries curriculum building


deprecated Public TDR Roadmap and Annual Activities

This document is a living document which outlines the TDR service development and enhancement priorities of the TDR Steering Committee and the TDL.

Outreach & Training

  • Outreach to Dataverse software adopters. Use the data provided to the TDR liaison in reports to reach out (thank you notes, etc) to researchers to strengthen relationships and to recruit.
  • Expand TDR engagement to include more than one staff person at each institution. Establish mechanism for TDR SC to assist each other with outreach projects and do collective outreach – done
  • Create targeted, discipline-specific guidance, metadata recommendations and DMPTool templates for researchers
  • Annual activities:
    • Present on/promote the repository at a minimum 3 conferences (disciplinary and library)
      • add annual info about this to annual report template
    • Webinar series or in-person event (Carpentries Project team in 2021)
      • continue Webinar series (relatively 2 per term)

Larger datasets 

Datasets >10TB

  • Define and document recommended data limits per institution
    • TDR WG creating policy and procedures
  • Develop at least one Remote Storage Agent for larger datasets - Done
    • pilot at TACC/UTLibs - Done
    • goal of one complete and one in progress in FY2021 - Done and ongoing at UT
  • Consider payment system for more space

Assessment

  • Complete Core Trust Seal certification - ongoing
  • Review reports against needs assessment report
  • Google Analytics
  • Annual activities:
    • TCDL report on usage

Accessibility

  • Policy guidance for audio/visual material

System Integrations

  • Consider and prioritize
    • ORCiD
    • OSF (follow the lead of the UVA work)
    • Vireo - await stable release
    • OJS - site growing number of journal datasets in Dataverse - consider engaging OJS group
    • DSpace - await stable release
    • GeoBlacklight 
    • DMPTool
    • WholeTale or Code Ocean
    • R - Capstone project
    • Docker or other software containers



DEPRECATED DEVELOPMENT ROADMAP BELOW

A * indicates work towards the goal in FY2017-18.

A # indicates work towards the goal in FY2018-19

FY 2019/20 completed:

  • Created Assessment and Training & Outreach Standing Committees
  • Establishing a Project Management System (and file naming convention) for our work
  • Confluence documentation tidy-up and re-org including DMPTool language and policy review
  • Designed a new TDR logo
  • Presented on/promoted the repository at several conferences (disciplinary and library)
  • Linked to Dataverse user guides in Confluence TDR user documentation where Harvard’s are more robust 
  • Developing Onboarding Materials for new Steering Committee members
  • Improved assessment tools and reporting
  • Followed Dataverse development towards Making Data Count compliance - implementing Summer 2020 with upgrade to 4.20
  • Assessment - TCDL report on usage
  • Hosted 5 speakers for our Webinar Series


Training researchers

  • Develop training curriculum (data sharing generally and dataverse-specific)*
  • Share resources/slides/experiences*#
  • Create research guide for reuse by TDR institutional members#
  • Create discipline-specific guidance and templates
  • Create more in-depth user guides  and link to Dataverse user guides where Harvard’s are more robust #

Training librarians

  • Dataverse training to develop expertise*#
  • Good data sharing practices and issues that may arise, e.g. rights issues, privacy and restricted data, proper citation.*#
  • Develop training curriculum
    • Workshops#
    • Videos*#
    • Open forum for discussion and questions TBD e.g. Slack, Google groups.*

Assessment

  • Needs assessment - researchers in institutions
  • Measure the impact of data reuse
  • Audit for optimum repository functionality
  • Identify good assessment tools*#
  • Identify dataverse use metrics*#
  • Data curation and best practices in curation (TXState)

Compile tools and marketing materials for users

  • Collect links to member's existing resources*#
  • Put resources in an open and accessible location*#
  • Create mechanism for member feedback and submission#
  • Compile plans for early-adopter thank you/welcome messages
  • Strategies for intra-campus partnerships (e.g. with grant networks, sponsored projects)#
  • Connect funder requirements with the TDR service. Create language regarding the TDR services to use in data management plans (and for DMPTool templates)*

Support for different use cases

  • Generate/provide example use cases to foresee technical and training needs at different scales, individual researchers, labs, centers, institutes; and for different user abilities.#
  • Develop method for gathering/capturing use cases, when and why users chose not to participate*#

Large datasets

  • Watch for Dataverse functionality for integrating with e.g., TACC*#
  • Payment system for more space?

Inter-institutional collaborations

  • Account request for non-TDL users collaborating with researchers at TDL member institutions possibly with limited permissions to specific Sub-Dataverses*
  • Expand third-party authentication capabilities (OAuth2 Google log in)*
  • Work with IQSS at Harvard to refine access to specific Institutional and sub-dataverses#

Accessibility

  • Accessibility assessment#
  • Policy guidance for audio/visual material

Refine system functionality and documentation over time

  • Provide feedback to Dataverse Community*#
  • PII and HIPAA#
  • Integrations
    • ORCiD
    • OSF (follow the lead of the UVA work)
    • Vireo
    • OJS
    • DSpace
    • Chronopolis#

Present on/promote the repository at disciplinary conferences (not just library conferences)