Roadmap
Public TDR Roadmap and Annual Activities
This document is a living document which outlines the TDR service development and enhancement priorities of the TDR Steering Committee and the TDL.
2024-25 Priorities
User-facing resources
- Video tutorials
- Finding and downloading data
- READme
- Using archive/zip files
- How to update metadata
- How to use bulk file uploader tools
- How to create thumbnails
- Tips on searching the TDR dataverse
- Uploading folders
- dataset templates
- best practices for research data management
- Documentation in Confluence
- APIs (for researchers)
- RAQ/FAQs for TDR SC liaisons on Confluence
- Format Management Best Practices
Data retention
- Phase 3
- Establish criteria for deaccessioning
- Develop tools for implementation
- Phase 4
- Propose a policy and workflow
- Make recommendations to TDR Steering Committee and Harvard Dataverse
Larger data
- Recommendations document: https://docs.google.com/document/d/1FWQAUcc6U5k4q2uIYxXRFQtxi9ke6l0Xhb8PY9BsBKI/edit
- (see also https://guides.dataverse.org/en/latest/developers/big-data-support.html)
Sensitive data & Data curation
- ON HOLD for this year
- Sensitive Data Documentation requests:
- sensitive data recommendations and guidance (general)
- tips for anonymizing/deidentification
- collected institutional guidance for handling of data types
2023-24 Priorities
Data retention
- Review proposed Dataverse documentation about data retention handling in the system
- Recommend data retention and deaccessioning strategies for the TDR SC
Sensitive data
- Gather, review and summarize resources about managing sensitive data in repositories and share them on the TDR wiki
- Resources may include
- Workarounds applied by other Dataverse users
- Options for depositors with sensitive data to de-identify or use a workaround
- Policy examples (ie "share metadata only")
- Data tags
- Resources may include
- Recommend sensitive data strategies for the TDR SC
Larger data
- (see also https://guides.dataverse.org/en/latest/developers/big-data-support.html)
- TDR WG to create larger data service policy and procedures
- Consider payment system for more space
- Connect with TACC, who is currently experimenting with improvements on the proof of concept application
- TDR WG to create larger data service policy and procedures
2022-23 Priorities
Sensitive data
- Change the structure of TDR to accomplish - TDR SC can make recommendations for the TDL roadmap. Note that implementation is based on TDL's budget and staff availability
- Create workarounds
- Create a flowchart for depositors with sensitive data with different options to de-identify or use a workaround
- Policy - for example, workarounds like 'just share metadata'
- Data tags
- collect and share resources, review and summarize
- determine TDR recommendations
- See also https://guides.dataverse.org/en/latest/developers/big-data-support.html
Larger data
- (see also https://guides.dataverse.org/en/latest/developers/big-data-support.html)
Datasets >10GB
- Provide workaround options
- Flowchart for large data (like break down data into smaller units)
- ?Define and document recommended data limits per institution
- TDR WG creating policy and procedures
- ?Consider payment system for more space
deprecated as of May 2022
Outreach & Training
- Outreach to Dataverse software adopters. Use the data provided to the TDR liaison in reports to reach out (thank you notes, etc) to researchers to strengthen relationships and to recruit.
- Create targeted, discipline-specific guidance, metadata recommendations and DMPTool templates for researchers
- Documentation and training materials refresh (2021-22) (especially for the adjectival changes to 'dataverse')
- Annual activities:
- Present on/promote the repository at a minimum 3 conferences (disciplinary and library)
- add annual info about this to annual report template
- Webinar series or in-person event
- continue Webinar series (relatively 2 per term)
- Present on/promote the repository at a minimum 3 conferences (disciplinary and library)
- Carpentries cohort continuance (branding, future training, etc)
Larger datasets
Datasets >10GB
- Define and document recommended data limits per institution
- TDR WG creating policy and procedures
- Consider payment system for more space
Assessment
- User facing assessment survey
- Complete Core Trust Seal certification - ongoing
- Google Analytics
- Annual activities:
- TCDL report on usage
- Carpentries Pilot Report (2021) - by Needs Assessment group of the Carpentries Cohort and Leads
Accessibility
- Policy guidance for audio/visual material
- TDL is working on platform accessibility audits in the coming year, creating and sharing VPATs, and future work
Curation
- DMPTool - has anyone reviewed since they redid it (Research Outputs is sep tab, and you can register the DMP with a researcher ID - ORCID, etc)
- Investigate existing curation tools
- Dashboard
- Analyze the content from Data Curation Network (DCN)
- Previewers - learn more and document
- GeoBlacklight
- Others in the Dataverse Community
- ORCiD - https://github.com/IQSS/dataverse/issues/4236
System Integrations
- Ongoing
- R
- GeoBlacklight
- Previewers
- Consider and prioritize
- ORCiD - https://github.com/IQSS/dataverse/issues/4236
- OJS - site growing number of journal datasets in Dataverse - consider engaging PKP
- DSpace - await stable release (linking publications with their data)
- Vireo - await stable release
- Software containers
- WholeTale or Code Ocean
- Docker
- OSF (follow the lead of the UVA work)
Other
- sensitive data
- Carpentries curriculum building
deprecated Public TDR Roadmap and Annual Activities
This document is a living document which outlines the TDR service development and enhancement priorities of the TDR Steering Committee and the TDL.
Outreach & Training
- Outreach to Dataverse software adopters. Use the data provided to the TDR liaison in reports to reach out (thank you notes, etc) to researchers to strengthen relationships and to recruit.
- Expand TDR engagement to include more than one staff person at each institution. Establish mechanism for TDR SC to assist each other with outreach projects and do collective outreach – done
- Create targeted, discipline-specific guidance, metadata recommendations and DMPTool templates for researchers
- Annual activities:
- Present on/promote the repository at a minimum 3 conferences (disciplinary and library)
- add annual info about this to annual report template
- Webinar series or in-person event (Carpentries Project team in 2021)
- continue Webinar series (relatively 2 per term)
- Present on/promote the repository at a minimum 3 conferences (disciplinary and library)
Larger datasets
Datasets >10TB
- Define and document recommended data limits per institution
- TDR WG creating policy and procedures
- Develop at least one Remote Storage Agent for larger datasets - Done
- pilot at TACC/UTLibs - Done
- goal of one complete and one in progress in FY2021 - Done and ongoing at UT
- Consider payment system for more space
Assessment
- Complete Core Trust Seal certification - ongoing
- Review reports against needs assessment report
- Google Analytics
- Annual activities:
- TCDL report on usage
Accessibility
- Policy guidance for audio/visual material
System Integrations
- Consider and prioritize
- ORCiD
- OSF (follow the lead of the UVA work)
- Vireo - await stable release
- OJS - site growing number of journal datasets in Dataverse - consider engaging OJS group
- DSpace - await stable release
- GeoBlacklight
- DMPTool
- WholeTale or Code Ocean
- R - Capstone project
- Docker or other software containers
DEPRECATED DEVELOPMENT ROADMAP BELOW
A * indicates work towards the goal in FY2017-18.
A # indicates work towards the goal in FY2018-19
FY 2019/20 completed:
- Created Assessment and Training & Outreach Standing Committees
- Establishing a Project Management System (and file naming convention) for our work
- Confluence documentation tidy-up and re-org including DMPTool language and policy review
- Designed a new TDR logo
- Presented on/promoted the repository at several conferences (disciplinary and library)
- Linked to Dataverse user guides in Confluence TDR user documentation where Harvard’s are more robust
- Developing Onboarding Materials for new Steering Committee members
- Improved assessment tools and reporting
- Followed Dataverse development towards Making Data Count compliance - implementing Summer 2020 with upgrade to 4.20
- Assessment - TCDL report on usage
- Hosted 5 speakers for our Webinar Series
Training researchers
- Develop training curriculum (data sharing generally and dataverse-specific)*
- Share resources/slides/experiences*#
- Create research guide for reuse by TDR institutional members#
- Create discipline-specific guidance and templates
- Create more in-depth user guides and link to Dataverse user guides where Harvard’s are more robust #
Training librarians
- Dataverse training to develop expertise*#
- Good data sharing practices and issues that may arise, e.g. rights issues, privacy and restricted data, proper citation.*#
- Develop training curriculum
- Workshops#
- Videos*#
- Open forum for discussion and questions TBD e.g. Slack, Google groups.*
Assessment
- Needs assessment - researchers in institutions
- Measure the impact of data reuse
- Audit for optimum repository functionality
- Identify good assessment tools*#
- Identify dataverse use metrics*#
- Data curation and best practices in curation (TXState)
Compile tools and marketing materials for users
- Collect links to member's existing resources*#
- Put resources in an open and accessible location*#
- Create mechanism for member feedback and submission#
- Compile plans for early-adopter thank you/welcome messages
- Strategies for intra-campus partnerships (e.g. with grant networks, sponsored projects)#
- Connect funder requirements with the TDR service. Create language regarding the TDR services to use in data management plans (and for DMPTool templates)*
Support for different use cases
- Generate/provide example use cases to foresee technical and training needs at different scales, individual researchers, labs, centers, institutes; and for different user abilities.#
- Develop method for gathering/capturing use cases, when and why users chose not to participate*#
Large datasets
- Watch for Dataverse functionality for integrating with e.g., TACC*#
- Payment system for more space?
Inter-institutional collaborations
- Account request for non-TDL users collaborating with researchers at TDL member institutions possibly with limited permissions to specific Sub-Dataverses*
- Expand third-party authentication capabilities (OAuth2 Google log in)*
- Work with IQSS at Harvard to refine access to specific Institutional and sub-dataverses#
Accessibility
- Accessibility assessment#
- Policy guidance for audio/visual material
Refine system functionality and documentation over time
- Provide feedback to Dataverse Community*#
- PII and HIPAA#
- Integrations
- ORCiD
- OSF (follow the lead of the UVA work)
- Vireo
- OJS
- DSpace
- Chronopolis#
Present on/promote the repository at disciplinary conferences (not just library conferences)
- See dissemination here for Librarian's: Librarians - Presentation and Resources