Web Archiving Texas Resources
- UK National Archives - Automate Web Archiving Quality Assurance Without a Programmer: https://blog.nationalarchives.gov.uk/automate-web-archiving-quality-assurance-without-programmer/
- International Internet Preservation Consortium (IIPC) - The website contains several useful resources, including introductory material about web archiving, content about legal issues, case studies, reports and presentations, a bibliography (up to only 2009), collection development policies and a page listing tools and software.
The Awesome Web Archiving list at https://github.com/iipc/awesome-web-archiving
- University of North Texas (UNT) Libraries is well-known for its robust web archiving program. You can learn about their policies, processes and projects, including their participation in the End of Term projects, on their website (current up to 2014, with a condensed tools list).
- The Community Owned Digital Preservation Tools Registry (COPTR) offers links to several web archiving tools for crawling as well as static capture, though it not complete nor regularly maintained.
- The Internet Archive is the largest collection of web content, and is also the home of Archive-It
- The United States' Library of Congress offers an overview of its web archiving program as well as collection development policy documentation and format recommendations via https://www.loc.gov/programs/web-archiving/about-this-program/.
- The UK's National Archives offers web archiving guidance (2011) and a website which includes guidance for website creators and managers to make their content more capturable.
- Wikipedia maintains a list of web archiving initiatives worldwide
- Contemporary tools and resources excluded from some of the resource lists above:
- Webrecorder is a web archiving service anyone can use for free to save web pages: https://webrecorder.io/
- Social Feed Manager is open source software that harvests social media data and web resources from Twitter, Tumblr, Flickr, and Sina Weibo. https://gwu-libraries.github.io/sfm-ui/
- Documenting the Now develops tools and builds community practices that support the ethical collection, use, and preservation of social media content.
- Conifer is a web archiving service that creates an interactive copy of any web page that you browse, including content revealed by your interactions such as playing video and audio, scrolling, clicking buttons, and so forth. https://conifer.rhizome.org/
- ArchivesSocial connects directly to social networks such as Facebook and Twitter to capture and preserve the content on your pages – ensuring you can easily respond to records requests.
- Quality Control Documentation:
- NYARC https://sites.google.com/site/nyarc3/web-archiving/quality-assurance
- Archive-It Quality Assurance Overview: https://support.archive-it.org/hc/en-us/articles/208333833-Quality-Assurance-Overview
- Collaborative Web Archiving
- Ivy Plus Libraries Confederation https://library.columbia.edu/collections/web-archives/Ivy_Plus_Libraries.html
Policies
- UTSA policy https://lib.utsa.edu/specialcollections/sites/specialcollections/files/2020-09/WebArchives_Policy_2020-08-20.pdf
- TAMU Web Archiving Methods and Collection Guidelines
WARC Preservation
- UNT’s client to get WARCs out with the API that AIT supports: https://github.com/unt-libraries/py-wasapi-client
- For safekeeping: An automated preservation workflow for Archive-It content, Adriane Hanson, University of Georgia, from 2020 Archive-It Partner meeting [slides] [recording]