A friend’s daughter is tasked with developing a Web-accessible archive for a multi-year collection of material that has been generated by an organization within a university. All of the material will be public, so there are no security issues and everything can be indexed by search engines. Ideally all of this can be maintained by non-programmers from Web browsers and minimal technical effort will be required for setup (though perhaps some programming would be useful/needed for an ingestion step).
The material is a mixture of PDFs, images, text, etc. She found some interesting software targeted at this very problem. Examples:
- Collective Access (free open-source cataloging software designed for museums)
- PastPerfect ($1,245/year not-free not-open-source)
- ArchivesSpace (open-source, but not free?)
All of these provide for comprehensive tagging of each item, boolean searches, etc. But I wonder/worry that these are overkill. The collection is not especially valuable and I don’t know if people want to take the trouble to craft elaborate queries.
I was thinking that she might be better off using standard WordPress. Every item that is in the archive can become a WordPress post dated whenever the item was created (maybe this can be done via a batch process inserting things into the WordPress tables). She and anyone else involved in the project can tag items with however many tags make sense. At that point users can
- search with Google
- search by date (WordPress lets you go back and look at posts by date)
- search by tag
One advantage for WordPress over the above systems that are built for archiving is that WordPress is much more popular and constantly being improved (changed, anyway!). There are plugin modules available, e.g., to improve full-text searching through PDFs. For those who already have a museum collection organized, there is even a “Culture Object” plugin that is designed to import a collection into WordPress.
Readers: Better ideas?