Tagged: ladybird

Henry Kissinger Project – Ingest Statistics

This is just a brief update to offer some ingest statistics related to the Henry Kissinger project. The digitized project will contain approximately 1,700,000 digital objects from approximately 12,800 folders.

The process of ingest includes both manual and automated processes. The Digital Library Programming group is responsible for the automated steps which basically include the creation of a Ladybird object and then publishing that object to Hydra. At this time, all objects are being ingested in a manner that prevents them from being exposed in the public Hydra interface (FindIT.library.yale.edu). The plan is to “turn on” the collection all at once, which is a better approach when a collection is very large and very complex. Otherwise, researchers may have a difficult time using the collection if materials were made available a little at a time, in sometimes what would seem like a random order.

As of Feb 16:

  • 339,041 – the number of objects ingested into Hydra
  • 4,266 – the number of folders ingested out of the approximate 12,800
  • 7 – the number of digital files that makeup an object in Hydra
  • 2,377,553 – the actual number of files ingested into Hydra
  • 792,655 – total objects ingested into Hydra
  • 5,548,585 – total number files currently in Hydra
  • 10.856 seconds – the average time it takes an object to ingest into Hydra

Something to consider with the last statistic, which is actually the one we focus on the most. At the current rate, time to ingest the entire collection is approximately 213 days. For each 1/10th of a second that this rate fluctuates, the completion time increases/decreases by roughly 31 hours. If ingest was to suddenly start taking 11.8 seconds, it would push the approximate completion time to 232 days.

Hydra Project Milestone: Automation

The Digital Library & Programming group is pleased to announce that we’ve hit a major milestone the development of the Hydra digital repository system at YUL. Communication and syncing between repository system components became fully automated at the end of September 2014. This automation applies not just to work on the infrastructure built for the Kissinger papers, but for all Ladybird/Hydra interaction.

Automation like this allows metadata and objects to travel within the Hydra system without intervention, which in turn allows Library IT to focus more intensely on structural and workflow development. As a Project Hydra partner, Yale is now in the position to share this work with the Project Hydra community, and empower those members to scale up their own repository ingest services.