This is just a brief update to offer some ingest statistics related to the Henry Kissinger project. The digitized project will contain approximately 1,700,000 digital objects from approximately 12,800 folders.
The process of ingest includes both manual and automated processes. The Digital Library Programming group is responsible for the automated steps which basically include the creation of a Ladybird object and then publishing that object to Hydra. At this time, all objects are being ingested in a manner that prevents them from being exposed in the public Hydra interface (FindIT.library.yale.edu). The plan is to “turn on” the collection all at once, which is a better approach when a collection is very large and very complex. Otherwise, researchers may have a difficult time using the collection if materials were made available a little at a time, in sometimes what would seem like a random order.
As of Feb 16:
- 339,041 – the number of objects ingested into Hydra
- 4,266 – the number of folders ingested out of the approximate 12,800
- 7 – the number of digital files that makeup an object in Hydra
- 2,377,553 – the actual number of files ingested into Hydra
- 792,655 – total objects ingested into Hydra
- 5,548,585 – total number files currently in Hydra
- 10.856 seconds – the average time it takes an object to ingest into Hydra
Something to consider with the last statistic, which is actually the one we focus on the most. At the current rate, time to ingest the entire collection is approximately 213 days. For each 1/10th of a second that this rate fluctuates, the completion time increases/decreases by roughly 31 hours. If ingest was to suddenly start taking 11.8 seconds, it would push the approximate completion time to 232 days.