A Rose by Any Other Naming Convention


In working on the Digital Accessioning Service workflow, we approached an interesting question: How do you label the physical media?  A seemingly simple issue, but it becomes more complicated when you consider the range of media types and the risks related to applying labels directly to media. The service requires participating units to label media prior to submission for accessioning both to confirm that we are associating the content with the correct description and to ensure that the physical media can be linked back to the content in the future.  It’s important that these labels be as permanent as the life of the media.  

Although the service requires that all media be labeled prior to submission, we do not prescribe a specific method for labelling.  The final decision on how to apply labels is left to the special collection unit that owns the media.  We tested the following labeling methods in order to provide guidance on how units may label media in way that ensures permanence and minimizes risk of damaging the media.

The labeling question is two fold – how do we apply the label and what should the naming convention be?  Each media piece requires a unique identifier, preferably on the media itself, rather than on a case or other container.  The service uses this unique identifier to confirm metadata prior to imaging.  The identifier is also used as the filename for the disk image and related photographs.  


First Question: What is the best naming convention?

First idea: Accession Number + Sequential Number (AccessionNumber-001, AccessionNumber-002, etc)

This plan would rely on existing accession numbers.  In this case half of the work is done for us since many disks have already been assigned an accession number when the larger collection was accessioned.  Using this number for labeling disks also applies semantic meaning to the label.  A disk with an accession number based label could quickly be connected to the larger accession record.  An accession number identifier also allows for more flexibility between units, which may use different accession numbering techniques.

One disadvantage to accession number labeling is that legacy naming conventions have already been written on some processed materials, some of which include a legacy accession number+sequential number convention.  If there are multiple identifiers on a single disk (both legacy naming conventions, and labels assigned by the record creator) there is a risk that the service will use the incorrect label information.  

Another potential drawback to this naming convention is that accession numbers are unique to a given collecting library or museum, but not necessarily unique across all YUL/M collections. Since we accept media from a number of libraries and museums there is no guarantee that the existing accession number will be unique to the service.  Ideally, all identifiers used by the service would be unique to avoid a potential mix-up of material from two different repositories.

Second idea: Barcodes

Barcodes have the advantage that most units already own a barcode scanner and those that don’t can easily purchase one that connects through a standard USB port. The barcode is unique and would remove the risk of someone incorrectly entering a unique ID as it goes through the service.  Any time a long number is rekeyed multiple times there is the risk of mistakes.  The barcode is also small enough that there is little risk of covering existing label information.  The only exception is very small flash drives or memory cards, which would already require external housing for labeling purposes.  Another advantage is that barcodes would obviously answer the question: “Which of these numbers is the correct one to enter?”  As media has been collected for over twenty years here at Yale University Libraries and Museums, the naming conventions have changed and legacy naming conventions still linger on older media.  A barcode workflow would remove the risk of mistakenly entering a legacy identifier.

One major disadvantage of the barcode is that it has no semantic meaning, and is instead a completely random number. Another disadvantage is that our system for descriptive records (ArchivesSpace) already has a field titled “barcode” that refers to the barcoding system applied to containers for managing storage locations of physical items.  The final drawback to applying barcodes is that it requires an adhesive label, which could fall off the media and potentially damage the media and the drive used for reading the media.


Decision Time: Accession Number + Sequential Number Wins!

Ultimately we decided that the semantic value and human readability of the accession number label outweighed the ease of scanning a barcode.  The service will most likely be copying the identifier directly from a metadata spreadsheet, which minimizes the risk of entering an incorrect identifier.


Second Question: How do we apply our decided-upon naming convention to the media?

First idea: Archival adhesive labels

This plan would require the archivist to write the naming information on an archival label (often used to label folders) and apply it directly to the physical media.  The largest drawback of adhesive labels is their fixed size.  It is unlikely that we would be able to find room for this type of label on a 3.5 inch floppy disk without covering existing label information written by the creator or other archivists.  This is especially problematic if the media may be displayed in an exhibit in the future. Another issue with this kind of label is the risk that it could fall off the media, requiring an archivist to re-image the disk to determine which collection it belongs to.  It’s also possible for the label to fall off while the disk is in use, potentially damaging the disk and the drive used to read it.

Second idea: Pencil

This plan would require the archivist to write the naming information directly on the media in pencil. Most of our plans for labelling require permanent markings on the media, which could be problematic for exhibiting material or if the naming convention were to change over time (or if someone makes a mistake in the original labeling process). Marking the media with pencils would remove the risk of making a mistake permanent on special collection material. However the impermanence of pencils is as much of a drawback as an advantage. Pencil markings smudge and can become illegible, which could require re-imaging to determine the contents of the media. Pencils also create dust, which could make its way through the plastic casing and damage the internal disk as well as the drive mechanism used to read the media.  

Bentley Historical Library (BHL) provides guidance on the risks related to labeling in their IMLS funded report on Removable Media and the Use of Digital Forensics. They recommend against “writing on floppy disks with pencils or ballpoint pens, touching the tape, or affixing labels” because of the internal disk’s fragility.

Third idea: Standard Permanent Marker

Photograph of 3.5 Inch Floppy Disk with fully written label

Some record creators use every inch of their labels, leaving little room for archivists to apply their own naming conventions


This plan would require the archivist to write the naming information directly on the media with a permanent marker.  The standard Sharpie marker is already in use for labeling

optical media in many units, so this would be an easy transition.  However, a black marker isn’t visible on all media types and may require the archivist to find space on an already packed label.



Fourth idea: Silver/Black Paint Pen

This plan would require the archivist to write the naming information directly on the media with an oil-based paint pen.  This would require both a silver and a black pen so that markings would be visible on media of all colors.  These markings would be permanent and once dried would not risk smudging.  The main drawback to this plan would be regarding optical media.  The BHL report recommends against writing on optical media with solvent based markers because they “can penetrate the hard layer and deform, discolor or corrode the disc, causing permanent reading problems for the laser.”  


Decision Time: Paint Marker Wins! (with a caveat)

Ultimately we decided there should be slightly different rules for different media.  Solvent based markers are the best solution for media in hard plastic casings (such as floppy disks and hard drives), but could damage optical media.  The Council on Library and Information Resources recommends felt-tip water-based markers for optical media, because these are the least likely to have a damaging chemical reaction.  We decided to use a permanent marker and only apply labels to the inner circle which does not contain data, which will limit the risk of corroding the media.


In Conclusion

The Digital Accessioning Service is making recommendations for how a unit may apply labels to their media, however the final decision is left to the unit. We hope that standardized naming conventions and label applications will aid in quality assurance and long term intellectual control of physical media.  By using the naming convention as a label for physical media and the filenames for disk images, packaged transferred files, and associated photographs of media, we can ensure that the digital files will be associated with the media for the long term.


The Final Label

Labels written on test media

CD with label written in Sharpie on inner circle

3.5 and 5.25 inch floppy disks with labels written in oil paint pens

Introducing the Digital Accessioning Service

Anyone who has worked with a computer for the past decade or longer probably has a few files saved on floppy disks, zip disks, CDs, and other assorted storage devices. Modern media like CDs and flash drives may still be accessible, but older disks are often unreadable by modern machines, left to languish in attics and filing cabinets. 

The libraries and museums at Yale University are no different. Much of the digital media we have acquired over the last twenty years is trapped on legacy media requiring special hardware and software to access it. Despite the growing presence of born-digital archival material in Yale’s special collections, until now we have not created a system-wide approach to processing these holdings. Repositories with adequate funding and expertise are already providing access to born-digital, but many of these activities have been ad-hoc and the procedures differ based on available technology and expertise. Some repositories have postponed acquiring and processing born-digital media, deciding to wait for the libraries to create a holistic approach.

In January 2015 the Born Digital Working Group was formed to address this need and determine how the different libraries and museums can pool resources and expertise to find a path forward for born-digital archival materials. Our vision is to provide the same level of stewardship for born-digital holdings as is devoted to our physical collections. One of the priority goals is to establish a centralized Digital Accessioning Service for Yale special collections to capture files in a way that maintains their archival integrity and package them for ingest into the Digital Preservation System. This service is still in the beginning stages as we test software and hardware, draft documentation, and ensure that we are ready to begin accessioning archival material. The service will be housed in our new Di Bonaventura Family Digital Archaeology and Preservation Lab, allowing us to provide accessioning services for born-digital media from across Yale University Libraries and Museums.View of the Digital Archaeology and Preservation Lab

Last April the Beinecke’s Technical Services department and the Preservation Department moved to a new facility that includes the shared Digital Archaeology and Preservation Lab. The space currently hosts disk imaging for special collections as well as disk imaging for the general collections managed by Euan Cochrane, Yale’s Digital Preservation Manager. The new lab has more room for collaborations and sharing expertise with staff and visitors. There is additional space and shelving for storage of supplies, tools, and media awaiting accessioning. The lab houses two workstations devoted to disk imaging for special collections, two workstations devoted to imaging disks from the general collection and additional write blockers.

The digital preservation team uses the lab to house legacy computers that can be used to view files in their original environment, interact with digital content that requires original hardware, and test and validate digital preservation approaches such as emulation and migration against the original content executing on contemporaneous legacy hardware. The Digital Preservation team is also creating disk images of the legacy computers which will be attached to emulated versions of the original hardware environments. Using images of the original hard drives with the emulated hardware will help to enable accurate validation of emulated hardware by removing one source of difference between the two environments. Mac Classic II attached to digital forensic station for disk imaging

We use BitCurator, Forensic Toolkit (FTK), and Kryoflux to create disk images in the lab. Each offers different advantages for working with different media types. The extra room also means we can avoid the pesky traffic jams that came up in our former lab space in Sterling Memorial Library.

Now that we have the technology in place, we are working on documentation. That includes reference guides for write blockers, how-to manuals for disk imaging, and workflows to explain how media will make their way from collections through the accessioning process. All of this will inform the service, led by myself as the Digital Accessioning Archivist with the guidance of the Beinecke’s Digital Archivist and the Born Digital Working Group. I’ll be the new go-to person for born-digital accessioning here at Yale University’s Libraries and Museums. For the next month I’ll be focused on getting all our docs in a row so that we can begin accessioning digital media from archival collections across campus. I will also be visiting the various special collections and archives across campus to familiarize myself with the collections and processes in place, so we can make born-digital accessioning a seamless part of archival processing here at Yale. As the born-digital program continues to develop, we hope to invite the Yale community into the lab to learn more about our work with digital preservation, emulation tools, and disk imaging.