Reports, in three phases

After attending this month’s ArchivesSpace Hackathon (which was an amazing event, by the way, which deserves its own blog post or two!), I started mucking about with the new reporting features in ArchivesSpace. So far, I’ve been tackling reports in three ways:

  1. Adding new reports to ArchivesSpace that I authored in JasperSoft Studio
  2. Editing prepackaged ArchivesSpace reports with JasperSoft Studio
  3. Adding a print.css stylesheet to ArchivesSpace to emulate those “print screen” reports from Archivists’ Toolkit

Long story short, it was actually a lot easier than I expected to add new reports to ArchivesSpace (tested on versions 1.4.x).  If you can write a new report with a tool like JasperSoft Studio, then you’ll be able to upload it to ArchivesSpace with ease.  That said, I still wonder what other changes might be required to load new reports into ArchivesSpace that are parameterized (and so, require additional user interactions before running), but I hope that those features will be added to new releases of the ArchivesSpace reports module if they aren’t already there.  In any event, those types of reports will be my next steps of exploration.  In the meantime, this post is about what I have up and running right now.

Continue reading Reports, in three phases

Link to Training Session #1

Training Session #1: Getting Started in ASpace is coming up on Thursday at 10am. If you would like to join us from afar, use this link to access the session:

We will be using a web-based videoconferencing service called Zoom to broadcast the session. When you click on the link, Zoom will quickly install on your computer before joining the session – see this video for instructions. Note that you will need headphones or speakers to hear the session. The session will also be recorded – we will send out a link after uploading.

Also, please note that info about further trainings will be posted on our Libguide, rather than on this blog. Thank you!



Software Performance Testing Or: How to Tell If You’ve Got Bad Blood





Sometimes a girl just needs to see a specialist. Arsyn and Catastrophe (played here by Selena Gomez and Taylor Swift) used to be besties, but a betrayal results in an apparent demise and a lot of bad blood. However, all is not lost #revengegoals. We all know band-aids don’t fix bullet holes, so what’s a girl to do? With the expert advice of consultants and a little re-engineering, our protagonists reunite for a final showdown.

In the same way a person in discomfort would seek a specialist to help determine what’s wrong, YUL sought similar diagnostics to suss out the root causes of ArchivesSpace performance problems. We went live with ASpace in early June 2015, however almost immediately the application became unusable due to timeouts, system crashes, or records that took so long to render you wondered if it wasn’t too late for law school while contemplating the status bar. A battery of diagnostic tests and tools helped pinpoint the source of ASpace’s woes.

There are many tools available (commercial, free, or locally developed) to conduct performance testing.  They range from simple to sophisticated and platform dependent or independent. Generally speaking though, software performance testing is an approach to testing that utilizes defined or prerecorded actions that:

    • Simulate known or anticipated user behavior in an application
    • Validate business requirements to be performed by the application
    • Help pinpoint where performance breakdowns are occurring, or performance could be optimized
    • Report a set of results and measurements for comparison, troubleshooting, and bench marking system performance
    • Can be executed automatically by a timer, crontab, or on-demand

In my opinion, testing software during development and during implementation is as important as tasting your food as you prepare it. Think of the list of ingredients as your recipe’s functional requirements. Does it need more salt? If the addition of an ingredient causes your sauce to break, do you start again or serve it as is? What if you over-engineer the cream and are stuck with butter? (I think that may be referred to as “re-branding”).

Software performance testing is critical to any development project, whether an open-source, or vendor-developed application. This methodical approach to product testing provides an IT department with a consistent review of core functions measured throughout a product life cycle. The typical software development life cycle places heaviest testing activities during the programming/development phase. Before staff training. Before production. It is a necessary step towards final user acceptance of the new or modified application. But I also encourage ongoing testing as functional or user requirements evolve, and as significant events occur in your application environment, such as network changes or application upgrades. Post-production, testing helps with ongoing capacity planning (data or users) and in this way it reveals itself as a useful tool not only for diagnostics, but also for systems management.

There are several types of performance tests, including unit, smoke, peak, load, and soak. I think peak and load are most common, used to measure heavy use of the application, but I love the imagery conjured by smoke and soak. Back in the day, smoke testing was quite literal–did it start on fire when you first turned it on? If not, you were good to go. (BTW, I love that this continues my cooking analogies from earlier). These types of tests provide controlled opportunities to view system performance under a range of conditions, and provide project lead-time to tune the infrastructure, software, and attendant services involved with your business process. But let’s not overlook the old eyeball test. In other words, if you see something, say something! Is the system performing as expected? Does it seem slow, sluggish, inconsistent? Front of the house is often where many non-functional requirements are measurable or observed, such as data accuracy, general usability, or system fail-over measures.

While the range of measurement tools is incredibly helpful, software can’t do everything. Knowledge of the application and user behavior falls outside the scope of these tools. We need people for that. Outlining the set of behaviors or actions to test, also people-driven. Interpreting and resolving the test results…you get where I’m going.bartgetshitbyacar9

Five Hundred Twenty Five Thousand Six Hundred Minutes, or how do you measure a workflow? Using one typical staff workflow (search & edit an accession record) in ASpace, we recorded these measurements:

  • ArchivesSpace backend 6 seconds to fetch the record from the database and produce the JSON representation
  • ArchivesSpace frontend 16 seconds to produce the HTML page for the edit form
  • User’s web browser 2+ minutes to render the HTML page and to run the JavaScript required to initialize the edit form

Each of these is a step in the process from the moment the user initiates a search in ASpace until the application renders the requested result. The first two steps are not entirely visible to the end user and represent performance on the back end.  What the user is painfully aware of is the 2+ minutes it takes in the browser (their client) to help them to the next step, getting their work done.

Each of these measured steps are jumping off points for further analysis by IT or the developers of the software. Ultimately, some MySQL innodb buffer adjustments brought the first two steps (22 seconds) down to 5-6 seconds. A new release of the software interface introduced additional response time improvements. Now when we discuss response time in any tally of seconds, should anyone be fussing over that? Yeppers. When you enter a search in Google, how long do you expect to wait for search results to start filing in? If you search an OPAC or Library Discovery layer, same question. When the app has a multi-stop itinerary, each step should be as efficient as possible. These are standard user expectations for modern web-based tools.

In the local case henceforth known as, “Nancy Drew and The Mystery at the Moss-Covered Mansion”, we used JMeter and Chrome Developer tools to measure ASpace performance back to front. JMeter provided the first two measurements noted earlier with the accession record example. Chrome developer tools provided the third measurement for accession record workflow.  A sample test run in JMeter is configured with variables such as threads (number of “users” to simulate), ramp up (the time to wait between the first thread and starting subsequent threads), and loop (how many times this should be repeated). All configurable values for the type of test you need to run (peak, soak, etc.), and directed at your dev, test, or prod instance of a service. Using Chrome Developer tools, you can capture time to complete browser-based actions such as loading, scripting, rendering, and painting.

I was fortunate to present this work this summer at the ArchivesSpace Member Meeting during the Society of American Archivists annual conference. Although the audience was clearly peppered with Justin Bieber fans, I think the general idea that if T-Swift can be re-engineered, so can an ArchivesSpace implementation was understood.

Nearly 600 million people have watched the Bad Blood video. If you are not one of them, you probably have a library card. But for those of us alumnae from Upstairs Hollywood Medical College, this song was my summer jamTaylor_Swift_Bad_Blood

Training Session #1: Getting Started in ASpace

Our first ArchivesSpace training session for Yale users will take place from 10-11am on Thursday, November 5th in SML Lecture Hall (and online)

Session I: Getting Started in ASpace
How to add users
Where to find documentation & resources
How to stay in the loop

All training sessions will be live streamed and recorded.


Questions about the training sessions? Contact

Upcoming Training Sessions

The ArchivesSpace Training Subcommittee is pleased to announce “Phase I” of a series of training sessions for YUL staff, beginning the first week in November. Stay tuned for days & times!

All training sessions will be live streamed and recorded.

Session I: Getting Started in ASpace
How to add users
Where to find documentation & resources
How to stay in the loop

Session II: Navigating the Resource Record
What is a Resource Record in ASpace?
Required and Optimum description requirements
How required fields are indicated
DACS prompts in the Resource Record
The Notes Field

Questions about the training sessions? Contact







Mucking around in ArchivesSpace Locally

It may occasionally be part of your job to get inside of the guts of ArchivesSpace so that you can test a new release, diagnose a bug, or generally get a better sense of what’s going on with the system that you rely on. Also, depending on your organizational structure and the way that resources are distributed, you may need to be in a position to help your local IT department answer questions about what’s happening with ArchivesSpace.

It’s (very smart) common practice to not give access to server environments beyond a small cadre of IT department professionals. And you wouldn’t want to experiment there anyway! Thus, you need a sandbox.

A few folks have asked for instructions about how to create this environment. In short, you’ll be installing ArchivesSpace on your local machines and then hooking it up to a copy of your production MySQL database. I’m assuming that your organization uses MySQL behind ArchivesSpace because most do. There’s really great documentation about all of this on the ArchivesSpace github page, but it may be overwhelming if you haven’t done this before. By the way, if you’ve ever managed a database before, you really don’t need this tutorial.

I’ll also talk through configuration, regular updates to the database from your production database, and things to think about when testing.

Congratulations! You’re about to create a consequence-free environment. This means that you should feel free to mess around, take risks, and learn by breaking things. You’re going to break things. Continue reading Mucking around in ArchivesSpace Locally

Validation Scenarios

The short version:

You can validate your EAD files in bulk before importing them into ArchivesSpace.  Essentially, there’s an easy way to do this and a hard way.  The easy way requires that you have the oXygen XML Editor.  With that, just add as many EAD files you want to an oXygen project and then validate them all in one step.  I’m not going to give detailed instructions on how to do batch validation without oXygen in this blog post, which is more difficult to set up, but I will give you a general sense of how it works and point you in the right direction.  Also, I’ll try to explain why you should be interested in validating your files in bulk prior to importing anything (EAD, MARC, whatever) into ArchivesSpace.

So on to the longer version:

If you’ve ever attempted to import EAD files into ArchivesSpace, you might have encountered some rather cryptic errors.  Here’s one:
Continue reading Validation Scenarios

Migration, step by step

Like Mary mentioned, we’re in! Migration was a five-week process that involved a lot of on-the-fly problem solving and the chance to really engage with how ArchivesSpace works. It also required setting alarms to check on scripts at 2:00 am. Thank goodness we’re done.

We work in a large, complex organization with a lot of local requirements. We also monkeyed around with AT quite a bit, and are moving into an ArchivesSpace environment with a lot of plug-ins. For most folks, it will be a lot easier than this. Here’s my documentation of what we’ve done and what I would think about doing differently if I were to do this again. Continue reading Migration, step by step

We’re in!

After a couple of years of planning and countless hours of discussions, compromises, development and testing, the first seven of Yale University’s nine ArchivesSpace repositories went live on Monday May 11.

In the final weeks of preparation…

The committee finalized (well mostly) our policies and procedures; the controlled values lists; and the system configuration.

Maureen Callahan and Mark Custer (aka the dynamic duo), with welcome assistance from Mike Rush, Adam Shahrani, and the folks at Hudson Molonglo ran several test migrations and methodically worked through data errors, developed and ran pre- and post- migration scripts and SQL updates, and finalized the workflows that will be undertaken for the next two migrations (the largest). I am hoping Maureen and Mark will share more of about their experiences with migration in a future blog post.


Rachel Chatalbash and Mary Caldera took the lead on finishing up the Yale ArchivesSpace User Manual and handling logistics for the training sessions. We ran into system performance issues during the test run, so Steve DiSorbo (with help from Mark Triggs) worked overtime to give us more computing power and optimized performance. It worked, and our first training session (18 folks from 7 repositories with two teachers and two floats) went without a hitch. We have several more regular sessions scheduled and will follow up with forums and special topic training in the summer.

Development continues as our IT team, Melissa Wisner, Yue Ji, and Steelson Smith finalize the programs that will allow automated locations updates and integration with Aeon. Both are requirements for the last two repositories.

It’s a work in progress and we’ll wait until we are all in production before we really CELEBRATE, but thanks to the Yale special collections community, the ArchivesSpace community, and all the others who provided support and assistance throughout the process.

Keeping timestamps and creator names from AT

If you’ve already migrated from Archivists’ Toolkit to ArchivesSpace, you know that timestamps on your records will be reset to the time of migration, and the name associated with the creator of the record will be changed to “admin.” Here at Yale, since accessioning is such a serious activity (after all, it’s the moment when we take legal, physical, and intellectual control of records), we wanted to keep that information in ArchivesSpace. At this time, we’re fine for only having this for accession records, although this technique could be modified for other records, too.

This was a project for Adam Shahrani, Mark Custer, and myself. And basically, with Adam’s patient assistance, we just wrote a SQL script to select cognate values from Archivists’ Toolkit and ArchivesSpace, and update the ASpace values with AT values.

The script for that is here, on GitHub. All relevant warnings apply. Obviously, you’ll need to update schema names. The other thing is that we did this in an environment where both databases were on the same server (localhost). Doing this update across servers would be more complicated.

And here’s what it looks like:

Screen Shot 2015-05-01 at 10.20.25 AM


Check out those sweet timestamps from 2012!