Software Performance Testing Or: How to Tell If You’ve Got Bad Blood

 

 

 

 

Sometimes a girl just needs to see a specialist. Arsyn and Catastrophe (played here by Selena Gomez and Taylor Swift) used to be besties, but a betrayal results in an apparent demise and a lot of bad blood. However, all is not lost #revengegoals. We all know band-aids don’t fix bullet holes, so what’s a girl to do? With the expert advice of consultants and a little re-engineering, our protagonists reunite for a final showdown.

In the same way a person in discomfort would seek a specialist to help determine what’s wrong, YUL sought similar diagnostics to suss out the root causes of ArchivesSpace performance problems. We went live with ASpace in early June 2015, however almost immediately the application became unusable due to timeouts, system crashes, or records that took so long to render you wondered if it wasn’t too late for law school while contemplating the status bar. A battery of diagnostic tests and tools helped pinpoint the source of ASpace’s woes.

There are many tools available (commercial, free, or locally developed) to conduct performance testing.  They range from simple to sophisticated and platform dependent or independent. Generally speaking though, software performance testing is an approach to testing that utilizes defined or prerecorded actions that:

    • Simulate known or anticipated user behavior in an application
    • Validate business requirements to be performed by the application
    • Help pinpoint where performance breakdowns are occurring, or performance could be optimized
    • Report a set of results and measurements for comparison, troubleshooting, and bench marking system performance
    • Can be executed automatically by a timer, crontab, or on-demand

In my opinion, testing software during development and during implementation is as important as tasting your food as you prepare it. Think of the list of ingredients as your recipe’s functional requirements. Does it need more salt? If the addition of an ingredient causes your sauce to break, do you start again or serve it as is? What if you over-engineer the cream and are stuck with butter? (I think that may be referred to as “re-branding”).

Software performance testing is critical to any development project, whether an open-source, or vendor-developed application. This methodical approach to product testing provides an IT department with a consistent review of core functions measured throughout a product life cycle. The typical software development life cycle places heaviest testing activities during the programming/development phase. Before staff training. Before production. It is a necessary step towards final user acceptance of the new or modified application. But I also encourage ongoing testing as functional or user requirements evolve, and as significant events occur in your application environment, such as network changes or application upgrades. Post-production, testing helps with ongoing capacity planning (data or users) and in this way it reveals itself as a useful tool not only for diagnostics, but also for systems management.

There are several types of performance tests, including unit, smoke, peak, load, and soak. I think peak and load are most common, used to measure heavy use of the application, but I love the imagery conjured by smoke and soak. Back in the day, smoke testing was quite literal–did it start on fire when you first turned it on? If not, you were good to go. (BTW, I love that this continues my cooking analogies from earlier). These types of tests provide controlled opportunities to view system performance under a range of conditions, and provide project lead-time to tune the infrastructure, software, and attendant services involved with your business process. But let’s not overlook the old eyeball test. In other words, if you see something, say something! Is the system performing as expected? Does it seem slow, sluggish, inconsistent? Front of the house is often where many non-functional requirements are measurable or observed, such as data accuracy, general usability, or system fail-over measures.

While the range of measurement tools is incredibly helpful, software can’t do everything. Knowledge of the application and user behavior falls outside the scope of these tools. We need people for that. Outlining the set of behaviors or actions to test, also people-driven. Interpreting and resolving the test results…you get where I’m going.bartgetshitbyacar9

Five Hundred Twenty Five Thousand Six Hundred Minutes, or how do you measure a workflow? Using one typical staff workflow (search & edit an accession record) in ASpace, we recorded these measurements:

  • ArchivesSpace backend 6 seconds to fetch the record from the database and produce the JSON representation
  • ArchivesSpace frontend 16 seconds to produce the HTML page for the edit form
  • User’s web browser 2+ minutes to render the HTML page and to run the JavaScript required to initialize the edit form

Each of these is a step in the process from the moment the user initiates a search in ASpace until the application renders the requested result. The first two steps are not entirely visible to the end user and represent performance on the back end.  What the user is painfully aware of is the 2+ minutes it takes in the browser (their client) to help them to the next step, getting their work done.

Each of these measured steps are jumping off points for further analysis by IT or the developers of the software. Ultimately, some MySQL innodb buffer adjustments brought the first two steps (22 seconds) down to 5-6 seconds. A new release of the software interface introduced additional response time improvements. Now when we discuss response time in any tally of seconds, should anyone be fussing over that? Yeppers. When you enter a search in Google, how long do you expect to wait for search results to start filing in? If you search an OPAC or Library Discovery layer, same question. When the app has a multi-stop itinerary, each step should be as efficient as possible. These are standard user expectations for modern web-based tools.

In the local case henceforth known as, “Nancy Drew and The Mystery at the Moss-Covered Mansion”, we used JMeter and Chrome Developer tools to measure ASpace performance back to front. JMeter provided the first two measurements noted earlier with the accession record example. Chrome developer tools provided the third measurement for accession record workflow.  A sample test run in JMeter is configured with variables such as threads (number of “users” to simulate), ramp up (the time to wait between the first thread and starting subsequent threads), and loop (how many times this should be repeated). All configurable values for the type of test you need to run (peak, soak, etc.), and directed at your dev, test, or prod instance of a service. Using Chrome Developer tools, you can capture time to complete browser-based actions such as loading, scripting, rendering, and painting.

I was fortunate to present this work this summer at the ArchivesSpace Member Meeting during the Society of American Archivists annual conference. Although the audience was clearly peppered with Justin Bieber fans, I think the general idea that if T-Swift can be re-engineered, so can an ArchivesSpace implementation was understood.

Nearly 600 million people have watched the Bad Blood video. If you are not one of them, you probably have a library card. But for those of us alumnae from Upstairs Hollywood Medical College, this song was my summer jamTaylor_Swift_Bad_Blood

Training Session #1: Getting Started in ASpace

Our first ArchivesSpace training session for Yale users will take place from 10-11am on Thursday, November 5th in SML Lecture Hall (and online)

Session I: Getting Started in ASpace
How to add users
Where to find documentation & resources
How to stay in the loop

All training sessions will be live streamed and recorded.

 

Questions about the training sessions? Contact emily.ferrigno@yale.edu

Upcoming Training Sessions

The ArchivesSpace Training Subcommittee is pleased to announce “Phase I” of a series of training sessions for YUL staff, beginning the first week in November. Stay tuned for days & times!

All training sessions will be live streamed and recorded.

Session I: Getting Started in ASpace
How to add users
Where to find documentation & resources
How to stay in the loop

Session II: Navigating the Resource Record
What is a Resource Record in ASpace?
Required and Optimum description requirements
How required fields are indicated
DACS prompts in the Resource Record
The Notes Field

Questions about the training sessions? Contact emily.ferrigno@yale.edu

 

 

 

 

 

 

Mucking around in ArchivesSpace Locally

It may occasionally be part of your job to get inside of the guts of ArchivesSpace so that you can test a new release, diagnose a bug, or generally get a better sense of what’s going on with the system that you rely on. Also, depending on your organizational structure and the way that resources are distributed, you may need to be in a position to help your local IT department answer questions about what’s happening with ArchivesSpace.

It’s (very smart) common practice to not give access to server environments beyond a small cadre of IT department professionals. And you wouldn’t want to experiment there anyway! Thus, you need a sandbox.

A few folks have asked for instructions about how to create this environment. In short, you’ll be installing ArchivesSpace on your local machines and then hooking it up to a copy of your production MySQL database. I’m assuming that your organization uses MySQL behind ArchivesSpace because most do. There’s really great documentation about all of this on the ArchivesSpace github page, but it may be overwhelming if you haven’t done this before. By the way, if you’ve ever managed a database before, you really don’t need this tutorial.

I’ll also talk through configuration, regular updates to the database from your production database, and things to think about when testing.

Congratulations! You’re about to create a consequence-free environment. This means that you should feel free to mess around, take risks, and learn by breaking things. You’re going to break things. Continue reading Mucking around in ArchivesSpace Locally

Validation Scenarios

The short version:

You can validate your EAD files in bulk before importing them into ArchivesSpace.  Essentially, there’s an easy way to do this and a hard way.  The easy way requires that you have the oXygen XML Editor.  With that, just add as many EAD files you want to an oXygen project and then validate them all in one step.  I’m not going to give detailed instructions on how to do batch validation without oXygen in this blog post, which is more difficult to set up, but I will give you a general sense of how it works and point you in the right direction.  Also, I’ll try to explain why you should be interested in validating your files in bulk prior to importing anything (EAD, MARC, whatever) into ArchivesSpace.

So on to the longer version:

If you’ve ever attempted to import EAD files into ArchivesSpace, you might have encountered some rather cryptic errors.  Here’s one:
Continue reading Validation Scenarios

Migration, step by step

Like Mary mentioned, we’re in! Migration was a five-week process that involved a lot of on-the-fly problem solving and the chance to really engage with how ArchivesSpace works. It also required setting alarms to check on scripts at 2:00 am. Thank goodness we’re done.

We work in a large, complex organization with a lot of local requirements. We also monkeyed around with AT quite a bit, and are moving into an ArchivesSpace environment with a lot of plug-ins. For most folks, it will be a lot easier than this. Here’s my documentation of what we’ve done and what I would think about doing differently if I were to do this again. Continue reading Migration, step by step

We’re in!

After a couple of years of planning and countless hours of discussions, compromises, development and testing, the first seven of Yale University’s nine ArchivesSpace repositories went live on Monday May 11.

In the final weeks of preparation…

The committee finalized (well mostly) our policies and procedures; the controlled values lists; and the system configuration.

Maureen Callahan and Mark Custer (aka the dynamic duo), with welcome assistance from Mike Rush, Adam Shahrani, and the folks at Hudson Molonglo ran several test migrations and methodically worked through data errors, developed and ran pre- and post- migration scripts and SQL updates, and finalized the workflows that will be undertaken for the next two migrations (the largest). I am hoping Maureen and Mark will share more of about their experiences with migration in a future blog post.

Meanwhile…

Rachel Chatalbash and Mary Caldera took the lead on finishing up the Yale ArchivesSpace User Manual and handling logistics for the training sessions. We ran into system performance issues during the test run, so Steve DiSorbo (with help from Mark Triggs) worked overtime to give us more computing power and optimized performance. It worked, and our first training session (18 folks from 7 repositories with two teachers and two floats) went without a hitch. We have several more regular sessions scheduled and will follow up with forums and special topic training in the summer.

Development continues as our IT team, Melissa Wisner, Yue Ji, and Steelson Smith finalize the programs that will allow automated locations updates and integration with Aeon. Both are requirements for the last two repositories.

It’s a work in progress and we’ll wait until we are all in production before we really CELEBRATE, but thanks to the Yale special collections community, the ArchivesSpace community, and all the others who provided support and assistance throughout the process.

Keeping timestamps and creator names from AT

If you’ve already migrated from Archivists’ Toolkit to ArchivesSpace, you know that timestamps on your records will be reset to the time of migration, and the name associated with the creator of the record will be changed to “admin.” Here at Yale, since accessioning is such a serious activity (after all, it’s the moment when we take legal, physical, and intellectual control of records), we wanted to keep that information in ArchivesSpace. At this time, we’re fine for only having this for accession records, although this technique could be modified for other records, too.

This was a project for Adam Shahrani, Mark Custer, and myself. And basically, with Adam’s patient assistance, we just wrote a SQL script to select cognate values from Archivists’ Toolkit and ArchivesSpace, and update the ASpace values with AT values.

The script for that is here, on GitHub. All relevant warnings apply. Obviously, you’ll need to update schema names. The other thing is that we did this in an environment where both databases were on the same server (localhost). Doing this update across servers would be more complicated.

And here’s what it looks like:

Screen Shot 2015-05-01 at 10.20.25 AM

 

Check out those sweet timestamps from 2012!

Reading Migration Errors and Fixing Our Data

We’ve been doing what feels like a zillion practice migrations to get our data ready for ArchivesSpace. Every time we do a test migration, we read the error reports to see what’s wrong with our AT data. From there, we clean up our at database with the aim of a completely error-free migration when it’s time to do this for real. This is still in progress, but common errors and clean-up techniques are below.

We had to get around inherent problems with the AT -> ASpace migrator.

  •  Resource records and accession records that are linked to subjects and agents already in the database won’t migrate. This is a really, really bad one.
    Here’s what the error looks like:

    Endpoint: http://localhost:8089/agents/corporate_entities
    AT Identifier:Name_Corporate->Yale Law School.
    Status code: 400
    Status text: Bad Request
    {"error":{"names":["Agent must be unique"]}}

    Make no mistake — this is a “record save error”. It’s not just that the agent or subject is no longer linked — the whole finding aid or accession record actually isn’t migrating. That’s a no-go for us.
    Since we’re moving the records of eight repositories (in four different Archivists’ Toolkit databases) into a single ArchivesSpace instance, we knew that we wouldn’t be able to live with this error. We toyed with the idea of various hacks (pre-pending subjects and agents with a unique string in each database so that they wouldn’t repeat), but in the end we decided to contract with Hudson Molonglo to fix the importer. We’ll be happy to report more on that once the work is done.

Because of the requirements of our advanced container management plug-in, we had to make sure that existing data met compatibility requirements.

  • Barcodes and box numbers have to match up. If you have ten components with the same barcode where eight say box “1” and one says box “2”, the migrator can’t create a top container. We wrote a bunch of sql reports to anticipate these problems, and have done a ton of clean-up over the last few months to make sense of them. In most cases, this required actually pulling down the materials and checking which components belonged to which containers and what their barcodes are. Many, many thanks to my colleague Christy who did this work.
  • Boxes can’t be assigned to more than one location (because of the laws of physics).
    Here’s what the error looks like. The relevant bits are in red text:

    Endpoint: http://localhost:8089/repositories/2/batch_imports?migration=ArchivistToolkit
    AT Identifier:RU.121
    Status code: 200
    Status text: OK
    {
     "errors": ["Server error: #<:ValidationException: {:errors=>{\"container_locations\"=>[\"Locations in ArchivesSpace container don't match locations in existing top container\"]}, :object_context=>{:top_container=>#<TopContainer @values={:id=>1963, :repo_id=>2, :lock_version=>1, :json_schema_version=>1, :barcode=>\"39002042754961\", :indicator=>\"1\", :created_by=>\"admin\", :last_modified_by=>\"admin\", :create_time=>2015-04-17 21:44:39 UTC, :system_mtime=>2015-04-17 21:44:39 UTC, :user_mtime=>2015-04-17 21:44:39 UTC, :ils_holding_id=>nil, :ils_item_id=>nil, :exported_to_ils=>nil, :legacy_restricted=>0}>, :aspace_container=>{\"container_locations\"=>[{\"ref\"=>\"/locations/124\", \"start_date\"=>\"Fri Apr 17 17:44:37 EDT 2015\", \"status\"=>\"current\"}], \"indicator_1\"=>\"1\", \"type_1\"=>\"box\"}, :top_container_locations=>[\"/locations/1\"], :aspace_locations=>[\"/locations/124\"]}}>"],
     "saved": []
    }

    This may take a bit of explanation. Basically, as I wrote before, in the AT data model, the container indicator is just a piece of data that’s associated with every component. The database has no way of knowing that each of those components called box “8” actually refer to the same thing. This can result in a lot of problems.
    One of those problems is that in the same collection, some components can be called box “8” and be assigned to one location, while other components can be called box “8” and be assigned to another location. Our migrator is trying to make sense of these containers in a more rigorous way, and it knows that the same box can’t be in two different places. Thus, it throws an error.
    I did these fixes in the database — in most cases, it’s only one or two components out of many that are associated with the errant location. We made a decision internally that we’re comfortable going with a majority-rules fix — in other words, if seventeen folders in a box are assigned to the off-site storage facility and one folder in the box is assigned to Drawer 8 in room B59, we’re pretty sure that the whole box is actually at the off-site storage facility. We have other data stores (our ILS, for instance) that we can use to double-check this data.
    If you want descriptive information about these components, this is actually kind of tricky — I’ve written about this report on my other blog, and you’re welcome to use it.

  • Box numbers have to be very, very literally the same. “17a” and “17A” aren’t the same. Neither are “8” and “8 ” (see the space?). We’ve written a pre-migration normalization script to help deal with some of these. Others we’ve fixed by updating the database, based on the error report.

Some errors are just warnings, but are really good to clean up.

  • Two collections probably shouldn’t have the same EADID. In most cases, this is a typo and easily fixed by looking at the record.
  • Begin dates shouldn’t be bigger than end dates
    Here’s the error:

    End date: 1061 before begin date: 1960, ignoring end date
    Record:: Resource Component: RU.126/ref14250

    Haha. That’s probably not from before the Battle of Hastings. We can just fix that typo.

  • Digital objects shouldn’t have the same identifier. Again, usually just a typo. The record will still migrate but the migrator will append a string to the end of the identifier to make it unique.
  • Miscellaneous other stuff. For instance:
    Endpoint: http://localhost:8089/repositories/2/batch_imports?migration=ArchivistToolkit
    AT Identifier:RU.703
    Status code: 200
    Status text: OK
    {
     "errors": ["Server error: #<:ValidationException: {:errors=>{\"notes/0/subnotes/1/items/168/label\"=>[\"Property is required but was missing\"]}}>"],
     "saved": []
    }

    Finding and fixing this was a huge pain in the neck. We had a list, deeeeeep into this collection, that was missing a label for one of the entries. Once we found it, we found that we couldn’t edit it in the application and had to figure out how to fix it in the database. We all learned a lot that day.

We’re very happy to hear from others about what errors they’ve found during their migrations, and how they’ve gone about fixing them!

Ode to Nancy-Part 2

So if you don’t love Nancy Lyon by now, you will very soon. I had the chance to meet with her again and do a walkabout through the labyrinth that is MSSA storage. We started in the primary work area in the basement of Sterling. Not very high ceilings, narrow pathways, everything secured. Some things are funny, like the Fortunoff Video Archives (FVA) material I saw waiting in coolers. Regular coolers. Like the kind you might take to the beach, or that comes free with a PBR 12 pack! Lots of signs posted with something to the effect that, “Don’t even think about setting your stuff down here without clearing it with Nancy!” Why, you might ask? Nancy is in charge of space. (I don’t know if that is officially true, but it should be). Space is at such a premium in MSSA that Nancy is preoccupied with space. She wakes up in the middle of the night thinking about ways to optimize space. When people at their staff meetings start asking for more space, Nancy tells them to take their problems to Jesus!

Nancy took me from MSSA across the Wright Reading Room into an area of the building I have never seen.  I don’t know what the original purpose might have been, but it wasn’t for people. It looks like an old dark storage unit I used to overpay for in Shelton. She navigates through the area as she would her childhood home. She very proudly points out a corner area behind a brick partition. “Look, I have paper stored there!” And I say, “But Nancy, what if a spider is in there?!” To which Nancy replies, “I don’t care, we need the space.” She moves on. I’m looking over my shoulder to make sure that spider isn’t following me.

Nancy is not without generosity when it comes to space. She has loaned a set of metal shelves to a guy named Art. Art is the man who comes around the library checking to see if any light bulbs need replacing. (He’s also been known to hang a whiteboard or two). My walkabout bonus was discovering that Art’s supply is located in this basement area, and that over the years Nancy agreed they could occupy the same space and made little labels for his shelves. IDK, maybe she just really gets that people need a place for important work things.

Nancy labels shelves so folks know this is Art's stuff.
Nancy labels shelves so folks know this is Art’s stuff.

There were some other cool treasures stored in this room. I took a few more photos:

 

I looked up the trophy information and learned this was a car design competition sponsored by General Motors.
I looked up the trophy information and learned this was a car design competition sponsored by General Motors.

 

An old steamer trunk.
An old steamer trunk.
A birdhouse and a sculpted head.
A birdhouse and a sculpted head.