What Could Possiblye Go Wrong? More on Project Management Themes as we Implement the ArchivesSpace PUI

Failure. What don’t I love about failure? Yes, it’s true, my Netflix queue is full of disaster-recovery documentaries, and for *fun* I read case studies about large-scale failures (Note to self: Get dictionary for Christmas).  And I love it not for the reasons you might think. I don’t like wreckage, rubbernecking, or even commotion. I don’t ever want people to be in pain. I mean, I cover my eyes while watching Law & Order! I like examining failure because it’s an opportunity to learn how people think, and the assumptions they make. It’s human to fail, but from a young age we are socially conditioned to avoid it all costs, which ironically, often leads to failure.

I have a very high comfort level with failure. Always have. Hello, old friend. I’m naturally a very serious person. Not that I don’t have fun, or at least my version of it, but in general I can think too much, worry a bit too much, and when I’m all up in it, I can sometimes forget the standard social norms and graces, like polite nodding and small talk. Over time I’ve learned to provide folks a heads up about it. I might tell them not to look directly at my face when I’m thinking, because you’ll think I’m trying to kill you, but I’m not! I just have a terrible thinking face, because I might let everything physical go slack, so I can direct all my internal resources to my sequence of thought. Some of this comes with the job of being a PM. You are trained to look for potential sources of risk and failure, the one thing that everyone else might miss. So you wander a little further into the rain sometimes. What you’re doing though, is trying to protect everyone. That’s how I feel about it. It sounds strange, but by thinking about failure you are trying to bolster their success. I might issue a preface such as, “I’m going to be me for a moment here, and do six months of thinking in the next 3 minutes, so buckle up.” Sometimes I feel like I’m sticking out in the day-to-day, and making folks a bit uncomfortable with my serious outlook. So inevitably when a failure comes, or an emergency, or crisis, I don’t mind as much, because suddenly I become the most normal person in the room. When everything is going up in flames, I’m just like, “Dude, bring it!”

It’s interesting to try and pinpoint where failure begins. Failure can be cumulative or discrete. One of the most referenced examples of failure is the Challenger explosion. I have discussed this case in a group I participate in called #fail4lib, a small subset of the #code4lib group. It’s sort of a support group for people like me! Most folks are familiar with the events of that morning, however, what may be less known, is that the failure began more than a year before that fateful morning. More than a year before the scheduled launch, engineering tests and launch simulations clearly demonstrated the O-ring materials did not consistently perform well. More than a year before that clear, blue, cold morning, a person with experience, integrity, and valid concerns, spoke up and said there is a problem, but he was told by management to back down. Most people view the Challenger explosion as the result of something that went wrong that day, that moment. It’s understandable, the need to see something so horrific as a total accident. And it isn’t something anyone ever wanted to happen. Not ever. But this wasn’t accidental. This was a cumulative failure, not discrete. And the man who spoke up was certainly not the only point of failure/course correction, but I suppose for various reasons his story stands out to me. So what did we learn from this? What did NASA? To this day their procedures reflect the lessons learned from this event. I think an important lesson is if someone raises a concern in their area of expertise or exposure, follow it through. Walk their beat with them, see it as close to their perspective as possible. Have a risk registry for the project and add the raised concern. The risk registry would document the stated issue, the likelihood of it occurring, and a value for what the impact would be. The risk registry also contains your organization’s strategy for dealing with it, whether Avoid, Control, Transfer, or Accept. No one individual or event can control failure, but there are methods to mitigate or better prepare for it to reduce impact. The playwright Arthur Miller is also very reflective about the inevitably of failure in some form, and how as individuals, or organizations, we can better understand the value it can hold.

There is another event a few years back that is a more discrete failure, the crash and sinking of the cruise ship Costa Concordia. First things first, 33 souls were lost. 32 during the sinking, and 1 during recovery. And you can never take that lightly. The captain (who went to trial for manslaughter and abandoning ship) strayed off the chartered course last minute. Well, then came the recovery effort, led by a project manager from Italy. I think for a PM, getting the call to manage a project like this is like going to the show. You’ve been called up to the majors. As PM on such an effort, you have a real countdown clock, each day of the plan is costing hundreds of thousands of dollars, you have unprecedented magnitude of risk, environmental variables you cannot control, and the whole world waiting as a stakeholder. You have to be on your game. You have to think everything through to the greatest extent.You have to imagine multiple outcomes, and prepare for all of them. Managing a recovery effort, or the result of a failed project, means even less tolerance for any additional errors. Costa Concordia was two times the size of the Titanic! It carried 100 thousand gallons of fuel, and it carried supplies and chemicals for 4200 people. It sank in a fragile eco-system where a rare breed of mussel lived. Each diver working on the recovery can only stay down for 45 minutes-talk about a project sprint! This was a half-billion dollar cruise ship, and the recovery effort triple-exceeded that price tag. The force required to to right the ship was one and a half times that required for a space shuttle takeoff!  It took over 7 hours to right it 1/6th of the way up, and once they start, they can’t stop, so they worked even in the dark of night. Can you imagine getting to be the project manager on this one? I’m so nerdy about it, I think he should write a book and go on tour, and I’d be standing first in line to get his autograph.

In my experience, most projects will fail at the beginning or at the end. Why? This is when the thinking and planning require more precision and thoroughness. You must begin and end with input from more than just the project team. People may under-plan, or misunderstand the project’s purpose. People may also assume the project is an isolated event in an organization, or they may not think through what a post-rollout requires. A good PM knows the finish line is never where you think it is going to be. Post-rollout is such an important phase of the project lifecycle. Whether you scheduled a soft-launch, or just a straight cut-over to new implementation, post-rollout still requires a plan. Ideally a smaller “release team” is prepped to monitor production for a two-four week period, and also resolve and archive the project artifacts. Always remember the project doesn’t end the day you go into production. And don’t wait for failure to happen before talking about it. Make it part of your Before Action Review environmental scan. Ask straight up, “How might we fail?” “What could go wrong?” Don’t make it a taboo subject. Don’t assume you won’t be able to know how you might fail before you get started on a project. One of my earliest lessons in enterprise systems administration was to always test for failure.  A colleague from IBM taught me that. When you are testing something, don’t limit scenarios to the “happy path” or the regular use of the system/application. Create tests that you know will fail/not pass go, and see what the outcome is. Did you receive an error message? Did it go into the void? Did that test item get processed anyway, even though you thought it would fail? Failure should be part of your design thinking, and project scenarios. Just as professional sports teams hold practice games in manufactured conditions to simulate potential game-day variables (practicing with blaring music, heckling, artificial rain/snow/cold), organizations could include failure or chaos simulations during their own testing/pre-production efforts. Quite simply, before implementation try to break stuff. You’ll gain more insight into your product then you may anticipate.

It is important not to see success and failure as binaries. An organization’s culture, or philosophy about failure, contributes to a project’s overall success. I’d love to encourage more open conversation about failure, and help organizations define their relationship to it and tolerance for it. There are degrees of failure. Some failure is unavoidable due to complexity, other failure is entirely preventable. There is new thinking on what is called “Intelligent Failure.” These failures are largely experimental in nature. They occur as an organization, or a team, are moving through an ideation/development stage of a product or project design. Some businesses encourage their development staff to “fail faster” or to “fail often.” Some of it depends on the type of industry you work in, and how risk-tolerant you are as an organization. For example, failing often isn’t going to work in hospitals or healthcare. There is a spectrum though. I believe every organization could, and should, have several published methods and tools in place for how to locate failure, discuss it, and ideally, learn from it. There are two interesting areas that often govern attitude towards failure. One is the “sunk-cost fallacy.” This is when someone, or an organization, decides to keep pursuing their course of action or strategy, because they have already invested time and money into it. Even if signs, or instincts, or pure facts are presented, psychologically a decision is made to stick with it and hope for the best outcome. Another is an assumption about consent or control. Sometimes only the most vocal person in the room is listened to. Others may be sitting there, but not speaking up, even if they are holding strong opinions about whether or not to continue down a stated path. I mentioned that every organization should have tools and methods in place for helping manage failure. In this scenario, some helpful methods would include having an agreed upon set of decision making criteria in advance of the project. Setting decision criteria in advance will make it less emotional, if/when a time comes to make a hard choice to evaluate if a project or decision is still working. I also think it helps to document decision making, so it can be referred to uniformly and objectively throughout the project lifecycle.

As we implement the new search and discovery platform for special collections and archives, I have several areas of potential failure in my mind. One of the most common is not having enough time. Everyone working on the project has full time commitments within their regular job. In estimating how much time anything will take, I try to be deeply thorough in my environmental scan, making notes and considerations for other major projects and events within the same time span, including a move back-in after a major renovation in Manuscripts and Archives. Not making preferred deadlines, or the slow shifting of the milestones is always a big concern for me. As a PM, your role is to keep everything moving forward. The work, the teams, the ideas, and conversations. Sometimes you are also managing external parties and vendors, who have their own objectives and deadlines in addition to yours. This project has several technical dependencies within other areas of our infrastructure, and making sure they are also completing successfully is another area of risk. Everyone involved with the project will perceive failure in their own way, just as they’ll perceive project success in their own way. Expectation is another major area where there will be degrees of success and failure. As the PM, you must do your best to be honest about potential points of risk and failure, but also not become paralyzed by them. There is no such state as perfect. There will always be someone in the room who isn’t happy, and seated directly across from them will be someone who is absolutely delighted over your hard work. You can’t take too much of any of it to heart. Just always stick to the philosophy of bringing as much logic and kindness to every situation as possible. In closing, I should note that as much as I am lunch buddies with failure, I spend a good amount of time on the science and art of happiness too! Whatever level you are at fighting the good fight, here is some general advice from my favorite good-hearted failure:

And always remember to put cameras in the back!

 

Leave a Reply

Your email address will not be published. Required fields are marked *