Security and Sharing

Over the past month Steelsen Smith from the Enterprise Systems and Services group had the opportunity to attend two events related to work we do in Library IT a NERCOMP sponsored security conference and the ILLiad international resource sharing conference.

The first was Boston College’s annual “Security Camp” – a free one day event for IT professionals. The 2015 agenda included lots of timely material, including presentations on identity and access management, docker (a software packaging and containing system), security scanning, DDOS attacks and more. The full agenda is here.

For anyone who manages the deployment of information systems, it has been impossible to avoid docker. In a nutshell, this technology allows users to bundle all of the interrelated parts of an application into a “container” that can then be run on a physical or virtual server. The advantage is that many code packages can share the same server without the overhead of a full virtual machine per application. The platform has proven to be robust, and the presenter (from MIT) made a great case for docker having applications in the classroom or enterprise. The greatest strength of the solution is that applications dependencies, e.g., Java version, can be updated individually without affecting their co-hosted peers. The software can also run on a hardened read-only OS (CoreOS as an example). Docker should not be trusted as fully secure for hosting potentially hostile containers, however. The main vulnerability of the platform comes from its strength – allowing direct hardware sharing. This means that if an application is carefully written to monitor hardware activity it can learn something about the containers it resides with. Also, if an application is able to successfully compromise the kernel it will have access to all other containers on the machine whereas in a dedicated VM it would require a few extra steps.

Another interesting talk focused on handling distributed denial of service attacks (DDoS) effectively. A DDoS is a very basic attack – it drowns out legitimate website requests by triggering an overwhelming number of invalid requests (like shouting in a room where people are speaking) and has become surprisingly easy – there are sites that will let you control their “botnets” of slave machines for a small fee. These attacks are also effective because they rely on the internal operation of fundamental internet protocols (e.g., SNMP or exploiting the TCP handshake) making them hard to protect against. In fact, the two best defenses (note that firewalls are not at all helpful in a DDoS attack) involve using outside providers to manipulate the internet to deflect traffic away from you. For web requests a CDN (content distribution network) can host your website and split it among datacenters around the world which are collectively able to withstand an attack. For attacks based on amplification (requesting a long answer with a short question) a provider like Incapsula or NeuStar can actually intercept internet traffic for you and scrub it – for a sizable fee. While universities generally do not need to worry as much as banks, if the blogosphere takes issue with something done by your institution then a DDoS attack becomes a real possibility.

A few weeks later came the ILLiad International conference in Virginia Beach, VA. Mostly attended by librarians with presentations focused on resource sharing there were a number of interesting talks that applied directly to work in IT both with our support for interlibrary loan software and discovery.

Linked data was one of the unexpected highlights of the conference with the vendor Zepheira giving talks on how relationships between assets as exposed by linked data can drive use. The theory is that discovery necessarily leads to increased use – therefore the easier it is for search engines and link aggregators to discover your content the easier it will be for users to discover it. The natural extension is that, once discovered, your resources should also be easy to request. Consolidated requesting – having your users register once and search and request through a single interface – is one of the ideal outcomes of a library’s analysis and enhancement of its web presence.

Another useful presentation topic addressed how medical libraries handle requests from independent medical researchers and physicians. A service, loansome doc, allows physicians to affiliate themselves with a library to request medical articles. The library then procures those materials on their behalf. There are more differences than similarities, however, when it comes to how these materials are filled. Some libraries have a nearly automated process while others still provide highly individualized service. Some libraries allow electronic delivery to be automatic while others require approval and payment. While it was fascinating to learn about what different medical libraries are doing it was also interesting to think about how article requesting might work as a general service to the public – allowing the “visitor privilege” to be extended to folks elsewhere on the internet. There are no doubt serious legal considerations, but how this could be safely done is a topic of considerable interest.

In both securing information and sharing information IT systems can help the university and the library within it meet institutional goals (or even just comply with regulations). These two events provided great insight into what our peers are doing (or not doing) and the results in their institutions. Although there was far too much covered for a single blog post, please feel free to email me if you’re interested in notes or to talk about any of the agenda topics.

ArchivesSpace Passes SDR

Yale University Library is implementing an exciting new open source platform for describing and managing special collection materials called ArchivesSpace.

Part of the implementation included working with Yale’s Information Security department to complete a Security Design Review of the application and infrastructure. Working with John Lee from InfoSec is always a great experience. The SDR process, recently improved, is required for any new service or application brought up on the Yale network. It begins with a comprehensive questionnaire about the application, data, and users of the system. InfoSec runs a series of scans (Accunetix, Nessus, etc.) to suss out any vulnerabilities that could compromise the application and Yale network.

The reports detail the vulnerabilities and rank them as High, Med or Low risk. Application owners then work with InfoSec on remediation. Once all High and Med risks are resolved, and Low-level risks remedied or accepted as is, the application receives final security approval. The process is informative and keeps the Library in compliance with Yale security standards.

FindIt, QuickSearch Security Design Review Completed

"data.path Ryoji.Ikeda - 4" by r2hox is licensed under CC BY-SA 2.0
data.path Ryoji.Ikeda – 4” by r2hox is licensed under CC BY-SA 2.0

Yale University Library FindIt and QuickSearch services have completed a Security Design Review (SDR) by the Information Security Office of Yale ITS.  These systems use the Hydra repository solution as the underlying technology stack.  The SDR process is used to provide recommendations for building, improving, or reengineering services to meet University policies, industry best practices, laws, and regulation requirements.  Thanks to Bob Rice for evaluating and implementing the recommendations and Tom Castiello and Marcus Aden from the Information Security Office for their insight and participation.

Data Storage Virtualization and Security Upgrades Continue

LibraryIT’s Information Architecture team continues to manage the SGI/StorHouse storage virtualization upgrade and Trusted Edge implementation, begun this week.

To oversimplify, storage virtualization is the introduction of a virtualization layer that sits in between network-connected data storage spaces. This allows for related files (digital objects, typically) to be stored in unrelated storage areas yet represented in one continuous resource- the end user who is looking for and retrieving files will never see that the whole system they are using lives in separate spots.

Our StorHouse storage currently houses approximately 600TB (terabytes), or about two thousand times the size of our Orbis database, of replicated data. Plans are to implement Trusted Edge, a policy-based/tiered storage component, that will allow us to move files to specific storage based on their use and type. For example archive type data will move off to our archive system, StorHouse, while use copies that should be available to the end user might be moved off to a faster storage area for faster access.