Farewell Yufind: LibraryIT to retire beta service in October

The Enterprise Systems & Architecture group would like to announce that LibraryIT is retiring YuFind, the Yale University Library VuFind beta service. The link http://yufind.library.yale.edu/ will no longer be available after October 31st, 2014. Patron use was discontinued of some time ago and support for the service ceased in 2013.

Yufind was the one of the Library’s first forays into using an interface with facets and run off of a Solr index, and the lessons learned from this beta project have also informed the development of the Blacklight-powered Quicksearch beta, which also uses a central Solr index. 

If you have any questions, comments, or concerns please contact raymond dot frohlich at yale dot edu.

Yale ITS to hold Tech Summit this month, Library IT to present

Join us for the inaugural Yale Technology Summit, a day-long program of conversations with Yale faculty, students, and staff working with innovative and cutting-edge technologies. The event, coordinated by Yale Information Technology Services, is free and open to all members of the Yale community.

Library and Library IT presentations at this event include:

  • Library Development for Digital Repositories: What is this Hydra Fedora stuff?
    In response to a fragmented digital collections environment developed over many years using many systems, the Yale Library has launched a project to unify digital collections within a single open source software framework using Hydra/Fedora. Michael Dula, the Library CTO, will talk about the decision to go open source with Hydra and Fedora as the underlying technologies. Topics will include Yale’s contributions to the open source Hydra community, a demonstration of initial projects, and future development plans and possibilities. 
  • Quicksearch: Universal Search at the University Library
    The Library offers several search interfaces: Orbis and MORRIS search the Library and Law Library catalogs, Articles+ for articles, journals and newspapers, and several digitized collection searches. The many search interfaces present a challenge to our patrons, who have to select the correct search depending on the material they need. The Library will combine several of these search interfaces into one unified ‘Quicksearch’, which over time will become a comprehensive search interface for the majority of Library resources. The Quicksearch poster session will highlight progress on the project so far. We will also provide laptops so Summit Participants can try the new search for themselves.

  • Humanities Data Mining in the Library
    In response to increased scholarly demand, Yale University Library is helping humanists make sense of large amounts of digital data. In this presentation, we will highlight recent projects based on Yale-digitized data, data from large commercial vendors, and data from the Library of Congress. We’ll address 1) working with digitized collections that are subject to license & copyright, 2) thinking about both explicit metadata and latent structure in large digital collections, and 3) moving beyond text to consider machine vision and computational image analysis.

  • Preservation and Access Challenges of Born-Digital Materials
    We will provide an introduction to the scope of born-digital materials at Sterling Memorial Library and the Beinecke Rare Book & Manuscript Library, and in particular will discuss the innovative ways staff at the Yale libraries are collaborating with colleagues on different initiatives, including a digital forensics lab devoted to the capture of born-digital materials, an emulation service that can provide online access to vintage computing environments via a web browser, and a vision for digital preservation to ensure that collection materials we capture today will remain usable in the future.

Watch the conversation on #YaleTechSummit2014 on Twitter!

via 2014 Tech Summit | Yale ITS

Development Notes for 10/13 – 10/17

Just a brief update on the work of our group for the past week.

We continue our efforts in contributing to the Fedora 4 project. We use Fedora as one of the core products in our Hydra implementation. Currently we have several installations of version 3. Version 4 has been in development for a little over a year with an expected release date of June 2015. While Yale has been a financial contributor to the Fedora Commons project for many years now, we only started contributing code to the project in 2013.

The Quicksearch project is also moving along swiftly. This week the major milestone of handling CAS login was completed. This is used for some features in the Blacklight software like the bookbag and search history. CAS is generally simple to integrate with most software products as long as the link between a NetID and the local user database can be made. In the case of Blacklight, making this link became complicated because of the use of several different code libraries in the specific version of Blacklight being used for Quicksearch, which is different from the version we use for our Hydra interface, FindIt.

Almost all efforts this week were related to ingest operations for the Kissinger project. There was also some vacation time taken so the output this week was limited.

Yesterday we met to discuss the development for full text search for objects ingested into Hydra. The work is broken up into the following steps:
1- alter the SOLR index to accept 2 new fields that will store full text
2- alter the Hydra ingest application to store the content of the TXT files into the new SOLR fields
3- setup the Blacklight controllers for handling if/when each of the FT fields are used in user searches
4- develop the Blacklight user interface to allow the FT search option

At this point we are only focused on the first two steps. 3&4 require us to have data in place. We will be moving steps 1 & 2 to the test environment the week of Oct 20 and then roll these changes into production the week of Oct 27. We will be doing all our FT testing with the Day Missions collection which uses a Hydra ingest package very similar to Kissinger.

This is a repost from another location that has more information on our full text search plan. So I will give a brief overview of what that plan looks like with the use cases used to draft this approach.

There are two types of full text search for objects we ingest into Hydra.

The first is the simplest, OCR text from a scanned image like a page in a book or a manuscript. This type of text is treated as an extension of the metadata making it simple to combine into search results since the text is considered open access.

The second is significantly more complex, it is where the contents of the full text require special permission to search so instead of the text being treated as an extension to the metadata, it is treated the same as we treat files that carry special access conditions. This permission would have been granted ahead of time so at the time you execute your full text search it will include results from the restricted items. This use case is currently specific to the Kissinger papers project but is being programmed to scale out as needed.

So the approach we are taking is kind of simple, we place the open access full text into one SOLR field and then the restricted access full text into a field specifically designed for restricted content. At the point when the search is executed, the open access text is searched and the restricted is filtered so that your search is only applied to the restricted contents which you have been granted access to view.