Spotlight on Spotlight

spotlight

Do you have content in blacklight? Do you have content in other silos? Would you like to create dynamic exhibits and/or collections?  Would you like to manage content, display, search, and facets in a highly configurable online interface?  If you answered yes to any of this, welcome to Spotlight!

“Spotlight is open source software that enables librarians, curators, and other content experts to easily build feature-rich websites that showcase collections and objects from a digital repository, uploaded items, or a combination of the two. Spotlight is a plug-in for Blacklight, an open source, Ruby on Rails Engine that provides a basic discovery interface for searching an Apache Solr index.”

Exhibit page content can be directly tweaked from the browser.
Exhibit page content can be directly tweaked from the browser.

On August 9th and 10th the Yale Center for British Art (YCBA) and Yale Library hosted the event “Spotlight on Spotlight”.  We were pleased to have members of the Spotlight team here to give a full demonstration, Q&A, and developer unconference.  Stu Snydman, Gary Geisler, and Chris Beer from Stanford, and Trey Pendragon from Princeton lead the sessions. The main demonstration Tuesday morning included a brief history, a review of the initial use cases, context surrounding the platform, and walk throughs of the application and its features. In the afternoon the Q&A session provided a further chance to answer questions collected from the morning presentation and a live conversation. On Wednesday developers stood up individual instances of the application, exercised its extensibility using the DPLA API to import content, and held further technical discussion. After attending the event Steve Weida, Yale Library Webmaster commented, “Spotlight is exciting technology and has matured at a very impressive pace. Along with our commitment to Omeka, Spotlight could play a key role in the future of the Library’s web presence.”

A full recording of the demonstration is available here:
http://britishart.yale.edu/multimedia-video/27/3681

Project website with codebase and further links:
https://github.com/projectblacklight/spotlight

Event wiki:
https://wiki.duraspace.org/display/hydra/Spotlight%20on%20Spotlight%20at%20Yale

DPLA API:
https://dp.la/info/developers/codex/

YUL gets an updated interface for digital collections search and discovery

On Monday 9/21/2015, YUL’s digital collection discovery interface (findit.library.yale.edu) will go live with a new design modeled on the Quicksearch interface design. These coordinating designs let our users know that they are in the same Yale University Library web space and should expect similar functionality.

The new look and feel of digital collections search at YUL- main page (click the image to get a larger view):

Digital Collections Search look and feel

 

Search results in the new design:

Digital Collections Search look and feel - search results

 

There will also be a few new features in the digital collections search added on Monday. These include:

 

  • an Access Restrictions facet, to limit by either open or restricted accessNew feature: Access facet
  • a Repository facet, to limit to and search within a specific repository at YUL

New feature added: Repository facet

  • a Call Number facet, to limit to and search within call numbers assigned to items

New feature: call number facet

 

Another feature coming soon (but not on Monday) is a date slider with a histogram visualization, which gives users the ability to limit by date range and see the frequency of hits in a given year. To see an example of a feature like this used elsewhere, click on this search of Articles+ and look to the lower left to see the date range and histogram.

Coming soon: date range

 

As always, your feedback is welcomed and appreciated. Please use the feedback link on the bottom center of the digital collections search page (or just click here), and tell us your thoughts!

Fedora 4 work at Open Repositories 2015

Open Repositories 2015 highlighted some of the interesting projects from different institutions.

One such Fedora-based project demonstrated the application and usefulness of the Linked Data platform. The project uses graph-based metadata; research data is cataloged using museum specific FRBRoo and CIDOC Conceptual Reference Model (CRM) ontologies, instead of using classic metadata schemes. The project participants claimed that in comparison to traditional bibliographic metadata schema (such as MODS), “these vocabularies allowed us to express detailed relationships between digital objects and entities (people, places, events, concepts) in a more nuanced way”. The relationships are described in RDF and persisted in Fedora 4. Fedora 4 speaks and understands RDF natively and acts effectively as a Linked Data compliant server.

The benefit for the project is that the user interface (based in Islandora) is driven entirely by RDF and researchers and scholars can directly query the metadata graph by entering powerful queries via a SPARQL endpoint.

I also had the opportunity to present a poster exploring the integration of Fedora 4 with Sakai (known locally as Classes v2). Fedora 4 uses a backend technology (ModeShape) that is an implementation of the Java Content Repository (JCR) standard. Sakai offers some support for JCR, making direct integration feasible. Sakai content can be directly accessed in Fedora 4 and Sakai can access Fedora 4 content as if it were a part of its own datastore. Before Fedora 4, custom tools would have to be developed to tackle the linking, making inter-operability harder and less maintainable. Fedora 4 offers a new way, thanks to a modern technology stack. A number of leading institutions that run both Sakai and Fedora expressed an interest in further exploring the integration possibilities via this approach.

LDCX 2015

ldcx-250w

Approximately 70 people convened at Lathrop Library on the Stanford University campus to collaborate on the converging goals of the library, archive, and museum community at the 6th annual ldcx 2015 conference. While the schedule was ad-hoc, composed of lighting talks, plenary sessions, topic groups, and informal breakouts, the issues were well rooted in the themes of linked data models, discovery applications, and digital asset management. One of the long standing goals of the community has been bringing together individual and institutional efforts and this was very much manifest at the conference. There was a fruitful balance of sharing past achievement, making ongoing progress and planning for challenges to come. The Hydra stack has made its presence felt in almost every arena. Development is at a stage where best practices and design abstractions are emerging. Implementation of the Linked Data Platform (LDP), and the Portland Commons Data Model (PCDM) holds much promise as foundations of the future. Surprisingly there was very little coverage of Digital Preservation, but perhaps this a potential vacuum to be filled later. While is difficult to give adequate attention to everything covered, for more please check out:

Projecthydra
Spotlight
Geoblacklight
Arclight and next-gen archives
Mirador
Linked Data Platform
Portland Commons Data Model
IIIF Image and Presentation Specification
Sufia
Fedora 4
Avalon

Quarterly Report from Fedora, October – December 2014

March 3, 2015

 

Read it online: http://bit.ly/1M39u2P

Contact: David Wilcox <dwilcox@duraspace.org>

 

The Quarterly Report from Fedora, October – December 2014

 

=================

Fedora Development

=================

In the past quarter, the development team released the production release of Fedora 4.0; detailed release notes are here:

 

Fedora 4.0.0 Production Release Notes [1]

 

This significant release signals the effectiveness of an international and complex community source project in delivering a modern repository platform with features that meet or exceed current use cases in the management of institutional digital assets. Fedora 4.0 features include vast improvements in scalability, linked data capabilities, research data support, modularity, ease of use and more. Download the latest Fedora 4 release online [2].

 

Fedora 4.0 is only the first release in the 4.x line – a number of features [3], including support for Fedora 3 to Fedora 4 migrations [4], are planned for subsequent 4.x releases. The production code sprint schedule [5] includes both feature development sprints and code maintenance sprints (to address issues as they arise). Please consider contributing developer time to these sprints by contacting Andrew Woods (awoods@duraspace.org), the Fedora Technical Lead.

 

=========

Fundraising

=========

We have concluded our official annual membership campaign, which runs from early May until the end of October (though we will continue to accept new project members throughout the year whenever the opportunity arises). The annual membership goal for 2014 was $500,000, and we exceeded this goal by raising $525,083. The Fedora project has a total of 63 members; this includes 24 new members and 39 renewals from 2013 members. The Fedora Product Manager will continue to coordinate with members of the Fedora Steering Group to expand the pool of DuraSpace members supporting the Fedora project and build a sustainable funding base for the future.

 

==============================

Community Engagement and Outreach

==============================

In the past quarter, developers have continued to hold daily meetings in conjunction with development sprints, as well as weekly Fedora committer calls attended by the broader community.

 

Members of the DC Area Fedora User Group met at the National Library of Medicine in October [6] to present project updates and learn about the latest Fedora 4 developments. This group meets twice annually to stay up-to-date on Fedora-related developments in the DC area; the next meeting will be on March 31 at the USDA National Agriculture Library [7].

 

In an effort to increase international outreach, the Fedora Product Manager traveled to Melbourne, Victoria, Australia to attend the eResearch Australasia [8] conference at the end of October. This was a great opportunity to engage with the regional Fedora community face-to-face, and meet with current and prospective DuraSpace members supporting Fedora.

 

In December, representatives from Fedora 4 Beta Pilot institutions [9] participated in a well-attended panel presentation at the CNI Fall meeting [10]. The success of the pilot program was an essential part of releasing Fedora 4.0 into production, and we are currently proceeding with another round of pilot projects [11] to support Fedora 3 to Fedora 4 migrations.

 

==============

Fedora 4 Training

==============

The 4th quarter of 2014 featured three Fedora 4 training workshops. The first training workshop was held in Washington, DC on October 7 following the DC Fedora User Group meeting [12]. It was well attended (32 participants) and the feedback was very positive. The next training workshop was held in Denver, Colorado on October 16 [13] following Islandora Camp CO. Attendance for this event was capped at 30, and it was full with a waiting list. The final October training workshop was held in Melbourne, Victoria, Australia on October 31 [14] following the eResearch Australasia conference. This event had 25 attendees and was an excellent opportunity for engagement with regional Fedora community members.

 

===========================

Upcoming Conferences and Events

===========================

The annual DuraSpace Summit will take place March 11-12 in Washington, DC. This important event is an opportunity for all DuraSpace members to celebrate the successes of the past year and discuss future plans for each of the DuraSpace projects: Fedora, DSpace, and VIVO.

 

Plans are underway to develop and deliver a 3-day Fedora training event in the Fall of 2015. This event, and the materials developed to support it, will increase engagement with Fedora 4 and provide tools for community members to host their own training events all over the world.

 

=========

References

=========

[1]   https://wiki.duraspace.org/display/FF/Fedora+4.0.0+Release+Notes

[2]   https://wiki.duraspace.org/display/FF/Downloads

[3]   https://wiki.duraspace.org/display/FF/Roadmap

[4]   https://wiki.duraspace.org/display/FF/Fedora+3+to+4+Data+Migration

[5]   https://wiki.duraspace.org/display/FF/Production+Sprint+Schedule

[6]   https://wiki.duraspace.org/display/Events/Washington+D.C.+Fedora+User+Group+Meeting+-+6-7+Oct+2014

[7]   https://wiki.duraspace.org/display/Events/Washington+D.C.+Fedora+User+Group+Meeting%3A+31+March+-+1+April+2015

[8]   http://conference.eresearch.edu.au/

[9]   https://wiki.duraspace.org/display/FF/2014+Q4+Draft+Report#

[10] http://www.cni.org/events/membership-meetings/past-meetings/fall-2014/

[11] https://wiki.duraspace.org/display/FF/Fedora+3+to+4+Upgration+Pilots

[12] https://wiki.duraspace.org/display/Events/Washington+D.C.+Fedora+User+Group+Meeting+-+6-7+Oct+2014

[13] https://wiki.duraspace.org/display/Events/Denver%2C+CO+Fedora+4+Training+Workshop+-+16+Oct+2014

[14] https://wiki.duraspace.org/display/Events/eResearch+Australasia+Fedora+4+Training+Workshop+-+31+Oct+2014

 

David Wilcox

Fedora Product Manager

DuraSpace

dwilcox@duraspace.org

Skype Name: david.wilcox82

 

Debra Hanken Kurtz Appointed DuraSpace CEO

From Michele Kimpton, Chief Executive Officer, DuraSpace, and Paul N. Courant, Chair, DuraSpace Board of Directors, University of Michigan, on behalf of the DuraSpace Board of Directors

Winchester, MA  It brings us great pleasure to announce that the DuraSpace Board of Directors has chosen Debra Hanken Kurtz to serve as the new CEO for the Organization.  Kurtz is currently the Executive Director of the Texas Digital Library.  She will begin in her new role on February 16, 2015 and establish an office in Austin, Texas to manage DuraSpace business operations.

Kurtz brings key relevant experience and skills to DuraSpace. As Executive Director of the Texas Digital Library, she managed and grew membership, operations, and services. She participates in working and planning groups for DPN and SHARE. At both Duke University and UNC Chapel Hill Libraries, Kurtz provided leadership and direction for digital collections, public websites, and early planning efforts for both libraries’ institutional repositories. She was an active partner within the Triangle Research Libraries Network and has been a voice for Kuali OLE, an open-source integrated library system built by and for academic and research libraries. Kurtz’s complete background can be found on linkedin.

The DuraSpace team and Board of Directors is enthusiastic about working with Kurtz and establishing the strategic direction for DuraSpace over the coming years in the rapidly evolving landscape of digital research and scholarship. Leveraging open source technology development to advance our communities’ goals and objectives will continue to be a focus for the DuraSpace organization.

All member organizations will have an opportunity to meet and talk with Kurtz at the upcoming DuraSpace Membership Summit in Washington DC on March 11-12. Meeting invitations and details will be sent out later this week to all member organizations.

The search was conducted by a committee of the Board, augmented by MacKenzie Smith, University Librarian at UC Davis, and Julia Trimmer, Manager Faculty Data Systems, Duke University.

We are grateful to everyone for their continued support throughout this process. As a result of your ongoing interest, engagement and participation DuraSpace is well-positioned to continue to serve our communities by providing leadership and innovation in the development and deployment of open source technologies and managed services that promote durable, persistent access to digital data.

The DuraSpace Board of Directors:

Dan Cohen, Executive Director, Digital Public Library of America

Mike Conlon, Co-Director, University of Florida Clinical and Translational Science Institute, and Director of Biomedical Informatics, UF College of Medicine

Paul N. Courant, Harold T. Shapiro Professor of Public Policy, Professor of Economics, and Professor of Information, the University of Michigan

Tom Cramer, Chief Technology Strategist and Associate Director of Digital Library Systems & Services, Stanford University

Charles J. Henry, President, Council on Library and Information Resources

Anne Jarvis, University Librarian, University of Cambridge

Heather Joseph, Executive Director, Scholarly Publishing and Academic Resources Coalition

Tyler Walters, Dean of University Libraries, Virginia Tech University

Laura C. Wood, Director of Tisch Library, Tufts University

Fedora 4 Committers Announced

Open source software projects typically have two governance structures, one manages the leadership and strategic roadmap and the second manages the source code. Both are critical to the success of a software development initiative but when working with open source software, which allows contributions from virtually anyone, management of the source code is critical to the survival of the product.

In December 2014, Fedora 4 was officially released after two years of planning and development. Yale’s contributions to the creation of Fedora 4 includes financial commitments as well as staffing. From the Digital Library Programming Services group in Library IT, Osman Din and Eric James have made substantial contributions to the development of Fedora 4 since July 2013.

Now that Fedora 4 has moved from a beta platform to a production worthy application, the Fedora Leadership Committee recommended to form the governance structure for maintaining the source code. The nominees included the top contributing programmers from all partner institutions. Osman Din from Yale was nominated and accepted a position in the Fedora Committers group. The full list with roles and responsibilities can be viewed here.

Presentation on Digital Repository Development at CNI Fall Meeting

Library IT’s CTO Michael Dula presented on digital repository development last week at the Coalition for Networked Information’s fall 2014 meeting in Washington D.C. The presentation is linked here.

via CNI.org: The Coalition for Networked Information (CNI) is dedicated to supporting the transformative promise of digital information technology for the advancement of scholarly communication and the enrichment of intellectual productivity. Some 220 institutions representing higher education, publishing, information technology, scholarly and professional organizations, foundations, and libraries and library organizations make up CNI’s members; CNI is entirely funded through membership dues. Semi-annual membership meetings bring together representatives of CNI’s constituencies to discuss ongoing and new projects, and to plan for future initiatives.

Fedora 4 development notes for November 2014

Fedora is used locally in a number of applications including our Hydra instance, Finding Aids Database, AMEEL and the Joel Sumner Smith collection. In our local instances we have been using versions of Fedora 3 since 2006. Fedora 3.8, to be released in December, will be the final release of the 3.x line with energy at that pointed shifted primarily to Fedora 4.

Fedora 4 consists of a complete refactoring of the fedora 3 code now built on top of the Modeshape and JCR repository APIs, with improvements in ease of installation, scaling, and RDF support. Below is a full list of features.

Yale contributes financially to the product as a bronze member and we also contribute to the programming efforts. Osman Din and Eric James, both in Digital Library Programming Services, actively participate in development. In addition, I sit on the Fedora Leadership Committee that handles the gathering of use cases, prioritization of features being programmed as well as budget planning.

Fedora 4.0 is now undergoing a cycle of beta releases to allow institutions to begin adopting it. On November 9th Fedora 4.0 Beta 4 was released again with an eye towards simple installation and support for performance and repository size. Fedora 4.1, to begin development in 2015, will focus on supporting the upgrade/migration process from Fedora 3.x. Some peers, including Penn State, have already begun to replace some of their Fedora 3 repositories with Fedora 4. We are also starting to think about how our migration strategies can dovetail with fedora 4.1 for our own adoption starting around the of summer of 2015.

With that little bit of background, I thought I would share the recent development notes. If you have questions about Fedora, do not hesitate to contact me (michael.friscia@yale.edu), Eric James (eric.james@yale.edu) or Osman Din (osman.din@yale.edu).

(note: Eric provided valuable editorial feedback for the above post)

======================================================

Release date: 9 November, 2014

 

We are proud to announce the fourth Beta release of Fedora 4. In the continuing effort to complete the Fedora 4 feature set, this Beta release is one of several leading up to the Fedora 4 release. Full release notes and downloads are available on the wiki: https://wiki.duraspace.org/display/FF/Fedora+4.0+Beta+4+Release+Notes.

 

==============

Release Manager

==============

Andrew Woods (DuraSpace)

 

==========

Contributors

==========

—————————-

1) Sprint Developers

 

Adam Soroka (University of Virginia)

Benjamin Armintor (Columbia University)

Chris Beer (Stanford University)

Esme Cowles (University of California, San Diego)

Giulia Hill (University of California, Berkeley)

Jared Whiklo (University of Manitoba)

Jon Roby (University of Manitoba)

Kevin S. Clarke (University of California, Los Angeles)

Longshou Situ (University of California, San Diego)

Michael Durbin (University of Virginia)

Mohamed Mohideen Abdul Rasheed (University of Maryland)

Osman Din (Yale University)

 

————————————

2) Community Developers

 

Aaron Coburn (Amherst College)

Frank Asseg (FIZ Karlsruhe)

Nikhil Trivedi (Art Institute of Chicago)

 

=======

Features

=======

—————————–

1) Removed features

In the interest of producing a stable, well-tested release, the development team identified and removed a number of under-developed features that had not been sufficiently tested and documented. These features were not identified as high priorities by the community, but they may be re-introduced in later versions of Fedora 4 based on community feedback.

 

– Namespace [1] creation/deletion endpoint

– Locks endpoint

– Workspaces other than the ‘default’

– Admin internal search endpoints

– Policy-driven storage

– Batch operations in single request

– Auto-versioning configuration option

– Sitemaps endpoint

– Writable nodetypes endpoint

 

——————

2) REST API

The REST API is one of the core Fedora 4 components, and this release brings it more in line with the emerging W3C Linked Data Platform 1.0 [2] specification. An example of this is the new tombstone functionality [3]; URIs are not supposed to be reused, so deleting a resource leaves a tombstone in its place that serves as a notification that the resource has been deleted. Child nodes of deleted resources also leave tombstones. Other examples of LDP-related REST API changes include:

 

– Support for hashed URIs [4] as subjects and objects in triples.

– Binary and binary description model changed:

– From: binary description at /resource, and binary at /resource/fcr:content,

– To: binary description at /resource/fcr:metadata, and binary at /resource

– Labels are required when creating new versions of resources [5].

– Content-Disposition, Content-Length, Content-Type are now available on HEAD requests [6].

 

—————-

3) Ontology

The Fedora 4 ontology [7] was previously broken out into several different namespaces, but these have now been collapsed into the repository [8] namespace. Additionally, the oai-pmh [9] namespace has been added to the ontology.

 

———-

4) LDP

Fedora 4 provides native linked data functionality, primarily by conforming with the W3C Linked Data Platform 1.0 [10] specification. The LDP 1.0 test suite [11] is executed against the Fedora 4 codebase as a part of the standard build process, and changes are made as necessary to pass the tests. Additionally, integrations tests for real-world RDF assertions [12] have also been added to the codebase.

 

Recent changes to suport LDP include:

 

– When serializing to RDF, child resources are included in responses [13], versus having to traverse auto-generated intermediate nodes.

– All RDF types on properties are now supported [14].

– Prefer/Preference-Applied headers have been updated [15] to match the latest requirements [16].

– RDF language types are now supported [17].

– The full range of LDP containers [18] are now supported

– Changed terminology from:

– object -> container

– datastream -> non-rdf-source-description

– Replaced relationships from:

– hasContent/isContentOf, to:

– describes/isDescribedBy

 

—————————

5) External modules

In additional to the core Fedora 4 codebase, there are a number of supported external modules that offer useful extensions to the repository. Two such modules are being introduced in the Fedora 4.0 Beta 4 release: Fedora 4 OAI Provider [19] and Fcrepo Camel [20].

 

The Fedora 4 OAI Provider implements the Open Archives Protocol Version 2.0 [21] using Fedora 4 as the backend. It exposes an endpoint at  /oai  which accepts OAI conforming HTTP requests. A Fedora resources containing set information can be created then exposed at the module’s endpoint which accepts HTTP POST requests containing serialized Set information adhering to the OAI schema.

 

Fcrepo Camel provides access to an external Fedora 4 Containers API [22] for use with Apache Camel [23]. Camel is middleware for writing message-based integrations, so this component can be used to connect Fedora 4 an extensive number of external systems [24], including Solr and Fuseki. This functionality is similar to that of the Fcrepo Message Consumer [25], except it is based on a well-maintained Apache project rather than being custom Fedora 4 code. Therefore, this component is likely to replace the Message Consumer in the future, though the Message Consumer will still be part of the Fedora 4.0 release.

 

————————

6) Admin Console

The administrative console provides a simple HTML user interface for viewing the contents of the repository and accessing functionality provided by the REST API. This release introduces support for custom velocity templates [26] based on the hierarchy of mixing types. Now, if you create a new mixin type, the templates to be used in the admin console will include the resource’s primary type, mixin types, and parent types thereof.

 

—————–

7) Projection

The projection [27] (also known as federation) feature allows Fedora 4 to connect to external storage media via a pluggable connector framework. A read-only filesystem connector is included with this release.

 

Additionally, Fedora 4 now has standardized support for externally-referenced content [28].

 

—————————

8) Java client library

The Java Client Library [29] is an example of a module that was conceived by Fedora community members who recognized a common need and rallied to design [30] and implement the functionality. This release includes an improvement to list the children of a resource [31] in the client library.

 

———–

9) Build

A key component under the covers of Fedora 4 is ModeShape [32], one that the Fedora 4 project tracks closely. Fedora 4.0 Beta 4 includes an upgrade to the production version of ModeShape 4.0.0 [33].

 

Fedora 4 comes with built-in profiling machinery that keeps track of how many times specific services have been requested, how long each request takes to be serviced, etc. These metrics can be visualized using Graphite [34]. Because Graphite can be difficult to setup and configure [35], this release includes a Packer.io build [36] which completely automates the process of standing up a Graphite server.

 

Additionally, the pluggable role-based [37] and XACML [38] authorization modules have been pre-packaged into fcrepo-webapp-plus [39]. This project builds custom-configured fcrepo4 webapp war files that include extra dependencies and configuration options.

 

————————-

10) Test Coverage

Unit and Integration test coverage [40] is a vital factor in maintaining a healthy code base. The following are the code coverage statistics for this release.

 

– Unit tests: 66.2%

– Integration tests: 69.4%

– Overall coverage: 82.5%

 

=========

References

=========

[1]  https://wiki.duraspace.org/display/FF/Glossary#Glossary-namespaceNamespace

[2]  http://www.w3.org/TR/ldp/

[3]  https://wiki.duraspace.org/display/FF/RESTful+HTTP+API#RESTfulHTTPAPI-RedDELETEDeletearesource

[4]  https://github.com/fcrepo4/fcrepo4/commit/5c30c743bb05ef627acc90f4b037b118c7d9de9c

[5]  https://wiki.duraspace.org/display/FF/Versioning#RESTfulHTTPAPI-Versioning-BluePOSTCreateanewversionofanobject

[6]  https://wiki.duraspace.org/display/FF/RESTful+HTTP+API+-+Containers

[7]  https://github.com/fcrepo4/ontology

[8]  http://fedora.info/definitions/v4/repository

[9]  https://github.com/fcrepo4/ontology/blob/master/oai-pmh.rdf

[10] http://www.w3.org/TR/ldp/

[11] http://w3c.github.io/ldp-testsuite/

[12] https://github.com/fcrepo4/fcrepo4/pull/579

[13] https://github.com/fcrepo4/fcrepo4/pull/542

[14] https://github.com/fcrepo4/fcrepo4/pull/587

[15] https://github.com/fcrepo4/fcrepo4/pull/451

[16] http://tools.ietf.org/html/rfc7240#page-7

[17] https://github.com/fcrepo4/fcrepo4/pull/586

[18] https://github.com/fcrepo4/fcrepo4/pull/594

[19] https://github.com/fcrepo4-labs/fcrepo4-oaiprovider

[20] https://github.com/fcrepo4-labs/fcrepo-camel

[21] http://www.openarchives.org/OAI/openarchivesprotocol.html

[22] https://wiki.duraspace.org/display/FF/RESTful+HTTP+API+-+Containers

[23] https://camel.apache.org

[24] https://camel.apache.org/components.html

[25] https://github.com/fcrepo4/fcrepo-message-consumer

[26] https://velocity.apache.org/engine/releases/velocity-1.5/user-guide.html#velocity_template_language_vtl:_an_introduction

[27] https://wiki.duraspace.org/display/FF/Federation

[28] https://wiki.duraspace.org/display/FF/RESTful+HTTP+API+-+Containers#RESTfulHTTPAPI-Containers-external-content

[29] https://github.com/fcrepo4-labs/fcrepo4-client

[30] https://wiki.duraspace.org/display/FF/Design+-+Java+Client+Library

[31] https://github.com/fcrepo4-labs/fcrepo4-client/pull/12

[32] http://modeshape.jboss.org

[33] http://modeshape.jboss.org/downloads/downloads4-0-0-final.html

[34] http://graphite.wikidot.com

[35] https://wiki.duraspace.org/display/FF/Setup+a+Graphite+instance

[36] https://github.com/fcrepo4-labs/fcrepo4-packer-graphite

[37] https://wiki.duraspace.org/display/FF/Basic+Role-based+Authorization+Delegate

[38] https://wiki.duraspace.org/display/FF/XACML+Authorization+Delegate

[39] https://github.com/fcrepo4-labs/fcrepo-webapp-plus

[40] http://sonar.fcrepo.org/dashboard/index/1

Development Notes for 10/13 – 10/17

Just a brief update on the work of our group for the past week.

We continue our efforts in contributing to the Fedora 4 project. We use Fedora as one of the core products in our Hydra implementation. Currently we have several installations of version 3. Version 4 has been in development for a little over a year with an expected release date of June 2015. While Yale has been a financial contributor to the Fedora Commons project for many years now, we only started contributing code to the project in 2013.

The Quicksearch project is also moving along swiftly. This week the major milestone of handling CAS login was completed. This is used for some features in the Blacklight software like the bookbag and search history. CAS is generally simple to integrate with most software products as long as the link between a NetID and the local user database can be made. In the case of Blacklight, making this link became complicated because of the use of several different code libraries in the specific version of Blacklight being used for Quicksearch, which is different from the version we use for our Hydra interface, FindIt.

Almost all efforts this week were related to ingest operations for the Kissinger project. There was also some vacation time taken so the output this week was limited.

Yesterday we met to discuss the development for full text search for objects ingested into Hydra. The work is broken up into the following steps:
1- alter the SOLR index to accept 2 new fields that will store full text
2- alter the Hydra ingest application to store the content of the TXT files into the new SOLR fields
3- setup the Blacklight controllers for handling if/when each of the FT fields are used in user searches
4- develop the Blacklight user interface to allow the FT search option

At this point we are only focused on the first two steps. 3&4 require us to have data in place. We will be moving steps 1 & 2 to the test environment the week of Oct 20 and then roll these changes into production the week of Oct 27. We will be doing all our FT testing with the Day Missions collection which uses a Hydra ingest package very similar to Kissinger.

This is a repost from another location that has more information on our full text search plan. So I will give a brief overview of what that plan looks like with the use cases used to draft this approach.

There are two types of full text search for objects we ingest into Hydra.

The first is the simplest, OCR text from a scanned image like a page in a book or a manuscript. This type of text is treated as an extension of the metadata making it simple to combine into search results since the text is considered open access.

The second is significantly more complex, it is where the contents of the full text require special permission to search so instead of the text being treated as an extension to the metadata, it is treated the same as we treat files that carry special access conditions. This permission would have been granted ahead of time so at the time you execute your full text search it will include results from the restricted items. This use case is currently specific to the Kissinger papers project but is being programmed to scale out as needed.

So the approach we are taking is kind of simple, we place the open access full text into one SOLR field and then the restricted access full text into a field specifically designed for restricted content. At the point when the search is executed, the open access text is searched and the restricted is filtered so that your search is only applied to the restricted contents which you have been granted access to view.