All posts by Arcadia Falcone

New documentation available for librarians and library staff

New documentation for Quicksearch is now available on the Cataloging  @ Yale site, under Discovery Systems Documentation. This documentation is aimed at librarians and staff, and answers questions such as:

When I search “Author”, what MARC fields am I searching? (Quicksearch Search Indexes)

Where do the facet terms come from? (Quicksearch Facet Indexes)

Why are all the RDA records at the end when I sort results by publication date? (Quicksearch Sort Indexes)

How does Quicksearch decide which format terms to assign to a resource? (Quicksearch MARC Format Mapping)

I made a change in Orbis…when can I expect to see it in Quicksearch? And how can I make sure my record won’t fail to ingest into Quicksearch? (How and When Cataloging Data Appears in Quicksearch)

Why aren’t all my search terms in some of my results? And what special syntax can I use to improve my searches? (Quicksearch Expert Search Tips)

What really happens when I click the search button in Books+? And how does the Quicksearch relevancy ranking work? (Technical Overview of Searching in Quicksearch and Books+)

How can I keep up to date with Quicksearch developments? (Quicksearch Project Blog)

Where do I point patrons for more information? (Public Quicksearch Help Site) (work in progress)

Many thanks to Dominique Bourassa for transforming these documents into Drupal, the members of the Books+ Search Analysis Working Group for their work on documenting the Quicksearch Solr indexes, and attendees of SearchFest for informing this documentation through their presentations and questions!

Quicksearch full reload statistics

During the last week of October, Quicksearch was refreshed with a complete reload of records from Orbis and Morris, and a new indexing configuration was implemented. Following the reload, a complete audit of the Solr index identified any records that were not correctly transferred to or deleted from Quicksearch.

Orbis:

9,843,283 records extracted
9,841,924 records initially loaded
343 ingest errors for correction (a significant drop since the last reload, thanks to the ongoing efforts of CMS and other technical services staff)
Remaining difference re-extracted

Morris:

499,410 records extracted
498,941 records initially loaded
Remaining difference re-extracted

Quicksearch Ingest Activity for October

Data is through October 26th, after which the daily ingest was put on hold for a full reload of records into Quicksearch from Orbis and Morris.

Orbis:

122,633 records updated or added
1,348 records deleted
68 ingest errors

Morris:

8,370 records updated or added
156 records deleted
106 ingest errors

Changes to records from Orbis and Morris are transferred to Quicksearch daily. Errors preventing ingest are usually an empty subfield or empty delimiter.

Quicksearch ingest activity for July and August

Now that the daily Quicksearch ingest audit has been up and running for several weeks, we have data to share! These statistics demonstrate not only the volume of records going into Quicksearch, but also the work that Yale and Law Library staff put into creating, editing, and managing records in Orbis and Morris. The count of records added to Quicksearch includes all records that were created, loaded, updated, or unsuppressed during this time period.

And now, the numbers:

From Orbis, 99,271 records were added to Quicksearch in July, and 89,486 in August. July saw 357 records failing ingest due to MARC errors (usually an empty subfield or rogue delimiter lurking somewhere), 77 of which were corrected by Catalog & Metadata Services staff. The remainder are preliminary order records that will be updated when the item is received, or records that triggered an error on multiple occasions. August’s error count was 107, of which 40 were corrected. The deletions for July were 2,540, with 1,779 records deleted in August.

From Morris, July’s additions numbered 6,944, with 224 deletions. The August counts were 17,829 records added, and 111 deleted. For both months Law maintained a clean slate with no errors logged.

Grand totals: 106,215 new or updated records in Quicksearch in July, and 107,315 more in August.

ETA: These monthly statistics are derived from a daily audit of Quicksearch loads, so if the same record is updated in Orbis or Morris several times over multiple days, each day on which it has new changes triggers re-ingest into Quicksearch and adds one to the tally.

New Audit Tool Keeps Quicksearch in Sync with Library Catalogs

While most MARC records transfer smoothly from Orbis and Morris to Quicksearch, sometimes things go wrong and discrepancies between the systems arise. The recent development and implementation of an automated daily audit process ensures that these errors will be identified and routed for correction swiftly.

The audit script runs each time Quicksearch processes a fresh extract of records from Orbis or Morris. The script identifies the record IDs from the processed extract files. It then contacts the Quicksearch Solr index to acquire an up-to-date list of all the record IDs in Quicksearch, as well as the date each record was last updated. By comparing the extract files to the contents of the index, the script checks that all added records are present with a current update date, and that all deleted or suppressed records have been removed. A notification e-mail immediately reports the audit results to the Quicksearch project team, and includes file attachments listing the record IDs that failed to be processed correctly. These errors are collected, analyzed to determine appropriate handling, and routed to Catalog & Metadata Services or Law Library staff for review.

The next phase in development is for the script to send the daily audit results to a local database, which will in turn power a publicly available visualization that updates in real time, illustrating the volume of catalog records that YUL staff create or update on a daily basis, and charting their flow from Orbis and Morris into Quicksearch.

How Quicksearch Assigns Format Facet Terms to Records

Quicksearch handles formats, such as “Books” or “Audio,” in a modular, layerable structure that is very different from Orbis. For example, manuscript maps are not treated as a separate format from print maps or other manuscript materials, but rather have both the “Archives or Manuscripts” and “Maps & GIS” formats assigned, and can be accessed through either facet or a combination of the two. During the record ingest process, a script analyzes the MARC data to determine which format terms to apply to the record, according to conditions described in the format mapping documentation linked below. This mapping was developed in collaboration between the Quicksearch Implementation Group and the Discovery & Systems Metadata Group, drawing upon MARC documentation, local practices, metadata analysis, and Yufind discussions. It continues to be refined as Quicksearch develops.

Quicksearch MARC Format Mapping Documentation