Skip to main content

Thanks to Andrew Nagy, Discovery Product Manager at Serials Solutions, for this reprint of his article from the Serials Solutions blog.  

So many people are talking about the concept of “Big Data” that it’s hard to separate the marketing hype from the true value. Big Data, loosely defined, is the ability to gather, analyze, interpret and, most importantly, act on large volumes of data to identify and solve problems.  Hospitals use it to prevent illness.  The financial industry uses it to detect credit card fraud.  Airlines use it to fill seats.  And, Amazon uses it to tell you what you might like to read next.

So what’s Big Data have to do with library discovery and the Summon service?  The ability to leverage Big Data is making it possible to better understand how users perform research.

For the past decade or longer, usability testing has been the traditional process for evaluating a software application’s user experience.  In usability testing, users—existing users of the application or participants recruited “off-the-street”—are observed while completing a series of scenarios that mimic real life examples.  Usability testing has its pros and cons.  For many years, this approach has provided valuable information.  However, no matter how unobtrusive the observation mechanism, users act differently when they know they are being observed.  So how can we better understand how users are actually using a service like Summon?

Enter Big Data.  As complex software systems have evolved into a Software as a Service (SaaS) paradigm, leveraging the advantages of economies of scale to make more powerful solutions, user experience analysis models have changed.  Rather than having hundreds or thousands of users on a locally installed application, we now have millions of users working on a single common application.  This single common application can record and track user activity and store the data in large-scale data warehouses creating an opportunity for a superior approach to user analysis.

Delivered as a SaaS solution, at the core of the Summon service is a single, unified index.  This unique architecture, which ensures that all users are searching the same system, allows us to gather and analyze user behavior data from the hundreds of millions of Summon searches performed by millions of users—including researchers from the largest and most prestigious academic and research libraries around the world.  All users searching across the same unified index, no matter how customized their local Summon site might be, is the key to capturing meaningful and interpretable data.

This data can expose behaviors that illustrate true usage of library services, as opposed to the usage of a small number of participants being observed in an unfamiliar situation such a usability study with defined tasks.  While proving a significant breakthrough in understanding user behaviors, Big Data analysis doesn’t replace the traditional usability testing process since this kind of analysis can only illustrate what people are doing, not what they aren’t.

What have we learned about users of the Summon service by analyzing Big Data?  One interesting observation that you can see illustrated in the chart below is that most common searches are 2-4 word queries with a long tail of much longer queries.

We see from the data that there are two main types of searches, short broad topical queries and multi-term focused queries.  Some of the most common searches in the Summon service are things like “early childhood development”, “human resource management”, “cloud computing” which show how users identify a theme and conduct broad, topical searches.  By coupling Big Data analysis with usability study analysis, we’re able to find that users tend to start with these broad topical search terms and narrow their results by adding additional terms to the query and applying filters and facets.

Another interesting outcome of our analysis is that the more search terms a user has in their query, the less likely they are to abandon the search.

The chart above illustrates there is a lower abandonment rate when a user has more search terms in their query.  The lower the abandonment rate, the better the search results are for the user.  This chart shows that once the user has at least 3 terms in their query, they have a much better chance of easily identifying content that looks interesting to them.

By combining user behavior analysis from Big Data and usability testing, we’re able to identify common scenarios that allow us to develop features and enhancements that address the observed behaviors.  For example, development of some of the latest features of the Summon service have been informed by this type of analysis, including Best Bets, the Database Recommender, Related Search Suggestions and  other features introduced with Summon 2.0.

Big Data analysis can be valuable beyond design and development; it can also play an important role in the way these features actually work.  Leveraging real-time, global Summon usage data, the Related Search Suggestions feature encourages users to expand their queries which can lead to better research outcomes.  And being data-driven, these features are rapidly and continuously fine-tuned to improve over time.

28 May 2013 | Posted by Shannon Janeczek

Related Posts

Colonialism and Conflict: Colonial State Papers and British Periodicals (Part 2: British Periodicals)

Over 350 years of British colonial activity and its associated conflicts are documented in two ProQuest historical collections — Colonial State Papers and British Periodicals. The former presents documents pertaining to the administration of…

Learn More

Colonialism and Conflict: Colonial State Papers and British Periodicals (Part 1: Colonial State Papers)

Part one of a two-part blog series. Over 350 years of British colonial activity and its associated conflicts are documented in two ProQuest historical collections — Colonial State Papers and British Periodicals. The former presents documents…

Learn More

Lacrosse—The Native American Game

ProQuest Congressional offers insight into the historic origins of lacrosse as a Native American sport, including traditions specific to Native American peoples of New York and numerous other places. The U.S. Serial Set contains 19th and 20th…

Learn More

Search the Blog