Competencies

Competency 14: Evaluating

Photo of San Andreas Fault

We have to know where we came from to know where we are going. (San Andreas Fault)

COMPETENCY STATEMENT

Each graduate of the Master of Library and Information Science program is able to…

…evaluate programs and services based on measurable criteria

Page Contents

IMPORTANCE OF THE COMPETENCY TO ME

Photo of Lord Kelvin

I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of science, whatever the matter may be. (Thomson, 1889)

Why should we measure?

As a software tester, I have long been aware that you can't know if you have a quality product, or if that quality is increasing or decreasing as changes are made, unless you are measuring the quality. The same holds true in LIS services. It is impossible to know if a data archive is providing needed services to its customers unless the archive is being measured on various characteristics such as breadth and depth of content, quality and completeness of the data and metadata, how many customers are finding (and using) what they need, how many requests are received for help navigating the catalog, and how many times a file download failed, for starters.

How should we measure?

ISO standards are the right place to start when planning to measure something. For evaluating a data archive, I would rely on ISO 11620:2014, "Information and documentation–Library performance indicators" (ISO, 2014). "The standard ISO 11620 (1998) defines evaluation as a process of assessing effectiveness, efficiency, usefulness and relevancy of certain services, equipment or facilities" (Irena, 2005, p. 2). Beyond ISO, Deming's foundational work on measuring quality in products (Evans & Ward, 2007, pp. 32-33) can be leveraged for measuring quality in services (Douglas & Fredendall, 2004).

What should we measure?

Metrics that could be collected for measuring the effectiveness of a data archive include:

user surveys on service details and overall satisfaction
system statistics such as download speed, number of downloads per hour or day, system downtime
number of requests for help or registered complaints
number of datasets used in published papers

The details of how the computers in an archive are performing are perhaps not as obvious as asking patrons how they feel about their experience, but the technical performance is no less important. The safety of the data on the archive should be measured: how vulnerable is it to hacking? DNS attacks? Natural disasters (such as loss of media and downtime)? Assessing a data archive's overall performance would also have to include collection-specific measurements (quality and completeness, as mentioned earlier) and information retrieval measurements such as precision and recall, using a known test set of relevant and non-relevant results (Manning, Raghavan, & Schütze, 2008, p. 140).

There are many ways to measure how an information organization is delivering to its customers, and once the metrics are defined, some can be monitored on an ongoing basis using data visualization tools like Splunk, which can be leveraged to display not only performance of running systems or details about datasets but also any set of time-stamped, text-formatted data–such as a log of user complaints generated by an archivist.

How much should we measure?

It is important, when measuring an organization against standards, to find the right balance between using suggested metrics of a standard and not enforcing them to the detriment of the operations of the archive. The Pareto principle suggests that checking and enforcing about 20% of the highest priority standards will account for 80% of the evaluation, and continuing to check the remaining standards will take most of the effort and result in the least useful results. "Just a critical few tasks account for the majority of what your visitors come to websites to do" (Sauro, 2012).

Finally, the focus of measuring effectiveness should keep in mind the audience. Metrics should not be designed based on what the curators think the archive should do but instead on what the users actually want. Irena (2005) goes further to say:

The future of data archives does not lie in supporting generic, broadly useful services such as information access to large collections, but in supporting "customization by community", i.e. the development of services tailored to support the specific, and real, practices of different user constituencies. (p. 4)

The purpose of a data archive is to provide information to users, and assessment should always be checking that goal.
Top[11]

WHAT WORK PREPARED ME TO UNDERSTAND AND PERFORM THIS COMPETENCY?

At NASA's Goddard Distributed Active Archive Center (GDAAC), I won the continuous measurable improvement (CMI) award, which reads:

Photo of CMI award

Photo of CMI award. Click for larger image

In recognition of your outstanding deployment of the principles of continuous measurable improvement in producing a suite of Web-based tools known as GATES–the Goddard Automated Tool for Enhanced SSI&T [science software integration and test]. GATES will significantly enhance the capability of Goddard's Distributed Active Archive Center (DAAC) to integrate and test MODIS science software in preparation for the launch of EOS AM-1 by automating processes and capturing test results from a Web-based application. The team's use of process mapping, spirited teamwork, customer-driven quality, and infections attitude toward continuous improvement exemplify a model cmi [sic] effort. – Ashok Kaveeshwar, President, Raytheon STX Corporation. April 1998

I've worked in several organizations that considered getting evaluated according to standards. At NASA Ames Research Center (ARC) Center TRACON Automation System (CTAS) we considered being evaluated for the Capability Maturity Model Integration (CMMI), and in my current job at Cytobank, I laid out a plan for trying to achieve a higher level of software testing maturity according to the international software testing standard, ISO/IEC/IEEE 29119 "Software Testing".

As mentioned above, part of evaluating a service is making sure the information is appropriate for the patrons. On the advice of a friend and former co-worker from NASA GDAAC, I began learning about data quality in the past few years, which is a growing field of interest for data archives. I joined the Earth Science Information Partners Federation (ESIPFed) data quality workgroup and attended a few telecons. In "Seminar in Archives and Records Management: Electronic Records" (LIBR284), I was able to research the standards for geospatial metadata for data quality metrics.

In "Information Organizations and Management" (LIBR204), I was in a group that created a strategic plan for the San Mateo County Library System (SMCL), and as part of that plan, we put together measurable goals that could be used to evaluate how well the system was meeting the plan. And in "Research Methods in Library and Information Science" (LIBR285), I created a research proposal that centered around a questionnaire to ask data archive users about the usability of geoscience datasets; the responses to that questionnaire could be used to improve the products and services of any geoscience data archive.
Top[11]

EVIDENCE

Paper: "Evaluating Reference Sites"

Paper: "Evaluating Search Strategies"

In "Reference and Information Services" (LIBR210), I completed two exercises in searching and evaluating various databases for known, assigned items. This was good practice not only in searching but also in evaluating the success (or failure) of information organizations' catalogs and their search functions.

Discussion post: "Chat Reference Evaluation"

Also in LIBR210, we performed weekly evaluations on different types of reference services and reported upon them in our discussion group. Here I present an evaluation of two chat reference services. This is the sort of qualitative evaluation that might be useful for assessing the help services of a data archive, similar to a secret shopper evaluating the quality of service in a store ("Mystery shopping", 2015).

Paper: "Strategic Plan for the San Mateo County Library System" (group project)

In LIBR204, I was in a group that evaluated an existing seven-year strategic plan for SMCL. This evidence was used in Competency 4 as proof that I have done intensive thinking about management issues and how to manage an information organization. Here I present the goals portion of the paper, from page 24-51, where we assessed the goals in the existing plan and made suggestions for how the goals could be better defined in order to be measured for success. The goal definition was a joint effort between all members of the team.

Paper: "Survey Questionnaire Analysis" (group project)

I was in a group in LIBR285 that designed a survey that could be used to help draft guidelines for libraries wanting to become Web 2.0-compliant. We identified our research variables, made decisions on the question formats and on the order of the questions. I helped write the questionnaire and cover letter and participated in the design analysis; I also did all the work with putting the questionnaire on SurveyMonkey and presenting the final paper in Word format. This was an experience in creating a questionnaire that helped me design the dataset usability questionnaire in my research proposal (see evidence in Competency 13); this experience will also help if I decide to use surveys for patrons at a future data archive. All members of the team met several times and consulted through comments on Google Docs to create the paper and participated equally.
Top[11]

CONCLUSION

Announcing an assessment at an organization might meet with groans from the employees. It is, however, the only way to know if you're getting it right–if you're meeting the customers' needs, if you're meeting your goals, if you are making a positive difference in the world. Assessment can take many forms, some more painless than others, but it should be a part of the routine work of a data archive or any LIS organization. Measuring an organization's success points the way to innovation and improvement.
Top[11]

REFERENCES

Davis, J. L. (2015, April 12). CMI award [Digital photo].

Douglas, T. J., & Fredendall, L. D. (2004, July 9). Evaluating the Deming management model of total quality in services [Abstract]. Decision Sciences, 35(3). http://onlinelibrary.wiley.com/doi/10.1111/j.0011-7315.2004.02569.x/abstract;jsessionid=5212D3C39D7E83790E39E31B14E84ED3.f03t02?deniedAccessCustomisedMessage=&userIsAuthenticated=false

Evans, G. E., & Ward, P. L. (2007). Management basics for information professionals [4th ed.]. New York, NY: Neal-Schuman Publishers.

International Standards Organization (ISO). (2014, May 27). ISO 11620:2014: Information and documentation–Library performance indicators. Retrieved from http://www.iso.org/iso/catalogue_detail.htm?csnumber=56755

Irena, V. B. (2005, May). User needs [PDF document]. Retrieved from http://www.iassistdata.org/downloads/2005/b3vipavc.pdf

Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval. New York, NY: Cambridge University Press.

Mystery shopping. (2015, April 10). Retrieved from http://en.wikipedia.org/wiki/Mystery_shopping

Reflections on the motive power of heat portrait of Lord Kelvin [Online image]. (1895). Retrieved from http://commons.wikimedia.org/wiki/File:Reflections_on_the_Motive_Power_of_Heat_portrait_of_Lord_Kelvin_1895.png

Sauro, J. (2012, September 12). Applying the Pareto principle to the user experience. Retrieved from http://www.measuringu.com/blog/pareto-ux.php

Thomson, W. (1889). Electrical units of measurement. In Nature series: Popular lectures and addresses: Constitution of matter (Vol. 1) (pp. 73-136). London: Macmillan.

Wallace, R. E. (2006, January 22). San Andreas. Retrieved from http://commons.wikimedia.org/wiki/File:San_Andreas.jpg

Last updated: Friday, April 17, 2015