! ! ! WARNING ! ! !
Open, honest and probably naive content ahead.
Proceed with caution!
As I pointed out in the previous related post (link in the bottom of the text), this bi-post is my own version of brainstorming on metadata quality metrics, that was spurred by some reading I did the past couple of weeks. So, without further ado, some more thoughts on metadata metrics… Comments are more than welcome!
Consistency: Logical consistency refers to the degree to which metadata match the metadata standard definitions. In this case, the main problems defined by Ochoa & Duval were:
- Instances including fields not defined in the standard
- Fields that are mandatory are not included
- Categorical fields don’t have values that are sanctioned
- Combination of values in specific fields contradict each other
First of all, it seems that for metadata created from scratch, these problems can be avoided using technical solutions, not permitting specific values or using only drop-down menus, etc. The problems occur when we have collections that need to be aggregated to existing platforms that are based on specific search mechanisms which cannot be deployed with the problematic metadata. Again, measuring these problems seems quite straight-forward in the relevant literature. On top of that, it seems really useful when assessing collections that are candidate for harvesting. Going one step further, by looking at the problems, we may be able to infer the costs involved for metadata enrichment and correction.
Coherence: Coherence is the degree to which all the fields describe the same object in a similar way. Loosely defined, this metric again calculates the semantic distance between different free-text fields (title and description). It’s not really suitable for individual records, but this is a metric that when applied to the entire collection can give some indication of poor titles or poor description. Again, this heavily depends on the instructions given to annotators, as there are cases when annotators avoid the words that exist in the title when describing the object, especially when they know that a text-search works both on the title and the description, so they think that it’s not a best-practice to repear title words in the description (or it’s just not needed). With this in mind, low coherence can actually be a sign of high-quality in some repositories.
Accessibility: This defines the level to which a metadata instance can be found and later understood. This metric is about how easy it is to find the object that is described with the metadata. In relevant literature, this is measured by looking at the links in the metadata, either direct (with the element Relation) or indirect (by metadata such as the author). I really like this idea and I think it makes sense. On the other hand, using readability indexes to assess the difficulty of the text, may work in some cases, but what happens in dense, research-intense scientific documents? I have no epiphany here, but I guess there are solutions to work around this, and lots of nice tools like the one I linked above.
Timeliness: It relates to the degree to which a metadata instance remains current. Well, this is a difficult one. Looking at the timeliness of the metadata record through its use or visits within a repository, is kind of wrong, as it depends on the search mechanisms deployed on the content, that may be biased towards specific resources. Using the change of the other metrics over time, seems like a good idea. It reminded me of the way that Klout works, measuring your social impact over time. On the other hand, a picture of Mona Lisa is catalogued and annotated once. If it’s not renewed over time, does this mean that the metadata instance is no longer current? I guess this is also something that relates to the expectations of the community in a way.
Provenance: Last but not least, and one of my favourites, provenance. It’s termed as the trust that a given community has, in the source of the metadata instance. The way to measure this, at least in relevant literature, is to add the quality of all instances submitted by a contributor (based on the aforementioned metrics) and then divide by the number of instances submitted. This is a nice idea and it also provides a number that changes over time. On the other hand, in the cases that arguments can be made against other metrics, they undermine this metric as well. It would be ideal if the reputation of a publisher could be assessed by the use of the resources on a portal, but then again, this is dependent on the portal itself and a thousand other factors that cannot be controlled or measured. Again, it seems like a Klout-like approach could do the trick, but I am not quite sure how…
Overall, I am still on the fence as far as metadata quality metrics go. I understand them and I think they’re useful, but in some cases, it still feels that this should be a contextual solution. The generic automated metrics have been a huge leap forward, and now I think that for me and for others working on these topics, the challenge is to use them and put them in context, adapting them so that they truly matter. My challenge has to do with education and I am happy to have started scraping the surface of the problem.
As I said in my previous blog post (if you haven’t read it, here it is), these lines are more of a brainstorming rather than scientific, concrete proof. Generally speaking, I always prefer putting things out there, open to discussion and comments, cause overthinking is not my thing. I hope that you have the same considerations as I do, and maybe you have some more solutions than I have. And of course I will be happy to get your feedback along with some reading material! 😉