Friday, April 27, 2018

OpenRefine workshop materials for ECFN/Nomisma

Next week is the 7th annual European Coin Find Network and Nomisma.org meeting in Valencia. I'll be guiding two brief, 30 minute introductory workshops in OpenRefine aimed at cleaning numismatic data and linking to Roman imperial coin type URIs defined in OCRE. I plan to write up the steps in the tutorial at some point, but the test materials can be accessed here:
Expect updates to this post when the workflow I intend to show in the workshop is codified into a written tutorial

Improving OCRE OpenRefine reconciliation with regex

I have made a slight update to improve the matching of OCRE coin types through the Numishare type-based OpenRefine reconciliation API. The reconciliation API queries the "title" as indexed as a text field in Solr, which as detailed in a previous blog post, functions most accurately when you reduce your reconciliation column down to the RIC number and use authority/mint/denomination as an additional property.

This would miss a lot of potential attributions of numbered subtypes that were never given parent type URIs in OCRE. Some examples are in Hadrianic types. The British Museum has assigned the type number '14', but OCRE has no Hadrian 14, only 14a and 14c. The API update appends the following regex to the title field Solr search: '(\(?[a-zA-z]\)?)?', resulting in the query "title_text:/14(\(?[a-zA-z]\)?)?/". This looks for a single lower-case or upper-case optional letter that may optionally be enclosed in parentheses.



When running the API against more than 2000 coins of Hadrian from Rome from the British Museum, about 500 had a 100% automatic match, and another 1,500 yielded two or more potential matches. Before this regex tweak, a significant portion of the 1,500 coins that didn't automatically match had no suggestions, and therefore required the "Search for Match" function to manually attempt through autosuggest by typing with the keyboard.

Friday, April 20, 2018

New updates from KENOM, Münzkabinett Berlin

Two new collection URIs have been minted in Nomisma.org for the KENOM project, and the OAI-PMH feed from KENOM has been re-harvested. Now there are more than 7,500 coins (and medals) contributed from that project, including 1 medal from Munich and 4 from Moritzburg for Art of Devastation, a corpus of World War I medals. These medals represent the first partner contributions to AoD, which to this point has consisted only of the American Numismatic Society's own collection.

The updates include 3 coins from Munich for Seleucid Coins Online.

Furthermore, a new update has been run on Berlin's contribution to Online Coins of the Roman Empire, which now includes some coins of the Gallic Empire. With these most recent updates from Berlin and KENOM, there are now 114,136 coins in OCRE.