2021-08-25

Date

Attendees

  • Melanie Cofield
  • Katie Pierce-Meyer
  • Josh Conrad
  • Mandy Ryan
  • Michael Shensky
  • Nancy Sparrow
  • Paloma Graciani Picardo
  • Robyn Mygatt
  • Sofia 
  • Beth Dodd
  • Brenna Edwards
  • Stephanie Tiedeken
  • Brittney Washington

Recording

Zoom recording and chat

Theme

  • Assessment of linked data projects

Agenda

  • Introductions and agenda overview
  • Demo: Michael Shensky will demo his current work related to assessment as applied to the Buildings of Texas project
  • Demo: Paloma Graciani-Picardo will share work with SPARQLs 
  • Ongoing discussion: assessment possibilities, questions identified and shared via Google doc: https://docs.google.com/document/d/1yRzhBPx6BG2UETEei21DJ5Swc8GVRNN8eAr7ZvphEKw/edit (anyone with the link should be able to edit)
  • Review what we've covered since topic survey in Spring, plan topics/actions for Fall meetings

Discussion items

ItemWhoNotes
Introductions, agenda overviewMelanie
  • Notes by Paloma
  • Robin and Sofia (Blanton) are joining the group. Yei!

Current work related to assessment as applied to the Buildings of Texas project

(Demo)

Michael

  • Still a work in progress, so just showing his current exploration, but not really a final product / proposal
  • Sprout out from our earlier conversations regarding the Architecture projects going on in wikidata and leveraging wikidata data in our workflows
  • Need to track changes in items that we have contributed to in wikidata before we can securely integrate it in our workflows
  • Revision history sample "Texas State Capitol"
  • Flag items that have been changed after they did our updates
  • Has been testing SPARQL using the wikidata query service. Very valuable to get familiar with SPARQL before starting writing scripts to do some of this tasks programmatically.
  • Looking at pages that he knows they have contributed to 
  • Architecture is its own item
  • Looking at one item that is on the Architecture and see how they connect to the collection, so that he can 
  • Wikidata query service is great to test things out and experiment, because you can hover over properties and gives you the label, as well as 
  • Query finds all items in wikidata that are collections and filters for those entities that have archives at statement, and then refines for the architecture repository, then only see those that have an instance of that is an instance of stated in alexander architectural archives
  • Retrieving both item ID and item label
  • Once the query is in place, it can be incorporated on a python script that would do additional tasks
  • Uses Atom ()as text editor for his python scripts (great tool if people want to test it out!). Has installed a package to run python straight from atom
  • Script submits an http request to retrieve the information without having to do it from the browser, and requests for the edit history information
  • It retrieves a time stamp of when things were edited, and it also retrieves an edit history (who did the edits, when and what) for each item that results from the initial SPARQL query
  • Encoding issues might retrieve weird characters (e.g. if the edits are translations of item level or description to non roman languages)
  • Next step: query the revisions themselves based on revision ID retrieved with this query, to get further information (might require using a different API)
  • Not sure if the editing information can be queried using SPARQL, as well
  • UT Github (https://github.austin.utexas.edu)  – can be used to share code for this type of projects with each other
  • Willing to share the python script once it is polished
  • Potential of editing history to show interest on the items we create for the community. Would the translations to other language be an indicative? Could be, sometimes it is bots that are applied to everything. 
SPARQL work (Demo)Paloma
  • SPARQL 4 QC - Using SPARQL queries for quality control and metrics (In Progress) - Slides: SPARQL 4 QC.pptx
  • Why quality control: Who manages the Wikidata store? Everyone/anyone can edit?  Keeping track of our contributions, managing changing links (e.g. Finding Aid migrations)
  • Query examples: How many items use the "archives at" property? (e.g. for archives anywhere (65650!), and for archives at the Ransom Center (149!)
  • Wikibase RDF mapping diagram = "Holy grail" for understanding how to build effective queries that get at different parts of the graph database
    • (see link in slide for this)
  • Explanation of HRC current practice for the "archives at" property - display in Wikidata and actual component breakdown with qualifiers for queries
  • Demo of query to find out what qualifiers are being currently used in "archives at" statements
  • Demo of query to find all items with "archives at" HRC and potential qualifiers (to gauge consistency in descriptions) - inconsistent with use of title and named qualifiers, helps surface improvements for consistency
  • Demo of query to find properties most often applied to references of "archives at" statement
  • Demo of query to retrieve values of reference properties for "archives at" the HRC (to gauge consistency in descriptions and level of completeness)
  • Demo of query to retrieve all items with references that have an HRC Finding Aid URL as a value (so far doesn't work, so it needs refinement)
  • For any of these queries, export large result sets for QC outside of Wikidata interface

Feedback from group: great examples of how to query what others are doing in Wikidata, extremely helpful for many in this group (smile)


DiscussionDeferred until next meeting

Looking ahead

Melanie, all

  • Defer until next meeting: Quick retrospective of topics covered since April/member survey on priority topics
  • Defer until next meeting: topical focus?
  • Fall meeting schedule - continue with same day/time on monthly basis


Action items

  • Attach Paloma's slides to notes here
  • Schedule Fall meetings and send invitations out to group