User talk:Charles Faulhaber: Difference between revisions

From FactGrid
Jump to navigation Jump to search
 
(67 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== All FactGrid users ==
* [https://tinyurl.com/y9r7o5rj dynamic listing, all people on the PhiloBiblon Project and their user pages]
* [https://tinyurl.com/y3tpwdxw dynamic listing of all people with an account]
== June 2019 First Edits ==
== June 2019 First Edits ==


Line 61: Line 66:


== 1st Web-Meeting, 18 May 2021 ==
== 1st Web-Meeting, 18 May 2021 ==
Moved to [[FactGrid talk:PhiloBiblon]]
== Database Objects and pages ==
The following is no database object - it is a Wiki page with the Q-Number of an object/Item:
https://database.factgrid.de/wiki/Q254471
To create an object you need to run the New Item procedure (Menu (which you did)). Better even: run that routine with a batch Fragment.
This is the database object - It has "Item" in the url:
https://database.factgrid.de/wiki/Item:Q254471 


Here are the notes from the meeting 17 May 2021:
It makes no sense to date things on FactGrid (as you did on the talk page as the system has a version history for anything that is far more precise and specific.


Philobiblon Meeting Notes
Maybe I can offer an online course on the Database. Useful also a look at the Help section to understand the database routines. (Improvements of the Held section are welcome.) I could give that course for anyone interested Monday 24 May 2021 at on our Webex Channel (easy to use) 21:00 CET.


Dear Colleagues,
== Good descriptions ==


Here's what we have in the Workplan for this summer:
Dear Charles, as I just wrote in my e-mail: If you are using labels like "MS: Toledo: Biblioteca Capitular...." you will get hundreds of hits on the normal navigation that make no difference before character 31. Two ways out of that dilemma: good, (1) intuitive, aliases and (2) descriptions that state what is special in the first 15 characters, the ones you see on display in thin letters, below the bold font labels. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 08:19, 26 January 2022 (CET)


Objectives: Review of Wikibase software: standards, protocols, data formats, and implementations; comparison with PhiloBiblon schema and data dictionaries and current and desired functionalities. Scenarios to describe how contributors and users will interact with PhiloBiblon.
== Witnesses and Price ==
Functional specifications for the features required for those scenarios.
Identify a small set of particularly rich and related records (number TBD) from each of PhiloBiblon’s four databases (BETA, BIPA, BITAGAP, BITECA) and ten tables to serve as test cases.
Ongoing data clean-up in legacy PhiloBiblon
Tasks:
T1. Review Wikibase software: standards, protocols, data formats, implementations (PI, Anderson, Formentí, Simons)
T2. Develop user scenarios based on PhiloBiblon data and current and desired functionalities. (PI, academic staff, Adv. Board: Dagenais, Gullo)
T3. Create functional specification for features needed to re-create these functionalities. (PI, Anderson, Formentí)
T4. Identify set of test target records (PI, academic staff)
T5. Ongoing clean-up of legacy data (PI, academic staff)


5/18/21 Meeting Notes
Morning Charles, I am not sure whether [[Item:Q394389]] is meant to refer to a person (which we have [[Q144912]]) in a law case or to a textual witness ([[Item:Q195274]])? (The plural is odd) --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 08:32, 24 February 2022 (CET)


Attendance:  
I also detected these to [[Item:Q394388]], [[Item:Q246405]] which might be the same. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 08:55, 24 February 2022 (CET)
Adam Anderson (zoom host), data analyst for project, Berkeley Lecturer
Charles Faulhaber, Project PI, and (new) director of Bancroft Library,
Josep Formenti (software engineer in Barcelona, also knows NLP and webdev),
Olaf Simons, book historian at Uni Erfurt, Gothe. WikiMedia platform & FactGrid
Randal Brandt, head of cataloging at UCB, rare books,
Xavier Agenjo (director of projects for Fundación Ignacio Larramendi in Madrid, creator of more then 40digital libraries),
Daniel Gullo (dgullo@csbsju.edu) director of collections, cataloging, creating databases of libraries, modern digital collections, controlled vocabulary for underrepresented religious traditions in the middle ages (vhimmel online database, NEH project director)
Óscar Perea Rodriguez, lecturer at USF, working with Charles on this since 2002
Jason Kovari (cataloging rare books), Cornell
Robert Sanderson, director of digital collections at Yale
Cliff Lynch, director of a small nonprofit in DC Coalition for Networked Information, in the School of information at UC Berkeley; worked on the predecessor of the CDL.
John May (phone): software dev., information management systems, designer of PhiloBiblon (software)
John Dagenais: Professor of Spanish at UCLA, degree in library science, user of PhiloBiblon


PhiloBiblon Project:
We need both textual witnesses [[Item:Q195274]] and witnesses as individuals [[Q144912]], although ours are usually witnesses of documents, not witnesses in a law case.
Goes back to 1975, as a spin-off of the Dictionary of the Old Spanish Language project at U of Wisconsin-Madison, a Spanish version of the OED. Based on contemporary uses of the language. To do this they created an in-house data base (1975 onward): Bibliography of Old Spanish Texts (BOOST), lineal ancestor of PhiloBiblon.  
In the former case, each record in our Analytic table is a textual witness.
Constant technology change. Put the collection on CD-OM discs by 1992, with digital images + text.
In the latter case, witnesses are found as Persons Associated with a text or with a textual witness.
1994 with the internet (Netscape). Charles was teaching a course on DH computing (including gopher, OCR, etc). One of his students introduced him to the World Wide Web and he said that it wasn’t going to be important… Since then they’ve been working on keeping up with the different versions (1.0, 2.0, 3.0 = Linked Open Data with RDF).
With regard to [[Item:Q394388]] and [[Item:Q246405]], they are different in my opinion. The latter is the actual price paid from one person to another; the former is the price established by law for a book. This is useful to know for book history.
Currently Exports data from Windows PhiloBiblon into XML files to upload to the server at Berkeley, where XTF (eXtensible Text Framework), run from CDL takes large XML files from 9 PhiloBiblon tables (uniform-title, manuscripts/editions, persons, etc.) and parses each of these files into individual records for querying.
--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 19:56, 24 February 2022 (CET)
Objective: to get us aligned with the Wikibase / Wiki-foundation, to piggyback on their data and technology moving forward.


Olaf Simons: FactGrid (works on Masonic institutions)
--I quote the "tasa" from the 1495 edition of Antonio de Nebrija's Latin-Spanish Dictionary:
Charles found Olof through commentary in their blogpost. Wikibase is the software behind the Wikidata project. It’s developed between 2012-ongoing. Used by national libraries worldwide to control data: creating triple-base statements--2 entities linked by a relation--(annotate with statements) and further develop metadata.
"Esta tassado este vocabulario por los muy altos | & muy poderosos principes el Rey & la Reyna | nuestros señores & por los del su muy alto con|sejo en cinco reales de plata."
You can see who is editing it by the minute. Each entity has a Q-number, e.g., for each person (gender, address, field of research, etc.) you can add as many statements as you want, and you can qualify these statements. You can add references to the entry (as many as you want). Used as a source for anything imaginable, we collect statements that become entries.
Using SparQL you can query this:
e.g. Illuminati: members list
Q: date of birth? and it appears as a column for each member
SparQL no longer connects text input fields. It normally connects items to items. E.g. a person is a member of a lodge, which is another item which contains informations with statements. Each item is a database object.
We will have to translate input field information into objects, and for that to work, each object needs a statement.
First Point: We don’t deal with books, instead you have people, places and institutions connected to these books. All of these need statements to work.
Produce a network of items of relations. Consider the types of objects (e.g. Geographic names, Proper Nouns, etc.).
We’ll need to get an idea of the number of items you’ll be creating beforehand.
Should create your own team and get accounts for those members, along with managers of the accounts to run the team independently.
Charles:  
For Spanish texts (6K texts)
8K individuals
PhiloBiblon “Data clips” correspond to P properties in Wikibase.


Creating properties is not the issue. You create properties as needed and link them on a text-by-text basis.
The price of this vocabulary has been set by the most high and most powerful princes, the King and Queen our lords, and by those of their most high council in 5 silver <I>reales</I>


The real problem is to understand the types of objects you have to create, and how they’re interconnected. They need to be created before you can link them.
This is an official government action. It is different from the price one would pay to a bookdealer, where money changes hands.
--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 06:09, 3 March 2022 (CET)


E.g. often we start with places, and move from there. First is to create the objects, you then interconnect it.
== Property like Items ==


Usually projects come to us with a spreadsheet…
Dear Charles, [[Item:Q394873]] - I do not quite see in what kind of statement such items can possibly occur. I do not even know what P2 or P3 statement to give them. Could you put them on an exemplary item? --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 10:39, 2 March 2022 (CET)


Foreseeable Problems:
This was based on our discussion 3/1/22 on how to use Literature (P12) to list a Q# of secondary bibliography and then to qualify that Q# with Kind of Reference (P708). You created that property in order not to use Note (P73) followed by a string. You rightly pointed out that if we used "descrito en," we would not see "described in" if we changed the language to English
The entire project is multi-lingual (using P-numbers, Q-numbers (Deutsche Nationalbibliothek GND)), which allow for different labels on these items and properties. You should use it in the language of your sources. The database currently accommodates French, English and German (the main users of the database). They put their labels in the database. You will need to make labels in Spanish and Portugese.
See [[Item:Q164502]].
Programming issues: the software is not designed to create presentable projects at the moment. This is ongoing work: 1) the FactGrid viewer, which creates pages based on information from the database, created on the fly. You will want something similar, for your own data. Bruno Belosst is the creator, you can contact him. This is where Wikibase is currently underdeveloped.
A ‘Knowledge’ tool that is in development which can also show the metadata for each entity.
Otherwise you get the SparQL query functionality.


::We will see how we get good wordings on these. So far they sound like properties - and we even have one of the same wording. That is why I was worried whether the information was not supposed to go on the [[Property:P411]]. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 08:48, 3 March 2022 (CET)


Robert Sanderson to Everyone (12:32 PM)
Dear Olaf, it's clear that we are going to have doubles with the same wording in both P# and Q#. We need this flexibility in order to map PhiloBiblon into FactGrid. It is clear that we need a P# for each class (=column head), but then we have to decide whether the individual items in that class should be P# or Q#. We don't think that we can mix them: Either all P# in a given class or all Q#. If we say that we should have all P$ in a given class, then we have to create those P# if they do not already exist. And you have convinced me that multiplying the number of P# is not a good idea: Occam's razor --[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 07:09, 4 March 2022 (CET)
We used XTF at Getty for our archives: https://xtf.cdlib.org/
For what it’s worth, at Yale we’re doing this across the libraries, archives and museums. The current data is 45 million such entities, spanning 2.5 billion triples
uses the same process at Yale. The difficulty is understanding the modeling and doing the transformation. The database side is essentially free. We have 15M MARC records, which get turned into 45M entities, people, places, concepts, objects, images, digital things. (8 classes)
We’re not using Wikibase, but rather the main library standards by themselves
The difference between the data modeling paradigms will necessitate the creation of new identifiers
But wikibase is really good at managing external, typed identifiers :)
We should write down the things you want to refer to.. e.g.
‘ownership’ (provenance in the museum and library world) if that’s something you want to refer independently to the object.
From there you can work out the CSV tables and get them into the system
Not to be a broken record, but the modeling also determines the possibilities for the queries. If you have (for example) place-written and date-written, on a text that can have multiple authors writing at different times and at different places, then you need some way to connect author / place and time *in the model*


Daniel Gullo: Biblissima is using this model.: https://data.biblissima.fr/w/Accueil
== Location of text in MS or edition ==
question about Q-numbers…
Warning: scholars use the MS-ids (manid), text-ids (texid), and other IDs established in the PhiloBiblon system.


Olaf: You will create external IDs (vital for places, e.g.) for any item in PhiloBiblon, and they will be linked to the IDs from PhiloBiblon. The Q-numbers are merged based on double entries (there’s movement, they are not deleted, but merged). There are ways of creating more key-value pairs without deleting IDs.
PhiloBiblon has a controlled vocabulary term ANALYTIC*TEXT_LOCCLASS to indicate the location in a manuscript or edition where a text is found.


This is similar to the database we’re developing…
Existing examples (e.g. [[Item:Q245760]]) link folio P100 or page P54 as statements directly to the MS or edition.
When you’re migrating your data, you want to think of it hierarchically with a controlled vocabulary: 1) Continents, 2) Geographic names, 3) institutions, 4) other entities.


Jason Kovari
PhiloBiblon, I think, needs a P# called "Location in MS or edition" to serve as column header in the CSV file to facilitate mapping to FactGrid.
Is the plan to assess what users really want? (or move forward with what you have?)
Is part of the workplan assessing the work that you initially have?
Jason Kovari (he/him) to Everyone (12:50 PM)
The word I was forgetting earlier was Affinity. There is a Wikidata Affinity Group as part of the LD4 Community: https://www.wikidata.org/wiki/Wikidata:WikiProject_LD4_Wikidata_Affinity_Group


Charles:
: This would take as a Q# Leaf ([[Item:Q369869]]), Page/Pages ([[Item:Q164536]]), or Fly Leaf ([[Item:Q395036]]).
There’s an orthogonal relationship between the ‘library world’ and what we’re doing. PhiloBiblon has evolved over 35 years in response to the needs of the user community as represented primarily by the members of the four PhiloBiblon teams (for Spanish, Portuguese/Galician-Portuguese, Catalan, Golden Age poetry). To what extent do we need to accommodate what the library world has been doing?
We’re not reimagining PhiloBiblon, but rather trying to move it into the web-based wiki world.
We want to incorporate external authority files, such as
Virtual International Authority file (VIAF): the main ID catalog (?)
Getty Thesauri
Denis Muzarelle, Vocabulaire codicologique
Rob and Olaf: Suggest starting with a small number of records (e.g. 2 or 3 codices based on your needs). This has been done.
Choose complex items (10 of each sort) and go from there.
E.g. items for each:
Books (with Q-numbers for all the books)
Toponyms
Institutions
etc.


What’s the process of getting PhiloBiblon into Wikibase? Use quick statements (or API) / spreadsheet into the machine. It takes a few hours to feed the data into the machine, but it’s simple.
--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 23:40, 5 March 2022 (CET)
The benefit of a CSV import is you can use OpenRefine to augment the identifiers and reconcile data.
How many items on the largest spreadsheet?


John May: (project janitor, coder) to perform rectifications and homogenization in the Windows database. It’s straightforward to export in CSV / spreadsheet. They are exceptionally large. The problem might be that the creation of Q and P numbers can be done programmatically, but that ontology will need to be made clear.
== thy miseries of our career statements ==
Design desiderata: to accommodate ambiguity and uncertainty. So PhiloBiblon is made to be flexible and customizable.
He’ll await the orders from Olof as to how he wants the data.
The PhiloBiblon DBMS is an n-dimensional dynamic data model (which can be normalized as needed)
ca. 60 mb for each bibliography (spreadsheet); no fixed field lengths; any structure / sub-sub-structure can contain structure - it’s all based on arrays…
6K texts
14K witnesses
John can write the code and tag the texts with P / Q numbers that would correspond to the Wikibase
PhiloBiblon is not a standard data model… so it will take some time to get the CSV…


How is PhiloBiblon used currently?
Dear Charles. The career statements are terrible so far. We started with an automatic input of German statements from lists. We then ran automatic translations, and here Deepl and Google did not care too much about the complex German spectrum. Thus Hilfsgeistlicher and Kaplan are both Chaplain in English, which is only more or less correct. We have hundred of these. The best solution would be a systematic input of a system of occupations. There are projects that created such systems so that they can then put various language words on that systematic grid. I contacted a project in Cambridge which Christine Philliou's team had picked as an interesting partner, but the contact died down. Katrin Moeller in Halle is on a similar project but does not feel the ready to make her system open source already.
Charles: the primary use of PhiloBiblon currently:
For editors of texts: How many Mss / printed editions are there of a given text?
Descriptions of MSS
There are lots of other potential queries based on data in PhiloBiblon, but it is currently difficult to extract it, e.g.:
Codicology: How many Mss that have gatherings of 12 leaves, and where do these Mss come from? Where and when are particularly catchword of signature types used in gatherings?
Prosopography: What individuals were active in a given location in a given time period
Oscar: there’s not a lot of interaction with PhiloBiblon, although some users do download forms from our Collaborate page, fill them out, and return them for incorporation into PhiloBiblon
Xavier: demo of SparQL in a nice web editor from the Biblioteca Dixital de Galicia… (Olaf was impressed with the editor)


Olaf: FactGrid is a limited user database, so only they can change things. Everyone in the project will get an account used to change data. Users like the ones Charles describes can be given an account
So: a difficult situation. Look at all the languages of an Item to decide whether these things are really double records (mere spelling variants for instance like German Lakai and Laqai) or whether they are so far not translated with the necessary nuances. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 20:53, 26 March 2022 (CET)
There’s no central form for making ‘issues’, you just change things when you see a mistake.
People who make a change add a reference about the change. You can keep mistaken information on the database to make sure people don’t change it back. It remains on the database ‘downgraded’ as mistaken information.
Qualifiers are created for hypothetical statements.
The public can see everything and do any search but it’s not open to the public. Once you have your data in qualified triples, it will be used by different users depending on their own interests..


Charles: Charles: We want to facilitate crowd-sourcing in order to tap into Hispanists all over Europe who can describe MSS in their local libraries and add tht information to PhiloBiblon directly.
Hi Olaf, if the two chaplains correspond to two different original understandings, then by all means they should be kept. I note that you make a distinction in German between Kaplan and Hilfsgeistlicher. For PhiloBiblon we'll use the latter Chaplain (Q43655). --[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 21:17, 26 March 2022 (CET)
Olaf: “spread accounts” - We can do his by giving everyone interested in contributing an account.
Wikidata doesn’t have working hypotheses, they want facts, not theories. Whereas FactGrid allows for the type of knowledge (hypothetical, guess, needs to be substantiated)
You make editorial control / blocking items… You can put items on a watch list and check who edits it. Otherwise everyone is allowed to edit and we watch the items as they get edited.
Olof will give people accounts (based on email addresses, website / institutional titles), but he can also give admin accounts and show you how to create these accounts…


Charles:
== P73 notes ==
Would it be useful to add a field in PhiloBiblon for a WikiBase Q / P number? (as we correct our data)
Yes, it’s advisable
FactGrid will show these for Wikidata numbers
this will be an interactive process, back and forth, until we get it right…


John May:
Dear Charles, The most important parameter is the present 1500 character limit. Both your strings are shorter and did not resist my input on the sandbox-Item: [[Item:Q24]]. Why could I do it and you failed? I can only guess. My most common mistake is an accidental blank space in the end. I am also not sure whether it would swallow tabs. But generally I am trying to use the property very moderately as I realised that things get messy after three 1500 character notes. So test it again and watch out for blanks in the end. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 08:35, 4 April 2022 (CEST)
In the production of these spreadsheets, what are we trying to do?
E.g. there are 200 cells (with structure within, including P-numbers, and Q-numbers). In a normalized table, we would break these up, but they are not in that format yet.
How are P / Q numbers assigned?
——
Tasks:
Get familiar with FactGrid & Wikibase
Adam will be working on linking the entities between PhiloBiblon and Wikibase.
Get admin account from Olaf (so you can create accounts for the project)
Take the framework and CSV which John May will make and work with Olaf, Josep, and Jason
Once we have the CSV, contact Jason Kovari: questioning whether we need entities or just metadata for the text?
Josep will work on obtaining the entities from the texts (see interface).
Josep will work on making an API for easy input options
Rob: We should write down the primary things you want to refer to.. e.g.
‘ownership’ (provenance in the museum and library world) if that’s something you want to refer independently to the object.
From there you can work out the CSV tables and get them into the system
Everyone: Communicate all problems on the project page, so anyone can see the discussion that led to certain solutions (so our models get adopted). Transparency is key to that to follow your line of thinking.


== Test Items ==
== The authors ==
Here are Dan Gullo's suggestions (17 May 2021) for MSS to use as test objects:


When choosing records, I would suggest some basic criteria (I put this here before my suggestions).
Dear Charles, I am wondering about the authors. Would it make sense to import all Wikidata's Spanish authors 700 to 1900, just to have them with external identifiers and basic data?


Well-established authors that are found in multiple collections
Would there be specific external identifiers that would make it easy for you to map such an import against your authors? --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 22:57, 26 April 2022 (CEST)
Well-established authors that are found in multiple languages
Well-established authors that may be found outside of Spanish or Portuguese libraries
Complex author-title relationships, that may involve a known translator and a known author
Complex title-title relationships, perhaps a commentary on a known work.
Complex titles with multiple variants, perhaps by language
Works that are in print and manuscript
Works with established bibliography
I would suggest Gonzalo de Berceo and known works to deal with a major author


BETA bioid 1211
::The Advantage of the Input from Wikidata would be that all the place references would come with easy matching - since we have all the places with Wikidata references. So that could make the entire matching process a bit easier - if persons were easy/easier to match for you. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 07:53, 27 April 2022 (CEST)


I would suggest Isaac of Ninevah and the translations and printed editions of his works (serious issues to wrestle with here because of the need to reconcile your data so that one author moves from 3 ids in Philobiblon to one Qid in Wikibase)
From Max: Adding q-items for authors doesn't help us in the short run - it actually makes our task more difficult. We have been operating under the assumption that we don't need to reconcile any of the base objects except geo. That is, we intend to create all the base objects (except geo) by script without checking first to see whether there is an existing object that we could/should use. So creating a bunch of author objects from a different source would mean that we need to be more careful when creating bio objects.


BETA bioid 1186
However, given that we expect work on PB to slow down significantly when the money runs out during the summer, I believe we will choose not to create the bulk of the PB base objects in FG in the summer time frame. Our current plan is:
BETA bioid 1398
BITAGAP bioid 1079


I would suggest Pseudo-Seneca for the same reasons as Isaac of Ninevah
:::Develop OpenRefine schemas against FG live (as Olaf has agreed), using a small subset of the base objects
:::Do a dump of the existing properties in FG along with all q-objects that have a PhiloBiblon ID (P476) -- PB base objects and dataclip objects
:::Load those objects into a beefy, more durable sandbox
:::Run OpenRefine using the full CSV dumps and the schemas we developed in step 1 -- to get a real sense of the dirty data problems we will (almost certainly) encounter and of the truly "problematic" parts of the PB schema
I hope we can get through the above before the money runs out. Having more objects to reconcile against will slow us down.--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 20:28, 28 April 2022 (CEST)
::::Well, it is the best solution for me. We should all learn how to do things with Open Refine. So let the expertise which you are gaining, flow. People will be eager to learn from you. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 21:28, 28 April 2022 (CEST)


BETA bioid 1192
== Duplicate ==
BITAGAP bioid 1107
BITECA bioid 6379


For particularly interesting records to think about the complexity of data migration, here are three.
Dear Charles, not sure whether these have different connotations in Spanish Q400622 / Q23209 (if not we should merge to the lower number) --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 08:02, 3 May 2022 (CEST)


Very complex record
and Q400617 versus Q394155


BETA manid 1567


Record which is not a complete manuscript, so how do you want to represent a part of a manuscript, and not a complete manuscript.
:: Q400622 / Q23209 should be merged. The correct Spanish label is Amante. Q400622 / Q23209 should not be merged. Q400617 is a script. Q394155 is a typeface--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 21:26, 3 May 2022 (CEST)


BITECA manid 2554
Seems to be a duplicate as well: [[Item:Q892222]] and the older [[Item:Q393633]], best, --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 06:47, 29 March 2024 (CET)


For a printed book with complex data
::  [[Item:Q393633]] has the Label "Part of subject item". The new item [[Item:Q892222]] IsPartOf refers to the fact that the subject is part of the object, for example, an article is part of the issue of a journal. This has become standard library terminology. I cite schema.org. Here's an example of how it's used in the UC Berkeley catalog: [https://search.library.berkeley.edu/discovery/fulldisplay?docid=cdi_doaj_primary_oai_doaj_org_article_5212938fb73341c48f9fcb117beaf782&context=PC&vid=01UCS_BER:UCB&lang=en&search_scope=DN_and_CI&adaptor=Primo%20Central&tab=Default_UCLibrarySearch&query=any,contains,charles%20faulhaber&facet=rtype,include,articles&offset=0 Faulhaber article in UCB catalog]


BETA manid 1510
:: As I have worked through the implications of P1053 (* Secondary bibliography) I saw that it was possible to include values for, e.g., Creator, Title, Source, Series. This worked exactly as I had hoped, but there was no metadata. You would see the title of an article and the title of the journal or the book in which it appeared, but no explicit indication that Title was the title of the article and Source was the title of the journal. Hence: IsPartOf. Each value can then be qualified by P700 (Statement refers to) and the appropriate metadata qualifier. Here's the same article in our Sandbox, still missing the metadata qualifiers, except for author: [https://philobiblon.cog.berkeley.edu/wiki/Item:Q16466 Faulhaber article in Sandbox]


For institutions, you can use mine for fun, because you have it three time in Philobiblon, all with three different names.
== P740 ==


BITECA libid 1128
Dear Charles I modified the label on P740 - mostly because I want to be able to use these properties also on works of Music or art. The more open the wording the better - and I also wanted to hint at the P233 as the rival for more defined statements. Hope that fits, we can rename these things as soon as we get examples. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 22:26, 3 May 2022 (CEST)
BITAGAP libid 852
BETA libid 668


=== Do links to PhiloBiblon have to be dynamic? ===
:: Understood. Much better.--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 22:43, 3 May 2022 (CEST)
I am asking this because I tried to create an External identifier from FG to PB.  


* [[Item:Q164503|Biblioteca Capitular de Toledo]]
You also mentioned today another P# possibly relevant to the relationship between two texts, but I can't remember what it was. It was not See also (P301), which you also mentioned. --[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 00:07, 4 May 2022 (CEST)


The external identifier could lead directly into the data set if I had a link with a stable environment and just the ID changing. I would replace the ID position in the URL with $1 and be able to link from the number into the PB item page... Seems that does not work, as I see you do not give any links into your own database...
== Wiegendrucke ==


Just by the way: sign all comments with <nowiki>~~~~</nowiki> - that creates automatic and dated signatures when you press save. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 22:55, 19 May 2021 (CEST)
Dear Charles, can you give me a couple of Wiegendrucke Links, maybe also with VIAF links so that I understand why this is not working. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 23:12, 30 June 2022 (CEST)


== Database Objects and pages ==
== Link to GKW-ID number ==
 
The URL for the Gesamtkatalog der Wiegendruck website [https://data.cerl.org/istc/$1 GKW] does not open the record.
 
We want it to link to the GKW record, just as the Incunabula Short Title Catalog link does: ISTC-ID (P645) [https://data.cerl.org/istc/$1 ISTC]
 
 
See [https://database.factgrid.de/wiki/Item:Q399016  Ed.: Giovanni Boccaccio, La Fiameta, Salamanca: Juan de Porras, 1497 (Q399016)]. This is ISTC-ID [https://data.cerl.org/istc/ib00738000 ib00738000] and GKW-ID [https://gesamtkatalogderwiegendrucke.de/docs/GW04461 04461]
 
The URL here gives an error message: "This gesamtkatalogderwiegendrucke.de page can’t be found". The URL from the ISTC record [https://www.gesamtkatalogderwiegendrucke.de/docs/GW04461.htm GW04461] works there. It looks as though the .htm at the end is necessary. The question is how to plug $1 in.--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 23:48, 30 June 2022 (CEST)
 
Well, not nice but part of their software. The first link is operable, the second is not - the difference is the zero which they interpolate after GW
 
# https://www.gesamtkatalogderwiegendrucke.de/docs/GW04461.htm
# https://gesamtkatalogderwiegendrucke.de/docs/GW4461.htm
 
See here [[Item:Q399016]] my input with that zero -and it works just fine. I could also re-write the formatter URL on [[Property:P768]] to include that 0, but it wont get us anywhere as they are running their entire system with an expected 5-digit input. See my initial test at [[[Item:Q24]] where I just added their number 1. They want me to call it 00001. So maybe Max writes a script to turn all your numbers into 5-digit numbers with the required 0 digits on all smaller numbers. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 08:13, 1 July 2022 (CEST)
 
You fixed it! Most of our GKW records are already correct, The GKW ID must be at least 5 digits, with left-padded zeros if necessary,  but there may be more. In addition it can begin with a character of the alphabet, e.g. "M"--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 01:21, 2 July 2022 (CEST)
 
== P708 ==
 
Dear Charles, I realised that Property:P708 turned into a potentially messy property. The description stated that it was designed to give information about ho secondary literature relates to the object, but then you also used this statement on [[Item:Q254531]] to describe relationships between people which we do with [[Property:P277]].
 
My advice is here to use the general P700 property to state what kind of reference an article is making (what you planned with P708), and to use the P277 to speak about the status of relationships between people organisations etc. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 11:49, 30 September 2022 (CEST)
 
Olaf, [[Property:P277]] "Type of relationship" is described as "status of an affiliation, e.g. lifelong membership" with a declaration stating that it is part of "FactGrid properties for Groups and organizations."
 
If we restrict it to that kind of relationship, then we could use Property:P708 to qualify relationships between people, e.g. for [[Property:P703]], Associated persons. This would mean changing its Description and making it part of "FactGrid Properties for People" and removing it as part of  "FactGrid properties to reference information."
 
If, however, we extend [[Property:P277]] to relationships between people instead of just between people and organizations, then the description would need to be changed and it would have to be part of "FactGrid properties for People" as well.
 
[[Property:P700]] would then qualify [[Property:P12]] Research literature instead of Property:P708. The latter has been used exclusively by PhiloBiblon, so changing it would only involve a small number of Items with Reference bibliography.
 
I would propose, then:
 
[[Property:P277]] to remain for people and groups.
 
[[Property:P700]] to qualify research literature.
 
Property:P708 for relationships between people to qualify [[Property:P703]] Associated persons.
--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 00:55, 1 October 2022 (CEST)It woul
 
::We are presently discussing broader shifts - also with a look at Wikidata. Katharina Brunner and Michael Ringgaard have made me think over the past months, showing me how one can search through interconnected Wikibases if there are property equivalents. This is nor argument against unique solutions but an argument to join otions where it does not make much of a difference.
 
::* [[Property:P277]] and [[Property:P820]] form a couple of qualifiers to state relationships (roles or types of affiliation) with greater clarity It takes two to specify directions e.g in family ties nephew/uncle. The base statement is usually either [[Property:P91]] "Member of" (there are different forms of membership) or [[Property:P703]] "Associated (or "significant") persons"
 
We will resolve some of our minor Properties (like [[Property:P192]] friends and to go into that direction of more unified properties and more detailed ontologies on the side of the values.
 
::* [[Property:P700]] is a general property if a statement is not exactly clear. The qualifier allows you to hint at an aspect you are referring to.
 
Looking at what you intended with Property:P708 - this is your normal case: [[Item:Q393579]] and here I would go for "Statement is referring to" ([[Property:P700]]) rather than the proposed "Kind of relationship (P708)" - look at both options on the item. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]])
 
:: Hi Olaf, if we use [[Property:P700]] Statement is referring to with regard to [[Property:P12]] Research Literature, which works for PhiloBiblon, then we have an "orphan" [[Property:P708]] Kind of relationship. It's not clear what that should be used for. In PhiloBiblon, relationships Uncle/nephew were automatic. By definition, if you specified the link Uncle/Nephew the reverse link Nephew/Uncle was created automatically. Can you point me to an exemplary Item that illustrates the use of [[Property:P277]] and [[Property:P820]] for this purpose?--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 01:17, 5 October 2022 (CEST)
:::Take a look at [[Item:Q452342]] - I am actually inclined to get rid of several family properties which we created and to move towards far more distinct items (and I guess that Bruno has the same tendency...) --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 08:40, 5 October 2022 (CEST)
 
== Editions and copies ==
 
Dear Charles, I took a look at [[Item:Q458438]]. "Present Holding" P329 was so far referring to Institutions. We added shelfmarks to refer to copies without creating specific items. You have now connected P329 to the copy - which looks nice, but adds a different category of objects on the property.
 
My first thought was that we needed an Item/Copy/Exemplar property P840 to list items. Maybe we don't - if we open P329 to copies...
 
I sometimes feel we should sit together over sample items to sort things out.
 
I also created [[Property:P843]] to marke the FRBR status on each object. Was it necessary? Should we, instead, put the FRBR status on P2 instance of? The argument for a succinct property was that here we could control the precision with greater ease. Just ask whether all the objects have such a statement and make sure that it has only answers in the four FRBR categories from item to work. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 10:01, 15 November 2022 (CET)
 
--Hi Olaf, I used your exemplary edition [[Item:Q272575]] as a model. it uses Present holding [[Property:P329]] to list the known copies of that edition. It corresponds to a FRBR Manifestation, so you could in fact add [[Property:P843]] > "Manifestation" in the FRBR scheme [[Item:Q453705]],
 
We have the problem of reciprocal properties. I think we need to indicate all the known copies of a given edition, and both P329 and [[Property:P840]] work for that, qualified with the same information (City: Library, shelfmark, old shelfmark).
At the same time, we also need to indicate, on the individual copies, that they are Items that correspond to a particular manifestation, i.e. an edition. I assume that you created Edition/ production/ model ([[Property:P839]]) for this purpose, and I have therefore added it to the two copies of the 1497 ed. of Boccaccio in [[Item:Q398978]] and [[Item:Q418372]]. I also added ÏTEM"[[Property:P840]] for the sake of comparison.
I used P2 for Print publication (Q16) and also for Specific copy of an edition / production ([[Item:Q3]]).
 
Shall we go over this on Thursday 11/17?--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 01:42, 16 November 2022 (CET)
 
Good morning Charles, I tried to bring a German incunabula "Rudimentum novitiorum" as item nearer to the standards you are establishing here. We also have an item for a single copy there: Rudimentum novitiorum [ehem. Exemplar aus der Schloßbibliothek der Grafen Chorinsky] (Q369917) It was generated from an auction catalogue e.g in the moment of trade. So P329 on the main page --> Rudimentum novitiorum (Lübeck [1475]). (Q369868) does not seem to fit. Please feel free to correct me.--[[User:Martin Gollasch|Martin Gollasch]] ([[User talk:Martin Gollasch|talk]]) 09:39, 16 November 2022 (CET)
 
Hi Martin, Olaf and I are struggling with this. You've already made a distinction between the edition (manifestation) of the <I>Rudimentum Novitiorum</i> [[Item:Q369868]] and the item, a physical copy of the edition, in the example of the copy of the <I>Rudimentum Novitiorum</i> formerly belonging to Schloßbibliothek der Grafen Chorinsky [[Item:Q369917]]. The problem is that the location of that copy is currently unknown. In PhiloBiblon we created a record precisely to capture such cases: "Unknown place, Unknown library. That way we can bring together all known copies (166 to date)whose current location is unknown. This is not particularly useful to know, but it's an artefact of our database design. In order to capture the existence of a previously known copy, it had to be linked to a library.
 
I see that FactGrid already has an example of a separate object to specify that a copy's current location is unknown: Private property, formerly Trento, Countess Klotz, whereabouts unknown ([[Item:Q256436]]).
 
You could do the same for the Chorinsky copy ([[Item:Q369917]]) and then add it to the edition using P329. In this case, that is probably the most efficient way to go, since you probably don't want to create Items for every single copy when you can simply list the holding location and shelfmark, especially since there are 105 known copies.
 
If you <em>did</em> want to create an item for every single copy, you could then use [[Property:P840]] to list surviving copies of a given edition.


This is no database object - it is a Wiki page with the Q-Number of an object:
In the case of Philobiblon, we have already created items for every known copy of a given edition, so it makes sense for us to use Item (P840) to link them to the edition itself. In the case of some Spanish-language incunabula we do have more than 100 known copies


https://database.factgrid.de/wiki/Q254471
Olaf, [[Item:Q3]] and [[Item:Q369870]] should probably be merged, and I have marked them as merger candidates--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 22:03, 16 November 2022 (CET)


To create an Object you need to run the New Item procedure (Menue). better even: run that routine with a batch Fragment. Maybe I do give an online course on the Database, maybe you take a look at the Help section to understand the database routines. I could give a course for anyone interested Monday 21:00 CET.
::I'll be online tomorrow 21:30 CET. "Unknown place, Unknown library" - the equivalent itm, though not under the same wording is [[Item:Q5]] lost object. It has a number of qualifiers to bring precision into the status. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 22:39, 16 November 2022 (CET)


This is the database object:
:::Just by the way [[Item:Q395220]] is a dubious object after the last clarifications of WEMI properties and items. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 09:46, 17 November 2022 (CET)
(CET)
::I'll be online tomorrow 21:30 CET. "Unknown place, Unknown library" - the equivalent itm, though not under the same wording is [[Item:Q5]] lost object. It has a number of qualifiers to bring precision into the status. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 22:39, 16 November 2022 (CET)


https://database.factgrid.de/wiki/Item:Q254471 
:What is a "lost object"? When a library or archive burns down objects may be lost. From the Cologne archive, which broke down, we know how difficult it is, to decide whether damaged or lost... Private collections are not a loss, but the objects are generally unavailable to the public. From common 18-ct Albums amicorum which are usually traded around 4-10 K€ we know that they pop up every 1-2 generation. Bombing raids on German cities in WW2 may be a cause of many losses or the assumption of a loss. So, if an object was described the last time in literature before WW1 and never after, it may be a loss, but not for sure... If it was described as in holding of a private noble family archive in the western part of Germany its probably still there. In East Germany or Poland probably not, but perhaps in a museum or archive nearby or in Russia... For me a "loss" has to be positivly documented. Its the physical destruction of an object. If the loss can only be assumed the object is "probably lost". If an object has been reported as in an unknown private hand, its not "probably lost". Example: We do not know exactly how the Königsberg library burnt down. May be a fire worker grabbed an object... From the Anna-Amalia fire we know, that not all objects burnt, some were only damaged and restored. In Weimar and in Cologne its probably still undecided how many object are unrestorable lost. I started this discussion with what I think is a standard case in the digital era. We do not need Items for every single copy of an incunabula in the holdings of institutions, but the lesser known ones circulating in trade and auctions between private owners/collectors need a deeper documentation and therefore an extra item.--[[User:Martin Gollasch|Martin Gollasch]] ([[User talk:Martin Gollasch|talk]]) 09:58, 17 November 2022 (CET)


best wishes, --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 08:51, 20 May 2021 (CEST)
::It's important to distinguish between a "lost object" in Martin's sense, i.e., one that is definitely known to be lost, as in items destroyed during WWII, and items that <em>may</em> be lost. The latter are items that were documented by a reliable source but can no longer be traced. PhiloBiblon has lots of these--MSS or copies of printed editions that were sold at auction or offered by booksellers but whose current location is  unknown. They are presumed to exist, and they may turn up in the future. We list them in a place-holder record as in an "Unknown place, Unknown library" and use Literature (P12) to show where they were documented, If they do re-appear, we just need to change the current location to whatever library or institution is the new owner, using [[Property:P329]].--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 19:08, 17 November 2022 (CET)

Latest revision as of 21:15, 29 March 2024

All FactGrid users

June 2019 First Edits

Dear Charles,

Here are the exemplary items you might try to refine:

There are probably some statements I would try to use if it was my project, especially the Property:P233/+Property:P234 option to create a stemma. This is something I am not quite sure how it will work out. If you talk to computer people they might find that easy. Property:P233 states the previous position(s) in a genetic relationship and Property:P234 is a property to qualify the respective genealogical link like: translation, abbreviation and so on. We manages to get network graphs with these two properties but I'd love to ave a visualisation that exploits the P106 date marker to create the tree...

There are other issues I did not try to solve. The Property:P476 identifier should be able to link straight through into the dataset. You can configure it if you know more about how your software is creating these dynamic links. (That is also the reason why I did not create the "Uniform Title IDno" - you people will know better how to create links into your database.

I also did not go into details with the two texts of the manuscript. Item:Q164746 has some blanks "Input here" that just indicate we could do it, but I did not create the respective items because that would open ever new boxes. The language item "Castellano" might need specification (Spanish would also have been available, but I guess you have more specific terms for different historical periods like we have in German with Middle High German, or Middle Low German).

Best Wishes --Olaf Simons (talk) 13:42, 27 May 2020 (CEST)

Thanks, Olaf,

I'm still trying to figure out how to identify Properties to use as examples in the NEH proposal.

I will use this with you rather than e-mail....

A question:

Is it possible to copy a Property and then make changes in it. For example, copy P329 "Holding archive" to create another P with the label "Holding library"?

Distinguishing Qs at the model level

Hi Olaf,

I understand "Goethe (Q5879) author (P21) Faust (Q29478)"

How do I express this in terms of a data label, linking an entity Q author (P21) to Q ext"?

How about entity human (Q5) author (P21) Q ???


I was looking at the "New Property" page. I think that we are going to have to create a lot of new properties. Will they all have to be debated and discussed before they can be added?

Thanks again. This is fun,

Charles

The statement above is the other way around: Faust has properties and one is that Goethe is its author. See Item:Q3828 for a typical letter. (We did not yet go any deeper into books). Look at our Reasonator interpretation to see why this makes sense.
You can state Faust on Goethe's side with Property:P174 works published. But we did not start with this because it would kill the Items. Some People have written some 800 letters - if we state them on the person's item that will not be beautiful, especially since we will travel around with a messy subset of qualifiers: Goethe is the author of the letter x, he sent that to Y, Y lived in Z, he sent that Letter on date B... So we put all this information on the objects and leave it to searches to tell us what a person has written. [all the object that have J.J.C. Bode as author (some 800)
Item:Q7 Human you need to state that someone is a human being - like a category in previous datebases. Author is mostly used as in career statements.
If you want to feed bigger masses into FactGrid, I would recommend to do some five test items of each sort and look at the searches they are supposed to answer. The good data model answers well on searches. On this course we will create most of the Properties needed. You can create your own properties any time but I recommend to do that only after a couple of items we created together. There are stupid mistakes one can make like not fully consider the data-type. It is also good practice to state new properties on the common page so that the others can see them and begin to use them in their own research. --Olaf Simons (talk) 08:13, 31 May 2020 (CEST)

Hi Olaf,

I did finally figure out, with help from a friend in Spain that the proper relationship is work P50 author/ written by: Libro de buen amor (Q2283127) written by (P50) Juan Ruiz (Q434597). In fact, this is how we do it in PhiloBiblon, where the basic entity is the Work, to which authors, translators, and other Associated Persons are linked.

I don't intend to create anything, neither properties nor classes, for a good long time!

I will talk to you with Jens on Tuesday

1st Web-Meeting, 18 May 2021

Moved to FactGrid talk:PhiloBiblon

Database Objects and pages

The following is no database object - it is a Wiki page with the Q-Number of an object/Item:

https://database.factgrid.de/wiki/Q254471

To create an object you need to run the New Item procedure (Menu (which you did)). Better even: run that routine with a batch Fragment.

This is the database object - It has "Item" in the url:

https://database.factgrid.de/wiki/Item:Q254471

It makes no sense to date things on FactGrid (as you did on the talk page as the system has a version history for anything that is far more precise and specific.

Maybe I can offer an online course on the Database. Useful also a look at the Help section to understand the database routines. (Improvements of the Held section are welcome.) I could give that course for anyone interested Monday 24 May 2021 at on our Webex Channel (easy to use) 21:00 CET.

Good descriptions

Dear Charles, as I just wrote in my e-mail: If you are using labels like "MS: Toledo: Biblioteca Capitular...." you will get hundreds of hits on the normal navigation that make no difference before character 31. Two ways out of that dilemma: good, (1) intuitive, aliases and (2) descriptions that state what is special in the first 15 characters, the ones you see on display in thin letters, below the bold font labels. --Olaf Simons (talk) 08:19, 26 January 2022 (CET)

Witnesses and Price

Morning Charles, I am not sure whether Item:Q394389 is meant to refer to a person (which we have Q144912) in a law case or to a textual witness (Item:Q195274)? (The plural is odd) --Olaf Simons (talk) 08:32, 24 February 2022 (CET)

I also detected these to Item:Q394388, Item:Q246405 which might be the same. --Olaf Simons (talk) 08:55, 24 February 2022 (CET)

We need both textual witnesses Item:Q195274 and witnesses as individuals Q144912, although ours are usually witnesses of documents, not witnesses in a law case. In the former case, each record in our Analytic table is a textual witness. In the latter case, witnesses are found as Persons Associated with a text or with a textual witness. With regard to Item:Q394388 and Item:Q246405, they are different in my opinion. The latter is the actual price paid from one person to another; the former is the price established by law for a book. This is useful to know for book history. --Charles Faulhaber (talk) 19:56, 24 February 2022 (CET)

--I quote the "tasa" from the 1495 edition of Antonio de Nebrija's Latin-Spanish Dictionary: "Esta tassado este vocabulario por los muy altos | & muy poderosos principes el Rey & la Reyna | nuestros señores & por los del su muy alto con|sejo en cinco reales de plata."

The price of this vocabulary has been set by the most high and most powerful princes, the King and Queen our lords, and by those of their most high council in 5 silver reales

This is an official government action. It is different from the price one would pay to a bookdealer, where money changes hands. --Charles Faulhaber (talk) 06:09, 3 March 2022 (CET)

Property like Items

Dear Charles, Item:Q394873 - I do not quite see in what kind of statement such items can possibly occur. I do not even know what P2 or P3 statement to give them. Could you put them on an exemplary item? --Olaf Simons (talk) 10:39, 2 March 2022 (CET)

This was based on our discussion 3/1/22 on how to use Literature (P12) to list a Q# of secondary bibliography and then to qualify that Q# with Kind of Reference (P708). You created that property in order not to use Note (P73) followed by a string. You rightly pointed out that if we used "descrito en," we would not see "described in" if we changed the language to English See Item:Q164502.

We will see how we get good wordings on these. So far they sound like properties - and we even have one of the same wording. That is why I was worried whether the information was not supposed to go on the Property:P411. --Olaf Simons (talk) 08:48, 3 March 2022 (CET)

Dear Olaf, it's clear that we are going to have doubles with the same wording in both P# and Q#. We need this flexibility in order to map PhiloBiblon into FactGrid. It is clear that we need a P# for each class (=column head), but then we have to decide whether the individual items in that class should be P# or Q#. We don't think that we can mix them: Either all P# in a given class or all Q#. If we say that we should have all P$ in a given class, then we have to create those P# if they do not already exist. And you have convinced me that multiplying the number of P# is not a good idea: Occam's razor --Charles Faulhaber (talk) 07:09, 4 March 2022 (CET)

Location of text in MS or edition

PhiloBiblon has a controlled vocabulary term ANALYTIC*TEXT_LOCCLASS to indicate the location in a manuscript or edition where a text is found.

Existing examples (e.g. Item:Q245760) link folio P100 or page P54 as statements directly to the MS or edition.

PhiloBiblon, I think, needs a P# called "Location in MS or edition" to serve as column header in the CSV file to facilitate mapping to FactGrid.

This would take as a Q# Leaf (Item:Q369869), Page/Pages (Item:Q164536), or Fly Leaf (Item:Q395036).
--Charles Faulhaber (talk) 23:40, 5 March 2022 (CET)

thy miseries of our career statements

Dear Charles. The career statements are terrible so far. We started with an automatic input of German statements from lists. We then ran automatic translations, and here Deepl and Google did not care too much about the complex German spectrum. Thus Hilfsgeistlicher and Kaplan are both Chaplain in English, which is only more or less correct. We have hundred of these. The best solution would be a systematic input of a system of occupations. There are projects that created such systems so that they can then put various language words on that systematic grid. I contacted a project in Cambridge which Christine Philliou's team had picked as an interesting partner, but the contact died down. Katrin Moeller in Halle is on a similar project but does not feel the ready to make her system open source already.

So: a difficult situation. Look at all the languages of an Item to decide whether these things are really double records (mere spelling variants for instance like German Lakai and Laqai) or whether they are so far not translated with the necessary nuances. --Olaf Simons (talk) 20:53, 26 March 2022 (CET)

Hi Olaf, if the two chaplains correspond to two different original understandings, then by all means they should be kept. I note that you make a distinction in German between Kaplan and Hilfsgeistlicher. For PhiloBiblon we'll use the latter Chaplain (Q43655). --Charles Faulhaber (talk) 21:17, 26 March 2022 (CET)

P73 notes

Dear Charles, The most important parameter is the present 1500 character limit. Both your strings are shorter and did not resist my input on the sandbox-Item: Item:Q24. Why could I do it and you failed? I can only guess. My most common mistake is an accidental blank space in the end. I am also not sure whether it would swallow tabs. But generally I am trying to use the property very moderately as I realised that things get messy after three 1500 character notes. So test it again and watch out for blanks in the end. --Olaf Simons (talk) 08:35, 4 April 2022 (CEST)

The authors

Dear Charles, I am wondering about the authors. Would it make sense to import all Wikidata's Spanish authors 700 to 1900, just to have them with external identifiers and basic data?

Would there be specific external identifiers that would make it easy for you to map such an import against your authors? --Olaf Simons (talk) 22:57, 26 April 2022 (CEST)

The Advantage of the Input from Wikidata would be that all the place references would come with easy matching - since we have all the places with Wikidata references. So that could make the entire matching process a bit easier - if persons were easy/easier to match for you. --Olaf Simons (talk) 07:53, 27 April 2022 (CEST)

From Max: Adding q-items for authors doesn't help us in the short run - it actually makes our task more difficult. We have been operating under the assumption that we don't need to reconcile any of the base objects except geo. That is, we intend to create all the base objects (except geo) by script without checking first to see whether there is an existing object that we could/should use. So creating a bunch of author objects from a different source would mean that we need to be more careful when creating bio objects.

However, given that we expect work on PB to slow down significantly when the money runs out during the summer, I believe we will choose not to create the bulk of the PB base objects in FG in the summer time frame. Our current plan is:

Develop OpenRefine schemas against FG live (as Olaf has agreed), using a small subset of the base objects
Do a dump of the existing properties in FG along with all q-objects that have a PhiloBiblon ID (P476) -- PB base objects and dataclip objects
Load those objects into a beefy, more durable sandbox
Run OpenRefine using the full CSV dumps and the schemas we developed in step 1 -- to get a real sense of the dirty data problems we will (almost certainly) encounter and of the truly "problematic" parts of the PB schema

I hope we can get through the above before the money runs out. Having more objects to reconcile against will slow us down.--Charles Faulhaber (talk) 20:28, 28 April 2022 (CEST)

Well, it is the best solution for me. We should all learn how to do things with Open Refine. So let the expertise which you are gaining, flow. People will be eager to learn from you. --Olaf Simons (talk) 21:28, 28 April 2022 (CEST)

Duplicate

Dear Charles, not sure whether these have different connotations in Spanish Q400622 / Q23209 (if not we should merge to the lower number) --Olaf Simons (talk) 08:02, 3 May 2022 (CEST)

and Q400617 versus Q394155


Q400622 / Q23209 should be merged. The correct Spanish label is Amante. Q400622 / Q23209 should not be merged. Q400617 is a script. Q394155 is a typeface--Charles Faulhaber (talk) 21:26, 3 May 2022 (CEST)

Seems to be a duplicate as well: Item:Q892222 and the older Item:Q393633, best, --Olaf Simons (talk) 06:47, 29 March 2024 (CET)

Item:Q393633 has the Label "Part of subject item". The new item Item:Q892222 IsPartOf refers to the fact that the subject is part of the object, for example, an article is part of the issue of a journal. This has become standard library terminology. I cite schema.org. Here's an example of how it's used in the UC Berkeley catalog: Faulhaber article in UCB catalog
As I have worked through the implications of P1053 (* Secondary bibliography) I saw that it was possible to include values for, e.g., Creator, Title, Source, Series. This worked exactly as I had hoped, but there was no metadata. You would see the title of an article and the title of the journal or the book in which it appeared, but no explicit indication that Title was the title of the article and Source was the title of the journal. Hence: IsPartOf. Each value can then be qualified by P700 (Statement refers to) and the appropriate metadata qualifier. Here's the same article in our Sandbox, still missing the metadata qualifiers, except for author: Faulhaber article in Sandbox

P740

Dear Charles I modified the label on P740 - mostly because I want to be able to use these properties also on works of Music or art. The more open the wording the better - and I also wanted to hint at the P233 as the rival for more defined statements. Hope that fits, we can rename these things as soon as we get examples. --Olaf Simons (talk) 22:26, 3 May 2022 (CEST)

Understood. Much better.--Charles Faulhaber (talk) 22:43, 3 May 2022 (CEST)

You also mentioned today another P# possibly relevant to the relationship between two texts, but I can't remember what it was. It was not See also (P301), which you also mentioned. --Charles Faulhaber (talk) 00:07, 4 May 2022 (CEST)

Wiegendrucke

Dear Charles, can you give me a couple of Wiegendrucke Links, maybe also with VIAF links so that I understand why this is not working. --Olaf Simons (talk) 23:12, 30 June 2022 (CEST)

Link to GKW-ID number

The URL for the Gesamtkatalog der Wiegendruck website GKW does not open the record.

We want it to link to the GKW record, just as the Incunabula Short Title Catalog link does: ISTC-ID (P645) ISTC


See Ed.: Giovanni Boccaccio, La Fiameta, Salamanca: Juan de Porras, 1497 (Q399016). This is ISTC-ID ib00738000 and GKW-ID 04461

The URL here gives an error message: "This gesamtkatalogderwiegendrucke.de page can’t be found". The URL from the ISTC record GW04461 works there. It looks as though the .htm at the end is necessary. The question is how to plug $1 in.--Charles Faulhaber (talk) 23:48, 30 June 2022 (CEST)

Well, not nice but part of their software. The first link is operable, the second is not - the difference is the zero which they interpolate after GW

  1. https://www.gesamtkatalogderwiegendrucke.de/docs/GW04461.htm
  2. https://gesamtkatalogderwiegendrucke.de/docs/GW4461.htm

See here Item:Q399016 my input with that zero -and it works just fine. I could also re-write the formatter URL on Property:P768 to include that 0, but it wont get us anywhere as they are running their entire system with an expected 5-digit input. See my initial test at [[[Item:Q24]] where I just added their number 1. They want me to call it 00001. So maybe Max writes a script to turn all your numbers into 5-digit numbers with the required 0 digits on all smaller numbers. --Olaf Simons (talk) 08:13, 1 July 2022 (CEST)

You fixed it! Most of our GKW records are already correct, The GKW ID must be at least 5 digits, with left-padded zeros if necessary, but there may be more. In addition it can begin with a character of the alphabet, e.g. "M"--Charles Faulhaber (talk) 01:21, 2 July 2022 (CEST)

P708

Dear Charles, I realised that Property:P708 turned into a potentially messy property. The description stated that it was designed to give information about ho secondary literature relates to the object, but then you also used this statement on Item:Q254531 to describe relationships between people which we do with Property:P277.

My advice is here to use the general P700 property to state what kind of reference an article is making (what you planned with P708), and to use the P277 to speak about the status of relationships between people organisations etc. --Olaf Simons (talk) 11:49, 30 September 2022 (CEST)

Olaf, Property:P277 "Type of relationship" is described as "status of an affiliation, e.g. lifelong membership" with a declaration stating that it is part of "FactGrid properties for Groups and organizations."

If we restrict it to that kind of relationship, then we could use Property:P708 to qualify relationships between people, e.g. for Property:P703, Associated persons. This would mean changing its Description and making it part of "FactGrid Properties for People" and removing it as part of "FactGrid properties to reference information."

If, however, we extend Property:P277 to relationships between people instead of just between people and organizations, then the description would need to be changed and it would have to be part of "FactGrid properties for People" as well.

Property:P700 would then qualify Property:P12 Research literature instead of Property:P708. The latter has been used exclusively by PhiloBiblon, so changing it would only involve a small number of Items with Reference bibliography.

I would propose, then:

Property:P277 to remain for people and groups.

Property:P700 to qualify research literature.

Property:P708 for relationships between people to qualify Property:P703 Associated persons. --Charles Faulhaber (talk) 00:55, 1 October 2022 (CEST)It woul

We are presently discussing broader shifts - also with a look at Wikidata. Katharina Brunner and Michael Ringgaard have made me think over the past months, showing me how one can search through interconnected Wikibases if there are property equivalents. This is nor argument against unique solutions but an argument to join otions where it does not make much of a difference.
  • Property:P277 and Property:P820 form a couple of qualifiers to state relationships (roles or types of affiliation) with greater clarity It takes two to specify directions e.g in family ties nephew/uncle. The base statement is usually either Property:P91 "Member of" (there are different forms of membership) or Property:P703 "Associated (or "significant") persons"

We will resolve some of our minor Properties (like Property:P192 friends and to go into that direction of more unified properties and more detailed ontologies on the side of the values.

  • Property:P700 is a general property if a statement is not exactly clear. The qualifier allows you to hint at an aspect you are referring to.

Looking at what you intended with Property:P708 - this is your normal case: Item:Q393579 and here I would go for "Statement is referring to" (Property:P700) rather than the proposed "Kind of relationship (P708)" - look at both options on the item. --Olaf Simons (talk)

Hi Olaf, if we use Property:P700 Statement is referring to with regard to Property:P12 Research Literature, which works for PhiloBiblon, then we have an "orphan" Property:P708 Kind of relationship. It's not clear what that should be used for. In PhiloBiblon, relationships Uncle/nephew were automatic. By definition, if you specified the link Uncle/Nephew the reverse link Nephew/Uncle was created automatically. Can you point me to an exemplary Item that illustrates the use of Property:P277 and Property:P820 for this purpose?--Charles Faulhaber (talk) 01:17, 5 October 2022 (CEST)
Take a look at Item:Q452342 - I am actually inclined to get rid of several family properties which we created and to move towards far more distinct items (and I guess that Bruno has the same tendency...) --Olaf Simons (talk) 08:40, 5 October 2022 (CEST)

Editions and copies

Dear Charles, I took a look at Item:Q458438. "Present Holding" P329 was so far referring to Institutions. We added shelfmarks to refer to copies without creating specific items. You have now connected P329 to the copy - which looks nice, but adds a different category of objects on the property.

My first thought was that we needed an Item/Copy/Exemplar property P840 to list items. Maybe we don't - if we open P329 to copies...

I sometimes feel we should sit together over sample items to sort things out.

I also created Property:P843 to marke the FRBR status on each object. Was it necessary? Should we, instead, put the FRBR status on P2 instance of? The argument for a succinct property was that here we could control the precision with greater ease. Just ask whether all the objects have such a statement and make sure that it has only answers in the four FRBR categories from item to work. --Olaf Simons (talk) 10:01, 15 November 2022 (CET)

--Hi Olaf, I used your exemplary edition Item:Q272575 as a model. it uses Present holding Property:P329 to list the known copies of that edition. It corresponds to a FRBR Manifestation, so you could in fact add Property:P843 > "Manifestation" in the FRBR scheme Item:Q453705,

We have the problem of reciprocal properties. I think we need to indicate all the known copies of a given edition, and both P329 and Property:P840 work for that, qualified with the same information (City: Library, shelfmark, old shelfmark). At the same time, we also need to indicate, on the individual copies, that they are Items that correspond to a particular manifestation, i.e. an edition. I assume that you created Edition/ production/ model (Property:P839) for this purpose, and I have therefore added it to the two copies of the 1497 ed. of Boccaccio in Item:Q398978 and Item:Q418372. I also added ÏTEM"Property:P840 for the sake of comparison. I used P2 for Print publication (Q16) and also for Specific copy of an edition / production (Item:Q3).

Shall we go over this on Thursday 11/17?--Charles Faulhaber (talk) 01:42, 16 November 2022 (CET)

Good morning Charles, I tried to bring a German incunabula "Rudimentum novitiorum" as item nearer to the standards you are establishing here. We also have an item for a single copy there: Rudimentum novitiorum [ehem. Exemplar aus der Schloßbibliothek der Grafen Chorinsky] (Q369917) It was generated from an auction catalogue e.g in the moment of trade. So P329 on the main page --> Rudimentum novitiorum (Lübeck [1475]). (Q369868) does not seem to fit. Please feel free to correct me.--Martin Gollasch (talk) 09:39, 16 November 2022 (CET)

Hi Martin, Olaf and I are struggling with this. You've already made a distinction between the edition (manifestation) of the Rudimentum Novitiorum Item:Q369868 and the item, a physical copy of the edition, in the example of the copy of the Rudimentum Novitiorum formerly belonging to Schloßbibliothek der Grafen Chorinsky Item:Q369917. The problem is that the location of that copy is currently unknown. In PhiloBiblon we created a record precisely to capture such cases: "Unknown place, Unknown library. That way we can bring together all known copies (166 to date)whose current location is unknown. This is not particularly useful to know, but it's an artefact of our database design. In order to capture the existence of a previously known copy, it had to be linked to a library.

I see that FactGrid already has an example of a separate object to specify that a copy's current location is unknown: Private property, formerly Trento, Countess Klotz, whereabouts unknown (Item:Q256436).

You could do the same for the Chorinsky copy (Item:Q369917) and then add it to the edition using P329. In this case, that is probably the most efficient way to go, since you probably don't want to create Items for every single copy when you can simply list the holding location and shelfmark, especially since there are 105 known copies.

If you did want to create an item for every single copy, you could then use Property:P840 to list surviving copies of a given edition.

In the case of Philobiblon, we have already created items for every known copy of a given edition, so it makes sense for us to use Item (P840) to link them to the edition itself. In the case of some Spanish-language incunabula we do have more than 100 known copies

Olaf, Item:Q3 and Item:Q369870 should probably be merged, and I have marked them as merger candidates--Charles Faulhaber (talk) 22:03, 16 November 2022 (CET)

I'll be online tomorrow 21:30 CET. "Unknown place, Unknown library" - the equivalent itm, though not under the same wording is Item:Q5 lost object. It has a number of qualifiers to bring precision into the status. --Olaf Simons (talk) 22:39, 16 November 2022 (CET)
Just by the way Item:Q395220 is a dubious object after the last clarifications of WEMI properties and items. --Olaf Simons (talk) 09:46, 17 November 2022 (CET)

(CET)

I'll be online tomorrow 21:30 CET. "Unknown place, Unknown library" - the equivalent itm, though not under the same wording is Item:Q5 lost object. It has a number of qualifiers to bring precision into the status. --Olaf Simons (talk) 22:39, 16 November 2022 (CET)


What is a "lost object"? When a library or archive burns down objects may be lost. From the Cologne archive, which broke down, we know how difficult it is, to decide whether damaged or lost... Private collections are not a loss, but the objects are generally unavailable to the public. From common 18-ct Albums amicorum which are usually traded around 4-10 K€ we know that they pop up every 1-2 generation. Bombing raids on German cities in WW2 may be a cause of many losses or the assumption of a loss. So, if an object was described the last time in literature before WW1 and never after, it may be a loss, but not for sure... If it was described as in holding of a private noble family archive in the western part of Germany its probably still there. In East Germany or Poland probably not, but perhaps in a museum or archive nearby or in Russia... For me a "loss" has to be positivly documented. Its the physical destruction of an object. If the loss can only be assumed the object is "probably lost". If an object has been reported as in an unknown private hand, its not "probably lost". Example: We do not know exactly how the Königsberg library burnt down. May be a fire worker grabbed an object... From the Anna-Amalia fire we know, that not all objects burnt, some were only damaged and restored. In Weimar and in Cologne its probably still undecided how many object are unrestorable lost. I started this discussion with what I think is a standard case in the digital era. We do not need Items for every single copy of an incunabula in the holdings of institutions, but the lesser known ones circulating in trade and auctions between private owners/collectors need a deeper documentation and therefore an extra item.--Martin Gollasch (talk) 09:58, 17 November 2022 (CET)
It's important to distinguish between a "lost object" in Martin's sense, i.e., one that is definitely known to be lost, as in items destroyed during WWII, and items that may be lost. The latter are items that were documented by a reliable source but can no longer be traced. PhiloBiblon has lots of these--MSS or copies of printed editions that were sold at auction or offered by booksellers but whose current location is unknown. They are presumed to exist, and they may turn up in the future. We list them in a place-holder record as in an "Unknown place, Unknown library" and use Literature (P12) to show where they were documented, If they do re-appear, we just need to change the current location to whatever library or institution is the new owner, using Property:P329.--Charles Faulhaber (talk) 19:08, 17 November 2022 (CET)