FactGrid talk:PhiloBiblon: Difference between revisions

From FactGrid
Jump to navigation Jump to search
mNo edit summary
 
(122 intermediate revisions by 5 users not shown)
Line 6: Line 6:
* [[FactGrid:Places|Geographical entities on FactGrid]]
* [[FactGrid:Places|Geographical entities on FactGrid]]
* [[FactGrid:Wikibase Presenters|Wikibase Presenters]]
* [[FactGrid:Wikibase Presenters|Wikibase Presenters]]
* [[FactGrid talk:PhiloBiblon/Archive]]
|}
|}
change
__TOC__
== Project Dimensions ==
Present stats:
<table cellpadding="1">
<tr bgcolor="#efefef">
<td align="left">TABLA</td>
<td width=100px" align="center">BETA</td>
<td width=100px" align="center">BITAGAP</td>
<td width=100px" align="center">BITECA</td>
<td width=100px" align="center">BIPA</td>
<td width=120px" align="right" bgcolor="white" width=100px">Total</td>
</tr>
<tr>
<td>ANALYTIC (witnesses)</td>
<td align="right">14692</td>
<td align="right">52084</td>
<td align="right">12239</td>
<td align="right">89926</td>
<td align="right">168,941</td>
</tr>
<tr>
<td>REFERENCES</td>
<td align="right">7270</td>
<td align="right">21558</td>
<td align="right">5976</td>
<td align="right">472</td>
<td align="right">35,276</td>
</tr>
<tr>
<td>PERSONS</td>
<td align="right">7423</td>
<td align="right">32309</td>
<td align="right">3473</td>
<td align="right">3418</td>
<td align="right">46,623</td>
</tr>
<tr>
<td>GEOGRAPHY</td>
<td align="right">1814</td>
<td align="right">4759</td>
<td align="right">840</td>
<td align="right">211</td>
<td align="right">7,624*</td>
</tr>
<tr>
<td>INSTITUTIONS</td>
<td align="right">794</td>
<td align="right">3297</td>
<td align="right">585</td>
<td align="right">4</td>
<td align="right">4,680</td>
</tr>
<tr>
<td>LIBRARIES</td>
<td align="right">915</td>
<td align="right">455</td>
<td align="right">420</td>
<td align="right">119</td>
<td align="right">1,909</td>
</tr>
<tr>
<td>MANUSCRIPTS &amp; IMPRINTS</td>
<td align="right">5168</td>
<td align="right">5886</td>
<td align="right">1971</td>
<td align="right">1572</td>
<td align="right">14,597</td>
</tr>
<tr>
<td>COPIES OF PRINTED BOOKS</td>
<td align="right">4157</td>
<td align="right">1146</td>
<td align="right">1473</td>
<td align="right">137</td>
<td align="right">6,913</td>
</tr>
<tr>
<td>SUBJECT HEADINGS</td>
<td align="right">339</td>
<td align="right">34</td>
<td align="right">149</td>
<td align="right">126</td>
<td align="right">648</td>
</tr>
<tr>
<td>WORKS (texts)</td>
<td align="right">6034</td>
<td align="right">31962</td>
<td align="right">6173</td>
<td align="right">89913</td>
<td align="right">134,082</td>
</tr>
<tr bgcolor="#efefef">
<td>TOTAL</td>
<td align="right">48606</td>
<td align="right">153490</td>
<td align="right">33299</td>
<td align="right">185898</td>
<td align="right">421293</td>
</tr>
</table>
<nowiki>*</nowiki> Places will definitely recommend an input of all Spanish places (deserted villages to cities, some 13,000) from sources that come with external identifiers and with geographic coordinates. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 11:46, 21 May 2021 (CEST)
::The best way to start the input is probably 1: all places, 2: all libraries, 3: All institutions... - as these things will work without your texts and codices, while the texts and codices will not work without these. Another approach would be to run a first input simply on Labels, (provisional) Descriptions and P2 information in order to get the grid of FG Q-Numbers and their matches of PB Identifiers as fast as possible. Once you have the matches you can run all subsequent inputs with greater ease. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 16:15, 23 May 2021 (CEST)
In fact, this is what we had suggested doing as the first steps, starting with these three entities, primarily because they are the least complex in our data model. Although, as you point out, they have fewer dependencies. Toponyms have internal dependencies. Thus, cities belong to states or provinces ... --[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 20:36, 23 May 2021 (CEST)
Running "first input simply on Labels, (provisional) Descriptions and P2 information in order to get the grid of FG Q-Numbers and their matches of PB Identifiers" is a wonderful idea. Needless to say I don't knw what that implies. Presumably our programmer John May will have to generate a list of PB identifiers. For every record we have a "moniker," a constructed field that serves to disambiguate similar records. Thus the moniker for persons consists of First and Last Name, followed by the person's title and dates, or some other identifying information. --[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 02:51, 26 May 2021 (CEST)
== Input efficiency: how to map controlled vocablary to P#/Q# ==
This Friday 1800 CST / 0900 PDT / 1200 EDT works for me. I've also invited Josep Formentí, who is in Barcelona, who has been looking at the FactGrid api.
We are currently focusing on cleaning up and coordinating our data in the four PhiloBiblon bibliographies.
Database designer John May is experimenting with CSV exports of data from the Biography/Persons table to allow us to manipulate it more easily to find errors, inconsistencies, and duplicates.
Josep just mentioned to me a deduplication library. This is something that may be useful once we're a little farther in to the process.
In addition to Adam's questions I want to follow up with Olaf on two issues:1. What's the most efficient way to compare our lists of controlled vocabulary with existing FactGrid P# and Q#?2. How would we "run a first input simply on Labels, (provisional) Descriptions and P2 information in order to get the grid of FG Q-Numbers and their matches of PB Identifiers as fast as possible"?3. How do you log in to FactGrid in Spanish?
Yashila, glad to meet you,
-- [[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 00:57, 10 June 2021 (CEST)
::Just briefly:
:* You can only use the database's wiki in Spanish if you have an account. Your account opens the menu top left with the "preferences" link that allows you to configure FactGrid. All people connected to this project should have accounts.
:* The entire property vocabulary is in the [[FactGrid:Directory of Properties]]. We should get a Spanish version of this structure in the course of this project. What we need here is English/Spanish speakers who go through all the 613 Properties top down. They all need Spanish Labels and descriptions to work. Google translate helps. Maybe this is a job for a student assistant if you have one, maybe it is interesting work anyone as it will make you familiar with the vocabulary.
:* The page for collective data modelling on Codices should be this one [[FactGrid:Data model for manuscripts]]. Do not hesitate to edit it, insert questions where you feel you need more clarity, restructure it as you need. All Wiki pages have a version history that allows us to compare states this pages has been in, nothing can get lost, nothing can be destroyed. The FactGrid player who created it is [[User:Marco Heiles]], and he should join you on data modelling debates as he is presently experimenting with some 200 codices of his PhD thesis. You are not bound to his decisions since we are using triples - so you can have any triple you want without affecting anyone on board. But we gain project cohesion if we find a language of triples that works for all. The names of the properties are irrelevant, the important thing is that we use them the same way.
:* Duplicates in biographies: We should avoid all duplicates by setting Wikidata reference numbers wherever a Wikidata-Item exists. The cool thing about the Wikidata reference is that you cannot create two items on FactGrid with the same Wikidata reference number. Question for John May is then: can you map your names against Wikidata?
:* How to run a first input that just creates all the objects, so that we have a list of FactGrid-Q-Numbers on corresponding PhiloBiblon IDs. We do the first input via CSV. This is a first input I prepared with a colleague - not yet in the machine: [https://docs.google.com/spreadsheets/d/1JmJUs5U5cNO3YBsbVStJpUyjKp4lqp3ZBF7_wyuvaxY/edit#gid=652013580 Google Spreadsheet] a complex first input. If you go for a simpler input you should still have:
::* Labels in as many languages as you require (English/ French/ Spanish/ Portuguese (?)/ German)
::* Descriptions at least in English, FactGrid's default language.
::* A P2 (what is it?) statement on all items that flow in (a codex, a human being, a place...)
::* Maybe the PhiloBiblon ID if that helps you to to organise the matches.
:* Question on my side: is there a Spanish register of all places from villages to cities? Could we match that with Wikidata and PhiloBiblon places in order to make sure that we create a unified geographical data environment that won't require us to create and to geo-reference ever new places at later stages?
::It might be good to create a Spanish Help Section on FG and I'd recommend that all new users take a brief tour through the English help section. Clarify things where I did not see the difficulties, demand clarification where I failed. New users are the best to do that as only they have the perspective to see the deficits in my/our explanations. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 09:03, 10 June 2021 (CEST)
== Spanish and Portuguese places ==
I have created Spanish and Portuguese localities on the basis of the listings of the statistical offices of both nations as linked in Wikidata:
* [https://w.wiki/3aJo Wikidata search: all items that have a Spanish INE ID]
* [https://w.wiki/3aJq Wikidata search: all items that have a Portuguese INE ID]
The present FactGrid list is shorter on the Spanish side as I deselected double entries on INE IDs. My feeling was here that these (roughly 4000) double Wikidata matches are to some extent the product of competing Wikipedia language projects all based on the same INE ids. The shorter FactGrid list should be easier to match against PhiloBiblon place listings.
This is the Google Spreadsheet from which the FactGrid input was done:
* [https://docs.google.com/spreadsheets/d/1ZvpVata3JflaSKISQlXszm5_RDgjeAvqam1KxExdMAE/edit#gid=0 Google Sheet of all FactGrid entities for Spain and Portugal]
The following two searches give all FactGrid localities for Spain and Portugal just with the respective INE listings and geographic coordinates. One can add external identifiers in further columns - presently available are VIAF, GeoNames, BNE, GND, Getty Identifiers (as far as Wikidata had them):
* [https://tinyurl.com/ygustfry FactGrid Search: all Spanish localities, geo-coordinates, Spanish INE IDs]
* [https://tinyurl.com/yjumz9fy FactGrid Search: all Portuguese localities, geo-coordinates, Portuguese INE IDs]
Both data sets can be improved:
# The Spanish set is lacking some 4000 English descriptions.
# Both sets are lacking descriptions in the other languages that have bene labelled: French, German, Spanish and Portuguese. The descriptions are needed to pick the right place in case of name identities; one would love to have them short and easy - like: "locality in Spain, province of XYZ".
# Getty thesaurus and BNE seem to be highly incomplete.
# Both sets still have localities without geographic coordinates.
# FactGrid should already have the various provinces as items but we will need a conclusive input to get the right provinces on all localities.
# We still need an easy to use data model for changing historical territorial affiliations (to be organised with qualifying begin and end statements on [[Property:P297]] statements).
# We might also see a need to create items for monasteries, castles particular houses in cities (they should get [[Property:P2]]+[[Item:Q16200]] statements and [[Property:P47]] localisations).
So still some work to be done. It would be cool to have a partner in this, a group associated to the BNE oder the Getty-Thesaurus adopting FactGrid as their Wikibase. 
The nice thing is, however, that we are now able to locate (almost) any object or person with a locality reference on the map as (almost) all place names have geographic coordinates on them which can be selected with the respective statements in SPARQL queries. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 09:48, 28 September 2021 (CEST)
:* I have just added the CSV file of BETA Geography to Google Drive: [https://docs.google.com/spreadsheets/d/1VBSxbAuG6WYxQKQhavROXSqFa9V6ZreFtRbQzsTzdok/edit?usp=sharing BETA - GEOGRAPHY.CSV]
:*BETA geoid ## are in col. A. The contents of the following columns are in the same order as the fields in PhiloBiblon's Data Dictionary for Geography: [https://docs.google.com/spreadsheets/d/10K7_lJFs-x2MYpHUCPEY-EVPzo6xTCBo/edit?usp=sharing&ouid=115024046159273345002&rtpof=true&sd=true GEO Dict]
:* This spread sheet is color-coded: [https://docs.google.com/document/d/1ANUUxnwkcFj0Zk_6zBHYkQOkCvn6168J/edit?usp=sharing&ouid=115024046159273345002&rtpof=true&sd=true Windows PhiloBiblon Data Dictionaries color code] --[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 07:04, 6 October 2021 (CEST)
::I have just run a first matching on the BETA - GEOGRAPHY.CSV with a look at all our Spanish places, which (of course) left a lot of blanks. K2-5 are, in our perspective the same Paris just under different names. We could reiterate the process with Portuguese and French places, Vienna is Austrian. We would not get a total matching without manual work as we will not match e variants in different languages. Coordinates are extremely rare, we have them on all, one would need rounded values to do the matching. The related GeoID is neither GeoNames nor an ID (too many 1002 record doubles).
::Can you establish how many individual localities you actually have in all your databases? How do you make sure that you refer to the same thing in your databases? One might like to do the matching on all of them in one process. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 23:08, 6 October 2021 (CEST)
:::PS. Maybe John May has an idea what he would do if we were going from FactGrid to PhiloBiblon. How would he match the two lists of our Spanish and Portuguese places with coordinates and Identifiers. What would he wish we had to make merging easier? --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 23:11, 6 October 2021 (CEST)
== First steps ==
For each data type = entity = PhiloBiblon table (Analytic, Bibliography, Biography, Copies, Geography, Institutins, Libraries, Ms/Ed, Subjects, Uniform Title) the following steps need to be done
1. Deduplication of records
2. Data clean-up. In the process identify Wikidata/FactGrid Q# for Geography, Biography, Institutions, Libraries, to the extent possible.
3. Definition of mapping between PhiloBiblon fields and "dataclips" (controlled vocabulary) and FactGrid P# and Q# (e.g., PhiloBiblon date of birth = FactGrid P77).
4. identification of PhiloBiblon entities with FactGrid Q entities (e.g., Fernando III, el santo, king of Castile and Leon = BETA bioid 1110 = BITAGAP bioid 2054 = BITECA bioid 6515 = FactGrid Q254531).
5. Import PhiloBiblon CSV records from Windows PhiloBiblon into FactGrid taking account of points 3 and 4.
6. Enrich records with data from external sources, e.g. VIAF.
Some of these steps can be carried out in parallel, i.e., Steps 1-4
The order in which entities should be imported into FactGrid needs to be decided. It seems clear that Geography should go first, because its data are used in Biography, Institutions, Libraries, Ms/Ed, and Uniform Title.
Institutions and Libraries should come next because the number of records to be imported is relatively small, and many of them will already have Wikidata Q# if not FactGrid Q#.
Biography (Persons) would come next because its records are needed for Analytic, Copies, Ms/Ed, and Uniform Title. There are some 40,000 Biography records in total, the vast majority of which will have neither Wikidata nor FactGrid records.
Uniform Title, Ms/Ed, Copies, and Analytic should follow, in that order.
This is in fact the same order set forth in the original Work Plan in the NEH proposal. It is worth noting that we are already starting on work Geography that in the Work Plan is scheduled for October.
--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 23:56, 13 June 2021 (CEST) (for Josep Maria Formentí)
== Questions for webex sessions ==
1. How do you keep together separate statements that form a data unit?
For example, BETA bioid 1007 shows the following information for the date and place of birth of king Alfonso X:
Toledo 1221-11-23 (DB~e)
Madrid 1221-11-26 (Alvar 2010)
These represent two different opinions.
The source for the first one is the <I>Diccionario Bibliográfico Español</I>.
The source for the second is a 2010 article by Carlos Alvar.
We do not want to have the two place of birth statements together followed by the two date of birth statements, since there is then no way to associate the 1221-11-23 date with Madrid and the 1221-11-26 date with Toledo.
--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 06:58, 26 July 2021 (CEST)
::Wikibase allows any number of statements on the same question. Four dates of birth next to each other are no problem at all. We give them with their respective different sources and different notes why this should be the date. If we can rule out a particular date we still take it and downvote it with a note, so that there is no fall back into an error. If we know that all dates are right (e.g. different successive numbers of population counts) we can prefer the latest date with the wish that only they appear in searches. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 08:19, 26 July 2021 (CEST)
:: For discussion purposes a very short "Alfonso X" in FactGrid without downvotes nor preferences at your service: https://database.factgrid.de/wiki/Item:Q266835 You can have a look at the corresponding wikidata-sheet (interlink at the bottom of the page) how they solve the problem there (its the same software).--[[User:Martin Gollasch|Martin Gollasch]] ([[User talk:Martin Gollasch|talk]]) 09:41, 26 July 2021 (CEST)
:: The problem is not the multiple dates of birth but rather associating a specific date of birth with a specific place of birth rather than adding all of the date of birth statements followed by all the  place of birth statements. What you show in the linked FactGrid Q266835 is precisely what we don't want. It breaks the association between date and place, leaving it up to the reader to guess that the first date is associated with the first place and the second with the second place. This kind of association is fundamental to the structure of PhiloBIblon--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 06:47, 30 July 2021 (CEST)
:::Well, you could link both statement,shifting either of te two linked statements into the qualifier (and you can have as any qualifiers as you want). But you will actually like the option to separate thise statements since you will not always have the lined statement. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 07:17, 30 July 2021 (CEST)
:One thing is database content, the other thing is, what does a reader/user see in the browser later on. I made some changes (gave preference for Carlos Alvar 2010) and we will see tomorrow at the latest how the database content is shown in the beta version of the FactGrid Viewer as an example. Right now there are two lines with equal values. You find the available Browsers/Viewers on the left of your screen. I guess, this viewer does not show the preference... Actually it should be no problem, to find a viewer, which is suitable for you. The actual question is: Do you have a database problem or a viewer problem?--[[User:Martin Gollasch|Martin Gollasch]] ([[User talk:Martin Gollasch|talk]]) 10:11, 30 July 2021 (CEST)
:: I regret the delay in responding. Martin, you have solved the problem in my opinion. This is not simply a browser issue; it is in fact a database problem, how to maintain the relationship among the various elements of a data structure, in this case place of birth / date of birth/ source. Here there are only two elements in the data structure. We have cases in which there are six or eight related elements that need to be kept together, for example the former owner of a manuscript, the date and place he acquired it, and the price he paid, and the source of the information. --[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 07:04, 29 August 2021 (CEST)
::: One simple solution is to add a position in sequence (P499) as a qualifier to keep track of the relationship between the elements that need to be kept together. This position in sequence can be used in a query to easily find the related elements. [[User:Bruno Belhoste|Bruno Belhoste]] ([[User talk:Bruno Belhoste|talk]]) 10:52, 31 August 2021 (CEST)
:Charles: The nice thing about starting a project in open data is, that many things can be decided as collaborate work is in progress... In such project as yours you might have a look how Germania Sacra is organizing their bishoprics and if you find a better way, others might even learn from you. Same thing with manuscripts of all kinds. I personally deal with the sourcetype Album Amicorum/Liber amicitiae to get mobility profiles and to get 18ct social group profiles. They can be described as a bunch of manuscripts. There is a tendency by German libraries to describe them one by one single sheet. In doing so, you will lose all the information, which results from their connection in the album. That needs certainly discussion later on, since electronic data can be reorganized more easily... So you have to keep your data reorganizeable. In the field album amicorums we mostly have the first/original holder of the book, but it certainly could be of interest to see all the owners en route to the present archive... So I am curios to see, how others solve common problems. I think, you should consider creating model structures of typical data sets in FactGrid soon. This will make it easier for other Users here to enter a discussion with you and your team members.--[[User:Martin Gollasch|Martin Gollasch]] ([[User talk:Martin Gollasch|talk]]) 11:15, 29 August 2021 (CEST)
That is precisely what we want to do. Olaf has suggested that we start with place names, because they are needed to establish relationships with people and with organizations. However, we are still struggling with the correlation between our ontology and FactGrid's properties. We do not want to create a Property that duplicates one that already exists in FactGrid. We've been looking at Wikidata to find models for the kinds of entities that we need and the Properties they use, e.g., WikiProject Books https://www.wikidata.org/wiki/Wikidata:WikiProject_Books. We understand that WikiData Properties and FactGrid Properties have different P#, but looking at the former will help us with the latter--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 08:01, 30 August 2021 (CEST)
:Sorry Charles, thats not what I meant. My suggestiom is, to take one of your old manuscripts, perhaps not the easiest one, and describe it in FactGrid with Items for all the needed persons, places and institutions you need for a proper and state of the art description. Sort of a case study... The mass import of places first is certainly a good decision, but I would like to propose to you to create one or two more or less complete models of your intended datasets parallel to the planned mass imports of data. The mass import is totally independent from my suggestion. In the German wikisource [https://de.wikisource.org/wiki/P%C3%A4pstin_Johanna Pope Joan] is often referred to as a dataset which includes nearly every problem you might have in modeling...--[[User:Martin Gollasch|Martin Gollasch]] ([[User talk:Martin Gollasch|talk]]) 09:25, 30 August 2021 (CEST)
== Model of Book ==
I have started to create manually a record for my own <I>Medieval Manuscripts in the Library of the Hispanic Society of America</I> ((Q385363)). The three parts of the header are Label: Title (Charles B. Faulhaber, Medieval Manuscripts in the Library of the Hispanic Society of America (New York, Hispanic Society of America, 1983)
); Description (catalog of medieval manuscripts); Alias (BETA bibid 1862)
I assumed that, since "book" is such a well-known data type, I would be able to find all of the properties that I needed when I started to create statements.


__TOC__
I was able to enter (with Martin Gollasch's help): Instance of / Author / Type of Publication / Date of Publication / Place of Publication / OCLC number / Topic / Publisher
 
At that point I discovered that in order to enter the Publisher you must first create a Q# for the Publisher. I did that.
 
::Yes, the publisher had to be created first, so that you could link to it. This is one of the miseries of the database that does not fill string fields but wants to connect informed items. My usual recommendation is to create rich landscapes of data as a background to refer to. I love consistent mass inputs of background information to refer to later.
 
Then I tried to add the full title of the book: Medieval Manuscripts in the Library of the Hispanic Society of America. Religious, Legal, Scientific, Historical, and Literary Manuscripts. I found Title in the Property field "to state the title of a document," but I could not get it to link to the Property P11 by selecting it from the list. I had to go to the list of properties, find "Title," copy "P11" and paste it into the field.
 
::Properties are not in the regular search (top right), unless you switch to advanced search (which I put into the menu for that purpose). To start the title statement it would have been enough to just click "add statements" and type: "Tit..." the auto complete would have suggested "Title" (and P11) as the property you are looking for. Best thing to do: anticipate all the words people might use in order to make the specific statement and have these words in Aliases. Standard option, however: go to the directory of properties, where you will also see neighbouring options.
 
::That's precisely the problem. I typed "Tit..." and got "Title", but then FactGrid would not link to it.--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 06:44, 8 November 2021 (CET)
 
::Working with student assistants I provided a list of statements (with the Q-Numbers) which they would make one by one.
 
::Yet it seems you found at least one way.
 
I added "listed in" (P124) "BETA / Bibliografía Española de Textos Antiguos (Q254471)". However, when I searched for that in the "string" field, I couldn't find it. Nor could I find it in the general FactGrid search box. I had to copy it from the FactGrid:PhiloBiblon page: https://database.factgrid.de/wiki/FactGrid:PhiloBiblon. Where does one indicate the number in the catalog?
 
::Again, it would have been enough to simply start typing "list...." and the machine would have suggested the property. (Or use the Advanced search and limit to Properties) (It is good design that the search field top right does not give you properties. We have Items and Properties of identical names and it is the items that should accumulate knowledge, they are the lemmata in this system. Properties should only occur when you open a statement, and here the auto complete will do the job (teach it with aliases of your choice).
 
I next wanted to enter the number of volumes in the edition. I can find no property in the general property list. I see P105 "volume," as in "volume 1," but not volumes.
 
::I would go for the P613 property "number of integrated object" and state Volume as the unit, as we have so many variants here. Some publications want to consist of parts, others of numbers, issues and so on. We better create items for these than properties since the property would demand a user who knows that this publication consists of volumes and not of parts.
 
:::P105 Volume is defined as “qualifier to refer to a specific volume of a work you have referenced in the main statement” However, this is not the correct Property to indicate the number of volumes in a publication. I don't know what you mean that it would be better to create items for these rather than properties.
:: How does one create an item?
 
:::It seems to me that you either have to let the Qualifier field to P613 either be a string field or you have to vastly increase the number of Properties that it allows. E.g., it ought to allow for "Volume(s)" just as P107 allows Page(s). Others: fascicules, parts, issues, numbers etc. as you mention above.--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 06:44, 8 November 2021 (CET)
 
I want to change "author" to "compiler," since I compiled the catalog, I didn't write it. I don't find a property for that either...
 
::And here you just gave me the better word for our P317 property.
 
:::I tried to edit the Author statement to change it to "P317 Compiled by". However, every time I put my cursor in the Property field the program sent me to "Author P11." I finally deleted the Author statement and added the "Compiled by" Statement. Is there a way to change the Property in a statement without changing the rest of it?
 
:::I tried to edit the Number of pages / leaves / sheets (P107) to add as a qualifier Page(s) (P54). I wanted to add the total number of pages in both volumes of the book: 1: [i]l + 664; 2: [i]-xiv + 245. However, it wouldn't let me save it. It didn't matter whether I put the number after the P107 Property or after the Qualifier P54. What is the constraint?
 
I tried to delete the ISBN and ISBN-13 Properties since this book has no ISBN number, but when I saved it I got an error message:
"Could not remove due to an error.
:::Edit conflict. Could not patch the current revision.
:::Edit conflict."
 
In fact I could save none of the changes I made in about an hour's work even though I was logged in.
 
Is it possible to duplicate a record. For example, two volumes of my catalog of MSS in the Hispanic Society were devoted to documents. They were published in 1993 with an ISBN number and a different subtitle It would be useful to duplicate the record and then just change the statements that differ.
--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 06:44, 8 November 2021 (CET)
 
In PhiloBiblon we have some 3000 "enumerated data types," some of which are entities rather than properties.
 
Should I make a list of properties that are missing (at least compared to PhiloBiblon) before we attempt mass input?
--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 06:40, 5 November 2021 (CET)
 
::Yes, the list would be useful. I assume that most of these will become Items to refer to in qualifiers or as units but we should have them all. FactGrid is a fast learner. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 08:36, 5 November 2021 (CET)
 
=== Practical session of Wiki/Wikibase editing ===


== 1st Web-Meeting, 18 May 2021 ==
Dear Charles. We need a practical session of Wikibase editing.
Here are the notes from the meeting 17 May 2021:


Philobiblon Meeting Notes
Yes! As you can see from the comments below I am very far from understanding how FactGrid works.--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 07:51, 9 November 2021 (CET)


Dear Colleagues,
There are two ways to get the last note: Edit conflict. The real one is that two people edit simultaneously on the same section. In this case the one who saves first gets the edit and the other gets a chance to see the new version with a look at the text he wanted to write. Another way to get the Error is by pressing "save" twice. In this case the machine is already processing your edit while you want to impose a new version. Looking at the version history I assume your edit conflict was of the second sort - you trying to intervene on an edit you have just sent off. You can ignore such error notes, or just let the machine save (your browser tells you that it is processing your request on the tab).


Here's what we have in the Workplan for this summer:
I do not see how exactly you deleted an hour's work. The [https://database.factgrid.de/w/index.php?title=Item:Q385363&action=history version history] shows that only you were editing on this page.
::I worked on the record ((Q385363)) for an hour and made numerous changes. I was not able to save any of them.
I am currently trying to change the value "909" in the statement with P107 (Number of pages/ leaves/ sheets) to "1: [i] l + 664; 2: [i]-xiv + 245" in order to get the breakdown of the pages in vol. 1 and vol. 2.
When I make that change in the Property field, the Qualifier "Pages" disappears and I have to add it back in, which brings up a blank field for the value. If I add "1: [i] l + 664; 2: [i]-xiv + 245" in that field I cannot save the record with the same value in both the Property and the Qualifier fields. If I delete the value from the Property field and keep it in the Qualifier field, I still cannot save the record.
The only way I can save the record is by retaining "909" as the value in the Property field and adding "1: [i] l + 664; 2: [i]-xiv + 245" in the "page(s)"Qualifier field.
This creates a puzzling statement for anyone who knows anything about descriptive bibliography.


Objectives: Review of Wikibase software: standards, protocols, data formats, and implementations; comparison with PhiloBiblon schema and data dictionaries and current and desired functionalities. Scenarios to describe how contributors and users will interact with PhiloBiblon. Functional specifications for the features required for those scenarios. Identify a small set of particularly rich and related records (number TBD) from each of PhiloBiblon’s four databases (BETA, BIPA, BITAGAP, BITECA) and ten tables to serve as test cases. Ongoing data clean-up in legacy PhiloBiblon Tasks: T1. Review Wikibase software: standards, protocols, data formats, implementations (PI, Anderson, Formentí, Simons) T2. Develop user scenarios based on PhiloBiblon data and current and desired functionalities. (PI, academic staff, Adv. Board: Dagenais, Gullo) T3. Create functional specification for features needed to re-create these functionalities. (PI, Anderson, Formentí) T4. Identify set of test target records (PI, academic staff) T5. Ongoing clean-up of legacy data (PI, academic staff)
I see that you have now added this under the "Physical description" property. It would never have occurred to me to do so.
Again, I think you need to specify what those numbers ("1: [i] l + 664; 2: [i]-xiv + 245") refer to: pages, leaves, folios?--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 07:51, 9 November 2021 (CET)


5/18/21 Meeting Notes
I am now trying to add a record for the Berkeley Library. In order to do that I need to amplify the record for the parent institution, the University of California Berkeley (Q369705). I want to add the exact address. The record currently shows "Place of home address (P83)." However, P83 seems to be limited to persons, according to the Description and Alias:
"use for geographic places like towns and villages, specify address with P208
<I>Resident of</I> Place of residence"


Attendance: Adam Anderson (zoom host), data analyst for project, Berkeley Lecturer Charles Faulhaber, Project PI, and (new) director of Bancroft Library, Josep Formenti (software engineer in Barcelona, also knows NLP and webdev), Olaf Simons, book historian at Uni Erfurt, Gothe. WikiMedia platform & FactGrid Randal Brandt, head of cataloging at UCB, rare books, Xavier Agenjo (director of projects for Fundación Ignacio Larramendi in Madrid, creator of more then 40digital libraries), Daniel Gullo (dgullo@csbsju.edu) director of collections, cataloging, creating databases of libraries, modern digital collections, controlled vocabulary for underrepresented religious traditions in the middle ages (vhimmel online database, NEH project director) Óscar Perea Rodriguez, lecturer at USF, working with Charles on this since 2002 Jason Kovari (cataloging rare books), Cornell Robert Sanderson, director of digital collections at Yale Cliff Lynch, director of a small nonprofit in DC Coalition for Networked Information, in the School of information at UC Berkeley; worked on the predecessor of the CDL. John May (phone): software dev., information management systems, designer of PhiloBiblon (software) John Dagenais: Professor of Spanish at UCLA, degree in library science, user of PhiloBiblon
Is it also to be used for Orgnizations?


PhiloBiblon Project: Goes back to 1975, as a spin-off of the Dictionary of the Old Spanish Language project at U of Wisconsin-Madison, a Spanish version of the OED. Based on contemporary uses of the language. To do this they created an in-house data base (1975 onward): Bibliography of Old Spanish Texts (BOOST), lineal ancestor of PhiloBiblon. Constant technology change. Put the collection on CD-OM discs by 1992, with digital images + text. 1994 with the internet (Netscape). Charles was teaching a course on DH computing (including gopher, OCR, etc). One of his students introduced him to the World Wide Web and he said that it wasn’t going to be important… Since then they’ve been working on keeping up with the different versions (1.0, 2.0, 3.0 = Linked Open Data with RDF). Currently Exports data from Windows PhiloBiblon into XML files to upload to the server at Berkeley, where XTF (eXtensible Text Framework), run from CDL takes large XML files from 9 PhiloBiblon tables (uniform-title, manuscripts/editions, persons, etc.) and parses each of these files into individual records for querying. Objective: to get us aligned with the Wikibase / Wiki-foundation, to piggyback on their data and technology moving forward.
When I use P208 for the exact addres (in this case just the postal code), I make a new statement and try to insert "94720-6000" as the value, but FactGrid apparently expects a Q#, not a string. This seems to imply that I have to create a Q# for the postal code "94720-6000"
I note that this is "data type item" in FactGrid, but the corresponding data type in Wikidata is "monolingual text".
--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 07:51, 9 November 2021 (CET)


Olaf Simons: FactGrid (works on Masonic institutions) Charles found Olof through commentary in their blogpost. Wikibase is the software behind the Wikidata project. It’s developed between 2012-ongoing. Used by national libraries worldwide to control data: creating triple-base statements--2 entities linked by a relation--(annotate with statements) and further develop metadata. You can see who is editing it by the minute. Each entity has a Q-number, e.g., for each person (gender, address, field of research, etc.) you can add as many statements as you want, and you can qualify these statements. You can add references to the entry (as many as you want). Used as a source for anything imaginable, we collect statements that become entries. Using SparQL you can query this: e.g. Illuminati: members list Q: date of birth? and it appears as a column for each member SparQL no longer connects text input fields. It normally connects items to items. E.g. a person is a member of a lodge, which is another item which contains informations with statements. Each item is a database object. We will have to translate input field information into objects, and for that to work, each object needs a statement. First Point: We don’t deal with books, instead you have people, places and institutions connected to these books. All of these need statements to work. Produce a network of items of relations. Consider the types of objects (e.g. Geographic names, Proper Nouns, etc.). We’ll need to get an idea of the number of items you’ll be creating beforehand. Should create your own team and get accounts for those members, along with managers of the accounts to run the team independently. Charles: For Spanish texts (6K texts) 8K individuals PhiloBiblon “Data clips” correspond to P properties in Wikibase.
As to '''items versus properties''':
:: This is a question of terminology. Is an "item" always a Q#?
I note that in the P# records, the data type is given as "Item," e.g.:


Creating properties is not the issue. You create properties as needed and link them on a text-by-text basis.
Address (P208)
full street address where subject or organization is located. Include building number, city/locality, post code, but not country
In more languages
Data type
Item


The real problem is to understand the types of objects you have to create, and how they’re interconnected. They need to be created before you can link them.


E.g. often we start with places, and move from there. First is to create the objects, you then interconnect it.
We can have properties for anything, but it has its disadvantages. If you create three properties, one for "number of pages", one for "number of leaves (German Blatt)" and one for number of (printed) sheets, then you must - when running a SPARQL query - know how the sort of statement is made in all particular cases. If library x is giving page numbers and library y gives you sheets then you better ask for all three sorts of statements to find out.


Usually projects come to us with a spreadsheet…


Foreseeable Problems: The entire project is multi-lingual (using P-numbers, Q-numbers (Deutsche Nationalbibliothek GND)), which allow for different labels on these items and properties. You should use it in the language of your sources. The database currently accommodates French, English and German (the main users of the database). They put their labels in the database. You will need to make labels in Spanish and Portugese. Programming issues: the software is not designed to create presentable projects at the moment. This is ongoing work: 1) the FactGrid viewer, which creates pages based on information from the database, created on the fly. You will want something similar, for your own data. Bruno Belosst is the creator, you can contact him. This is where Wikibase is currently underdeveloped. A ‘Knowledge’ tool that is in development which can also show the metadata for each entity. Otherwise you get the SparQL query functionality.
The better option is in this case a Property for the number whatever it is - pages, leaves or sheets - and a unit behind. You can then be sure that you ask the right question as it is only one Question for whatsoever. These exact answer comes with the units or qualifiers.


Robert Sanderson to Everyone (12:32 PM) We used XTF at Getty for our archives: https://xtf.cdlib.org/ For what it’s worth, at Yale we’re doing this across the libraries, archives and museums. The current data is 45 million such entities, spanning 2.5 billion triples uses the same process at Yale. The difficulty is understanding the modeling and doing the transformation. The database side is essentially free. We have 15M MARC records, which get turned into 45M entities, people, places, concepts, objects, images, digital things. (8 classes) We’re not using Wikibase, but rather the main library standards by themselves The difference between the data modeling paradigms will necessitate the creation of new identifiers But wikibase is really good at managing external, typed identifiers :) We should write down the things you want to refer to.. e.g. ‘ownership’ (provenance in the museum and library world) if that’s something you want to refer independently to the object. From there you can work out the CSV tables and get them into the system Not to be a broken record, but the modeling also determines the possibilities for the queries. If you have (for example) place-written and date-written, on a text that can have multiple authors writing at different times and at different places, then you need some way to connect author / place and time *in the model*
I tried this. See above--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 07:51, 9 November 2021 (CET)


Daniel Gullo: Biblissima is using this model.: https://data.biblissima.fr/w/Accueil question about Q-numbers… Warning: scholars use the MS-ids (manid), text-ids (texid), and other IDs established in the PhiloBiblon system.
Creating items is easy: use the [[Special:NewItem|New Item]] link (on the menu) or, preferable, a batch that creates an item in all the languages you need with all the statements that you are likely to repeat on every new item in your present course of work. You find some batch fragments in the menu. I will help you to create special batches for your team - these things are useful as they speed up editing in more than one language. We should have a course on this. (I also created a help section - see: [[Help:How do I feed data into FactGrid?]]) --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 09:21, 8 November 2021 (CET)


Olaf: You will create external IDs (vital for places, e.g.) for any item in PhiloBiblon, and they will be linked to the IDs from PhiloBiblon. The Q-numbers are merged based on double entries (there’s movement, they are not deleted, but merged). There are ways of creating more key-value pairs without deleting IDs.
::Dear Charles, I am the one who shifted the "1: [i] l + 664; 2: [i]-xiv + 245" to physical description as that is where we have these remarks in all the other data sets. I regret I cannot see where the editing does not work, and where you get stuck for hours. We should have a practical session in which I just look over your shoulder to see what you are doing. Share screen is easy these days. Odd string inputs... I sometimes received error codes, but rarely. Usually I makle a mistake about blanks in the beginning or end. But my experience with the site is that it works incredibly smooth. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 08:36, 9 November 2021 (CET)


This is similar to the database we’re developing… When you’re migrating your data, you want to think of it hierarchically with a controlled vocabulary: 1) Continents, 2) Geographic names, 3) institutions, 4) other entities.
::It is, in any case not possible to put "1: [i] l + 664; 2: [i]-xiv + 245" on the total number of pages property since the particular property is not a string property but a property calling for a number, a quantity, as defined in the "Property type" (which you have to pick when creating a property). The cool thing with quantity properties is that you can sort them by number, if ever you want to know the biggest of all you codices. They are designed for statistics and computations.


Jason Kovari Is the plan to assess what users really want? (or move forward with what you have?) Is part of the workplan assessing the work that you initially have? Jason Kovari (he/him) to Everyone (12:50 PM) The word I was forgetting earlier was Affinity. There is a Wikidata Affinity Group as part of the LD4 Community: https://www.wikidata.org/wiki/Wikidata:WikiProject_LD4_Wikidata_Affinity_Group
::Address P208 wants an item to link to. The good thing about the item is that it will have geo-coordinates, so that you can eventually ask the machine to show you all [https://database.factgrid.de/query/#%23defaultView%3AMap%0ASELECT%20%3FKaufmann%20%3FKaufmannLabel%20%3FAdresse%20%3FAdresseLabel%20%3FGeokoordinaten%20WHERE%20%7B%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%20%20%3FKaufmann%20wdt%3AP165%20wd%3AQ36715.%0A%20%20%3FKaufmann%20wdt%3AP83%20wd%3AQ10279.%0A%20%20OPTIONAL%20%7B%0A%20%20%20%20%3FKaufmann%20wdt%3AP208%20%3FAdresse.%0A%20%20%20%20OPTIONAL%20%7B%20%3FAdresse%20wdt%3AP48%20%3FGeokoordinaten.%20%7D%0A%20%20%7D%0A%7D merchants of Gotha] on a map. The search will ask for all "Merchants" of Gotha, it will ask for their addresses and it will come with a command to go into the address items to get the coordinates from there. The advantage is here that you do not have to repeat coordinates all the time. They are on the items we are linking to. If you do not know how a property is working, go to it, then ask "what links here" and look at the Items that are used. This is Bruno Belhoste's present project "[https://tinyurl.com/yf4fvrf7 Paris to download]" you can take a look at the SPARQL code to see how it is done. It's a bright thing to link to informed items. We d not want our users to put geo coordinates on every statement. WE want the machine to know what is meant by "[[Item:Q10441|Paris]]" or by "[[Item:Q317740|Paris, rue de l'Université, no. 14]]".


Charles: There’s an orthogonal relationship between the ‘library world’ and what we’re doing. PhiloBiblon has evolved over 35 years in response to the needs of the user community as represented primarily by the members of the four PhiloBiblon teams (for Spanish, Portuguese/Galician-Portuguese, Catalan, Golden Age poetry). To what extent do we need to accommodate what the library world has been doing? We’re not reimagining PhiloBiblon, but rather trying to move it into the web-based wiki world. We want to incorporate external authority files, such as Virtual International Authority file (VIAF), Getty Thesauri, Denis Muzarelle, Vocabulaire codicologique
::We should really have that session in which I see your editing and in which I can tell you the advantages of doing it differently. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 08:51, 9 November 2021 (CET)
Rob and Olaf: Suggest starting with a small number of records (e.g. 2 or 3 codices based on your needs). This has been done. Choose complex items (10 of each sort) and go from there. E.g. items for each: Books (with Q-numbers for all the books) Toponyms Institutions etc.


What’s the process of getting PhiloBiblon into Wikibase? Use quick statements (or API) / spreadsheet into the machine. It takes a few hours to feed the data into the machine, but it’s simple. The benefit of a CSV import is you can use OpenRefine to augment the identifiers and reconcile data. How many items on the largest spreadsheet?
== New Properties ==


John May: (project janitor, coder) to perform rectifications and homogenization in the Windows database. It’s straightforward to export in CSV / spreadsheet. They are exceptionally large. The problem might be that the creation of Q and P numbers can be done programmatically, but that ontology will need to be made clear. Design desiderata: to accommodate ambiguity and uncertainty. So PhiloBiblon is made to be flexible and customizable. He’ll await the orders from Olof as to how he wants the data. The PhiloBiblon DBMS is an n-dimensional dynamic data model (which can be normalized as needed) ca. 60 mb for each bibliography (spreadsheet); no fixed field lengths; any structure / sub-sub-structure can contain structure - it’s all based on arrays… 6K texts 14K witnesses John can write the code and tag the texts with P / Q numbers that would correspond to the Wikibase PhiloBiblon is not a standard data model… so it will take some time to get the CSV…
As we discussed, we still need some new P# in order to facilitate modeling of PhiloBiblon entities:
# New P# for external identifiers to match existing P156 (online presence) and P634 (DOI): URN = Uniform Resource Name (Q399023) and URI = Uniform Resource Identifier (Q399000).
# New P# for ISSN = International Standard Serial Number (Q394868) and ISSNe (International Standard Serial Number, electronic) = eISSN (Q46339674). Cfr. ISBN-10 (P605)
# New P# "Honorific" for form of address. Takes Q#, e.g., Fr. = Friar, Lord, Señor.  Cfr. Wikidata honorific (Q1326966), Subclass (P3) of Epithet. Cfr. [https://pb.lib.berkeley.edu/xtf/servlet/org.cdlib.xtf.dynaXML.DynaXML?source=BETA/Display/1259BETA.Person.xml&style=Person.xsl%0A%0A%20%0A%20%0A%20&gobk=http%3A%2F%2Fpb.lib.berkeley.edu%2Fxtf%2Fservlet%2Forg.cdlib.xtf.crossQuery.CrossQuery%3Frmode%3Dphilobeta%26everyone%3Dfr%26name%3D%26title%3D%26daterange%3D%26assocplace%3D%26affiliation%3D%26subject%3D%26text-join%3Dand%26browseout%3Dperson%26sort%3Dauthor BETA bioid 1259]
# New P# "Religious Order"; cfr. Wikidata P611 (Religious order)
# New P# to indicate relation between two physical copies of a book. Qualified by P708 Kind of relation. Indicates the relationship between a copy of a print publication and another copy not necessarily of the same publication. I.e. two unrelated books could be bound together as a “bound with”
# New P# "Script style" (cf. Wikidata P9302)
# New P# "Typeface/font" (cf. Wikidata P2739)
# New P# "Watermark" or "Watermark Type." Would be instance of FactGrid Properties for Core Bibliographical Data, Cfr. Wikidata Watermark (Q43065). See Watermarks in [https://pb.lib.berkeley.edu/xtf/servlet/org.cdlib.xtf.dynaXML.DynaXML?source=BETA/Display/2098BETA.MsEd.xml&style=MsEd.xsl%0A%0A%20%0A%20%0A%20&gobk=http%3A%2F%2Fpb.lib.berkeley.edu%2Fxtf%2Fservlet%2Forg.cdlib.xtf.crossQuery.CrossQuery%3Frmode%3Dphilobeta%26mstype%3DM%26everyone%3D%26city%3D%26library%3D%26shelfmark%3D7867%26daterange%3D%26placeofprod%3D%26scribe%3D%26publisher%3D%26prevowner%3D%26assocname%3D%26subject%3D%26text-join%3Dand%26browseout%3Dmsed%26sort%3Dtitle BETA manid 2098].--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 00:08, 25 May 2022 (CEST)


How is PhiloBiblon used currently? Charles: the primary use of PhiloBiblon currently:
<hr>


For editors of texts: How many Mss / printed editions are there of a given text? Descriptions of MSS There are lots of other potential queries based on data in PhiloBiblon, but it is currently difficult to extract it, e.g.:
Hesitated with number 5 and do  not know whether No. 1 is already doing the job. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 10:37, 25 May 2022 (CEST)


Codicology: How many Mss that have gatherings of 12 leaves, and where do these Mss come from? Where and when are particularly catchword of signature types used in gatherings?
# I created the equivalent to Wikidata's [https://www.wikidata.org/wiki/Property:P7470 URN formatter] - hope that was the thing you need. [[Property:P744]]
# ISSN [[Property:P743]]
# Honorific prefix [[Property:P745]]
# Religious Order [[Property:P746]]
# ...is a tricky one. I saw volumes with 50 dissertations and tracts - we would end in a mess if we always stated the neighbours on all the neighbours. It would be better to have a property that just links upwards to the collective volume, something like "bound in collective volume" (German is nicer: "Beiband in") - then however we could just as well state the includes [[Property:P9]] on the collective volume. Consider with me what's best.
# Script style [[Property:P747]]
# Typeface [[Property:P748]]
# Watermark [[Property:P749]]


Prosopography: What individuals were active in a given location in a given time period
: New properties to serve as Head of Column both for manuscripts as well as print publications. These would be parallel to Page(s) (P54) and Folio (P100)
# [[Property:P755]] for Column(s). Cfr. Column(s) (Q394315) and Wikidata column (P3903)
# [[Property:P756]] for Fly leaf/leaves. Cfr. Fly leaf / leaves (Q395036).
# [[Property:P757]] for Plate(s), i.e., illustrations printed separately and inserted into a book. cfr. Plate (Q394145).
--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 02:00, 31 May 2022 (CEST)
: Heighth P59) and Width (P60) take data type Quantity. However, it appears that Quantity only allows integers. We have cases where e.g., the width is indicated "210 / 215", i.e., it varies between 210 and 215 mm. How do we handle that? The option of putting in two separate values, 210 and 215, and qualifying them in some way (e.g. "greater than," "less than" doesn't capture the reality.--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 23:04, 7 June 2022 (CEST)
::The integer solution is generally preferable, because the database actually recognises these numbers as numbers - ready for computation and for correct listing. String input means that 1100 will come before 3 (you will need 003 in that case to get the correct listing). So better go for numbers rather than string and state ranges with maximum and minimum qualifiers. 
::Things are slightly different with the properties I have just created, they are references, often with input such as (page) 1-56. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 08:30, 8 June 2022 (CEST)
::Thank you for Column (P755), Fly leaf/leaves (P766) and Plate (P757)! "So better go for numbers rather than string and state ranges with maximum and minimum qualifiers." How would you do this for Width (P60) in [[https://database.factgrid.de/wiki/Item:Q393504 RAH Cod 59]]? The minimum for the width of the text page is 210 mm and the maximum 215--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 07:28, 9 June 2022 (CEST)


Oscar: there’s not a lot of interaction with PhiloBiblon, although some users do download forms from our Collaborate page, fill them out, and return them for incorporation into PhiloBiblon Xavier: demo of SparQL in a nice web editor from the Biblioteca Dixital Galiciana… (Olaf was impressed with the editor)
New property [label] GKW-ID [description]  Gesamtkatalog der Wiegendrucke. Data type: External identifier. Parallel to ISTC-ID for Incunabula Short Title Catalog (P645): [[https://www.gesamtkatalogderwiegendrucke.de/ Gesamtkatalog der Wiegendrucke ]]


Olaf: FactGrid is a limited user database, so only users with accounts can change things. Everyone in the project will get an account. Users like the ones Charles describes can be given an account. There’s no central form for making ‘issues’, you just change things when you see a mistake. People who make a change add a reference about the change. You can keep mistaken information on the database to make sure people don’t change it back. It remains on the database ‘downgraded’ as mistaken information. Qualifiers are created for hypothetical statements. The public can see everything and do any search but it’s not open to the public. Once you have your data in qualified triples, it will be used by different users depending on their own interests..
New property for CSV column head: [label] Condition / bibliography [description] "physical state of manuscript or edition, e.g. deteriorated". Data type string. Not to be confused with Physical description (P577). Typical phrases would be "in good condition," "wanting ff. 1," "water stains," "wormholed"


Charles: Charles: We want to facilitate crowd-sourcing in order to tap into Hispanists all over Europe who can describe MSS in their local libraries and add tht information to PhiloBiblon directly.
New property for CSV column head [label] state / bibliography [description] "textual changes in a edition due to correction of errors either before or after issue" Data type: string--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 07:33, 16 June 2022 (CEST)


Olaf: “spread accounts” - We can do his by giving everyone interested in contributing an account. Wikidata doesn’t have working hypotheses, they want facts, not theories. Whereas FactGrid allows for the type of knowledge (hypothetical, guess, needs to be substantiated) You make editorial control / blocking items… You can put items on a watch list and check who edits it. Otherwise everyone is allowed to edit and we watch the items as they get edited. Olof will give people accounts (based on email addresses, website / institutional titles), but he can also give admin accounts and show you how to create these accounts…
::Dear Charles I just created the new properties [[Property:P766]] Minimum and [[Property:P767]] Maximum do be used as qualifiers.


Charles: Would it be useful to add a field in PhiloBiblon for a WikiBase Q / P number? (as we correct our data),
::If a book is lacking something we have [[Property:P706]] ("without"). The Condition property is [[Property:P158]] - "Preservation" - I do not know whether condition would be the better label, as the English condition can have such a lot of meanings (like in logic). You are the native speaker and I'll accept your wording. (I added Condition, Conservation, State as aliases). The Properties are item-properties, so that we can get multilingual statements. You can use note in order to add details.


Yes, it’s advisable FactGrid will show these for Wikidata numbers this will be an interactive process, back and forth, until we get it right…
::I am not quite sure what to make of "state / bibliography [description]"


John May: In the production of these spreadsheets, what are we trying to do? E.g. there are 200 cells (with structure within, including P-numbers, and Q-numbers). In a normalized table, we would break these up, but they are not in that format yet. How are P / Q numbers assigned? —— Tasks: Get familiar with FactGrid & Wikibase Adam will be working on linking the entities between PhiloBiblon and Wikibase. Get admin account from Olaf (so you can create accounts for the project) Take the framework and CSV which John May will make and work with Olaf, Josep, and Jason Once we have the CSV, contact Jason Kovari: questioning whether we need entities or just metadata for the text? Josep will work on obtaining the entities from the texts (see interface). Josep will work on making an API for easy input options Rob: We should write down the primary things you want to refer to.. e.g. ‘ownership’ (provenance in the museum and library world) if that’s something you want to refer independently to the object. From there you can work out the CSV tables and get them into the system Everyone: Communicate all problems on the project page, so anyone can see the discussion that led to certain solutions (so our models get adopted). Transparency is key to that to follow your line of thinking.
::The GWK property is [[Property:P768]] best, --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 12:45, 16 June 2022 (CEST)


=== Olaf's thoughts ===
<hr>
My questions ''what kind of objects we are dealing with? in what quantities?'' did not aim at a datamodel which we would need before we start. We are not working on a conventional database where you have to know categories of objects before you start. (We do only have one sort of objects - they have Q-Numbers, and they attract statements without any need to create conceptual borders.)


My question was intended to detect hidden fields. Some databases have ''texts'' and ''people'' - and the organisers do not realise that they also have places - simply because they run these places as text input fields. Places will be Items as well and they are tricky. FactGrid has for example 11 Items under the name "Paris". 10 of these are places, one is a family name. If you connect manuscripts or people to "Paris" this will be easy to match as long as you already have your Paris with an external identifier such as the GEOnames ID. If places are, however, are just referenced as strings so far, that will be work to identify the respective Q-Items on our site. We have only some 20 major Iberian places on FactGrid at this moment. You will create your own references - and I assume that you will prefer to feed 30.000 places into FacGrid rather than to create the 8,000 places you would need right now for the 8,000 people. The import of the whole package (from a good source with Wikidata-Numbers, Geonames ID and stadnard References of your choice plus coordinates) aves a lot of work later whenever you link information to a new place.  
[DEPRECATED in favor of Name history [[Property:P34]] BETA*BIOGRAPHY*NAME_CLASS. Requires new P* “Name Class” or “Type of name.” with Datatype ITEM. The Items are:  [https://database.factgrid.de/wiki/Item:Q424844 Native name (Q424844)], [https://database.factgrid.de/wiki/Item:Q399910 Translated (Q399910) name], [https://database.factgrid.de/wiki/Item:Q14 Pseudonym (Q14)], [https://database.factgrid.de/wiki/Item:Q28006 Name variant (Q28006)].--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 20:56, 5 July 2022 (CEST)
<hr>
[[Property:P800]] BETA*MAN*BINDING. Requires a “Bookbinding” P#. Datatype: String. See Wikidata book cover (Q1125338).


My present sum is here:
[[Property:P778]] BETA*MAN*FEATURE_CLASS. Requires a “* Physical feature”  P# that takes an Item qualified by a Note (P71). Corresponds to Physical description (P577), which takes a String,


* 6,000 Codices
Example:
* 14,000 Witnesses
Physical feature P# > Catchword (Q394122) qualified by Note (P71) “horizontal centered below col. B”
* 8,000 Individuals
* 30,000 Places (just a guess after checking Wikidata on villages and human settlements)
* ..... Organisations (Libraries etc.)


...all these are quantities we can easily prepare on spreadsheets with the advantage that we can take a look at these data together before we feed them into the machine. The places are a good starting point as they will be just the landscape to refer to, not the complex objects themselves. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 14:17, 20 May 2021 (CEST)
Physical feature [[Property:P778]], which takes an item, resolves this problem. --[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 01:26, 19 October 2022 (CEST)


== Dan Gullo's suggestions (17 May 2021) for MSS to use as test objects ==
[[Property:P801]] BETA*MAN*GRAPHIC_CLASS. Requires a P# “Pictorial Element” that corresponds to Pictorial Element (Q405221). Datatype: Item, qualified by Note (P71).


When choosing records, I would suggest some basic criteria (I put this here before my suggestions).
[[Property:P790]] BETA*MAN*MUSIC_CLASS.  Requires a P# “Musical notation” Datatype:  Item, qualified by a Note (P71).


Well-established authors that are found in multiple collections Well-established authors that are found in multiple languages Well-established authors that may be found outside of Spanish or Portuguese libraries Complex author-title relationships, that may involve a known translator and a known author Complex title-title relationships, perhaps a commentary on a known work. Complex titles with multiple variants, perhaps by language Works that are in print and manuscript Works with established bibliography
[DEPRECATED in favor of properties for each type of call number] BETA*MAN*RELATED_LIBCALLNOCLASS. Requires a P# “* Inventory Position”. Datatype: Item, qualified by Note (P71) for the shelfmark string. Parallel to Inventory position (P30) Datatype String.
This would allow a variety of different shelfmark types (Q#), all of which are distinguished by PhiloBiblon: Inventory #, Registry #, alternate shelfmark, electronic shelfmark.--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 23:34, 25 August 2022 (CEST)


I would suggest Gonzalo de Berceo and known works to deal with a major author


* BETA bioid 1211
BETA*MAN*RELATED_LIBEVENTCURRENCY. Requires a P# “ISO 4217 code” identifier for a currency per ISO 4217. Datatype: External identifier.
See Wikidata P498. Code for $ = USD; Swiss franc - CHF; French franc = FRF (before euro), Deutschmark = DEM (before euro). Updated --[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 23:17, 28 July 2022 (CEST)


I would suggest Isaac of Ninevah and the translations and printed editions of his works (serious issues to wrestle with here because of the need to reconcile your data so that one author moves from 3 ids in Philobiblon to one Qid in Wikibase)
<hr>


* BETA bioid 1186 BETA bioid 1398 BITAGAP bioid 1079
[[Property:P781]] BETA*UNIFORM_TITLE*METRICS_LENGTH
Wikidata: quantitative metrical pattern (P2552) Datatype: String.


I would suggest Pseudo-Seneca for the same reasons as Isaac of Ninevah
In PhiloBiblon the formulas indicate the number of stanzas in a poem and the number of lines in a stanza.


* BETA bioid 1192 BITAGAP bioid 1107 BITECA bioid 6379
:2 x 4, 2 x 3  for a sonnet, two quatrains followed by two sextets


For particularly interesting records to think about the complexity of data migration, here are three.
:4 x 7 = 7 stanzas of 4 lines.


Very complex record
:1 x 4, 4 x 12 = 1 stanza of 4 lines and 4 stanzas of 12 lines


* BETA manid 1567
[[Property:P783]] BETA*UNIFORM_TITLE*METRICS_RHYME. Requires new P* “Rhyme scheme” with Datatype: String.


Record which is not a complete manuscript, so how do you want to represent a part of a manuscript, and not a complete manuscript.
In PhiloBiblon indicates the rhyme scheme(s) of a poetic genre,


* BITECA manid 2554
:Petrarchan sonnet: ABBA ABBA CDC DCD


For a printed book with complex data
:Elizabethan sonnet: ABAB CDCD EFEF GG


* BETA manid 1510
FactGrid “Rhyme scheme” (Q419569) < Wikidata “Rhyme scheme” (Q3264652)--[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 08:03, 7 July 2022 (CEST)
<hr>


For institutions, you can use mine for fun, because you have it three time in Philobiblon, all with three different names.
New property for Organisations: [label] telephone number [description] to state the telephone number(s) or an organization [alias] phone number.
: PhiloBiblon has over 500 libraries with phone numbers. This would correspond to Wikidata [https://www.wikidata.org/wiki/Property:P1329 Phone number P1329] --[[User:Charles Faulhaber|Charles Faulhaber]] ([[User talk:Charles Faulhaber|talk]]) 00:04, 15 November 2022 (CET)


BITECA libid 1128 BITAGAP libid 852 BETA libid 668
== Shelf-mark alternatives ==


== Test Items created in 2019 ==
* Inventory position (P10) = current shelfmark
* Old Inventory position (P30) = olim, former shelfmark
* Alternate shelfmark [[Property:P801]]
* Registry number [[Property:P802]]
* Inventory number [[Property:P803]]
* Barcode [[Property:P804]]
* Photograph number [[Property:P55]]
* Reference code [[Property:P805]]


* [[Item:Q164502|MS: Toledo: Biblioteca Capitular, 43-13 (1344-03-04)]]
== ID numbers for major libraries and union catalogs ==
:* [[Item:Q164746|Alfonso X, rey de Castilla y León. Siete partidas, 1 (MS: Toledo: Biblioteca Capitular, 43-13)]]
:* [[Item:Q164748|Desconocido. Los diez mandamientos (MS: Toledo: Biblioteca Capitular, 43-13)]]


== Do links to PhiloBiblon have to be dynamic? ==
Just as we have an ID property for the Biblioteca Nacional de España [[Property:P652]], we also need them for other national libraries We also need them for other union catalogues like the English Short Title Catalog (ESTC ID [[Property:P585]]) such as the following:
I am asking this because I tried to create an External identifier from FG to PB.


* [[Item:Q164503|Biblioteca Capitular de Toledo]]
* Bibliotheque national de France. <B>Label</B>: BnF-ID: <B>Description:</B> identifier from the authority file of the Bibliothèque national de France. <B>Alias:</B> National library of France ID | Bnf ID
* British Library: <B>Label</B>: BL-ID. <B>Description:</B> identifier from the authority file of the British Library. <B>Alias:</B> National library of the United Kingdom ID | BL ID
* Library of Congress: <B>Label</B>: LC-ID. <B>Description:</B> identifier from the authority file of the Library of Congress. <B>Alias:</B> National library of the United States ID | LCCN | LC ID
* Biblioteca Nacional de Portugal: <B>Label</B>: BNP-ID. <B>Description:</B> identifier from the authority file of the Biblioteca Nacional de Portugal. <B>Alias:</B> National library of Portugal ID | BNP ID
* Österreichische Nationalbibliothek: <B>Label</B>: ONB-ID: <B>Description:</B> identifier from the authority file of the Österreichische Nationalbibliothek. <B>Alias:</B> National library of Austria ID | ONB ID | ÖNB ID
* Universal Short Title Catalog: <B>Label</B>: USTC-ID. <B>Description:</B> identifier from the authority file of the Universal Short Title union catalog. <B>Alias:</B> Universal Short Title Catalog ID | USTC ID
* Bibliographical Patrimony of Spain union catalog: <B>Label</B>: CCPBE-ID. <B>Description:</B> identifier from the authority file of the union catalog of the Bibliographical Patrimony of Spain. <B>Alias:</B> Union Catalog of Spain ID | CCPBE ID | CCPB ID


The external identifier could lead directly into the data set if I had a link with a stable environment and just the ID changing. I would replace the ID position in the URL with $1 and be able to link from the number into the PB item page... Seems that does not work, as I see you do not give any links into your own database...  
:: Most of the existing properties for these identifiers use a hyphen (e.g. GKW-id [[Property:P768]]). I propose that the new ones should follow the same pattern. However, since users will search for them with and without the hyphen, non-hypenated forms should be used as aliases, as indicated.


Just by the way: sign all comments with <nowiki>~~~~</nowiki> - that creates automatic and dated signatures when you press save. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 22:55, 19 May 2021 (CEST)
:: BTW, [[Property:P235]] and [[Property:P595]] should be merged. They both refer to WorldCat - OCLC and both have the corresponding Wikidata property P243.


== PhiloBiblon Project Page in multiple languages ==
== PhiloBiblon people (and organisations) ==


I would like to add the Spanish, Catalan, and Portuguese versions of this basic description of PhiloBiblon
Here is an agenda
21:35, 20 May 2021 (CEST)~
# Create a list of all persons appearing in PhiloBiblon records (from authors to various associated persons) - on a Google Spreadsheet.
# Check the list with OpenRefine against Wikidata in order to get a broad range of external identifiers
# Check the list against FactGrid in order to avoid duplicates
# Create a column of descriptions if possible with the standard * Date and place of birth, + Date and place of death, short bio including a geographic reference
# Feed the list into FactGrid with P2=Q7, P154=Q17/18, P131 PhiloBiblon project statements and with Wikidata references wherever possible
# Create the respective name statements
# add information as available as time goes by --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 16:05, 26 November 2022 (CET)

Latest revision as of 18:59, 28 August 2023

Further Project Pages

change


Project Dimensions

Present stats:

TABLA BETA BITAGAP BITECA BIPA Total
ANALYTIC (witnesses) 14692 52084 12239 89926 168,941
REFERENCES 7270 21558 5976 472 35,276
PERSONS 7423 32309 3473 3418 46,623
GEOGRAPHY 1814 4759 840 211 7,624*
INSTITUTIONS 794 3297 585 4 4,680
LIBRARIES 915 455 420 119 1,909
MANUSCRIPTS & IMPRINTS 5168 5886 1971 1572 14,597
COPIES OF PRINTED BOOKS 4157 1146 1473 137 6,913
SUBJECT HEADINGS 339 34 149 126 648
WORKS (texts) 6034 31962 6173 89913 134,082
TOTAL 48606 153490 33299 185898 421293

* Places will definitely recommend an input of all Spanish places (deserted villages to cities, some 13,000) from sources that come with external identifiers and with geographic coordinates. --Olaf Simons (talk) 11:46, 21 May 2021 (CEST)

The best way to start the input is probably 1: all places, 2: all libraries, 3: All institutions... - as these things will work without your texts and codices, while the texts and codices will not work without these. Another approach would be to run a first input simply on Labels, (provisional) Descriptions and P2 information in order to get the grid of FG Q-Numbers and their matches of PB Identifiers as fast as possible. Once you have the matches you can run all subsequent inputs with greater ease. --Olaf Simons (talk) 16:15, 23 May 2021 (CEST)

In fact, this is what we had suggested doing as the first steps, starting with these three entities, primarily because they are the least complex in our data model. Although, as you point out, they have fewer dependencies. Toponyms have internal dependencies. Thus, cities belong to states or provinces ... --Charles Faulhaber (talk) 20:36, 23 May 2021 (CEST)

Running "first input simply on Labels, (provisional) Descriptions and P2 information in order to get the grid of FG Q-Numbers and their matches of PB Identifiers" is a wonderful idea. Needless to say I don't knw what that implies. Presumably our programmer John May will have to generate a list of PB identifiers. For every record we have a "moniker," a constructed field that serves to disambiguate similar records. Thus the moniker for persons consists of First and Last Name, followed by the person's title and dates, or some other identifying information. --Charles Faulhaber (talk) 02:51, 26 May 2021 (CEST)

Input efficiency: how to map controlled vocablary to P#/Q#

This Friday 1800 CST / 0900 PDT / 1200 EDT works for me. I've also invited Josep Formentí, who is in Barcelona, who has been looking at the FactGrid api.

We are currently focusing on cleaning up and coordinating our data in the four PhiloBiblon bibliographies.

Database designer John May is experimenting with CSV exports of data from the Biography/Persons table to allow us to manipulate it more easily to find errors, inconsistencies, and duplicates. Josep just mentioned to me a deduplication library. This is something that may be useful once we're a little farther in to the process.

In addition to Adam's questions I want to follow up with Olaf on two issues:1. What's the most efficient way to compare our lists of controlled vocabulary with existing FactGrid P# and Q#?2. How would we "run a first input simply on Labels, (provisional) Descriptions and P2 information in order to get the grid of FG Q-Numbers and their matches of PB Identifiers as fast as possible"?3. How do you log in to FactGrid in Spanish?

Yashila, glad to meet you,

-- Charles Faulhaber (talk) 00:57, 10 June 2021 (CEST)

Just briefly:
  • You can only use the database's wiki in Spanish if you have an account. Your account opens the menu top left with the "preferences" link that allows you to configure FactGrid. All people connected to this project should have accounts.
  • The entire property vocabulary is in the FactGrid:Directory of Properties. We should get a Spanish version of this structure in the course of this project. What we need here is English/Spanish speakers who go through all the 613 Properties top down. They all need Spanish Labels and descriptions to work. Google translate helps. Maybe this is a job for a student assistant if you have one, maybe it is interesting work anyone as it will make you familiar with the vocabulary.
  • The page for collective data modelling on Codices should be this one FactGrid:Data model for manuscripts. Do not hesitate to edit it, insert questions where you feel you need more clarity, restructure it as you need. All Wiki pages have a version history that allows us to compare states this pages has been in, nothing can get lost, nothing can be destroyed. The FactGrid player who created it is User:Marco Heiles, and he should join you on data modelling debates as he is presently experimenting with some 200 codices of his PhD thesis. You are not bound to his decisions since we are using triples - so you can have any triple you want without affecting anyone on board. But we gain project cohesion if we find a language of triples that works for all. The names of the properties are irrelevant, the important thing is that we use them the same way.
  • Duplicates in biographies: We should avoid all duplicates by setting Wikidata reference numbers wherever a Wikidata-Item exists. The cool thing about the Wikidata reference is that you cannot create two items on FactGrid with the same Wikidata reference number. Question for John May is then: can you map your names against Wikidata?
  • How to run a first input that just creates all the objects, so that we have a list of FactGrid-Q-Numbers on corresponding PhiloBiblon IDs. We do the first input via CSV. This is a first input I prepared with a colleague - not yet in the machine: Google Spreadsheet a complex first input. If you go for a simpler input you should still have:
  • Labels in as many languages as you require (English/ French/ Spanish/ Portuguese (?)/ German)
  • Descriptions at least in English, FactGrid's default language.
  • A P2 (what is it?) statement on all items that flow in (a codex, a human being, a place...)
  • Maybe the PhiloBiblon ID if that helps you to to organise the matches.
  • Question on my side: is there a Spanish register of all places from villages to cities? Could we match that with Wikidata and PhiloBiblon places in order to make sure that we create a unified geographical data environment that won't require us to create and to geo-reference ever new places at later stages?
It might be good to create a Spanish Help Section on FG and I'd recommend that all new users take a brief tour through the English help section. Clarify things where I did not see the difficulties, demand clarification where I failed. New users are the best to do that as only they have the perspective to see the deficits in my/our explanations. --Olaf Simons (talk) 09:03, 10 June 2021 (CEST)

Spanish and Portuguese places

I have created Spanish and Portuguese localities on the basis of the listings of the statistical offices of both nations as linked in Wikidata:

* Wikidata search: all items that have a Spanish INE ID
* Wikidata search: all items that have a Portuguese INE ID

The present FactGrid list is shorter on the Spanish side as I deselected double entries on INE IDs. My feeling was here that these (roughly 4000) double Wikidata matches are to some extent the product of competing Wikipedia language projects all based on the same INE ids. The shorter FactGrid list should be easier to match against PhiloBiblon place listings.

This is the Google Spreadsheet from which the FactGrid input was done:

The following two searches give all FactGrid localities for Spain and Portugal just with the respective INE listings and geographic coordinates. One can add external identifiers in further columns - presently available are VIAF, GeoNames, BNE, GND, Getty Identifiers (as far as Wikidata had them):

* FactGrid Search: all Spanish localities, geo-coordinates, Spanish INE IDs
* FactGrid Search: all Portuguese localities, geo-coordinates, Portuguese INE IDs

Both data sets can be improved:

  1. The Spanish set is lacking some 4000 English descriptions.
  2. Both sets are lacking descriptions in the other languages that have bene labelled: French, German, Spanish and Portuguese. The descriptions are needed to pick the right place in case of name identities; one would love to have them short and easy - like: "locality in Spain, province of XYZ".
  3. Getty thesaurus and BNE seem to be highly incomplete.
  4. Both sets still have localities without geographic coordinates.
  5. FactGrid should already have the various provinces as items but we will need a conclusive input to get the right provinces on all localities.
  6. We still need an easy to use data model for changing historical territorial affiliations (to be organised with qualifying begin and end statements on Property:P297 statements).
  7. We might also see a need to create items for monasteries, castles particular houses in cities (they should get Property:P2+Item:Q16200 statements and Property:P47 localisations).

So still some work to be done. It would be cool to have a partner in this, a group associated to the BNE oder the Getty-Thesaurus adopting FactGrid as their Wikibase.

The nice thing is, however, that we are now able to locate (almost) any object or person with a locality reference on the map as (almost) all place names have geographic coordinates on them which can be selected with the respective statements in SPARQL queries. --Olaf Simons (talk) 09:48, 28 September 2021 (CEST)

  • BETA geoid ## are in col. A. The contents of the following columns are in the same order as the fields in PhiloBiblon's Data Dictionary for Geography: GEO Dict
I have just run a first matching on the BETA - GEOGRAPHY.CSV with a look at all our Spanish places, which (of course) left a lot of blanks. K2-5 are, in our perspective the same Paris just under different names. We could reiterate the process with Portuguese and French places, Vienna is Austrian. We would not get a total matching without manual work as we will not match e variants in different languages. Coordinates are extremely rare, we have them on all, one would need rounded values to do the matching. The related GeoID is neither GeoNames nor an ID (too many 1002 record doubles).
Can you establish how many individual localities you actually have in all your databases? How do you make sure that you refer to the same thing in your databases? One might like to do the matching on all of them in one process. --Olaf Simons (talk) 23:08, 6 October 2021 (CEST)
PS. Maybe John May has an idea what he would do if we were going from FactGrid to PhiloBiblon. How would he match the two lists of our Spanish and Portuguese places with coordinates and Identifiers. What would he wish we had to make merging easier? --Olaf Simons (talk) 23:11, 6 October 2021 (CEST)

First steps

For each data type = entity = PhiloBiblon table (Analytic, Bibliography, Biography, Copies, Geography, Institutins, Libraries, Ms/Ed, Subjects, Uniform Title) the following steps need to be done

1. Deduplication of records

2. Data clean-up. In the process identify Wikidata/FactGrid Q# for Geography, Biography, Institutions, Libraries, to the extent possible.

3. Definition of mapping between PhiloBiblon fields and "dataclips" (controlled vocabulary) and FactGrid P# and Q# (e.g., PhiloBiblon date of birth = FactGrid P77).

4. identification of PhiloBiblon entities with FactGrid Q entities (e.g., Fernando III, el santo, king of Castile and Leon = BETA bioid 1110 = BITAGAP bioid 2054 = BITECA bioid 6515 = FactGrid Q254531).

5. Import PhiloBiblon CSV records from Windows PhiloBiblon into FactGrid taking account of points 3 and 4.

6. Enrich records with data from external sources, e.g. VIAF.

Some of these steps can be carried out in parallel, i.e., Steps 1-4

The order in which entities should be imported into FactGrid needs to be decided. It seems clear that Geography should go first, because its data are used in Biography, Institutions, Libraries, Ms/Ed, and Uniform Title.

Institutions and Libraries should come next because the number of records to be imported is relatively small, and many of them will already have Wikidata Q# if not FactGrid Q#.

Biography (Persons) would come next because its records are needed for Analytic, Copies, Ms/Ed, and Uniform Title. There are some 40,000 Biography records in total, the vast majority of which will have neither Wikidata nor FactGrid records.

Uniform Title, Ms/Ed, Copies, and Analytic should follow, in that order.

This is in fact the same order set forth in the original Work Plan in the NEH proposal. It is worth noting that we are already starting on work Geography that in the Work Plan is scheduled for October.

--Charles Faulhaber (talk) 23:56, 13 June 2021 (CEST) (for Josep Maria Formentí)

Questions for webex sessions

1. How do you keep together separate statements that form a data unit? For example, BETA bioid 1007 shows the following information for the date and place of birth of king Alfonso X:

Toledo 1221-11-23 (DB~e)

Madrid 1221-11-26 (Alvar 2010)

These represent two different opinions.

The source for the first one is the Diccionario Bibliográfico Español. The source for the second is a 2010 article by Carlos Alvar.

We do not want to have the two place of birth statements together followed by the two date of birth statements, since there is then no way to associate the 1221-11-23 date with Madrid and the 1221-11-26 date with Toledo.

--Charles Faulhaber (talk) 06:58, 26 July 2021 (CEST)

Wikibase allows any number of statements on the same question. Four dates of birth next to each other are no problem at all. We give them with their respective different sources and different notes why this should be the date. If we can rule out a particular date we still take it and downvote it with a note, so that there is no fall back into an error. If we know that all dates are right (e.g. different successive numbers of population counts) we can prefer the latest date with the wish that only they appear in searches. --Olaf Simons (talk) 08:19, 26 July 2021 (CEST)
For discussion purposes a very short "Alfonso X" in FactGrid without downvotes nor preferences at your service: https://database.factgrid.de/wiki/Item:Q266835 You can have a look at the corresponding wikidata-sheet (interlink at the bottom of the page) how they solve the problem there (its the same software).--Martin Gollasch (talk) 09:41, 26 July 2021 (CEST)
The problem is not the multiple dates of birth but rather associating a specific date of birth with a specific place of birth rather than adding all of the date of birth statements followed by all the place of birth statements. What you show in the linked FactGrid Q266835 is precisely what we don't want. It breaks the association between date and place, leaving it up to the reader to guess that the first date is associated with the first place and the second with the second place. This kind of association is fundamental to the structure of PhiloBIblon--Charles Faulhaber (talk) 06:47, 30 July 2021 (CEST)
Well, you could link both statement,shifting either of te two linked statements into the qualifier (and you can have as any qualifiers as you want). But you will actually like the option to separate thise statements since you will not always have the lined statement. --Olaf Simons (talk) 07:17, 30 July 2021 (CEST)
One thing is database content, the other thing is, what does a reader/user see in the browser later on. I made some changes (gave preference for Carlos Alvar 2010) and we will see tomorrow at the latest how the database content is shown in the beta version of the FactGrid Viewer as an example. Right now there are two lines with equal values. You find the available Browsers/Viewers on the left of your screen. I guess, this viewer does not show the preference... Actually it should be no problem, to find a viewer, which is suitable for you. The actual question is: Do you have a database problem or a viewer problem?--Martin Gollasch (talk) 10:11, 30 July 2021 (CEST)
I regret the delay in responding. Martin, you have solved the problem in my opinion. This is not simply a browser issue; it is in fact a database problem, how to maintain the relationship among the various elements of a data structure, in this case place of birth / date of birth/ source. Here there are only two elements in the data structure. We have cases in which there are six or eight related elements that need to be kept together, for example the former owner of a manuscript, the date and place he acquired it, and the price he paid, and the source of the information. --Charles Faulhaber (talk) 07:04, 29 August 2021 (CEST)
One simple solution is to add a position in sequence (P499) as a qualifier to keep track of the relationship between the elements that need to be kept together. This position in sequence can be used in a query to easily find the related elements. Bruno Belhoste (talk) 10:52, 31 August 2021 (CEST)
Charles: The nice thing about starting a project in open data is, that many things can be decided as collaborate work is in progress... In such project as yours you might have a look how Germania Sacra is organizing their bishoprics and if you find a better way, others might even learn from you. Same thing with manuscripts of all kinds. I personally deal with the sourcetype Album Amicorum/Liber amicitiae to get mobility profiles and to get 18ct social group profiles. They can be described as a bunch of manuscripts. There is a tendency by German libraries to describe them one by one single sheet. In doing so, you will lose all the information, which results from their connection in the album. That needs certainly discussion later on, since electronic data can be reorganized more easily... So you have to keep your data reorganizeable. In the field album amicorums we mostly have the first/original holder of the book, but it certainly could be of interest to see all the owners en route to the present archive... So I am curios to see, how others solve common problems. I think, you should consider creating model structures of typical data sets in FactGrid soon. This will make it easier for other Users here to enter a discussion with you and your team members.--Martin Gollasch (talk) 11:15, 29 August 2021 (CEST)

That is precisely what we want to do. Olaf has suggested that we start with place names, because they are needed to establish relationships with people and with organizations. However, we are still struggling with the correlation between our ontology and FactGrid's properties. We do not want to create a Property that duplicates one that already exists in FactGrid. We've been looking at Wikidata to find models for the kinds of entities that we need and the Properties they use, e.g., WikiProject Books https://www.wikidata.org/wiki/Wikidata:WikiProject_Books. We understand that WikiData Properties and FactGrid Properties have different P#, but looking at the former will help us with the latter--Charles Faulhaber (talk) 08:01, 30 August 2021 (CEST)

Sorry Charles, thats not what I meant. My suggestiom is, to take one of your old manuscripts, perhaps not the easiest one, and describe it in FactGrid with Items for all the needed persons, places and institutions you need for a proper and state of the art description. Sort of a case study... The mass import of places first is certainly a good decision, but I would like to propose to you to create one or two more or less complete models of your intended datasets parallel to the planned mass imports of data. The mass import is totally independent from my suggestion. In the German wikisource Pope Joan is often referred to as a dataset which includes nearly every problem you might have in modeling...--Martin Gollasch (talk) 09:25, 30 August 2021 (CEST)

Model of Book

I have started to create manually a record for my own Medieval Manuscripts in the Library of the Hispanic Society of America ((Q385363)). The three parts of the header are Label: Title (Charles B. Faulhaber, Medieval Manuscripts in the Library of the Hispanic Society of America (New York, Hispanic Society of America, 1983) ); Description (catalog of medieval manuscripts); Alias (BETA bibid 1862)

I assumed that, since "book" is such a well-known data type, I would be able to find all of the properties that I needed when I started to create statements.

I was able to enter (with Martin Gollasch's help): Instance of / Author / Type of Publication / Date of Publication / Place of Publication / OCLC number / Topic / Publisher

At that point I discovered that in order to enter the Publisher you must first create a Q# for the Publisher. I did that.

Yes, the publisher had to be created first, so that you could link to it. This is one of the miseries of the database that does not fill string fields but wants to connect informed items. My usual recommendation is to create rich landscapes of data as a background to refer to. I love consistent mass inputs of background information to refer to later.

Then I tried to add the full title of the book: Medieval Manuscripts in the Library of the Hispanic Society of America. Religious, Legal, Scientific, Historical, and Literary Manuscripts. I found Title in the Property field "to state the title of a document," but I could not get it to link to the Property P11 by selecting it from the list. I had to go to the list of properties, find "Title," copy "P11" and paste it into the field.

Properties are not in the regular search (top right), unless you switch to advanced search (which I put into the menu for that purpose). To start the title statement it would have been enough to just click "add statements" and type: "Tit..." the auto complete would have suggested "Title" (and P11) as the property you are looking for. Best thing to do: anticipate all the words people might use in order to make the specific statement and have these words in Aliases. Standard option, however: go to the directory of properties, where you will also see neighbouring options.
That's precisely the problem. I typed "Tit..." and got "Title", but then FactGrid would not link to it.--Charles Faulhaber (talk) 06:44, 8 November 2021 (CET)
Working with student assistants I provided a list of statements (with the Q-Numbers) which they would make one by one.
Yet it seems you found at least one way.

I added "listed in" (P124) "BETA / Bibliografía Española de Textos Antiguos (Q254471)". However, when I searched for that in the "string" field, I couldn't find it. Nor could I find it in the general FactGrid search box. I had to copy it from the FactGrid:PhiloBiblon page: https://database.factgrid.de/wiki/FactGrid:PhiloBiblon. Where does one indicate the number in the catalog?

Again, it would have been enough to simply start typing "list...." and the machine would have suggested the property. (Or use the Advanced search and limit to Properties) (It is good design that the search field top right does not give you properties. We have Items and Properties of identical names and it is the items that should accumulate knowledge, they are the lemmata in this system. Properties should only occur when you open a statement, and here the auto complete will do the job (teach it with aliases of your choice).

I next wanted to enter the number of volumes in the edition. I can find no property in the general property list. I see P105 "volume," as in "volume 1," but not volumes.

I would go for the P613 property "number of integrated object" and state Volume as the unit, as we have so many variants here. Some publications want to consist of parts, others of numbers, issues and so on. We better create items for these than properties since the property would demand a user who knows that this publication consists of volumes and not of parts.
P105 Volume is defined as “qualifier to refer to a specific volume of a work you have referenced in the main statement” However, this is not the correct Property to indicate the number of volumes in a publication. I don't know what you mean that it would be better to create items for these rather than properties.
How does one create an item?
It seems to me that you either have to let the Qualifier field to P613 either be a string field or you have to vastly increase the number of Properties that it allows. E.g., it ought to allow for "Volume(s)" just as P107 allows Page(s). Others: fascicules, parts, issues, numbers etc. as you mention above.--Charles Faulhaber (talk) 06:44, 8 November 2021 (CET)

I want to change "author" to "compiler," since I compiled the catalog, I didn't write it. I don't find a property for that either...

And here you just gave me the better word for our P317 property.
I tried to edit the Author statement to change it to "P317 Compiled by". However, every time I put my cursor in the Property field the program sent me to "Author P11." I finally deleted the Author statement and added the "Compiled by" Statement. Is there a way to change the Property in a statement without changing the rest of it?
I tried to edit the Number of pages / leaves / sheets (P107) to add as a qualifier Page(s) (P54). I wanted to add the total number of pages in both volumes of the book: 1: [i]l + 664; 2: [i]-xiv + 245. However, it wouldn't let me save it. It didn't matter whether I put the number after the P107 Property or after the Qualifier P54. What is the constraint?

I tried to delete the ISBN and ISBN-13 Properties since this book has no ISBN number, but when I saved it I got an error message: "Could not remove due to an error.

Edit conflict. Could not patch the current revision.
Edit conflict."

In fact I could save none of the changes I made in about an hour's work even though I was logged in.

Is it possible to duplicate a record. For example, two volumes of my catalog of MSS in the Hispanic Society were devoted to documents. They were published in 1993 with an ISBN number and a different subtitle It would be useful to duplicate the record and then just change the statements that differ. --Charles Faulhaber (talk) 06:44, 8 November 2021 (CET)

In PhiloBiblon we have some 3000 "enumerated data types," some of which are entities rather than properties.

Should I make a list of properties that are missing (at least compared to PhiloBiblon) before we attempt mass input? --Charles Faulhaber (talk) 06:40, 5 November 2021 (CET)

Yes, the list would be useful. I assume that most of these will become Items to refer to in qualifiers or as units but we should have them all. FactGrid is a fast learner. --Olaf Simons (talk) 08:36, 5 November 2021 (CET)

Practical session of Wiki/Wikibase editing

Dear Charles. We need a practical session of Wikibase editing.

Yes! As you can see from the comments below I am very far from understanding how FactGrid works.--Charles Faulhaber (talk) 07:51, 9 November 2021 (CET)

There are two ways to get the last note: Edit conflict. The real one is that two people edit simultaneously on the same section. In this case the one who saves first gets the edit and the other gets a chance to see the new version with a look at the text he wanted to write. Another way to get the Error is by pressing "save" twice. In this case the machine is already processing your edit while you want to impose a new version. Looking at the version history I assume your edit conflict was of the second sort - you trying to intervene on an edit you have just sent off. You can ignore such error notes, or just let the machine save (your browser tells you that it is processing your request on the tab).

I do not see how exactly you deleted an hour's work. The version history shows that only you were editing on this page.

I worked on the record ((Q385363)) for an hour and made numerous changes. I was not able to save any of them.

I am currently trying to change the value "909" in the statement with P107 (Number of pages/ leaves/ sheets) to "1: [i] l + 664; 2: [i]-xiv + 245" in order to get the breakdown of the pages in vol. 1 and vol. 2. When I make that change in the Property field, the Qualifier "Pages" disappears and I have to add it back in, which brings up a blank field for the value. If I add "1: [i] l + 664; 2: [i]-xiv + 245" in that field I cannot save the record with the same value in both the Property and the Qualifier fields. If I delete the value from the Property field and keep it in the Qualifier field, I still cannot save the record. The only way I can save the record is by retaining "909" as the value in the Property field and adding "1: [i] l + 664; 2: [i]-xiv + 245" in the "page(s)"Qualifier field. This creates a puzzling statement for anyone who knows anything about descriptive bibliography.

I see that you have now added this under the "Physical description" property. It would never have occurred to me to do so. Again, I think you need to specify what those numbers ("1: [i] l + 664; 2: [i]-xiv + 245") refer to: pages, leaves, folios?--Charles Faulhaber (talk) 07:51, 9 November 2021 (CET)

I am now trying to add a record for the Berkeley Library. In order to do that I need to amplify the record for the parent institution, the University of California Berkeley (Q369705). I want to add the exact address. The record currently shows "Place of home address (P83)." However, P83 seems to be limited to persons, according to the Description and Alias: "use for geographic places like towns and villages, specify address with P208 Resident of Place of residence"

Is it also to be used for Orgnizations?

When I use P208 for the exact addres (in this case just the postal code), I make a new statement and try to insert "94720-6000" as the value, but FactGrid apparently expects a Q#, not a string. This seems to imply that I have to create a Q# for the postal code "94720-6000" I note that this is "data type item" in FactGrid, but the corresponding data type in Wikidata is "monolingual text". --Charles Faulhaber (talk) 07:51, 9 November 2021 (CET)

As to items versus properties:

This is a question of terminology. Is an "item" always a Q#?

I note that in the P# records, the data type is given as "Item," e.g.:

Address (P208) full street address where subject or organization is located. Include building number, city/locality, post code, but not country In more languages Data type Item


We can have properties for anything, but it has its disadvantages. If you create three properties, one for "number of pages", one for "number of leaves (German Blatt)" and one for number of (printed) sheets, then you must - when running a SPARQL query - know how the sort of statement is made in all particular cases. If library x is giving page numbers and library y gives you sheets then you better ask for all three sorts of statements to find out.


The better option is in this case a Property for the number whatever it is - pages, leaves or sheets - and a unit behind. You can then be sure that you ask the right question as it is only one Question for whatsoever. These exact answer comes with the units or qualifiers.

I tried this. See above--Charles Faulhaber (talk) 07:51, 9 November 2021 (CET)

Creating items is easy: use the New Item link (on the menu) or, preferable, a batch that creates an item in all the languages you need with all the statements that you are likely to repeat on every new item in your present course of work. You find some batch fragments in the menu. I will help you to create special batches for your team - these things are useful as they speed up editing in more than one language. We should have a course on this. (I also created a help section - see: Help:How do I feed data into FactGrid?) --Olaf Simons (talk) 09:21, 8 November 2021 (CET)

Dear Charles, I am the one who shifted the "1: [i] l + 664; 2: [i]-xiv + 245" to physical description as that is where we have these remarks in all the other data sets. I regret I cannot see where the editing does not work, and where you get stuck for hours. We should have a practical session in which I just look over your shoulder to see what you are doing. Share screen is easy these days. Odd string inputs... I sometimes received error codes, but rarely. Usually I makle a mistake about blanks in the beginning or end. But my experience with the site is that it works incredibly smooth. --Olaf Simons (talk) 08:36, 9 November 2021 (CET)
It is, in any case not possible to put "1: [i] l + 664; 2: [i]-xiv + 245" on the total number of pages property since the particular property is not a string property but a property calling for a number, a quantity, as defined in the "Property type" (which you have to pick when creating a property). The cool thing with quantity properties is that you can sort them by number, if ever you want to know the biggest of all you codices. They are designed for statistics and computations.
Address P208 wants an item to link to. The good thing about the item is that it will have geo-coordinates, so that you can eventually ask the machine to show you all merchants of Gotha on a map. The search will ask for all "Merchants" of Gotha, it will ask for their addresses and it will come with a command to go into the address items to get the coordinates from there. The advantage is here that you do not have to repeat coordinates all the time. They are on the items we are linking to. If you do not know how a property is working, go to it, then ask "what links here" and look at the Items that are used. This is Bruno Belhoste's present project "Paris to download" you can take a look at the SPARQL code to see how it is done. It's a bright thing to link to informed items. We d not want our users to put geo coordinates on every statement. WE want the machine to know what is meant by "Paris" or by "Paris, rue de l'Université, no. 14".
We should really have that session in which I see your editing and in which I can tell you the advantages of doing it differently. --Olaf Simons (talk) 08:51, 9 November 2021 (CET)

New Properties

As we discussed, we still need some new P# in order to facilitate modeling of PhiloBiblon entities:

  1. New P# for external identifiers to match existing P156 (online presence) and P634 (DOI): URN = Uniform Resource Name (Q399023) and URI = Uniform Resource Identifier (Q399000).
  2. New P# for ISSN = International Standard Serial Number (Q394868) and ISSNe (International Standard Serial Number, electronic) = eISSN (Q46339674). Cfr. ISBN-10 (P605)
  3. New P# "Honorific" for form of address. Takes Q#, e.g., Fr. = Friar, Lord, Señor. Cfr. Wikidata honorific (Q1326966), Subclass (P3) of Epithet. Cfr. BETA bioid 1259
  4. New P# "Religious Order"; cfr. Wikidata P611 (Religious order)
  5. New P# to indicate relation between two physical copies of a book. Qualified by P708 Kind of relation. Indicates the relationship between a copy of a print publication and another copy not necessarily of the same publication. I.e. two unrelated books could be bound together as a “bound with”
  6. New P# "Script style" (cf. Wikidata P9302)
  7. New P# "Typeface/font" (cf. Wikidata P2739)
  8. New P# "Watermark" or "Watermark Type." Would be instance of FactGrid Properties for Core Bibliographical Data, Cfr. Wikidata Watermark (Q43065). See Watermarks in BETA manid 2098.--Charles Faulhaber (talk) 00:08, 25 May 2022 (CEST)

Hesitated with number 5 and do not know whether No. 1 is already doing the job. --Olaf Simons (talk) 10:37, 25 May 2022 (CEST)

  1. I created the equivalent to Wikidata's URN formatter - hope that was the thing you need. Property:P744
  2. ISSN Property:P743
  3. Honorific prefix Property:P745
  4. Religious Order Property:P746
  5. ...is a tricky one. I saw volumes with 50 dissertations and tracts - we would end in a mess if we always stated the neighbours on all the neighbours. It would be better to have a property that just links upwards to the collective volume, something like "bound in collective volume" (German is nicer: "Beiband in") - then however we could just as well state the includes Property:P9 on the collective volume. Consider with me what's best.
  6. Script style Property:P747
  7. Typeface Property:P748
  8. Watermark Property:P749
New properties to serve as Head of Column both for manuscripts as well as print publications. These would be parallel to Page(s) (P54) and Folio (P100)
  1. Property:P755 for Column(s). Cfr. Column(s) (Q394315) and Wikidata column (P3903)
  2. Property:P756 for Fly leaf/leaves. Cfr. Fly leaf / leaves (Q395036).
  3. Property:P757 for Plate(s), i.e., illustrations printed separately and inserted into a book. cfr. Plate (Q394145).

--Charles Faulhaber (talk) 02:00, 31 May 2022 (CEST)

Heighth P59) and Width (P60) take data type Quantity. However, it appears that Quantity only allows integers. We have cases where e.g., the width is indicated "210 / 215", i.e., it varies between 210 and 215 mm. How do we handle that? The option of putting in two separate values, 210 and 215, and qualifying them in some way (e.g. "greater than," "less than" doesn't capture the reality.--Charles Faulhaber (talk) 23:04, 7 June 2022 (CEST)
The integer solution is generally preferable, because the database actually recognises these numbers as numbers - ready for computation and for correct listing. String input means that 1100 will come before 3 (you will need 003 in that case to get the correct listing). So better go for numbers rather than string and state ranges with maximum and minimum qualifiers.
Things are slightly different with the properties I have just created, they are references, often with input such as (page) 1-56. --Olaf Simons (talk) 08:30, 8 June 2022 (CEST)
Thank you for Column (P755), Fly leaf/leaves (P766) and Plate (P757)! "So better go for numbers rather than string and state ranges with maximum and minimum qualifiers." How would you do this for Width (P60) in [RAH Cod 59]? The minimum for the width of the text page is 210 mm and the maximum 215--Charles Faulhaber (talk) 07:28, 9 June 2022 (CEST)

New property [label] GKW-ID [description] Gesamtkatalog der Wiegendrucke. Data type: External identifier. Parallel to ISTC-ID for Incunabula Short Title Catalog (P645): [Gesamtkatalog der Wiegendrucke ]

New property for CSV column head: [label] Condition / bibliography [description] "physical state of manuscript or edition, e.g. deteriorated". Data type string. Not to be confused with Physical description (P577). Typical phrases would be "in good condition," "wanting ff. 1," "water stains," "wormholed"

New property for CSV column head [label] state / bibliography [description] "textual changes in a edition due to correction of errors either before or after issue" Data type: string--Charles Faulhaber (talk) 07:33, 16 June 2022 (CEST)

Dear Charles I just created the new properties Property:P766 Minimum and Property:P767 Maximum do be used as qualifiers.
If a book is lacking something we have Property:P706 ("without"). The Condition property is Property:P158 - "Preservation" - I do not know whether condition would be the better label, as the English condition can have such a lot of meanings (like in logic). You are the native speaker and I'll accept your wording. (I added Condition, Conservation, State as aliases). The Properties are item-properties, so that we can get multilingual statements. You can use note in order to add details.
I am not quite sure what to make of "state / bibliography [description]"
The GWK property is Property:P768 best, --Olaf Simons (talk) 12:45, 16 June 2022 (CEST)

[DEPRECATED in favor of Name history Property:P34 BETA*BIOGRAPHY*NAME_CLASS. Requires new P* “Name Class” or “Type of name.” with Datatype ITEM. The Items are: Native name (Q424844), Translated (Q399910) name, Pseudonym (Q14), Name variant (Q28006).--Charles Faulhaber (talk) 20:56, 5 July 2022 (CEST)


Property:P800 BETA*MAN*BINDING. Requires a “Bookbinding” P#. Datatype: String. See Wikidata book cover (Q1125338).

Property:P778 BETA*MAN*FEATURE_CLASS. Requires a “* Physical feature” P# that takes an Item qualified by a Note (P71). Corresponds to Physical description (P577), which takes a String,

Example: Physical feature P# > Catchword (Q394122) qualified by Note (P71) “horizontal centered below col. B”

Physical feature Property:P778, which takes an item, resolves this problem. --Charles Faulhaber (talk) 01:26, 19 October 2022 (CEST)

Property:P801 BETA*MAN*GRAPHIC_CLASS. Requires a P# “Pictorial Element” that corresponds to Pictorial Element (Q405221). Datatype: Item, qualified by Note (P71).

Property:P790 BETA*MAN*MUSIC_CLASS. Requires a P# “Musical notation” Datatype: Item, qualified by a Note (P71).

[DEPRECATED in favor of properties for each type of call number] BETA*MAN*RELATED_LIBCALLNOCLASS. Requires a P# “* Inventory Position”. Datatype: Item, qualified by Note (P71) for the shelfmark string. Parallel to Inventory position (P30) Datatype String. This would allow a variety of different shelfmark types (Q#), all of which are distinguished by PhiloBiblon: Inventory #, Registry #, alternate shelfmark, electronic shelfmark.--Charles Faulhaber (talk) 23:34, 25 August 2022 (CEST)


BETA*MAN*RELATED_LIBEVENTCURRENCY. Requires a P# “ISO 4217 code” identifier for a currency per ISO 4217. Datatype: External identifier. See Wikidata P498. Code for $ = USD; Swiss franc - CHF; French franc = FRF (before euro), Deutschmark = DEM (before euro). Updated --Charles Faulhaber (talk) 23:17, 28 July 2022 (CEST)


Property:P781 BETA*UNIFORM_TITLE*METRICS_LENGTH Wikidata: quantitative metrical pattern (P2552) Datatype: String.

In PhiloBiblon the formulas indicate the number of stanzas in a poem and the number of lines in a stanza.

2 x 4, 2 x 3 for a sonnet, two quatrains followed by two sextets
4 x 7 = 7 stanzas of 4 lines.
1 x 4, 4 x 12 = 1 stanza of 4 lines and 4 stanzas of 12 lines

Property:P783 BETA*UNIFORM_TITLE*METRICS_RHYME. Requires new P* “Rhyme scheme” with Datatype: String.

In PhiloBiblon indicates the rhyme scheme(s) of a poetic genre,

Petrarchan sonnet: ABBA ABBA CDC DCD
Elizabethan sonnet: ABAB CDCD EFEF GG

FactGrid “Rhyme scheme” (Q419569) < Wikidata “Rhyme scheme” (Q3264652)--Charles Faulhaber (talk) 08:03, 7 July 2022 (CEST)


New property for Organisations: [label] telephone number [description] to state the telephone number(s) or an organization [alias] phone number.

PhiloBiblon has over 500 libraries with phone numbers. This would correspond to Wikidata Phone number P1329 --Charles Faulhaber (talk) 00:04, 15 November 2022 (CET)

Shelf-mark alternatives

ID numbers for major libraries and union catalogs

Just as we have an ID property for the Biblioteca Nacional de España Property:P652, we also need them for other national libraries We also need them for other union catalogues like the English Short Title Catalog (ESTC ID Property:P585) such as the following:

  • Bibliotheque national de France. Label: BnF-ID: Description: identifier from the authority file of the Bibliothèque national de France. Alias: National library of France ID | Bnf ID
  • British Library: Label: BL-ID. Description: identifier from the authority file of the British Library. Alias: National library of the United Kingdom ID | BL ID
  • Library of Congress: Label: LC-ID. Description: identifier from the authority file of the Library of Congress. Alias: National library of the United States ID | LCCN | LC ID
  • Biblioteca Nacional de Portugal: Label: BNP-ID. Description: identifier from the authority file of the Biblioteca Nacional de Portugal. Alias: National library of Portugal ID | BNP ID
  • Österreichische Nationalbibliothek: Label: ONB-ID: Description: identifier from the authority file of the Österreichische Nationalbibliothek. Alias: National library of Austria ID | ONB ID | ÖNB ID
  • Universal Short Title Catalog: Label: USTC-ID. Description: identifier from the authority file of the Universal Short Title union catalog. Alias: Universal Short Title Catalog ID | USTC ID
  • Bibliographical Patrimony of Spain union catalog: Label: CCPBE-ID. Description: identifier from the authority file of the union catalog of the Bibliographical Patrimony of Spain. Alias: Union Catalog of Spain ID | CCPBE ID | CCPB ID
Most of the existing properties for these identifiers use a hyphen (e.g. GKW-id Property:P768). I propose that the new ones should follow the same pattern. However, since users will search for them with and without the hyphen, non-hypenated forms should be used as aliases, as indicated.
BTW, Property:P235 and Property:P595 should be merged. They both refer to WorldCat - OCLC and both have the corresponding Wikidata property P243.

PhiloBiblon people (and organisations)

Here is an agenda

  1. Create a list of all persons appearing in PhiloBiblon records (from authors to various associated persons) - on a Google Spreadsheet.
  2. Check the list with OpenRefine against Wikidata in order to get a broad range of external identifiers
  3. Check the list against FactGrid in order to avoid duplicates
  4. Create a column of descriptions if possible with the standard * Date and place of birth, + Date and place of death, short bio including a geographic reference
  5. Feed the list into FactGrid with P2=Q7, P154=Q17/18, P131 PhiloBiblon project statements and with Wikidata references wherever possible
  6. Create the respective name statements
  7. add information as available as time goes by --Olaf Simons (talk) 16:05, 26 November 2022 (CET)