Difference between revisions of "FactGrid:Data modeling"

From FactGrid
Jump to: navigation, search
(The Problem of on-and off existence)
(Changing names)
Line 25: Line 25:
  
 
We are presently using [[Property:P57]] for a history of naming but that is not ideal since usual searches will not get to the the right names at the right moment.
 
We are presently using [[Property:P57]] for a history of naming but that is not ideal since usual searches will not get to the the right names at the right moment.
 +
:: It is a major question. I am not convinced by using [[Property:P57]] for the reason you give. In my view, as historians, we have to give information which is as close as possible to the sources and to the actors themselves. It means that we have to name organizations as they were named in the sources. It is possible that the same organization has two or three different names at the same time. We have to choose one of its names as the reference name and put the other ones as aliases (and maybe also as objects of the property P57 naming). But, in my view, in case the organization changed its name at certain times, we have to create different items. Let me take the example of the French Academy of Sciences. It was an Old Regime institution called Académie royale des sciences [[Item:Q153559]], then during the Revolution it became the First Class of the National Institute [[Item:Q153578]], and finally in 1816 again Académie des sciences [[Item:Q153579]]. Is it the same institution? Maybe yes, maybe no, but, from the point of view of the historians it is much better, I think, to consider them as different institutions, at least as a basic statement. But the condition is to link these different institutions to make a chain, which will be the "intemporal institution". I propose to use [[Property:P6]] and [[Property:P7]] to create this chain. Suppose you want to consider in a query not the Old Regime Academy [[Item:Q153559]] but the Academy of Sciences in the longue durée. You can create the variable ?Academy in the clause WHERE with the triple: Q153559 wdt:P7* ?Academy. If you use ?Academy in another clause you can retrieve information concerning the Academy of sciences from its foundation in 17th century to today; if you want to consider only the First Class of the Institute and the modern Academy of Sciences you can create the variable ?Academy in the clause WHERE with the triple: Q153578 wdt:P7* ?Academy, and so on. Try [https://database.factgrid.de/query/#%23Academy%20of%20Sciences%20%28Paris%29%0ASELECT%20%3FAcademy%20%3FAcademyLabel%20WHERE%20%7B%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0Awd%3AQ153559%20wdt%3AP7%2a%20%3FAcademy.%0A%7D] and [https://database.factgrid.de/query/#%23Academy%20of%20Sciences%20%28Paris%29%0ASELECT%20%3FAcademy%20%3FAcademyLabel%20WHERE%20%7B%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0Awd%3AQ153578%20wdt%3AP7%2a%20%3FAcademy.%0A%7D].--[[User:Bruno Belhoste|Bruno Belhoste]] ([[User talk:Bruno Belhoste|talk]]) 14:22, 15 May 2020 (CEST)
 +
.
  
 
== Stemmata ==
 
== Stemmata ==

Revision as of 14:22, 15 May 2020

The Problem of on-and off existence

Gotha's Lodge Lodge "Ernst zum Compaß", Gotha (Q10575) is a typical example: it has a starting point begin date (P49) but actually several new starts and more than one end end date (P50). People found a lodge, the lodge is closed (in the events of the French Revolution), it re-opens in the national wave of the Napoleonic wars, it is closed again in 1935 and re-eopened in 1949 or later or never.

Is it always the same lodge? What do we do with changing names? The German National Library creates loads of items under ever new names - and you have to make sure to get the respective successions. The alternative: If those involved wanted to state that they actually are the same people (even if that is only a fiction of continuity) let them:

The present solution: We have started to use history (P137) "History" and active phase (Q94446) "active phase":

See Lodge "Ernst zum Compaß", Gotha (Q10575) Lodge "Ernst zum Compaß", Gotha

We can now use qualifiers on each active phase in order to state a respective beginning and a respective end. Is this a good option? How does it work in SPARQL searches if you want to give a time line? How does the option agree with the general P49/P50 use on items?

--Timeline is fine. Try for example Gotha's Lodge between 1760 and 1830. No problem either with the general P49/P50 (you can use OPTIONAL for the history) --Bruno Belhoste (talk) 18:46, 14 May 2020 (CEST)

Beautiful. You know how to script these things. (And you should write a blog post one of these days about practical tips - like how to get open refine connected or how to best learn scripting these things... --22:18, 14 May 2020 (CEST)
Just tested it with all the lodges that have these markers. Turns out that the search does not get the continuities. I added labels to show that.
The visualisation packs the histories without understanding the continuities. --Olaf Simons (talk) 07:24, 15 May 2020 (CEST)
yes, there is a problem. I don't see any possibility to solve it because the way the Timeline module of the query service put the data in the layout is to maximize the compacity and there is no way to constrain it to align data by using any criterion; the only solution is to use another Timeline module with more features.--Bruno Belhoste (talk) 12:12, 15 May 2020 (CEST)

Changing names

An organisation can run through dozens of name changes over the years - e.g. an Early Modern Publishing house with name changes whenever a father hands down the business to son, wife, son in law etc.

We are presently using <naming> (P57) for a history of naming but that is not ideal since usual searches will not get to the the right names at the right moment.

It is a major question. I am not convinced by using <naming> (P57) for the reason you give. In my view, as historians, we have to give information which is as close as possible to the sources and to the actors themselves. It means that we have to name organizations as they were named in the sources. It is possible that the same organization has two or three different names at the same time. We have to choose one of its names as the reference name and put the other ones as aliases (and maybe also as objects of the property P57 naming). But, in my view, in case the organization changed its name at certain times, we have to create different items. Let me take the example of the French Academy of Sciences. It was an Old Regime institution called Académie royale des sciences Royal Academy of Sciences (Paris) (Q153559), then during the Revolution it became the First Class of the National Institute Classe des Sciences Physiques et Mathématiques de l'Institut national (Q153578), and finally in 1816 again Académie des sciences Académie des Sciences (Paris) (Q153579). Is it the same institution? Maybe yes, maybe no, but, from the point of view of the historians it is much better, I think, to consider them as different institutions, at least as a basic statement. But the condition is to link these different institutions to make a chain, which will be the "intemporal institution". I propose to use continuation of (P6) and continued by (P7) to create this chain. Suppose you want to consider in a query not the Old Regime Academy Royal Academy of Sciences (Paris) (Q153559) but the Academy of Sciences in the longue durée. You can create the variable ?Academy in the clause WHERE with the triple: Q153559 wdt:P7* ?Academy. If you use ?Academy in another clause you can retrieve information concerning the Academy of sciences from its foundation in 17th century to today; if you want to consider only the First Class of the Institute and the modern Academy of Sciences you can create the variable ?Academy in the clause WHERE with the triple: Q153578 wdt:P7* ?Academy, and so on. Try [1] and [2].--Bruno Belhoste (talk) 14:22, 15 May 2020 (CEST)

.

Stemmata

Ernst Howald and Henry E. Sigerist. Antonii Musa De herba vettonica ... Leipzig 1927
We created two properties for this: preceding in stemma (P233) names the object - a book edition, a manuscript or any other thing that is genetically earlier. connection in stemma (P234) comes as the qualifier and offers a statement on what basis the object can be seen as a following. You might for instance link a translation to the edition that gave the original text.

The organisation is top down chronological (the guide lines in the picture above are not that beautiful, but dates on y-axis would be cool).

Objects can have multiple connections to earlier Items (a medieval scribe could use two books to create a new version of the text).

It would be cool if the P234 information became available on the lines that are connecting items.

One of the problems is here also: How do I select a family of items?

The situation at a particular point in time

Think of a house: Tenants are moving in and out - we can model that with resident (P239) "resident" and P49/50 qualifiers. You are now interested in the situation at a point in history: Who were the tenants on March 3, 1848?

If we can get this done for a house like Gotha, Hauptmarkt 10 (old house no. 1146) (Q14572) we might be able to show a city at a point in time.

Organisational ties

Originally we thought we should use The Wikibase advantage of being able to create any imaginable statement to do just that. A lodge has a "Mother lodge", so do create a link to that (with the daughter lodge respective property). It can chose an umbrella organisation, it will accept an obedience (adhering a system) etc.

The problem of these specific properties

  • is that a neighbouring organisation might have its own nomenclature for pretty much the same dependencies - or that it can have the same terms, but mean something different with them.
  • that you need to know the specific terminology in order to run a query

The general solution could be a standard option like "organisational ties" and use that property to specify them with the quakifiers for the specific tie.

A more specific solution can lie in between these poles: We create a pattern of general types of these ties, so that we can then ask broad questions in order to see different networks.

  1. is owned by / owns
  2. parent organisation / subsidiaries
  3. received the patent from / granted patents to
  4. represented by / representing
  5. member of / members
  6. partner organisations
  7. organisationally supported by / supporting with organisational help
  8. financed by / finances
  9. recognised by / recognises
  10. next hierarchical level level above / next hierarchical level underneath

One would now use qualifiers to give the exact terminologies. Which of these are redundant? which of the are missing? #

Is there a better option?