FactGrid:Data modeling

From FactGrid
Revision as of 12:12, 15 May 2020 by Bruno Belhoste (talk | contribs) (The Problem of on-and off existence)
Jump to: navigation, search

The Problem of on-and off existence

Gotha's Lodge Item:Q10575 is a typical example: it has a starting point Property:P49 but actually several new starts and more than one end Property:P50. People found a lodge, the lodge is closed (in the events of the French Revolution), it re-opens in the national wave of the Napoleonic wars, it is closed again in 1935 and re-eopened in 1949 or later or never.

Is it always the same lodge? What do we do with changing names? The German National Library creates loads of items under ever new names - and you have to make sure to get the respective successions. The alternative: If those involved wanted to state that they actually are the same people (even if that is only a fiction of continuity) let them:

The present solution: We have started to use Property:P137 "History" and Item:Q94446 "active phase":

See Item:Q10575 Lodge "Ernst zum Compaß", Gotha

We can now use qualifiers on each active phase in order to state a respective beginning and a respective end. Is this a good option? How does it work in SPARQL searches if you want to give a time line? How does the option agree with the general P49/P50 use on items?

--Timeline is fine. Try for example Gotha's Lodge between 1760 and 1830. No problem either with the general P49/P50 (you can use OPTIONAL for the history) --Bruno Belhoste (talk) 18:46, 14 May 2020 (CEST)

Beautiful. You know how to script these things. (And you should write a blog post one of these days about practical tips - like how to get open refine connected or how to best learn scripting these things... --22:18, 14 May 2020 (CEST)
Just tested it with all the lodges that have these markers. Turns out that the search does not get the continuities. I added labels to show that.
The visualisation packs the histories without understanding the continuities. --Olaf Simons (talk) 07:24, 15 May 2020 (CEST)
yes, there is a problem. I don't see any possibility to solve it because the way the Timeline module of the query service put the data in the layout is to maximize the compacity and there is no way to constrain it to align data by using any criterion; the only solution is to use another Timeline module with more features.--Bruno Belhoste (talk) 12:12, 15 May 2020 (CEST)

Changing names

An organisation can run through dozens of name changes over the years - e.g. an Early Modern Publishing house with name changes whenever a father hands down the business to son, wife, son in law etc.

We are presently using Property:P57 for a history of naming but that is not ideal since usual searches will not get to the the right names at the right moment.


Ernst Howald and Henry E. Sigerist. Antonii Musa De herba vettonica ... Leipzig 1927
We created two properties for this: Property:P233 names the object - a book edition, a manuscript or any other thing that is genetically earlier. Property:P234 comes as the qualifier and offers a statement on what basis the object can be seen as a following. You might for instance link a translation to the edition that gave the original text.

The organisation is top down chronological (the guide lines in the picture above are not that beautiful, but dates on y-axis would be cool).

Objects can have multiple connections to earlier Items (a medieval scribe could use two books to create a new version of the text).

It would be cool if the P234 information became available on the lines that are connecting items.

One of the problems is here also: How do I select a family of items?

The situation at a particular point in time

Think of a house: Tenants are moving in and out - we can model that with Property:P239 "resident" and P49/50 qualifiers. You are now interested in the situation at a point in history: Who were the tenants on March 3, 1848?

If we can get this done for a house like Item:Q14572 we might be able to show a city at a point in time.

Organisational ties

Originally we thought we should use The Wikibase advantage of being able to create any imaginable statement to do just that. A lodge has a "Mother lodge", so do create a link to that (with the daughter lodge respective property). It can chose an umbrella organisation, it will accept an obedience (adhering a system) etc.

The problem of these specific properties

  • is that a neighbouring organisation might have its own nomenclature for pretty much the same dependencies - or that it can have the same terms, but mean something different with them.
  • that you need to know the specific terminology in order to run a query

The general solution could be a standard option like "organisational ties" and use that property to specify them with the quakifiers for the specific tie.

A more specific solution can lie in between these poles: We create a pattern of general types of these ties, so that we can then ask broad questions in order to see different networks.

  1. is owned by / owns
  2. parent organisation / subsidiaries
  3. received the patent from / granted patents to
  4. represented by / representing
  5. member of / members
  6. partner organisations
  7. organisationally supported by / supporting with organisational help
  8. financed by / finances
  9. recognised by / recognises
  10. next hierarchical level level above / next hierarchical level underneath

One would now use qualifiers to give the exact terminologies. Which of these are redundant? which of the are missing? #

Is there a better option?