FactGrid:Troubleshooting: Difference between revisions

From FactGrid
Jump to navigation Jump to search
No edit summary
(45 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== A knows B / B knows A? ==
* [[Talk:Main Page|Construction Sites]]


We would like to offer the greatest variety of contact information.
== Advice: best practice data structures ==
=== Repeated information - avoid it or go for it? ===
We have some 7000 documents in the database. Each document has an author, most documents have recipients and most documents mention other persons.


* We are presently presenting some 9.000 documents with information about senders and the respective receivers
* [https://database.factgrid.de/query/#SELECT%20%3FDokument%20%3FDokumentLabel%20WHERE%20%7B%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%20%20%3FDokument%20wdt%3AP2%20wd%3AQ10671.%0A%20%20%3FDokument%20wdt%3AP21%20wd%3AQ133.%0A%7D 375 documents are written by Johann Joachim Christoph Bode].
* We are listing some 300 events with information about participants
* [https://database.factgrid.de/query/#SELECT%20%3FDokument%20%3FDokumentLabel%20WHERE%20%7B%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%20%20%3FDokument%20wdt%3AP2%20wd%3AQ10671.%0A%20%20%3FDokument%20wdt%3AP28%20wd%3AQ133.%0A%7D 1123 documents have Bode as first recipient]
* We could add specific contact information in the form of "A meets B" statements - we could just as well create items for any known meeting.
* [https://database.factgrid.de/query/#SELECT%20%3FDokument%20%3FDokumentLabel%20WHERE%20%7B%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%20%20%3FDokument%20wdt%3AP2%20wd%3AQ10671.%0A%20%20%3FDokument%20wdt%3AP33%20wd%3AQ133.%0A%7D 108 documents mention Bode]


How do we organise such information in a way that database queries can offer us contact information with the greatest ease? How do we avoid recursive statements ("Person A meets Person B" - do we have to repeat the statement as "Person B meets Person A", and if so, how do we ensure the congruence)?
Organising Bode's data set [[Item:Q133]]: should we list all these documents in three sections? It might be interesting to do this because this is the point at which we show that we have far more information on the man than any other database.


The same problem is created by membership: we have 1350 Illuminati. Ist to be an Illuminati a personal thing? Is it a thing on the account of the Order - or shall we create 1350 new items for each membership and refer to these - under what P-function?
I am otherwise not to eager to begin this because this was the reason why we left the Wiki and went for the database: The ideal situation is that we have every fact only once in the database. A data sheet on Bode should fill the "Bode as Author" section with a look into the database. Once we start creating lists we have to make sure that they are all constantly up to date and well organised - and a Bode data sheet with a list of 1123 documents he received will no longer be funny.


== Changing names in business succession? Use the redirect? Create different items? ==
The ideal situation is a split of functions: A Bode page is created in data base retrievals. You geht the list as soon as you click at "more" freshly made from the data base. (There is a tendency to use Wikibase much like a Wikipedia page: have everything on Bode on "his page" and understand that the dataset [[Item:Q133]] is all we know about Bode...)
 
=== Membership information: have the information on person? on the organisation or on an intermediate membership item? ===
We have about 1350 Illuminati [https://database.factgrid.de/query/#SELECT%20%3FIlluminatenorden%20%3FIlluminatenordenLabel%20WHERE%20%7B%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%20%20%0A%20%20%0A%20%20%3FIlluminatenorden%20wdt%3AP91%20wd%3AQ10677.%0A%7D search]. Is being a member a personal thing, to be stated in the person's data set? Is it a thing on the account of the Order to be stated here [[Item:Q10677]] - or shall we create items for each membership and refer to these items on both sides, person and organisation?
 
A particular problem is that mere membership information will not be the end of the story. Members progress in the order, they get degrees and reach positions. We will not be able to list these details as qualifiers of the respective membership (member of the Illuminati, begin date, place, proposed by, advanced to position x, date... [Wikibase will list all dates together])
 
Membership items could be complex but would they be operable? Would users know what to look for? --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 10:45, 20 November 2018 (CET)
 
=== Changing entities and cuts through time ===


See [https://de.wikipedia.org/wiki/M%C3%BCnchner_Buchhandel_1500%E2%80%931850 Münchner Buchhandel 1500–1850] at Wikipedia for the problem.
See [https://de.wikipedia.org/wiki/M%C3%BCnchner_Buchhandel_1500%E2%80%931850 Münchner Buchhandel 1500–1850] at Wikipedia for the problem.


The Publishing House of Moritz Georg Weidmann becomes eventually the Olms Verlag - if we treat this as one company - how do we create different references for different times? Books have different imprints decade after decade - especially with the early modern practice to use the publishers' personal names in the imprints. Our system must be able to make sense of practically any imprint we find - and it should at the same moment understand how the changing imprint information is actually connected to existing companies. Munich does not have 57 companies from 1500 to 1850 but six and these are interconnected through family ties.
The Publishing House of Moritz Georg Weidmann becomes eventually the Olms Verlag - if we treat this as one company - how do we create different references for different times? Books have different imprints decade after decade - especially with the early modern practice to use the publishers' personal names in the imprints. Our system must be able to make sense of practically any imprint we find - and it should at the same moment understand how the changing imprint information is actually connected to existing companies. Munich does not have 57 companies from 1500 to 1850 but six under changing names.


* How do we give the proper picture in time segments?
* How do we give the proper picture in queries with time frames (tell me how many bookshops Munich had in 1700)?
* How do we best organise data on an item that goes from 1500 to the present?
* How do we best organise data on an item that goes from 1500 to the present?
* Should we use the Wikipedia redirect function (via merge items) to create various items that are all leading the the once central item?
* Should we create different item under each name and state how these items are related?
* should we create different items for each imprint and state how these items are interconnected? (if so: how do we make sure that people using the QueryService know our complex data structures?
* Should we create one idem and state the changing names with begin/end information (sound better because a lot of information like the location might be stable)?


== How should we organise items that are not yet fully consolidated? ==
The same problem occurs with addresses. A city quarter can be torn down, the new quarter gets old street names - but are these still the same addresses? (They might have new geographic coordinates).
We will face a particular need to organise unconsolidated personal information and a fuzzy search for them: Documents do often just state last names. We have a place, a year, perhaps an occupation but no further information. We will need a system that gives us hints of potential matches.


* Was it good to state names as strings?
=== Recipients 1, 2, 3 - a good idea? ===
* Would it be more clever to create n item for each family name and to organise variants around the item (in order to understand how families belong together)?
Regular communication from members into the Illuminati Order was a complex thing. Members met one a month in their local Minerval Church and handed in their monthly Quibus Licet ("to whom it may concern"), letters into the Order. The "local superior" would be the first reader. He would pass the letter to the Provincial leader who would answer as Basilus, the "Unknown Superior". The local superior of Gotha would be Christian Georg Helmolt (the first recipient), he would send the package of Gotha's "QQLL" (the abbreviation for the plural) to Johann Joachim Christoph Bode in Weimar (recipient 2) who could in turn pass the letters to Gotha's duke Ernest II. (recipient 3 in this case).


== Recipients 1, 2, 3 - a good idea? ==
In order to link the respective places to which the letters were sent to the recipients I created different recipients which could now be spotted in different places.
Regular communication from members into the Illuminati Order was a complex thing. Members met one a month in their local Minerval Church and here they handed in their monthly Quibus Licet ("to whom it may concern"), a letter to be firts read by the local superior and to be passed from him to the "Unknown Superiors". The local superior of Gotha would be Christian Georg Helmolt (the first recipient), he would send the entire package of Gotha's "QQLL" (the abbreviation for the plural) to Johann Joachim Christoph Bode in Weimar (recipient 2) who could in turn pass the letters to Gotha's duke Ernest II. (recipient 3 in this case).
 
In order to link the respective places to which the letters were sent to the recipients I created different respective recipients which could now be spotted in different places.


One could organise this differently. [[User:Daniel Mietchen]] proposed to create an item for each transportation process. Would this be clever? How should we organise things with the aim to facilitate searches? --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 21:42, 6 August 2018 (CEST)
One could organise this differently. [[User:Daniel Mietchen]] proposed to create an item for each transportation process. Would this be clever? How should we organise things with the aim to facilitate searches? --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 21:42, 6 August 2018 (CEST)
Line 37: Line 44:
:Alternative again proposed by [[User:Daniel Mietchen]]: take a look at Wikidata's P1545, series qualifier.
:Alternative again proposed by [[User:Daniel Mietchen]]: take a look at Wikidata's P1545, series qualifier.


==GND Identifyer ==
::It all boils down to the question of best search results and optimal linking. One would love to see the bundling of information on higher levels of personal power. Search results again are a question of expectations: User have to get the information even if they do not know more about the availability. (That's why recipient 1-2-3 was probably bad - you do not know that these exist...)
 
=== What should we do with our 1500 transcripts of documents? ===
 
We are in need of a good place to offer transcripts (and eventually: scans).
 
Would [[Q22976]] be a good place for the transcript of [[Item:Q22976]]?
 
Would it be better to have a direct input field for the transcript of full pages?
 
Could the reasonator as a standard unser interface grab transpripts from such wiki pages?
 
Could we use the Wikisource extension to offer scans side by side?
 
=== Scholia ===
 
Daniel Mietchen hinted at Scholia https://tools.wmflabs.org/scholia/ and how that could change many of our problems. We could here use a software solution that embeds SPARQL queries. It could also embed the transcripts of texts which we have to present. We would basically decide how many sorts of database objects we have and compose templates of the searches they are supposed to embed. Question remains: who would compose this for us. (On the design front: it is not as cohesive as Magnus Manske's Reasonator.) --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 16:34, 27 November 2018 (CET)
 
== Specific problems of our Wikibase Installation ==
=== GND Identifiers do not link ===
See [[Item:Q133]]. Why does the GND number not linkt to the GND as in Wikidata?
 
=== Wikidata has a function that allows quoting items via {{}} template, can we do that as well? ===
Does not seem to work... Side aspect: Wikidata states Q-Numbers like Q123, we state "Item:Q123" (and I have begun to link transcripts to the Q-Pages) --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 12:20, 23 November 2018 (CET)
 
=== Description 250 character limit ===
We have extended descriptions wherever we have foreign data inputs. Daniel Mietchen said we can modify the character limits on our own Wikibase instance - would be good to allow perhaps 1500 characters. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 12:26, 23 November 2018 (CET)
 
=== Time: Hours and Minutes cannot be stated ===
We were trying to state events on single days, and did not manage to state hours or minutes, though the Sofware seems to offer the option. --[[User:Olaf Simons|Olaf Simons]] ([[User talk:Olaf Simons|talk]]) 12:26, 23 November 2018 (CET)
 
=== pdf upload ===
that should be a switch in the LocalSettings
 
== QueryService related problems ==
=== QueryService gives January 1 as default answer for full year dates ===
If we give a full year (e.g. "date of birth") the Query Service will state that January 1 as the date.
 
=== QueryService gives output with wd:Q123 numbers ===
though the links do actually go into the FactGrid resource
 
=== QueryService example searches are Wikidata searches and do not work on our database ===
though the links do actually go into the FactGrid resource
 
== Tools/features one would like to have ==
=== A tool to grab and import (selected) Wikidata information ===
If we have data on a FactGrid Item on Wikidate - could on think of a tool that can be used to grab these data as a set and to convert selected statements into FactGrid statements?
 
=== A multilingual interface to just offer information ===
The ideal development would be an additional tab on all wikibase-items. If you search for Johann Sebastian Bach you get a person's page that extracts information from the J.S. Bach data set and through additional searches. Bach's works will be listed on request - not from the information the specific Q-item is providing but from a database search: list all items with Bach as composer. List all documents with J.S. Bach as author. List all documents with J.S. Bach as receiver... Users get these lists only on request. The J.S. Bach display tells us that we can run this search and feed the information under the present header.
 
=== An option to feed information into historical maps ===
One would love to offer links to Gotha addresses of different centuries and decades and one would have to use different maps for this purpose. Gotha today has housing areas for about 50.000 inhabitants. Gotha in 1800 ended basically with the city walls that were just torn down. We do not yet have the expertise to present visual information on maps other than the default Open Street Map interface.
 
== Requests for search scripts ==
=== P88 with P101 ===  
A search that states given names (P88) with the place in sequence (P101) information - in order to replace the P88 string property with 248 item property.


Why does it not work like a link as in Wikidata? Why can't we click at the number and access the dataset on the GND database?
=== Search with descriptions output ===
to get all names that do not yet have a description (or that have an insufficient description)


==Query Service and full years ==
=== Organisations that share members ===
If we enter full years with the appropriate distinction 1761-00-00etc. - why do we still get a January 1 date?
It would be interesting to get an idea of organisations and their vicinity or adversity - also: their potential to succeed each other.


== [[Item:Q22976]] or just [[Q22976]] ==
== Specific FactGrid wishes ==
FactGrid Items have an Item: prefix.
=== Logo ===
Important in joint ventures.


Question 1: Wikidata has a function that allows quoting items - via template, can we do that as well?
=== Skin ===
Important gain a more independent position in the eyes of university, library and archive people.


Question 2: We are in need of a good place to offer scans.Would it [[Q22976]] be a good match to [[Item:Q22976]] or could there be a better place for transcripts (and possibly for image files)?
=== An interface to move the ''The Gotha Illuminati Research Base'' into the FactGrid ===
The ''The Gotha Illuminati Research Base'' https://projekte.uni-erfurt.de/illuminaten/Main_Page  is a conventional media wiki and we should win a superior functionality here on the FactGrid.


== tiffs and pdfs not supported ==
=== A clearer idea of how we would like to be quoted ===
That must be in local settings php...
FactGrid datasets have specific research statements - but how do we want them to be referred to in footnotes? See [[Item:Q11305]]


== How would we organise contemorary sets? ==
== Data we should like to have ==
* Matrikel of German universities
* Logis-Verzeichnisse Göttinger Studenten (halbjährlich)

Revision as of 17:25, 14 December 2018

Advice: best practice data structures

Repeated information - avoid it or go for it?

We have some 7000 documents in the database. Each document has an author, most documents have recipients and most documents mention other persons.

Organising Bode's data set Item:Q133: should we list all these documents in three sections? It might be interesting to do this because this is the point at which we show that we have far more information on the man than any other database.

I am otherwise not to eager to begin this because this was the reason why we left the Wiki and went for the database: The ideal situation is that we have every fact only once in the database. A data sheet on Bode should fill the "Bode as Author" section with a look into the database. Once we start creating lists we have to make sure that they are all constantly up to date and well organised - and a Bode data sheet with a list of 1123 documents he received will no longer be funny.

The ideal situation is a split of functions: A Bode page is created in data base retrievals. You geht the list as soon as you click at "more" freshly made from the data base. (There is a tendency to use Wikibase much like a Wikipedia page: have everything on Bode on "his page" and understand that the dataset Item:Q133 is all we know about Bode...)

Membership information: have the information on person? on the organisation or on an intermediate membership item?

We have about 1350 Illuminati search. Is being a member a personal thing, to be stated in the person's data set? Is it a thing on the account of the Order to be stated here Item:Q10677 - or shall we create items for each membership and refer to these items on both sides, person and organisation?

A particular problem is that mere membership information will not be the end of the story. Members progress in the order, they get degrees and reach positions. We will not be able to list these details as qualifiers of the respective membership (member of the Illuminati, begin date, place, proposed by, advanced to position x, date... [Wikibase will list all dates together])

Membership items could be complex but would they be operable? Would users know what to look for? --Olaf Simons (talk) 10:45, 20 November 2018 (CET)

Changing entities and cuts through time

See Münchner Buchhandel 1500–1850 at Wikipedia for the problem.

The Publishing House of Moritz Georg Weidmann becomes eventually the Olms Verlag - if we treat this as one company - how do we create different references for different times? Books have different imprints decade after decade - especially with the early modern practice to use the publishers' personal names in the imprints. Our system must be able to make sense of practically any imprint we find - and it should at the same moment understand how the changing imprint information is actually connected to existing companies. Munich does not have 57 companies from 1500 to 1850 but six under changing names.

  • How do we give the proper picture in queries with time frames (tell me how many bookshops Munich had in 1700)?
  • How do we best organise data on an item that goes from 1500 to the present?
  • Should we create different item under each name and state how these items are related?
  • Should we create one idem and state the changing names with begin/end information (sound better because a lot of information like the location might be stable)?

The same problem occurs with addresses. A city quarter can be torn down, the new quarter gets old street names - but are these still the same addresses? (They might have new geographic coordinates).

Recipients 1, 2, 3 - a good idea?

Regular communication from members into the Illuminati Order was a complex thing. Members met one a month in their local Minerval Church and handed in their monthly Quibus Licet ("to whom it may concern"), letters into the Order. The "local superior" would be the first reader. He would pass the letter to the Provincial leader who would answer as Basilus, the "Unknown Superior". The local superior of Gotha would be Christian Georg Helmolt (the first recipient), he would send the package of Gotha's "QQLL" (the abbreviation for the plural) to Johann Joachim Christoph Bode in Weimar (recipient 2) who could in turn pass the letters to Gotha's duke Ernest II. (recipient 3 in this case).

In order to link the respective places to which the letters were sent to the recipients I created different recipients which could now be spotted in different places.

One could organise this differently. User:Daniel Mietchen proposed to create an item for each transportation process. Would this be clever? How should we organise things with the aim to facilitate searches? --Olaf Simons (talk) 21:42, 6 August 2018 (CEST)

Alternative again proposed by User:Daniel Mietchen: take a look at Wikidata's P1545, series qualifier.
It all boils down to the question of best search results and optimal linking. One would love to see the bundling of information on higher levels of personal power. Search results again are a question of expectations: User have to get the information even if they do not know more about the availability. (That's why recipient 1-2-3 was probably bad - you do not know that these exist...)

What should we do with our 1500 transcripts of documents?

We are in need of a good place to offer transcripts (and eventually: scans).

Would Q22976 be a good place for the transcript of Item:Q22976?

Would it be better to have a direct input field for the transcript of full pages?

Could the reasonator as a standard unser interface grab transpripts from such wiki pages?

Could we use the Wikisource extension to offer scans side by side?

Scholia

Daniel Mietchen hinted at Scholia https://tools.wmflabs.org/scholia/ and how that could change many of our problems. We could here use a software solution that embeds SPARQL queries. It could also embed the transcripts of texts which we have to present. We would basically decide how many sorts of database objects we have and compose templates of the searches they are supposed to embed. Question remains: who would compose this for us. (On the design front: it is not as cohesive as Magnus Manske's Reasonator.) --Olaf Simons (talk) 16:34, 27 November 2018 (CET)

Specific problems of our Wikibase Installation

GND Identifiers do not link

See Item:Q133. Why does the GND number not linkt to the GND as in Wikidata?

Wikidata has a function that allows quoting items via {{}} template, can we do that as well?

Does not seem to work... Side aspect: Wikidata states Q-Numbers like Q123, we state "Item:Q123" (and I have begun to link transcripts to the Q-Pages) --Olaf Simons (talk) 12:20, 23 November 2018 (CET)

Description 250 character limit

We have extended descriptions wherever we have foreign data inputs. Daniel Mietchen said we can modify the character limits on our own Wikibase instance - would be good to allow perhaps 1500 characters. --Olaf Simons (talk) 12:26, 23 November 2018 (CET)

Time: Hours and Minutes cannot be stated

We were trying to state events on single days, and did not manage to state hours or minutes, though the Sofware seems to offer the option. --Olaf Simons (talk) 12:26, 23 November 2018 (CET)

pdf upload

that should be a switch in the LocalSettings

QueryService related problems

QueryService gives January 1 as default answer for full year dates

If we give a full year (e.g. "date of birth") the Query Service will state that January 1 as the date.

QueryService gives output with wd:Q123 numbers

though the links do actually go into the FactGrid resource

QueryService example searches are Wikidata searches and do not work on our database

though the links do actually go into the FactGrid resource

Tools/features one would like to have

A tool to grab and import (selected) Wikidata information

If we have data on a FactGrid Item on Wikidate - could on think of a tool that can be used to grab these data as a set and to convert selected statements into FactGrid statements?

A multilingual interface to just offer information

The ideal development would be an additional tab on all wikibase-items. If you search for Johann Sebastian Bach you get a person's page that extracts information from the J.S. Bach data set and through additional searches. Bach's works will be listed on request - not from the information the specific Q-item is providing but from a database search: list all items with Bach as composer. List all documents with J.S. Bach as author. List all documents with J.S. Bach as receiver... Users get these lists only on request. The J.S. Bach display tells us that we can run this search and feed the information under the present header.

An option to feed information into historical maps

One would love to offer links to Gotha addresses of different centuries and decades and one would have to use different maps for this purpose. Gotha today has housing areas for about 50.000 inhabitants. Gotha in 1800 ended basically with the city walls that were just torn down. We do not yet have the expertise to present visual information on maps other than the default Open Street Map interface.

Requests for search scripts

P88 with P101

A search that states given names (P88) with the place in sequence (P101) information - in order to replace the P88 string property with 248 item property.

Search with descriptions output

to get all names that do not yet have a description (or that have an insufficient description)

Organisations that share members

It would be interesting to get an idea of organisations and their vicinity or adversity - also: their potential to succeed each other.

Specific FactGrid wishes

Important in joint ventures.

Skin

Important gain a more independent position in the eyes of university, library and archive people.

An interface to move the The Gotha Illuminati Research Base into the FactGrid

The The Gotha Illuminati Research Base https://projekte.uni-erfurt.de/illuminaten/Main_Page is a conventional media wiki and we should win a superior functionality here on the FactGrid.

A clearer idea of how we would like to be quoted

FactGrid datasets have specific research statements - but how do we want them to be referred to in footnotes? See Item:Q11305

Data we should like to have

  • Matrikel of German universities
  • Logis-Verzeichnisse Göttinger Studenten (halbjährlich)