Help:How do I feed data into FactGrid?: Difference between revisions

From FactGrid
Jump to navigation Jump to search
No edit summary
No edit summary
Line 26: Line 26:
There is no restriction on sorts of database objects which one might deem impossible. Whether a person, a document, a place, a house, an event or an abstract idea... FactGrid knows only database objects and statements about them and there is not anything under a word, name or title imaginable that cannot be the object of statements about it.
There is no restriction on sorts of database objects which one might deem impossible. Whether a person, a document, a place, a house, an event or an abstract idea... FactGrid knows only database objects and statements about them and there is not anything under a word, name or title imaginable that cannot be the object of statements about it.


== Triple-based statements ==
=== Triple-based statements ===
Statements in the FactGrid take the form of triples, three-part atomic statements such as
Statements in the FactGrid take the form of triples, three-part atomic statements such as


Line 47: Line 47:
When creating new properties, on has to consider which type of link they should create.
When creating new properties, on has to consider which type of link they should create.


== Statements can be qualified in any complexity ==
=== Statements can be qualified in any complexity ===
Triples in themselves would be a comparatively rough statement. Wikibase allows Tripel to "qualify" with any extension. Adam Weishaupt was married twice. It makes sense to qualify the entries in more detail. The first marriage is the triple
Triples in themselves would be a comparatively rough statement. Wikibase allows Tripel to "qualify" with any extension. Adam Weishaupt was married twice. It makes sense to qualify the entries in more detail. The first marriage is the triple


Line 67: Line 67:


* You can leave these answers side by side and support them with their different sources; they then appear one after the other in searches.
* You can leave these answers side by side and support them with their different sources; they then appear one after the other in searches.
* You can evaluate the conflicting statements individually by qualifying each statement with [[Property:P155 "How safe is that". A wide range of pre-structured claims are available for this property.
* You can evaluate the conflicting statements individually by qualifying each statement with [[Property:P155]] "How sure is this?". A wide range of pre-structured claims are available for this property.
* However, you can also ensure that incorrect information no longer appears in searches. To do this, the markings at the beginning of the input field are lowered or raised accordingly before saving the statement. Statements that have been disqualified as obsolete are retained in the database, where they are useful for preventing relapses into the precarious state of information, but from now on they will no longer appear in search queries.
* However, you can also ensure that incorrect information no longer appears in searches. To do this, the markings at the beginning of the input field are lowered or raised accordingly before saving the statement. Statements that have been disqualified as obsolete are retained in the database, where they are useful for preventing relapses into the precarious state of information, but from now on they will no longer appear in search queries.


== The "Directory of Properties" lists the types of statements currently available ==
== The "Directory of Properties" lists the various statements currently available ==
All statements that can currently be made with the FactGrid are listed in the property directory once in an overall list or clearly arranged under individual questions, they are managed in the same and are individually documented with statements.
All statements that can currently be made with the FactGrid are listed in the property directory once in an overall list or clearly arranged under individual questions, they are managed in the same and are individually documented with statements.



Revision as of 23:39, 5 April 2020

back

<provisional Google translation>

Creating Items one by one

The link to the input mask is given in the left menue:

With this form, you can create any imaginable database object, Item. It will receive a new Q-number and should gain qualities impact and dimensions with the individual statements one can make about it.

At least a "label" and a "description" should be assigned to the object when creating it. Name variants can be added in order to handle the object under all its various known names in full text searches.

If the software is used in English, the information given in the labels, descriptions and aliases will appear in all the languages that have not set their own wordings. It is therefore important that all Items and properties get English labels and descriptions. If you are editing on FactGrid primarily in another language, start your input with English and add your own language in the second step.

How to make statements on individual Items

Most databases will require you to decide beforehand what kind of objects you will handle with on the platforms and how exactly they can be related to each other. Wikibase is structurally open. Items gain dimensions and meaning just as they do in regular discourses through the statements they attract and in which they appear.

For the database all its objects are merely Q-numbers, linked with each other with the help of P-numbers. Alternatively an object can appear in a statement with a date or a unit. Linking abstract Item-numbers with the help of equally abstract P-numbers has the advantage that both can be now named and identified in any language imaginable. The Item will not gain its qualities with its categorisation but with the triples of statements the attract.

There is no restriction on sorts of database objects which one might deem impossible. Whether a person, a document, a place, a house, an event or an abstract idea... FactGrid knows only database objects and statements about them and there is not anything under a word, name or title imaginable that cannot be the object of statements about it.

Triple-based statements

Statements in the FactGrid take the form of triples, three-part atomic statements such as

 Johann Christian Bach's (Item:Q147795) father is (Property:P141) Johann Sebastian Bach (Item:Q147798)

The Q—P—Q triple is the most interesting option as it brings data base objects into relative positions towards each other. They have three eminent advantages over the "string" input which most conventional databases require. Unlike the combination of characters "Köln" as the place of birth of a person, the reference to the respective database object Item:Q10400 has three big advantages:

  • the reference introduces an Item that is itself the object of various statements. The string sequence "Köln" does not come with geo-coordinates or external identifiers, but these will be in the package of Item:Q10400.
  • the reference introduces an Item that can be handled in numerous languages, Anglophone readers read will read "Cologne" on the Item:Q10400 instead of "Köln".
  • the reference to an item instead of a string brings position into the database. If there are several places with the same name, you will select the Q-number of the object you mean, the one with the right geo-coordinates, the right country, etc.

In addition to Q—P—Q triples, there the triples can also refer to dates, quantities, or strings of characters (for example in order to state a particular wording or spelling in a quotation), a URL or the identifier of another database. An alternative triple can thus be:

 Johann Christian Bach (Item:Q147795) was born on (Property:P77) September 5, 1735

or

 Johann Christian Bach (Item:Q147795) has the GND identification number (Property:P76) 118505521

When creating new properties, on has to consider which type of link they should create.

Statements can be qualified in any complexity

Triples in themselves would be a comparatively rough statement. Wikibase allows Tripel to "qualify" with any extension. Adam Weishaupt was married twice. It makes sense to qualify the entries in more detail. The first marriage is the triple

 Johann Adam Weishaupt (Item:Q1308) was married to (Property:P84) Maria Afra Johanna Walburga Weishaupt (born Sausenhofer) (Item:Q23424)
- since (Property:P49) July 11, 1773
- Place of marriage (Property:P132) Eichstätt (Item:Q10340)
- Best man was (Property:P340) Anton Härtl (Item:Q97362)

When entering such statements manually, it is not necessary to know the Q or P numbers. The moment you type into the input fields of the triplet or the qualifying statements, the database makes suggestions for the statements you probably want to make and the people you probably mean.

Statements can be complexly substantiated with references

Any number of source documents can be added to all statements. The most important properties are:

  • Property:P51 "primary source" - for the historical document from which research literature takes or should take the date,
  • Property:P51 "literature" - for reference to research literature on the object (qualifiers can be used to add pages),
  • Property:P51 "according to" - to build distance from the (doubtful) statement.

Contradicting statements are possible and valuable

Wikibase instances allow multiple contradicting statements to be made about objects. This is not a defect, but makes sense, for example, as soon as several conflicting documents make statements about a question such as the date of birth.

  • You can leave these answers side by side and support them with their different sources; they then appear one after the other in searches.
  • You can evaluate the conflicting statements individually by qualifying each statement with Property:P155 "How sure is this?". A wide range of pre-structured claims are available for this property.
  • However, you can also ensure that incorrect information no longer appears in searches. To do this, the markings at the beginning of the input field are lowered or raised accordingly before saving the statement. Statements that have been disqualified as obsolete are retained in the database, where they are useful for preventing relapses into the precarious state of information, but from now on they will no longer appear in search queries.

The "Directory of Properties" lists the various statements currently available

All statements that can currently be made with the FactGrid are listed in the property directory once in an overall list or clearly arranged under individual questions, they are managed in the same and are individually documented with statements.

In the left column, the statements appear sorted according to where you want to make them - for example in a biography or in a database object for a document.

In the right column they are arranged according to the target reference - so you can want to make a statement to a person (like column) in which organization (right column) he was a member.

Any new statement types can be set up

Wikibase is extremely open when it comes to database modeling. No Q is conceivable for which no statements can be made - the moment a statement can be made about a thing, a Q can be set up for them. For the database, the statements themselves are merely P numbers, which are filled with text depending on your interests and to which you can now make statements.

The establishment of a new statement is not entirely uncritical:

  • It might be the statement already exists in a wording that did not cross your mind (add it as an Alias)
  • It might be better to use an already existing less precise statement to keep the results of search queries together
  • The data model might not make much sense (For instance an organization could have a "generalissimo" in its hierarchical structure, but a property in this regard is not particularly practical, since it presupposes that seekers already know what kind of positions they have to ask in the organization in search of the leaders there is a property that asks about the people in a leading position in an organization and the use of qualifiers that show the position of the people in question.

With the FactGrid project, we are interested in giving research projects the greatest possible freedom in the respective issues, so properties have so far been set up without any great hurdles to coordinate. However, in order to avoid technical mistakes and to use the database transparently, we urge a short discussion process on the site

Nevertheless, if you generate properties yourself - under pressure from the project - make sure that you anticipate the type of statements you want to make, i.e. whether they refer to a Q number (database object), a P number ( a property or property) want to refer to a date, a URL or a database ID. Properties that have been set are then fixed in terms of their requirements and no longer assume any different values. It is advisable to coordinate with people who have already set up properties in the FactGrid.

FactGrid is multilingual (as far as you teach it)

To assign a different language to database objects or statement types, you have to log in in this language; this happens in the "Preferences", the user settings, at the top right next to the links in your own user pages.

In the "Default", that is, without further presetting, the English-language settings appear on all labels. It is therefore in the interest of non-registered users that all Q numbers and all P numbers are also used in English.

However, we encourage all participants to enter data in their mother tongues and gradually make the database multilingual. She currently speaks three languages ??comprehensively: English, German and French.

The mass input of data via QuickStatements

If data is already available in a spreadsheet, whether Excel or Google, or in a CSV (comma-separated) table, it can also be entered in series. The tool used for this, QuickStatements, is linked in the menu on the left.

Data preparation

Unlike when entering or correcting individual information, the database no longer makes suggestions for input. The place of birth of Johann Sebastian Bach is Eisenach, the place of birth of Johann Christian Bach is Leipzig... and this information must now all be provided with the corresponding Q and P numbers. This is what the prepared entry from a table in Excel looks like:

Q147798 P82 Q10341
Q147795 P82 Q10408

Researching the Q and P numbers is an effort that should not be underestimated. Before starting the entry, it must be determined whether the corresponding objects about which statements are to be made already exist - otherwise duplicates would be created and the information in the database would not be aggregated.

In a first step, the database must therefore be asked for all known people. The new entry is to be compared with their Q numbers. In a second step, in the example, all locations are to be drawn from the FactGrid and compared with the numbers to be entered for existing numbers. Where the objects are missing, they must be created (this can also be done serially, but also costs time).

There are practical options for this, such as connecting columns with selected values ??using the command SVreweis calculation program. The more professional way is to use Open Refine for database comparison. There is currently a development deficit in the FactGrid, as a Wikidata-external resource.

Preparation no less requires the adjustment of calendar dates. Exact input conventions must be considered here.

Finally, the depth of input is not to be underestimated: If you have to generate a new place from a list with details of places of birth, you have to populate it with individual statements: labels, brief descriptions, the statement that it is a place, the geocoordinate - otherwise, the location does not appear on the map, with which you want to visualize the entire entry at the end. (This is a workload that is quickly overlooked when calculating the workload when looking at databases that only work with text fields).

The exact instructions for the transformation of information in the run-up to the mass input can be found on this help page of the Wikidata project updated continuously:

https://www.wikidata.org/wiki/Help:QuickStatements

Prepared batch fragments are used for the standardized creation of individual database objects

In day-to-day work with the database, in which only individual new database objects are to be created, batch fragments in which work routines are run have proven to be practical. A very simple batch file of this type is this for creating a last name in three languages

 qid, Lde, Len, Lfr, Dde, Den, Dfr, P2
, "#", "#", "#", "Familienname", "family name", "nom de famille", Q24499

The entire fragment is copied into QuickStatements using copy & paste, in the present case all # characters are replaced by the last name, for which a Q number has so far been missing, and then the input command is sent as a CSV input. Before the actual processing, QuickStatements gives a list of the work steps that the program will carry out. With the submission you can now generate the new name much faster in the desired languages.

A list of popular fragments can be found under the link "Batch fragments" in the menu on the left. More complex work routines can be tailored to your own project with any number of statements that you do not want to make individually.

Internet demonstrations on the use of quick statements There are several videos on YouTube on how to use QuickStatements when entering prepared table information. These make the procedure with their screen demonstrations clearer than cumbersome continuous text descriptions. The Wikidata help also provides extremely detailed explanations on all detailed questions.

https://www.youtube.com/results?search_query=quickstatements+wikidata