Help:How do I feed data into FactGrid?

From FactGrid
Jump to navigation Jump to search

back

<provisional Google translation>

Creating Items one by one

The link to the input mask is given in the left menue:

With this form, you can create any imaginable database object, Item. It will receive a new Q-number and should gain qualities impact and dimensions with the individual statements one can make about it.

At least a "label" and a "description" should be assigned to the object when creating it. Name variants can be added in order to handle the object under all its various known names in full text searches.

If the software is used in English, the information given in the labels, descriptions and aliases will appear in all the languages that have not set their own wordings. It is therefore important that all Items and properties get English labels and descriptions. If you are editing on FactGrid primarily in another language, start your input with English and add your own language in the second step.

How to make statements on individual Items

Most databases will require you to decide beforehand what kind of objects you will handle with on the platforms and how exactly they can be related to each other. Wikibase is structurally open. Items gain dimensions and meaning just as they do in regular discourses through the statements they attract and in which they appear.

For the database all its objects are merely Q-numbers, linked with each other with the help of P-numbers. Alternatively an object can appear in a statement with a date or a unit. Linking abstract Item-numbers with the help of equally abstract P-numbers has the advantage that both can be now named and identified in any language imaginable. The Item will not gain its qualities with its categorisation but with the triples of statements the attract.

There is no restriction on sorts of database objects which one might deem impossible. Whether a person, a document, a place, a house, an event or an abstract idea... FactGrid knows only database objects and statements about them and there is not anything under a word, name or title imaginable that cannot be the object of statements about it.

Triple-based statements

Statements in the FactGrid take the form of triples, three-part atomic statements such as

 Johann Christian Bach's (Item:Q147795) father is (Property:P141) Johann Sebastian Bach (Item:Q147798)

The Q—P—Q triple is the most interesting option as it brings data base objects into relative positions towards each other. They have three eminent advantages over the "string" input which most conventional databases require. Unlike the combination of characters "Köln" as the place of birth of a person, the reference to the respective database object Item:Q10400 has three big advantages:

  • the reference introduces an Item that is itself the object of various statements. The string sequence "Köln" does not come with geo-coordinates or external identifiers, but these will be in the package of Item:Q10400.
  • the reference introduces an Item that can be handled in numerous languages, Anglophone readers read will read "Cologne" on the Item:Q10400 instead of "Köln".
  • the reference to an item instead of a string brings position into the database. If there are several places with the same name, you will select the Q-number of the object you mean, the one with the right geo-coordinates, the right country, etc.

In addition to Q—P—Q triples, there the triples can also refer to dates, quantities, or strings of characters (for example in order to state a particular wording or spelling in a quotation), a URL or the identifier of another database. An alternative triple can thus be:

 Johann Christian Bach (Item:Q147795) was born on (Property:P77) September 5, 1735

or

 Johann Christian Bach (Item:Q147795) has the GND identification number (Property:P76) 118505521

When creating new properties, on has to consider which type of link they should create.

Statements can be qualified in any complexity

Triples in themselves would be a comparatively rough statements. That is why Wikibase allows the "qualification" of triples with extended statements. Adam Weishaupt was married twice. It makes sense to qualify the entries in more detail. The first marriage is the triple:

Johann Adam Weishaupt (Item:Q1308) was married to (Property:P84) Maria Afra Johanna Walburga Weishaupt (born Sausenhofer) (Item:Q23424)
— since (Property:P49) July 11, 1773
— Place of marriage (Property:P132) Eichstätt (Item:Q10340)
— Best man was (Property:P340) Anton Härtl (Item:Q97362)

When entering such statements manually, it is not necessary to know the respective Q or P numbers. The moment you type into the input fields the database will make suggestions for the statements you probably want to make and the names you probably intend to enter.

Statements can be complexly substantiated with references

Any number of source documents can be added to all statements. The most important properties are here:

  • Property:P51 "primary source" — for the historical document from which research literature takes or should take the information,
  • Property:P12 "literature" — for reference to research literature on the subject (qualifiers can be used to add exact page references),
  • Property:P129 "according to" — to build distance from the (doubtful) statement.

Contradicting can be made and are valuable

Wikibase instances allow multiple contradicting statements to be made about objects. This is not a defect, but makes sense, for example, as soon as individual documents make contradicting statements about a question such as the date of birth.

  • You can leave these answers side by side and support each of them with the respective sources; they will now appear side by side in searches.
  • You can evaluate the conflicting statements individually by qualifying each statement with Property:P155 "How sure is this?". A wide range of pre-structured claims are available for this property.
  • You can also ensure that incorrect information will no longer appear in searches. To do this, the markings at the beginning of the input field are lowered or raised accordingly before saving the statement. Statements that have been disqualified as obsolete are retained in the database, where they remain useful for preventing relapses into the precarious state of information.

The "Directory of Properties" lists all the statements that can currently be made with the FactGrid

All statements that can currently be made with the FactGrid are listed in the Directory of Properties both in an overall list and under individual questions on the contents page.

The left column gives the statements according to where you want to make them — for example in a biography or in a database object for a document.

In the right column they are arranged according to the object you want to target — you can want to make a statement on a person (left column) in order to state organisations (right column) he or she was a member of.

We can define any imaginable statements

Wikibase is extremely open when it comes to database modeling. Any object that can attract statements is able to become a Q-Item on a Wikibase installation. For the database, the statements themselves are Q-Numbers connected by P-numbers. The labelling of these numbers is at the discretion of the user.

The establishment of a new statement is not entirely uncritical:

  • The respective statement might already exist in a wording that did not cross your mind (you can repair this by defining the Alias you would prefer to use)
  • One might actually prefer to stay content with a less precise statement in order to keep search results together — the desired precision can be gained with qualifiers.
  • The data model might not make much sense (For instance an organization could have a "generalissimo" in its hierarchical structure, but a property in this regard is not particularly practical, since it presupposes that seekers already know what kind of positions they have to ask in the organization in search of the leaders there is a property that asks about the people in a leading position in an organization and the use of qualifiers that show the position of the people in question.

With the FactGrid project, we are interested in giving research projects the greatest possible freedom in the respective issues, so properties have so far been set up without any great hurdles to coordinate. However, in order to avoid technical mistakes and to use the database transparently, we urge a short discussion process on the site

Nevertheless, if you generate properties yourself — under pressure from the project — make sure that you anticipate the type of statements you want to make, i.e. whether they refer to a Q number (database object), a P number ( a property or property) want to refer to a date, a URL or a database ID. Properties that have been set are then fixed in terms of their requirements and no longer assume any different values. It is advisable to coordinate with people who have already set up properties in the FactGrid.

FactGrid is multilingual (as far as you teach it)

To assign a different language to database objects or statement types, you have to log in in this language; this happens in the "Preferences", the user settings, at the top right next to the links in your own user pages.

In the "Default", that is, without further presetting, the English-language settings appear on all labels. It is therefore in the interest of non-registered users that all Q numbers and all P numbers are also used in English.

However, we encourage all participants to enter data in their mother tongues and gradually make the database multilingual. She currently speaks three languages ??comprehensively: English, German and French.

The mass input of data via QuickStatements

If data is already available in a spreadsheet, whether Excel or Google, or in a CSV (comma-separated) table, it can also be entered in series. The tool used for this, QuickStatements, is linked in the menu on the left.

Data preparation

Unlike when entering or correcting individual information, the database no longer makes suggestions for input. The place of birth of Johann Sebastian Bach is Eisenach, the place of birth of Johann Christian Bach is Leipzig... and this information must now all be provided with the corresponding Q and P numbers. This is what the prepared entry from a table in Excel looks like:

Q147798 P82 Q10341
Q147795 P82 Q10408

Researching the Q and P numbers is an effort that should not be underestimated. Before starting the entry, it must be determined whether the corresponding objects about which statements are to be made already exist — otherwise duplicates would be created and the information in the database would not be aggregated.

In a first step, the database must therefore be asked for all known people. The new entry is to be compared with their Q numbers. In a second step, in the example, all locations are to be drawn from the FactGrid and compared with the numbers to be entered for existing numbers. Where the objects are missing, they must be created (this can also be done serially, but also costs time).

There are practical options for this, such as connecting columns with selected values ??using the command SVreweis calculation program. The more professional way is to use Open Refine for database comparison. There is currently a development deficit in the FactGrid, as a Wikidata-external resource.

Preparation no less requires the adjustment of calendar dates. Exact input conventions must be considered here.

Finally, the depth of input is not to be underestimated: If you have to generate a new place from a list with details of places of birth, you have to populate it with individual statements: labels, brief descriptions, the statement that it is a place, the geocoordinate — otherwise, the location does not appear on the map, with which you want to visualize the entire entry at the end. (This is a workload that is quickly overlooked when calculating the workload when looking at databases that only work with text fields).

The exact instructions for the transformation of information in the run-up to the mass input can be found on this help page of the Wikidata project updated continuously:

https://www.wikidata.org/wiki/Help:QuickStatements

Prepared batch fragments are used for the standardized creation of individual database objects

In day-to-day work with the database, in which only individual new database objects are to be created, batch fragments in which work routines are run have proven to be practical. A very simple batch file of this type is this for creating a last name in three languages

 qid, Lde, Len, Lfr, Dde, Den, Dfr, P2
, "#", "#", "#", "Familienname", "family name", "nom de famille", Q24499

The entire fragment is copied into QuickStatements using copy & paste, in the present case all # characters are replaced by the last name, for which a Q number has so far been missing, and then the input command is sent as a CSV input. Before the actual processing, QuickStatements gives a list of the work steps that the program will carry out. With the submission you can now generate the new name much faster in the desired languages.

A list of popular fragments can be found under the link "Batch fragments" in the menu on the left. More complex work routines can be tailored to your own project with any number of statements that you do not want to make individually.

Internet demonstrations on the use of quick statements There are several videos on YouTube on how to use QuickStatements when entering prepared table information. These make the procedure with their screen demonstrations clearer than cumbersome continuous text descriptions. The Wikidata help also provides extremely detailed explanations on all detailed questions.

https://www.youtube.com/results?search_query=quickstatements+wikidata