Help:How do I feed data into FactGrid?

From FactGrid
Jump to: navigation, search

back

https://www.wikidata.org/wiki/Help:QuickStatements

Creating Items one by one

The link to the input mask is given in the left menue:

With this form, you can create any imaginable database object, Item. It will receive a new Q-number and should gain qualities impact and dimensions with the individual statements one can make about it.

At least a "label" and a "description" should be assigned to the object when creating it. Name variants can be added in order to handle the object under all its various known names in full text searches.

If the software is used in English, the information given in the labels, descriptions and aliases will appear in all the languages that have not set their own wordings. It is therefore important that all Items and properties get English labels and descriptions. If you are editing on FactGrid primarily in another language, start your input with English and add your own language in the second step.

How to make statements on individual Items

Most databases will require you to decide beforehand what kind of objects you will handle with on the platforms and how exactly they can be related to each other. Wikibase is structurally open. Items gain dimensions and meaning just as they do in regular discourses through the statements they attract and in which they appear.

For the database all its objects are merely Q-numbers, linked with each other with the help of P-numbers. Alternatively an object can appear in a statement with a date or a unit. Linking abstract Item-numbers with the help of equally abstract P-numbers has the advantage that both can be now named and identified in any language imaginable. The Item will not gain its qualities with its categorisation but with the triples of statements the attract.

There is no restriction on sorts of database objects which one might deem impossible. Whether a person, a document, a place, a house, an event or an abstract idea... FactGrid knows only database objects and statements about them and there is not anything under a word, name or title imaginable that cannot be the object of statements about it.

Triple-based statements

Statements in the FactGrid take the form of triples, three-part atomic statements such as

Johann Christian Bach's (Item:Q147795) father is (Property:P141) Johann Sebastian Bach (Item:Q147798)

The Q—P—Q triple is the most interesting option as it brings data base objects into relative positions towards each other. They have three eminent advantages over the "string" input which most conventional databases require. Unlike the combination of characters "Köln" as the place of birth of a person, the reference to the respective database object Item:Q10400 has three big advantages:

  • the reference introduces an Item that is itself the object of various statements. The string sequence "Köln" does not come with geo-coordinates or external identifiers, but these will be in the package of Item:Q10400.
  • the reference introduces an Item that can be handled in numerous languages, Anglophone readers read will read "Cologne" on the Item:Q10400 instead of "Köln".
  • the reference to an item instead of a string brings position into the database. If there are several places with the same name, you will select the Q-number of the object you mean, the one with the right geo-coordinates, the right country, etc.

In addition to Q—P—Q triples, there the triples can also refer to dates, quantities, or strings of characters (for example in order to state a particular wording or spelling in a quotation), a URL or the identifier of another database. An alternative triple can thus be:

Johann Christian Bach (Item:Q147795) was born on (Property:P77) September 5, 1735

or

Johann Christian Bach (Item:Q147795) has the GND identification number (Property:P76) 118505521

When creating new properties, on has to consider which type of link they should create.

Statements can be qualified in any complexity

Triples in themselves would be a comparatively rough statements. That is why Wikibase allows the "qualification" of triples with extended statements. Adam Weishaupt was married twice. It makes sense to qualify the entries in more detail. The first marriage is the triple:

Johann Adam Weishaupt (Item:Q1308) was married to (Property:P84) Maria Afra Johanna Walburga Weishaupt (born Sausenhofer) (Item:Q23424)
— since (Property:P49) July 11, 1773
— Place of marriage (Property:P132) Eichstätt (Item:Q10340)
— Best man was (Property:P340) Anton Härtl (Item:Q97362)

When entering such statements manually, it is not necessary to know the respective Q or P numbers. The moment you type into the input fields the database will make suggestions for the statements you probably want to make and the names you probably intend to enter.

Statements can be complexly substantiated with references

Any number of source documents can be added to all statements. The most important properties are here:

  • Property:P51 "primary source" — for the historical document from which research literature takes or should take the information,
  • Property:P12 "literature" — for reference to research literature on the subject (qualifiers can be used to add exact page references),
  • Property:P129 "according to" — to build distance from the (doubtful) statement.

Contradicting can be made and are valuable

Wikibase instances allow multiple contradicting statements to be made about objects. This is not a defect, but makes sense, for example, as soon as individual documents make contradicting statements about a question such as the date of birth.

  • You can leave these answers side by side and support each of them with the respective sources; they will now appear side by side in searches.
  • You can evaluate the conflicting statements individually by qualifying each statement with Property:P155 "How sure is this?". A wide range of pre-structured claims are available for this property.
  • You can also ensure that incorrect information will no longer appear in searches. To do this, the markings at the beginning of the input field are lowered or raised accordingly before saving the statement. Statements that have been disqualified as obsolete are retained in the database, where they remain useful for preventing relapses into the precarious state of information.

The "Directory of Properties" lists all the statements that can currently be made on FactGrid

All statements that can currently be made with the FactGrid are listed in the Directory of Properties both in an overall list and under individual questions on the contents page.

The left column gives the statements according to where you want to make them — for example in a biography or in a database object for a document.

In the right column they are arranged according to the object you want to target — you can want to make a statement on a person (left column) in order to state organisations (right column) he or she was a member of.

We can define any imaginable statement

Wikibase is extremely open when it comes to database modeling. Any object that can attract statements is able to become a Q-Item on a Wikibase installation. For the database, the statements themselves are Q-Numbers connected by P-numbers. The labelling of these numbers is at the discretion of the user.

The establishment of a new statement is not entirely uncritical:

  • The respective statement might already exist in a wording that did not cross your mind (you can repair this by defining the Alias you would prefer to use)
  • One might actually prefer to stay content with a less precise statement in order to keep search results together — the desired precision can be gained with qualifiers.
  • The data model might not make much sense (For instance an organization could have a "generalissimo" in its hierarchical structure, but a property in this regard is not particularly practical, since it presupposes that seekers already know what kind of positions they have to ask in the organization in search of the leaders there is a property that asks about the people in a leading position in an organization and the use of qualifiers that show the position of the people in question.

With the FactGrid project, we are interested in giving research projects the greatest possible freedom in the respective issues, so properties have so far been set up without any great hurdles to coordinate. However, in order to avoid technical mistakes and to use the database transparently, we urge a short discussion process on the site

Nevertheless, if you generate properties yourself — under pressure from the project — make sure that you anticipate the type of statements you want to make, i.e. whether they refer to a Q number (database object), a P number ( a property or property) want to refer to a date, a URL or a database ID. Properties that have been set are then fixed in terms of their requirements and no longer assume any different values. It is advisable to coordinate with people who have already set up properties in the FactGrid.

FactGrid is multilingual (as far as you teach it)

Wikiase is designed to be used by a multilingual community. The interface language can be changed accordingly by anyone who has an account. But the option to speak practically any language goes much further: All the Items and all the Properties are eventually only Q- and P-numbers. The different language labels and descriptions turn them into multi-lingual objects that can now occur in as many languages as have contributed the necessary labels and descriptions.

To assign labels and descriptions in a new language, you have to log in in this language and set the language in the "Preferences" section of your user settings.

English is the default language on FatGrid. Labels and descriptions will in any other language appear as stated in English until an alternative label or description has been set on the Item or Property. It is therefore in the interest of all non-registered users that all Q-numbers and all P-numbers are — also — set in English.

If you feel easy about English you should in a manual input (of items one by one) begin your input in English and add your own language label and description in step two after saving. If you feel less at home in English you will have to rely on other users to give the English translations — your data will just appear as Q-numbers and P-numbers in other language searches until this has been done.



The mass input of data via QuickStatements

If data is already available in a spreadsheet, whether Excel or Google, or in a CSV (comma-separated-value) table, it can also be entered in a serial input. The tool for this, QuickStatements, is linked in the menu on the left.

Data preparation

Compared to entering individual statements the mass input will require quite some preparation: The database can no longer makes suggestions during the input, you have to find out whether the Items you will need are already in the database.

A simple table might state the birth places of several people, among them Johann Sebastian Bach Item:Q147798 (Eisenach Item:Q10341) and his son Johann Christian Bach Item:Q147795 (Leipzig Item:Q10408). This is the input which you have to prepare in this case, Property:P82 is designated to state the respective place of birth:

Q147798 P82 Q10341
Q147795 P82 Q10408

To reach this point you will have to get the Q-numbers for the persons on your list, the Q-numbers for the places referred to, and this will require some FactGrid database mining in the first place in oder to get all the persons and places in the database whose Q-numbers must replace the respective names. Things get even more difficult when you have to create the Items in the preparation process. Any negligent work will created masses of duplicates that will cause new work in the merging processes.

The Wikidata-standard tool is OpenRefine. Most FacGrid users are using Excel or Google spreadsheets and the Vertical Lookup feature to retrieve the Q-Numbers from Imports and to make the replacements in their lists.

Calendar dates demand their own preparation. Exact input conventions have to be considered in any mass input.

Finally, the depth of the required input should not be underestimated: If you have to link a statement to a place name this is simple in a database that is content with a string input. The Q-item solution is more complex. It is easy to generate Q-items for places but they will not appear on maps unless they have received geographic coordinates; they should be linked to external identifiers in order to gain uniqueness. All this is a workload that that is easily overlooked when calculating the workload of a mass input in Wikibase. The software is bright but the depth it offers comes at its own price.

The exact instructions for the transformation of information in the run-up to the mass input can be found on this help page of the Wikidata project eher it is constantly updated by the bigger community:

https://www.wikidata.org/wiki/Help:QuickStatements

Use prepared batch fragments for the standardised creation of database objects

In day-to-day work with the database, in which only individual new database objects are to be created, batch fragments in which work routines are run have proven to be practical. A very simple batch file of this type is this for creating a last name in three languages

qid, Lde, Len, Lfr, Dde, Den, Dfr, P2
, "#", "#", "#", "Familienname", "family name", "nom de famille", Q24499

The entire fragment is copied into QuickStatements using copy & paste, in the present case all # characters are replaced by the last name, for which a Q number has so far been missing, and then the input command is sent as a CSV input. Before the actual processing, QuickStatements gives a list of the work steps that the program will carry out. With the submission you can now generate the new name much faster in the desired languages.

A list of popular fragments can be found under the link "Batch fragments" in the menu on the left. More complex work routines can be tailored to your own project with any number of statements that you do not want to make individually.

Internet demonstrations on the use of quick statements There are several videos on YouTube on how to use QuickStatements when entering prepared table information. These make the procedure with their screen demonstrations clearer than cumbersome continuous text descriptions. The Wikidata help also provides extremely detailed explanations on all detailed questions.

https://www.youtube.com/results?search_query=quickstatements+wikidata