Talk:Main Page
Kontakt
- Olaf Simons, olaf.simons@pierre-marteau.com, Büro +49-361-737-1722, Mobil: +49-179-5196880
- Wikimedia.de 49-30-219 15 826-0
- Uni Rechenzentrum Erfurt: Datenbankbetreuung: Tobias Weller tobias.weller@uni-erfurt.de, +49 361 737-5468
Project: Getting a richer data environment
- We have realised that editing individual statements is possible and still quite easy on a Wikibase installation - but everything but the ideal way to cope with the medium. The Wikidata community learns how to use the database and its quiery services as it is not working on an empty platform. There are already things there to search and to connect - and to add to.
- We need a plan how we could - with strategic wisdom - integrated larger amounts of data from
- Wikidata https://www.wikidata.org/
- The German GND http://www.dnb.de
- The English ESTC http://estc.bl.uk, blog post on developments,
- The German VD16 [1], VD17 [2], VD18 [3]
- The Dutch STCN [4]
- The project should be of interest to Wikidata as we would serve as a testcase and entry point of these data sets.
- We would need an idea ow to mine these databases systematically (We will not be interested in the entire GND or in running a Wikidata fork. We might should think of selective inputs (like all people with particular birth years...)
- We will need a tool to - automatically - hunt and merge double entries if we merge database information from these sources.
Project: The WikiBase Interface
See Needed thing #3: An attractive Wikibase-Interface for browsing and reading Wikibase information
Web Input Forms
Community building
Wir sollten Kontakt aufnehmen zu allen die eine Wikibase Installation laufen lassen:
Starthilfen
- Benutzerhandbuch
- Liste der Konfigurationsvariablen
- MediaWiki-FAQ
- Mailingliste neuer MediaWiki-Versionen
- Übersetze MediaWiki für deine Sprache
- Erfahre, wie du Spam auf deinem Wiki bekämpfen kannst
PhiloBiblon
I would like to introduce PhiloBiblon, a new FactGrid project, and seek your advice on first steps.
PhiloBiblon is a database of the romance vernacular sources of medieval and early modern Iberian culture: the primary sources themselves, both MS and printed, the texts they contain, the individuals involved with the production and transmission of those sources and texts, the libraries holding them, along with relevant secondary references and authority files for persons, places, and institutions.
PhiloBiblon consists of four separate bibliographies, with a cumulative total of ~415,000 records. Each of these has its own Home Page on the current PhiloBiblon website at UC Berkeley: https://bancroft.berkeley.edu/philobiblon/
1. BETA / Bibliografía Española de Textos Antiguos: Medieval texts in Spanish. (https://bancroft.berkeley.edu/philobiblon/beta_en.html) FactGrid Item:Q254471
2. BIPA / Bibliografía de la Poesía Áurea: Golden Age Poetry (16th-17th c.) in Spanish: https://bancroft.berkeley.edu/philobiblon/bipa_en.html
3. BITAGAP / Bibliografia de Textos Antigos Galegos e Portugueses: Medieval texts in Galician, Galician-Portuguese, and Portuguese: https://bancroft.berkeley.edu/philobiblon/bitagap_en.html
4. BITECA / Bibliografia de Textos Antics Catalans, Valencians i Balears: Medieval texts in Catalan: https://bancroft.berkeley.edu/philobiblon/biteca_en.html
PhiloBiblon currently runs on a relational DBMS under Windows with tables for ten entities: (1) Persons, (2) Institutions, (3) Libraries, (4) Toponyms, (5) Primary sources [MSS and early prints], (6) Copies [additional copies of early prints], (7) Texts, (8) Witnesses [a copy of a text in a given primary source], (9) Subject headings, (10) Secondary sources.
We have a detailed data dictionary for each table.
My first question: What general advice would you give me about using FactGrid?
My second question: What’s the most efficient way to create FactGrid data models for these ten entities?
I know that FactGrid has existing data models for most if not all of these entities, but a preliminary look suggests that in some cases they will need to be extended to accommodate PhiloBiblon data.
Olaf has suggested that we start with toponyms, libraries, and institutions so that the FactGrid Q# will be available for the creation of other entities. --Charles Faulhaber (talk) 22:00, 24 May 2021 (CEST)
- In the German Wikisource the experienced Users recommended for Newby's a look at "Pope Joan" on questions like yours... And indeed, the German Wikisource article https://de.wikisource.org/wiki/P%C3%A4pstin_Johanna has answers to nearly every question that might arise in their project. So simply get started, search for or/and create your own suitable Pope Joan.--Martin Gollasch (talk) 22:26, 24 May 2021 (CEST)
Basic FactGrid Questions
What is FactGrid? What is special about the service?
FactGrid is a Wikibase installation that brings the software developed for Wikimedia’s Wikidata project into the world of academic research. It serves as both a technological and cultural game changer. Technologically, it is innovative because we are no longer constrained by the limitations of relational databases. Anything can now become an object of knowledge, or an "item," in any arrangement of properties that we can define as needed.
Culturally, FactGrid is a game changer because the software is designed to be used by large communities of users from diverse linguistic backgrounds, all working on the same objects in their respective languages.
What is the main purpose of the PID service?
FactGrid provides Persistent Identifiers (PIDs) for all database objects by default. Users assign “labels” to these Q-numbered “items” to make the information available in their respective languages. This is also done for the “properties” that interconnect the “items”; these form the stock of potential input fields under ascending P-numbers. The platform is presently at 1,059,221 items and 1,200 properties to make statements.
The main goal of the project is to offer an alternative to the ongoing proliferation of data silos. On FactGrid all the platform’s projects will interconnect their information to the same items which all the others are using – we have some 12 different items called “Paris” with only one “Paris” (Q10441) being the capital of France. Researchers invest their time in improving the quality of information, where so far they have been creating and recreating information that is already available in various other silos. FactGrid identifiers are interconnected with identifiers in databases elsewhere. Other databases, in turn, link back to FactGrid, to facilitate the mutual identification of objects.
What are the requirements that projects must fulfill?
FactGrid data are free to download under the CC0 license. Adding new information is, however, restricted to registered users under real name accounts. Principal investigators are provided with administrative accounts which they use to generate accounts for their teams. All projects are asked to state their research interests on the platform. They are, in return, free to create any objects they are interested in. Unlike the GND or Wikidata, FactGrid has no notability criteria. We invite projects to create items about which they would love to know far more.
How does one get a FactGrid Wikibase QID? (Where can I find information about the first steps?)
Contact information is available on the front page (olaf.simons@pierre-marteau.com), and new accounts are usually granted within minutes. The QIDs for items are, on the other hand, automatically created whenever a new object is generated—whether “manually” via the “new item” menu or in bulk via batch inputs from a spreadsheet.
The challenge lies not in the creation of QIDs, but in preventing the creation of duplicate objects. OpenRefine helps by checking the database in advance, or alternatively, one can match data with the VLOOKUP function in any spreadsheet. New projects that have external identifiers on their items can easily check if FactGrid items are referring to the same identifiers.
Information about the software and the platform is available, but projects usually prefer to discuss their data first in a video call.
What costs should I expect?
FactGrid services are free. The user community has formed a stakeholder association, with the goal of keeping the platform free of charge. Until 2023, Gotha’s Research Centre funded FactGrid with €3,000 per year. Since then, the platform has been supported by Germany’s National Research Data Initiative (NFDI) with a five-year grant of approximately €20,000 per year. These funds primarily support the creation of tools that all projects can use on the database. Several of our projects invest the savings into specific tools for their respective home pages. The database allows external services to access all its information without further barriers.
What are the most important/urgent questions users have?
Most questions are practical:
- How should we prepare spreadsheets for input?
- Which FactGrid properties should we use for the best integration?
- Which new properties do we need?
- How do we match place names in our data with the over 300,000 places already available on FactGrid?
- Where can we find specific datasets to feed into FactGrid, now that we realize we shouldn't duplicate existing data from other sources?
- How do we communicate on the platform?
- How long will it take us to create a set of 1,000 people? (Only a few minutes, once you identified the overlaps with already existing information.)
Where can I find information to start working on the platform?
There is a popular intro page starting the blog (link); terms of service are listed on the main page; the help section can be accessed through the menu. All projects on FactGrid (listed in the menu for project spaces) can be contacted on site, via email or through the external home pages they advertise. The most common entry point is the coordinator’s email address on the front page, followed by an exchange on the project’s video channel.
Is there a FAQ page?
Yes, here: https://blog.factgrid.de/archives/1591 – available in English, German, French, and Hungarian.