User talk:László Kiss

From FactGrid
Revision as of 17:11, 21 September 2023 by Martin Gollasch (talk | contribs) (→‎Meta)
Jump to navigation Jump to search

Some little changes that make things easier

  • Do not put begin and end dates into the reference section (which is there to state for your sources), use them as qualifiers instead. --Olaf Simons (talk) 16:34, 1 August 2022 (CEST)

Places

...as I said you might want all Hungarian places in the machine as it is painful to create these places one by one. If you create a place, give the Wikidata-Identifier. The cool thing about these is that they allow searches from FactGrid into Wikidata and that you cannot create two places with the same Wikidata-identifier. (do not be disturbed by my messages, I am just trying to give you the wider view) --Olaf Simons (talk) 15:38, 1 August 2022 (CEST)

I am not quite sure how the Hungarian Wikidata community organised their places. This search gives 3164 places. It would be a fast thing to import them all - faster than to create 40 one by one. --Olaf Simons (talk) 15:38, 1 August 2022 (CEST)

First Searches

Almost a closed container

Places coming in

Dear László, I have just started an input on all Hungarian places map view - its coming from south to north, so yo can start matching (with Geonames IDs or with Wikidata-Ids as you please). Hope it faciltates your work, --Olaf Simons (talk) 08:49, 2 August 2022 (CEST)

...and make sure that new places you create are connected to Wikidata - it is your chance to run queries via that link into the Wikidata-database and, in the future, into other Wikibase installations (quite a cool feature) --Olaf Simons (talk) 18:30, 2 August 2022 (CEST)

Research statement and batches?

Dear László, I have just put a "research statement" on all your work as far as it was easy for me to catch it. The P131 statements are, as you will notice, extremely useful to catch all "your" items on one search and to present work to your funding institution - or to a frontend that could run on your institute's website and that would just offer "your" items (if that should be interesting).

You can easily change the text on the item (as I created the title from your first mail), and you can add more information on your project, like funding institution, home institution, whatever you like.

Looking over your shoulder I was wondering whether you should not create a "batch" for your input. The batch gets all the repeating standard information like: Is a human, is a man, is a Hungarian citizen, is part of my research project, is a member of parliament... so that you only have to make the particular statements on each person. If you have a spreadsheet of all the people you could also consider to run them all into the machine with all basic information and then go for the particulars one by one. --Olaf Simons (talk) 09:02, 3 August 2022 (CEST)

A batch fragment

qid,Lhu,Lde,Lfr,Len,P2,P154,P165,P616,P131
,"#","#","#","#",Q7,Q18,Q145220,Q140539,Q429552

Maybe another statement they would all get is member of (P91) Communist Party of Hungary. If there is a Woman among them change Q18 to Q17...

You put this fragment into QuickStatements, change # for the names (first is Hungarian for your preferred option of family name first), then send it off as CSV input. --Olaf Simons (talk) 09:49, 3 August 2022 (CEST)

First name/ last name

I have just looked through the list of members of the academy of sciences. We could match them against Wikidata and get all kinds of dates from there... the labels could be matched. It seems as if you presently have a mix of first/last name options in the entire set. (You can handle this differently in the languages, start with last names in Hungarian, start with given names in English.) Some double records might have crept into the set Item:Q429607, Item:Q429373 (not to mention the other Szabó Istváns...). If you get first name/ last name labels on the English labels I will take a look at Wikidata's Members of the Academy of Sciences. --Olaf Simons (talk) 18:35, 3 August 2022 (CEST)

While I am holding my breath

while I am holding my breath, looking at the granulation of data you are beginning to create... - that will be big, if you manage it! Let Labels begin with capitals, it gives coherent alphabetical listings in regular SPARQL queries... --Olaf Simons (talk) 23:41, 4 August 2022 (CEST)

Dear Olaf, My question is whether it is possible to import dates (birth and death) and career statement qualifiers (Begin date, End date) as a CSV command with Quickstatement. László Kiss (talk) 07:58, 5 August 2022 (CEST)
Absolutely possible:
The only thing that comes as an addition is linking to Wikidata ("Swikidata"). Sometimes you will wonder whether to do a version 1 input or CSV. Version 1 from Spreadsheet is often less tricky. CSV will often need three quotes on a """Text input""". It is also less easy with empty cells. I usually use CSV to create items and Version 1 when I continue from spreadsheets. Version 1 is also easier with qualifiers, but only slightly... --Olaf Simons (talk) 08:06, 5 August 2022 (CEST)
You know how to get year-month-day dates into the machine? (as I see you are creating year dates and full dates...) --Olaf Simons (talk) 09:48, 5 August 2022 (CEST)


Yes, this is the command: qid,P77 Q426075,+1891-08-15T00:00:00Z/9 Q426075,+1891-08-15T00:00:00Z/10 Q426075,+1891-08-15T00:00:00Z/11

The first line will be the year, the second line the month and the third line the day. László Kiss (talk) 12:33, 5 August 2022 (CEST)

Is it a better command line? László Kiss (talk) 12:40, 5 August 2022 (CEST)

Oh, I see,I only need a third line. László Kiss (talk) 12:42, 5 August 2022 (CEST)

Exactly. The other options are only needed if you do not know it any better. --Olaf Simons (talk) 12:46, 5 August 2022 (CEST)
You can reverse an input if you just set a -Q and run the respective lines again. (also a reason why I prefer the Version 1 input. It is less difficult to reverse processes.) --Olaf Simons (talk) 13:39, 5 August 2022 (CEST)


  • Item:Q434688 has two places of birth. I suppose you wanted to put the second on the man with the same name... --Olaf Simons (talk) 16:49, 5 August 2022 (CEST)

Properties

Dear Olaf, I would like to ask for help in how I could detail the data of the members of parliament in the database using appropriate properties. In order to be able to connect the data to our other institute databases, the parliamentary circle and the membership type should be recorded in addition to the parliamentary membership. In the case of the latter, it would also be necessary to be able to record entry from the party list or from a constituency (just like membership type - list or constituency, if constituency, then which district). Are there suitable properties for this purpose, or do you recommend creating new properties? László Kiss (talk) 11:46, 23 August 2022 (CEST)

Dear László, Is "Status of affiliation" Property:P277 the thing you are looking for? Its to be used as a qualifier on P91-statements. We could do a video call tomorrow afternoon if that suited you. --Olaf Simons (talk) 12:56, 23 August 2022 (CEST)


Then perhaps it is best if I create subclasses for the currently unified "Member of the National Assembly of Hungary" variable, such as "Member of the National Assembly of Hungary 1945-47", "...1947-49", I assign these under P165 for the individual persons and link P277 to them. (It could have changed in each different cycle.) László Kiss (talk) 13:27, 23 August 2022 (CEST)

The better way should be to search with filters. You have given begin and end dates - so filters should enable you to ask for special time frames. We can ask User:Bruno Belhoste if you have an example of what you want to be able to show. He is good at understanding how to search for information. --Olaf Simons (talk) 13:30, 23 August 2022 (CEST)

The problem with the filters is that not all parliamentary members entered the parliament on the first day of the term - there were changes, resignations, deaths, etc. I should somehow understand the parliamentary cycle as a variable. László Kiss (talk) 13:34, 23 August 2022 (CEST)

Another solution could be if we could add time interval filters to the queries. Is there such an option in sparql? László Kiss (talk) 13:46, 23 August 2022 (CEST)

Sure, SPARQL shoukld be the solution. See this section: FactGrid:Sample_queries#Filters --Olaf Simons (talk) 15:34, 23 August 2022 (CEST)

Wikidata has a property of parliametary term. https://www.wikidata.org/wiki/Property:P2937 Would it be possible to create a property with such content on the factgrid as well? László Kiss (talk) 16:45, 23 August 2022 (CEST)

and created: Parliamentary term Property:P795

Thank you! László Kiss (talk) 18:06, 23 August 2022 (CEST)


Dear Olaf, Can I have one last question for today? From among the parliamentary members, I would like to select the representatives of cycle Q447232 using the P795 property as a qualifier. What did I write wrong in this query? Run does not find a matching record.

SELECT ?item ?itemLabel ?Career_statement ?Career_statementLabel WHERE {SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } ?item wdt:P165 wd:Q426073; pq:P795 wd:Q447232.}ORDER BY (?itemLabel) László Kiss (talk) 19:34, 23 August 2022 (CEST)

I found the error, thanks. :) László Kiss (talk) 21:44, 23 August 2022 (CEST)

I guess this was what you were looking for: https://tinyurl.com/2opskjqw --Olaf Simons (talk) 22:07, 23 August 2022 (CEST)


Dear Olaf, If I would like to add the place of work to a career_statement, should I use P83 as a qualifier? László Kiss (talk) 10:31, 7 September 2022 (CEST)

  • We started with P47 to bring localisations into statements.
  • We then felt a need to have special statement for the place where a person is living (or where an organisation has a branch) - which is more a question of where you have a house with an address.
  • We eventually added P296 (stay in) because people can have home address but move around on journeys every other year.

So I'd go for P47 or (if youwant to refer to the place where the person has his address P83. --Olaf Simons (talk) 20:50, 7 September 2022 (CEST)

Quarters of Budapest

Dear László, can you check the new city quarters against Wikidata? We should have all their objects references so that complex searches can run through federated queries. --Olaf Simons (talk) 12:43, 11 August 2022 (CEST)

With QuickStatements you have to add the Wikidata-Q-Numbers with Swikidata as command instead of a P number. So your FactGrid Q2345 + Swikidata + Q1223 (Wikidata-Q) --Olaf Simons (talk) 12:47, 11 August 2022 (CEST)


Yes, I'm just trying to plan the whole QS command line because I have to record more than 500 settlements. In addition to the districts of Budapest, there are many other former independent settlements, settlements annexed from Hungary, and various American and other small towns. László Kiss (talk) 13:13, 11 August 2022 (CEST)

By the way: We have two places allegedly in Hungary (so German 18th century history) which we could not identify Hungarian places without geocoordinates in the beginning Weinburg and Schloßhof. Have you got an idea what they could have been un Hungarian? (Maybe they are mistakes, there are German places with such names) --Olaf Simons (talk) 13:47, 11 August 2022 (CEST)

I don't think they are Hungarian settlements, there is an Austrian settlement called Schlosshof near Devin, on the other side of the Morava. Weinburg was a hunting castle of the Habsburg rulers near Braunsee in the 16th century. László Kiss (talk) 15:26, 11 August 2022 (CEST)

Dear Olaf, is it possible to import new items (in this case settlements) using the wikidata identifier? László Kiss (talk) 19:19, 11 August 2022 (CEST)

If you are looking for an automatic way to switch information from Wikidata to FactGrid, that should be programmed one of these days. So far I run a Wikidata search for a field I am interested in (like all Hungarian places) then I run a FactGrid search: all places we have with Wikidata-identifiers, and then I subtract: I do not create new items where we already have these items (on the basis of the Wikidata-identifiers). If you feel like creating all places of Slovenia or the US, feel free. Others will make use of them sooner or later. Create them with Wikidata-Identifiers so that we can always see what we already have. Both I and some Wikidata people try to do the reverse thing: make sure that Wikidata knows all our numbers as well. (I'll be more or less without internet from Saturday onwards till Monday 22. I guess User:Bruno Belhoste will help you with questions on SPARQL and mass inputs. User:Martin Gollasch will also be around with a look at properties we have and how we use them. --Olaf Simons (talk) 22:41, 11 August 2022 (CEST)

P626

Dear László, I just saw your use of the P626 statement. It is designed to be made on "career Statements". Someone is a farmer, the "farmer" item gets the P626 statement that this is an agricultural job. We can then ask for statistics. If 1000 people have 300 jobs that gives a terrible statistical distribution. So we ask the machine: look at the jobs, then go to these jobs and see what category they are and create statistics not on the job-statements but on the P626 statements that lace the 300 Jobs in a dozen of categories.

Your present P626 on these persons should better appear under P97 statements.

I'll be back to work next week. It might be good you give me some of your ideas in a video-meeting. You have created an extremely interesting data set. there should be lots of explorations possible. --Olaf Simons (talk) 22:28, 18 August 2022 (CEST)

Dear Olaf, I will fix it all. László Kiss (talk) 23:02, 18 August 2022 (CEST)

Legislative terms

Dear László, I did some data modelling on Item:Q447232 - mostly to get rid of the P2 statement that would have moved your items into the directory of properties --Olaf Simons (talk) 22:33, 23 August 2022 (CEST)

4 x Ferenc Nagy

I suppose they are all the same. It would be good to map all these people against Wikidata. That would help you tu run double records. You will be able to merge them, as you realised, but you will have additional work later when you will stumble over links to merged items (they are redirected but not really merged...) --Olaf Simons (talk) 22:39, 23 August 2022 (CEST)

No, they are all different people (and the many Varga and Tóth and Kiss :) and Kovács and Szabó...) I will upload the demographic and career data in more detail and you will see the difference. Thanks for data modeling, I really like this solution. László Kiss (talk) 22:44, 23 August 2022 (CEST)

eventually we will need "Descriptions" on each of your people. Our favourite solution is something like "* 1934-10-15 Budapest, + 2004-08-16 Vienna, Hungarian politician", so that people who insert the name into the regular input field get an immediate idea whom to pick. --Olaf Simons (talk) 08:20, 24 August 2022 (CEST)

OK, I'm on it. :)

Good evening László, would an external identifier be good for this database? https://www.abtl.hu/ords/archontologia/f?p=108:13:::NO:13:P13_OBJECT_ID,P13_OBJECT_TYPE:950138,ELETRAJZ --Olaf Simons (talk) 23:21, 25 August 2022 (CEST)

Thanks, I don't think it's necessary, the database is only slightly in contact with mine. Intelligence officers typically did not hold higher government positions. László Kiss (talk) 20:30, 26 August 2022 (CEST)

Politics P91 and P661

Dear László, you have noticed that we have two properties in the field of personal political tendencies. P91 is used if you actually are a member, but the we added P661 to state party leanings as well (useful in the early modern era where you can be Whig or Tory without really being a member of anything. If your people Item:Q447319 are actually registered members you might prefer shifting them to the P91 property. --Olaf Simons (talk) 08:44, 30 August 2022 (CEST)

Oh,thank you, I will fix it. Can you tell me, how could I get a qualifier into a query? I have all the "political_orientation" values per person, but I would like to get all the P661 values with a qualifier P49.

SELECT ?item ?itemLabel ?Political_orientation ?Political_orientationLabel ?Begin_dateLabel WHERE {

 SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  ?item wdt:P131 wd:Q429552.  
 ?item wdt:P661 ?Political_orientation     }

ORDER BY (?item) László Kiss (talk) 10:31, 30 August 2022 (CEST)

(with bad internet access - see the sample searches, I left some searches there (scripted by User:Bruno Belhoste --Olaf Simons (talk) 10:49, 30 August 2022 (CEST)

Hungarian in the FactGrid viewer

I have added Hungarian to the languages of the FactGrid viewer. I used DeepL for the translations. Can you please tell me if there are any errors or omissions? --Bruno Belhoste (talk) 14:55, 30 August 2022 (CEST)

Dear Bruno, I see everything is fine in it. László Kiss (talk) 16:42, 30 August 2022 (CEST)

Return and new question

Dear Olaf! After a forced one-year break, I will resume the development of the database, hope to add more useful data fields and try to complete the existing data fields and descriptions. I have a question for you in a moment: what should I do if one of the persons in my sample is also included in the database as part of another research project here, with a different identifier? There is no difference in the basic data, but the two sheets do not contain the same data. Simple merge is not good I guess, because wrong references can be left afterwards. László Kiss

Dear László, it does not feel like a year (maybe as I remain overburdened with the growth of the project). Straight to your question: If it is the same Person it can only exist once on FactGrid. We usually merge from the higher to the lower number (assuming that the lower number might already have an external mapping on Wikidata, which should be checked on Wikidata). Otherwise we merge from the less connected to the heavy connected item, to save work in the ensuing link corrections.
If two projects are working on the same person - well that's what we want to see, people working on the same objects and adding to the research of others. So why not have two P131-research statements on the same item? Which is the item in question? --Olaf Simons (talk) 11:39, 21 September 2023 (CEST)

Now I have suddenly found two: Q224295 and Q434878, and Q430496 and Q219592. László Kiss

Sándor Ék (Q224295) is a general contribution of this project.. Just merge (either direction is possible).--Martin Gollasch (talk) 13:20, 21 September 2023 (CEST)
Georg von Bartok (Q219592) is from the Schopenhauer-Project, and a merge would be exactly what is wanted here. This way we create a surplus in information value--Martin Gollasch (talk) 13:24, 21 September 2023 (CEST)
Merging is on the menu (left). Clear the descriptions of on of the two candidates, as the software cannot handle two alternative descriptions. If the Labels are different one will automatically go into the Alias section. It's a standard. --Olaf Simons (talk) 13:29, 21 September 2023 (CEST)
PS. Always link to Wikidata. It is impossible to create two items with the same Wikidata equivalent on FactGrid, which is a cool feature to minimise the number of double records. --Olaf Simons (talk) 13:30, 21 September 2023 (CEST)

Meta

We created a GND-No with the DNB for you. The VIAF will usually follow Bot-generated within the next weeks in wd.--Martin Gollasch (talk) 16:41, 21 September 2023 (CEST)

As I understand, there are several Hungarian National Namespace person ID's. Your database must be of interest for them. Do you have contacts with the meta-librarians in NSZL?--Martin Gollasch (talk) 17:58, 21 September 2023 (CEST)

Yes, I am in contact with them, and we are thinking about the possibility of cooperation. Laszlo Kiss

Well, we could install the proper identificator anaolog to wikidata here too. You just have to say which one you want and Olaf will fix it probably within minutes.--Martin Gollasch (talk) 18:11, 21 September 2023 (CEST)