Cultural Heritage Export Bot — CHEBOT

править

Code: https://gitlab.com/avsdov/chebot

Function details: The Russian Wikivoyage hosts the largest database of cultural heritage monuments in Russia. It contains over 180 thousand objects, but only part of them have a representation in the Wikidata. Wikivoyage volunteers keep working on the completeness, accuracy and consistency of descriptions of cultural heritage monuments. This bot implements consistent uploading of basic information about the specified cultural heritage sites to the Wikidata from the Wikivoyage. The bot is started during daily update of Wikivoyage heritage database. The bot updates data from pages that have changed the revision. New Wikidata entity for a certain heritage monument without wdid is created if the following conditions are met: the monument has coordinates, the monument is not marked being neither dismissed nor destroyed. New Wikidata entity is also created for head element of heritage ensemble with specified Commons category regardless of coordinates or status.

The following information is exported for a heritage monument from the Wikivoyage database to Wikidata:

  • Entity label in Russian (name field) — not updated after first export.
  • Entity label in English may be generated from Commons category name (if specified) upon creation — not updated after first export.
  • P17 (country) — the statement must have value Q159 (Russia); if the statement is missing, it will be created with qualifier start time of 18.03.2014 (for Crimean sites with ids 82xxx and 92xxx) or 25.12.1991 (for other sites).
  • P18 (image) — if P18 statement is missing, it will be created; otherwise skipped.
  • P1483 (KN ID) — the statement must have the only value matching knid in Wikivoyage database; if the value differs or is missing, it will be created/updated.
  • P2817 (appears in the heritage monument list) — the statement must have the only value matching Wikidata entity of the source Wikivoyage page; if the value differs or is missing, it will be created/updated.
  • P2186 (WLM ID) — the statement must have the only value matching knid in Wikivoyage database (sites in Crimea may have two statements for knid and uid values); if the value differs or is missing, it will be created/updated (corresponding statements for Crimean sites will be supplied with qualifiers P642 (of) Q159/Q212).
  • P361 (part of) — for parts of heritage ensembles — the statement must have a value matching Wikidata entity describing the whole ensemble; if such statement is missing, it will be added.
  • P5381 (EGROKN ID), P8316 (sobory ID), P9343 (temples ID) — the corresponding statement must have the only value matching such value (knid-new/sobory/temples) in Wikivoyage database; if the value differs or is missing, it will be created/updated.
  • P31 (instance of) — the statement must have a value matching typology of heritage site: for a monument of urban planning and architecture — Q2319498 (landmark), for a historical monument — Q1081138 (historic site), for a piece of monumental art — Q4989906 (monument), for a archeological monument — Q839954 (archaeological site), for a historical settlement — Q3920245 (historical city in Russia); if such value is missing, it will be added. The bot removes the statement P31 having the value of Q8346700, because it is supposed for use in P1435 statements.
  • P1435 (heritage designation) — the statement must have a value matching protection status of heritage site: Q105835774, Q23668083, Q105835744, Q105835766, Q105835782, Q121055800. If protection status is not specified in Wikivoyage database, the value Q8346700 is accepted. The statement P1435 must have only one value from the previous list. If such value differs or missing, it will be created/updated. Other P1435 values are not affected.
  • P131 (located in ATE) — the statement must have the only value matching Wikidata entity of municipality (munid in Wikivoyage database); if the value differs or is missing, it will be created/updated. The statement is ignored (not affected) for historic settlements (when wdid matches munid). Statements with qualifiers are not affected as well.
  • P2795 (directions) — if neither statement P6375 nor statement P669 are specified, the statement P2795 must have a value in Russian matching address in Wikivoyage database; otherwise such statement P2795 will be created/updated.
  • P625 (coordinates) ­— the statement must have the only value differing from Wikivoyage coordinates less than 30 meters, otherwise it will be created/updated. If entity has several statements P625, they are not checked and not affected.
  • P571 (inception) — if the statement is missing, but Wikivoyage database specifies year field for this object and this value may be recognized as 4-digit number, new statement P571 will be created. (If statement P571 is already exists, it is not affected).
  • P576 (abolished...) — if the statement is missing, but Wikivoyage database specifies the monument as being destroyed, new statement P576 will be created with "unknown" value. (If statement P576 is already exists, it is not affected).
  • P373 (Commons category) — if the statement is missing, but Wikivoyage database specifies commonscat field for this object, new statement P373 will be created. (If statement P373 is already exists, it is not affected).
  • commonswiki site link — if the link is missing, but Wikivoyage database specifies commonscat field for this monument and the specified Commons page has no linked entity, new link will be created. (If commonswiki link is already exists, it is not affected).
  • ruwiki site link — if the link is missing, but Wikivoyage database specified wiki field for this object and the specified Russian Wikipedia page has no linked entity, new link will be created. (If ruwiki link is already exists, it is not affected).

This tool is hosted on Wikimedia Toolforge.

During the test period, only the following pages are updated: white list of pages.

Last log:

Blacklist: Specified knids and entities will be neither checked nor updated.