User:LennardHofmann/GSoC 2022/Report 5

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
<nowiki>Google Summer of Code; Google Summer of Code; Google Summer of Code; Google Summer of Code; Google Summer of Code; تابستان کد گوگل; Инициатива на Google „Лято на кода“; Google Summer of Code; Google夏日程式碼大賽; Google Summer of Code; Google Summer of Code; Google夏日程式碼大賽; गूगल समर औफ कोड; 구글 서머 오브 코드; Google Summer of Code; Google Summer of Code; Google Summer of Code; Google Summer of Code; Google Summer of Code; Google Summer of Code; Google Summer of Code; Google Summer of Code; Google Summer of Code; Google Summer of Code; Google Summer of Code; Google Summer of Code; קיץ הקוד של גוגל; Google Summer of Code; Google Summer of Code; ගූගල් කෝඩ් වසන්තය; Google Summer of Code; صيف جوجل البرمجي; Google编程之夏; Google夏日程式碼大賽; programa anual que ofrece proyectos de software de código abierto a estudiantes desarrolladores de educación superior; Googleがフリーソフトウェアやオープンソースのプロジェクトを指定し、その夏に課題をクリアした数百人の学生に賞金を支払う制度; jährliches Programmierstipendium von Google; programme annuel organisé par Google visant à promouvoir le développement du logiciel libre.; annual program that offers open-source software projects to post-secondary student developers; رویداد علوم رایانه; stipendia studentům za programování na vyhlášených projektech; විශ්ව විද්‍යාල සිසුන් වෙනුවෙන් වාර්ශිකව පැවත්වෙන වැඩසටහනක්; Summer of Code; GSoC; Summer of Code; GSoC; GSoC; Google Summer of Code; Summer of Code; Summer of Code; GSoC; GSOC; google summer of code; Summer of Code; GSoC; Summer of Code; GSoC; Google编程之夏; Google Summer of Code; GSoC</nowiki>
Google Summer of Code 
annual program that offers open-source software projects to post-secondary student developers
Upload media
Instance of
Organizer
  • Google Open Source
Founded by
Start time
  • 2005
Inception
  • 2005
official website
Authority file
Edit infobox data on Wikidata

Over the last three months, in exchange with my mentor Mike Peel, I have been rewriting the Wikidata Infobox in Lua. The Infobox is shown on over 4 million category pages on Wikimedia Commons, a free and multilingual media repository, to inform readers about the topic of a category and help them browse the category system. A sample infobox, which displays information from Wikidata item Q1324301 and adapts to the language you have selected, can be seen on this page.

Now that Google Summer of Code 2022 is coming to an end, I would like to recap my work.

Where to find the code

Most of the time was spent removing code from Template:Wikidata Infobox/core, rewriting it in Lua, and adding it to Module:Wikidata Infobox/sandbox. Before the rewrite, that module was just a collection of helper functions—now, it produces the entire infobox. My changes to WikidataIB, the Infobox's biggest dependency, are listed here.

Recap

I documented my journey in four blog posts:

My work has not always been as exciting as the reports may suggest: I spent hours trying to make sense of complicated Wikitext expressions like this one:

{{#if:{{#property:P1950 | from={{{qid|}}}}} | {{#if:{{#property:P735 | from={{{qid|}}}}} | {{#invoke:WikidataIB |getValue |rank=best |P1950 |qid={{{qid|}}} | name=P1950 | |fwd={{{fwd|ALL}}} |osd={{{osd|no}}} |noicon=yes | linked=n | spf={{{spf|}}} | prefix="[""[Category:" |postfix=" (surname){{!}}{{#invoke:Wikidata Infobox|stripDiacrits|{{#invoke:WikidataIB |getValue |rank=best |P735 |qid={{{qid|}}} | name=P735 | |fwd={{{fwd|ALL}}} |osd={{{osd|no}}} |noicon=yes | linked=n | spf={{{spf|}}} | lang=en | sep=" "}}}}]]" | lang=en | sep=" "}} | {{#invoke:WikidataIB |getValue |rank=best |P1950 |qid={{{qid|}}} | name=P1950 | |fwd={{{fwd|ALL}}} |osd={{{osd|no}}} |noicon=yes | linked=n | spf={{{spf|}}} | prefix="[""[Category:" |postfix=" (surname)]]" | lang=en | sep=" "}} }} | }}

Recently, I added documentation on how to copy the Infobox to other wikis.

I have learned a lot about writing Scribunto modules, optimizing Lua code, and the edge cases of fetching data from Wikidata. Some of these edge cases are tracked at Wikidata infobox maintenance to help users improve Wikidata, see e.g. items with no claims.

This was my biggest coding project so far in terms of userbase and size (the Lua module has over 1800 carefully crafted lines), but I'm really happy with how it tuned out.

Results

The main purpose behind the rewrite was to improve the Infobox's performance because it took around 3.5 seconds to render a small category page like South Pole Telescope. Thanks to the rewrite, it now takes only half a second, and there are just a handful of category pages left that take longer than 3 seconds to load. Also, categories with very big Wikidata items like COVID-19 pandemic in Colombia can finally be rendered within MediaWiki's Lua memory bound.

What remains to be done

As Wikidata and Commons evolve, work on the Infobox will never be finished. Open tasks and feature requests can be found on the template's talk page. However, most of these tasks are either difficult to implement (date formatting) or need more discussion.

I will stick around and continue to fix bugs reported on the talk page.

Previous post: Report 4