<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>http://simia.net/index.php?action=history&amp;feed=atom&amp;title=Progress_in_lexicographic_data_in_Wikidata_2024</id>
	<title>Progress in lexicographic data in Wikidata 2024 - Revision history</title>
	<link rel="self" type="application/atom+xml" href="http://simia.net/index.php?action=history&amp;feed=atom&amp;title=Progress_in_lexicographic_data_in_Wikidata_2024"/>
	<link rel="alternate" type="text/html" href="http://simia.net/index.php?title=Progress_in_lexicographic_data_in_Wikidata_2024&amp;action=history"/>
	<updated>2026-05-05T18:56:52Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.32.0</generator>
	<entry>
		<id>http://simia.net/index.php?title=Progress_in_lexicographic_data_in_Wikidata_2024&amp;diff=2660&amp;oldid=prev</id>
		<title>Denny at 08:00, 22 January 2025</title>
		<link rel="alternate" type="text/html" href="http://simia.net/index.php?title=Progress_in_lexicographic_data_in_Wikidata_2024&amp;diff=2660&amp;oldid=prev"/>
		<updated>2025-01-22T08:00:05Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 08:00, 22 January 2025&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l15&quot; &gt;Line 15:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 15:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;What does the coverage mean? Given a text (usually Wikipedia in that language, but in some cases a corpus from the Leipzig Corpora Collection), how many of the occurrences in that text are already represented as forms in Wikidata's lexicographic data. Note that every percent more gets much more difficult than the previous one: an increase from 1% to 2% usually needs much much less work than from 91% to 92%.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;What does the coverage mean? Given a text (usually Wikipedia in that language, but in some cases a corpus from the Leipzig Corpora Collection), how many of the occurrences in that text are already represented as forms in Wikidata's lexicographic data. Note that every percent more gets much more difficult than the previous one: an increase from 1% to 2% usually needs much much less work than from 91% to 92%.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;See also [[Progress in lexicographic data in Wikidata 2023|last year's progress]]&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;See also [[Progress in lexicographic data in Wikidata 2023|last year's progress]]&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;{{tag|Simia}}&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;{{tag|Simia}}&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;noinclude&amp;gt;{{simiapost|english}}&amp;lt;/noinclude&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #222; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;noinclude&amp;gt;{{simiapost|english}}&amp;lt;/noinclude&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Denny</name></author>
		
	</entry>
	<entry>
		<id>http://simia.net/index.php?title=Progress_in_lexicographic_data_in_Wikidata_2024&amp;diff=2659&amp;oldid=prev</id>
		<title>Denny: Created page with &quot;{{pubdate|{{subst:CURRENTDAY}}|{{subst:CURRENTMONTHNAME}}|{{subst:CURRENTYEAR}}}}  Here are some highlights of the progress in lexicographic data in Wikidata in 2024  * [https...&quot;</title>
		<link rel="alternate" type="text/html" href="http://simia.net/index.php?title=Progress_in_lexicographic_data_in_Wikidata_2024&amp;diff=2659&amp;oldid=prev"/>
		<updated>2025-01-22T07:59:31Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;{{pubdate|{{subst:CURRENTDAY}}|{{subst:CURRENTMONTHNAME}}|{{subst:CURRENTYEAR}}}}  Here are some highlights of the progress in lexicographic data in Wikidata in 2024  * [https...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;{{pubdate|22|January|2025}}&lt;br /&gt;
&lt;br /&gt;
Here are some highlights of the progress in lexicographic data in Wikidata in 2024&lt;br /&gt;
&lt;br /&gt;
* [https://www.wikidata.org/w/index.php?title=Wikidata%3ALexicographical_coverage%2Fha%2FStatistics&amp;amp;diff=2296112898&amp;amp;oldid=2028939807 Hausa]: jumped from 1.5% coverage right to 40%&lt;br /&gt;
* [https://www.wikidata.org/w/index.php?title=Wikidata%3ALexicographical_coverage%2Fda%2FStatistics&amp;amp;diff=2296112264&amp;amp;oldid=2038031859 Danish]: Danish also made another huge jump forward, increasing the number of forms from 170k to 570k, form coverage from 33% to 52%, and token coverage from 87% to 92%&lt;br /&gt;
* [https://www.wikidata.org/w/index.php?title=Wikidata%3ALexicographical_coverage%2Fit%2FStatistics&amp;amp;diff=2296112477&amp;amp;oldid=2028938353 Italian]: Italian made another huge push, increased the number of forms from 290k to 410k, and the coverage from 83% to 93%&lt;br /&gt;
* [https://www.wikidata.org/w/index.php?title=Wikidata%3ALexicographical_coverage%2Fes%2FStatistics&amp;amp;diff=2296112316&amp;amp;oldid=2038032305 Spanish]: Spanish also kept pushing forward, increasing the number of forms from 440k to 560k, and the coverage from 91.3% to 91.8%&lt;br /&gt;
* [https://www.wikidata.org/w/index.php?title=Wikidata%3ALexicographical_coverage%2Fnn%2FStatistics&amp;amp;diff=2296113002&amp;amp;oldid=2038035697 Norwegian] (Nynorsk): increased the number of forms from 67k to 88k, and coverage from 80% to 82%&lt;br /&gt;
* [https://www.wikidata.org/w/index.php?title=Wikidata%3ALexicographical_coverage%2Fcs%2FStatistics&amp;amp;diff=2296112252&amp;amp;oldid=2038031815 Czech]: increased the coverage from 63% to 69%, the number of forms from 190k to 210k&lt;br /&gt;
* [https://www.wikidata.org/w/index.php?title=Wikidata%3ALexicographical_coverage%2Fta%2FStatistics&amp;amp;diff=2296113062&amp;amp;oldid=2038036123 Tamil]: almost doubled the number of forms from 3800 to 6600, increasing coverage from 8% to 11%&lt;br /&gt;
* [https://www.wikidata.org/w/index.php?title=Wikidata%3ALexicographical_coverage%2Fbr%2FStatistics&amp;amp;diff=2296112836&amp;amp;oldid=2038034728 Breton]: added 1000 new forms, increasing the coverage from 67% to 71%&lt;br /&gt;
* [https://www.wikidata.org/w/index.php?title=Wikidata%3ALexicographical_coverage%2Fhr%2FStatistics&amp;amp;diff=2296112426&amp;amp;oldid=2038032898 Croatian]: increased from 4k to 5.5k forms, improving coverage from 45% to 48%&lt;br /&gt;
&lt;br /&gt;
What does the coverage mean? Given a text (usually Wikipedia in that language, but in some cases a corpus from the Leipzig Corpora Collection), how many of the occurrences in that text are already represented as forms in Wikidata's lexicographic data. Note that every percent more gets much more difficult than the previous one: an increase from 1% to 2% usually needs much much less work than from 91% to 92%.&lt;br /&gt;
&lt;br /&gt;
See also [[Progress in lexicographic data in Wikidata 2023|last year's progress]]&lt;br /&gt;
&lt;br /&gt;
{{tag|Simia}}&lt;br /&gt;
&amp;lt;noinclude&amp;gt;{{simiapost|english}}&amp;lt;/noinclude&amp;gt;&lt;/div&gt;</summary>
		<author><name>Denny</name></author>
		
	</entry>
</feed>