Difference between revisions of "Main Page"

From Simia
Jump to navigation Jump to search
imported>Denny
 
(17 intermediate revisions by 4 users not shown)
Line 1: Line 1:
<ask default="None yet" format="embedded" limit="2" sort="published" order="desc">[[Category:Blog post]] [[published:=+]]</ask>
+
__NOTOC__
 +
<div align="center"><big>[[Deutscher Blog|Nur Deutsche Beiträge]] - [[English blog|English posts only]] - [[Content|Other contents of Simia]]</big></div>
  
{{#ask:[[Category:Blog post]] [[published::+]]
+
{{#ask:
 +
[[Category:Blog post]] [[published::+]]
 +
|?published
 +
|?about
 +
|default=None yet
 +
|format=embedded
 +
|limit=10
 +
|sort=published
 
|order=desc
 
|order=desc
|sort=published
 
|limit=2
 
|format=embedded
 
|default=None yet
 
 
}}
 
}}
  
__NOTOC__
+
<div align="right"><big>[[Archive]] - [[Feeds|Subcribe to feed]]</big></div>
 +
 
 +
__NOFACTBOX__

Latest revision as of 15:42, 19 March 2019

Nur Deutsche Beiträge - English posts only - Other contents of Simia

The new thing about the Semantic Web is not the semantics, it's the Web

In my 2025 ISWC keynote (publication of recording is pending) I was using some text which I considered a quote: “The new thing about the Semantic Web is not the semantics, it's the Web.”

Now I thought the quote was by Chris Welty, and I thought it was in his 2007 ISWC keynote (what a nice recall!), but I didn't have the time to actually check it and I could be wrong.

On arrival in Nara I met Natasha Noy, and asked her if she remembered the quote. She did not, but said it didn't sound like something that Chris would say, and it sounded more like something Jim Hendler would say.

So in the talk itself, I had the quote, but said that I was unsure to whom to attribute it.

Later, Juan Sequeda and I were trying to figure out where the quote is from. Juan just asked Chris via messenger (he wasn't in Nara), but Chris said it's nothing he remembers saying. I think he also thought it sounds more like something Jim would say.

I asked Gemini, and Gemini thought it was Tim Berners-Lee. Gemini Deep Research on the other hand was guessing Jim Hendler, with input from Dean Allemang and maybe Tim Berners-Lee. Juan asked a paid version of ChatGPT, and that actually found the quote (“In the Semantic Web, it is not the Semantic which is new, it is the Web which is new”) in the Semantic Web FAQ by the W3C, attributed to Chris Welty. Finally, I asked ChatGPT (free version), and it attributed the quote to -- well, Denny Vrandečić. Myself. Very funny

So, I finally watched the 2007 keynote, and indeed: he gets very close to the quote, by combining what he said: “The emphasis is on Web, not on Semantic.” (24:20-25:00) and “It wasn't the KR in Semantic Web that was novel, it was the Web that was novel.” (39:40-40:00).

Now I can confidently attribute the quote to Chris Welty.

Update: there has been a twist to the story! Kingsley Udehen found the quote in slides by Tim Berners-Lee wrapping up ISWC 2005, Slide 3, two years before Chris' keynote. But -- attributed to Chris Welty! So the conclusion stands, just with an even older provenance.

Simia

Kicking off the naming contest for Abstract Wikipedia

Before we started Abstract Wikipedia, it was always clear that "Abstract Wikipedia" was just a working name for the project, not the name that the Website would eventually get. It was intentionally chosen to be minimally descriptive --it is abstract in the sense that it captures encyclopaedic content abstracted from a concrete natural language-- but also not a very good name, because it is confusing: people confuse it with a Wikipedia of Abstracts (short summaries of articles), or think of math or art.

Thus we are kicking off a naming contest and were inviting all Wikimedians to join! Already, more than a dozen names are there, and you can suggest more, and vote for your favorites.

Let's find the best possible name!

Simia

Keynote at ISWC 2025

It is an honor to be invited as a keynote speaker to the ISWC - International Semantic Web Conference this year in Nara, Japan.

This is particularly exciting for me, because during my PhD research, ISWC has been my "home conference" - the prime conference in my research area, where the research community that I felt affiliated with was meeting. It will be 20 years since I attended my first ISWC, in Galway, and every year I attended I enjoyed both the academic program and the opportunity to meet friends again. I am very much looking forward to meet friends again, some which I haven't seen for many years.

The title of the talk will be:

"Wikipedia and the Semantic Web - Celebrating 20 years of co-development, and the future"

If there's anything I should mention, let me know. The full abstract and how to register for the conference, can be found here: ISWC 2025 Keynotes abstracts and speakers

I would be very excited to see you there, either to see you again, or to meet you there for the first time!

Simia

Powerball jackpot at $1.7bn

The Powerball jackpot is around $1.7 billion. Given that a ticket costs $2 and the chance to win the jackpot is about 1:300 million it seems almost rational to buy a ticket. I mean, I'd buy a ticket or two if I would still live in the US.

Or, as a friend suggested, take $600 million and buy 300 million tickets and win for sure.

Would that work?

Yes, it would work in the sense that you would have a guaranteed jackpot win. But does it pay off?

Let's assume you don't have $600 million in cash lying around and instead get a loan (ha, sure!) for that amount (if you're reading this and you *do* have $600 million in cash lying around, the analysis would be quite different, interestingly, but in that case send me a DM and I'll sell you the analysis).

Now, there are a few considerations:

1. You will be paying taxes on the prize money. At least 37% in federal taxes, and then between 0% and 11% in state taxes.

2. The $1.7 billion is the sum that you get over 30 annual payments, where each payment is 5% more than the previous one. That starts at about $13-$16 million after taxes, depending on your state. That will likely not be enough to cover the interest on the $600 million, but that depends on your loan terms.

3. You can opt for a one-time payment instead of the 30 years of annual payments. But in that case it's "only" $770 million you get. Or between $400-$500 million after taxes.

4. The biggest risk is that someone else might win too. In that case you split evenly with every other winner. That means a single other winner reduces your payout by 50%. If there are more, you get even less.

5. On the other side, you will actually get more than just the Jackpot. Because you bought 300 million tickets, several millions of those tickets will win something. Altogether, that's another $90 million roughly. Again, before taxes.

Seems close cut so far, but...

6. Here comes the kicker -- you can actually deduct the cost of the lottery tickets against the taxes on lottery winnings, if you itemize them. So better keep the receipts!

That would reduce the taxes considerably, if you take the one-time payment (with the annual payments, it probably won't last for 30 years). So the total calculation for the one-time payment would be:

Total win $770 million + $90 million = $860 million. $600 million is tax free, on the other $260 million you'll have 37-48% tax, you'll still end up with more than $130 million profit, with which you'll be able to pay off the interest on the short-term loan easily.

But because of consideration nr. 4, and given that there are many other players given that huge jackpot, you'll likely have to split. And if that happens, your losses will be catastrophic:

Assuming a single other winner, your total prize money drops to $385 million + $90 million (that part is fixed) = $475 million. Fortunately it's all tax free (because, well, you didn't actually make any money), unfortunately you're now at least $125 million worse off, plus interest. If more people win, it gets worse.

My recommendation: if you're into this kind of things, buy one ticket (the first ticket increases your chances by a lot, the second ticket merely doubles it), dream about what you'd do with the money for a day or two, but have no expectations to win. Chances are you won't (there's a reason it's called a tax on stupidity).

And definitely don't do the "buy every single ticket" scheme with loaned money. That will end badly.

P.S. (a few days later): the Powerball was won by two tickets, and thus had to be split.

Simia

Tech companies and the size of the market

If you think there might be an AI bubble, but don't worry about it, even if it pops, how bad can it be?

Here's one number: seven tech companies (NVIDIA, Microsoft, Google, Apple, Meta, Tesla and Amazon) are worth about 20 trillion together, representing a third of the value of the total US stock market.

For context: total subprime mortgages in 2008 were 1.3 trillion, total US mortgage debt was 10 trillion.

You have a 401k or have invested in index fonds? Your money is very likely significantly tied with the AI market. If you want that, that's fine. If you're worried it might be a bubble ---

Simia

Additions with unique digits: a tale of puzzling and AI

The other day, Markus Krötzsch and I were catching up when one of his kids came in. She told us about a puzzle her math teacher gave her:

How many integer sums are there where the equation uses each digit at most once?

The simplest example is 1+2=3, but 12+47=59 also works. On the other hand, 1+9=10 isn’t a solution because the digit ‘1’ appears twice (in the first number and the sum). Markus’s wife Anja was the first of us to find a solution using three-digit numbers.

I recommend trying to solve it yourself at this point, and then come back to this post. It is much more fun like that.

Thinking for a moment, we realized there seemed to be quite a few solutions. But how many? As computer scientists, our immediate thought was to write a program. First, though, we estimated the complexity of a brute force approach: if we tried all possible permutations of the digits 0 to 9, that’s 10! permutations, or about 3.6 million. A computer should work through that pretty quickly.

I started writing code, and did so even more inefficiently. I included ‘+’ and ‘=’ with the 10 digits, then iterated through all permutations (and shorter versions) to check for solutions. This ballooned the candidate pool to an upper bound of 12!x10, or about 5 billion. I knew there was plenty of room for improvement – the code wasn’t smart – but it should find all the results. I started running it. At this point, I was pretty sure this was the kind of problem that with raw computational power can be solved faster than taking the time to come up with 'smart' code and better heuristics. My full code is here.

As my code ran, Markus took it as a fun challenge and set out to create a smarter solution — which, admittedly, shouldn’t be too hard. Instead of iterating through all possible (and many impossible) equation strings, he only permuted the numbers, applied a maximum length, and checked those. Markus’ code was impressively fast – just 4 seconds! It seemed like we had found the 'right' way to solve it, using better heuristics. Or had we?

Soon after, my code finished, taking 20 minutes. We both had the same answer! (You can find the answer in the linked material, but I am keeping it out here in case you want to try it out yourself). Having two different approaches with the same result gave us high confidence in the answer.

Now, that’s where it could have ended. But the kids were teasing us: it took us forever to write the code (less than half an hour, but still), why didn’t we just ask ChatGPT for the answer?

We were pretty convinced there’s no chance ChatGPT would be able to answer that. So we asked. Markus formulated the prompt (in German, due to my prodding, so it’s more fun for his kid watching), and ChatGPT started thinking.

I have never seen it thinking for so long. Almost seven minutes.

We used that time to laugh and joke about it. No way it would figure it out! We were looking at the steps it was describing. They didn’t seem too bad. What is a cryptarithm? Never heard of that. We were getting ready to amuse ourselves by watching and dissecting ChatGPT’s answer.

Boy, we were wrong!

It returned the same number we had found. The answer was great! Not only that, we looked into the thinking process and found the code that ChatGPT wrote. That code ran in fractions of a second. It wasn’t just faster; the way it built the equations, digit by digit, felt like a fundamentally different strategy than our permutations. Here is a link to the full conversation (local archive). Where my code ran in 20 minutes and Markus’s in 4 seconds, ChatGPT’s code ran in 60 milliseconds.

We were completely baffled. I searched the Web for this riddle and solutions, hoping to find indications it had been solved before, that it was part of the corpus. It might be the case, but we didn’t find it anywhere. ChatGPT came up with more efficient code, and was faster to write it. We were deeply impressed, and were left with two questions, without immediate answers:

  1. Given that this is a language model, how did it actually manage to reach this solution? Why did this work?
  2. Given that ChatGPT can do this, what does this mean? What are the consequences of having such an ability available? How does this change things?

I kicked off my local Ollama with OpenAI’s recent Open Weight model (120 billion parameters), using Markus’s prompt. I let it run in the background while Markus and I kept chatting. It took a long time – tens of minutes? I didn’t keep track – and we were again confident it wouldn’t be able to answer the question. This was mostly because my local Ollama doesn’t have access to a Python interpreter. And indeed, it answered that there were 45 solutions, very much off from the 1991 solutions we were looking for.

We were a bit relieved. It was getting late, so we called it a night. Seeing Ollama struggle initially, giving me numbers far off the mark, actually reinforced my earlier skepticism. It felt like my intuition about what these models couldn't do was being confirmed – at least for a moment. I asked the system to list all 45 solutions, and it did. I noticed that 1+2=3 was missing. I asked if that would be a correct answer and if it wanted to reconsider its number of solutions, then left it to run while I went to sleep.

The next day I checked the answer. The model had increased its answer to 124 solutions and wrote new code. It told me I should run that Python code myself, as a next step. So, again, the wrong answer.

And then I did run the code. From both the first answer, and the second, improved answer. If you actually run the code, both of them result in the correct answer! The first version took about 20 seconds. The second version then took almost half an hour to run, but I think that’s only because at this point the LLM was becoming a bit extra cautious; it even checked for five-digit numbers, although four-digit numbers were sufficient. I took the LLM’s last code, made some minor clean-up to the code, and reduced it to four digits again, and the code gives the correct answer in 20 seconds again. My complete conversation with gpt-oss is available here.

Not only that: I would even claim that this cleaned-up code provides the easiest to read and understand of all the solutions we discussed so far. It iterates over all possible numbers up to four digits, making sure the second number is always bigger than the first, gets their sum, checks the result for digit uniqueness (quite elegantly using a set!), and counts the solutions. The code is short and, I’d say, elegant.

Markus has also written a second version, which beats the one by ChatGPT in efficiency, running in only 20 milliseconds', and probably being so fast that the Unix time command doesn't measure it reliably.

I am impressed by the answers by the models (and by Markus as well). I think this is a great example of my biggest problem with all of these models, something I’ve been saying for years: I cannot explain to someone else what these models are good at, and what they are not good at.

But I thought I had a good intuition myself. That I roughly knew what kind of questions they would answer well, and which ones I shouldn’t trust them with. What my experience from this experiment showed me is that I do not. I don’t know what the capabilities of these models are.

Given the recent news around the math olympiad, should I have expected such a result? Was this just a toss-up? Do these systems now reliably solve problems like these? Would we have trusted ChatGPt’s result, or would we have been skeptical, if Markus and I hadn't calculated the result for the riddle ourselves?

I am tempted to rationalize it away, to think that the problem was simpler than we made it, that my complex code involving permutations was a red herring, and that’s why the systems could solve it. Maybe that’s just me dealing with the AI effect, where we downplay AI achievements once they happen.

Thanks for reading so far. And big thanks to Markus’s kids for bringing up this riddle, and their math teacher for asking it.


Simia

Wikipedia is an encyclopedia

Just happy to post that Wikifunctions can now create some simple sentences in multiple languages. More information on the Abstract Wikipedia newsletter.

Simia

Oh! Dalmatien

oh! Dalmatien von Doris Akrap erschien vor wenigen Wochen, ein sogenannter Insider-Guide, im Folio-Verlag. Darin enthalten sind drei Dutzend kurze Artikel, jeder etwa drei Seiten lang, mit Geschichten aus und zu Dalmatien. Zum Dalmatinischem Essen und Trinken, zur Dalmatinischen Lebensweise, der lokalen Sprache, den Winden der Region und welchen Einfluss sie auf das Gemüt haben, interessante Anekdoten zur Kultur und zur Geschichte.

Die Schreibweise ist durchgehend unterhaltsam und flott, die Geschichten hängen nicht zusammen und können in beliebiger Reihenfolge gelesen oder übersprungen werden, sie sind kurz, so das man sie in kurzen Pausen lesen kann -- die perfekte Reiselektüre für Dalmatien.

Besonders gefallen hat mir das Einweben der Immigrationskindererfahrungen der Autorin, deren Echo ich in meiner eigenen Familiengeschichte fühlen konnte. Wieviel ihrem Vater das Haus in Dalmatien bedeutete. Wie sehr sie selbst daran hing und wie das Haus sie durch ihr Leben begleitete, auch über den Tod ihres Vaters hinaus. Die dramatische Familiengeschichte im zweiten Weltkrieg. Aber all das überschattet nicht die Leichtigkeit des Buches, und es bleibt eine unterhaltsame und dennoch lehreiche Sommerlektüre, die einen Einblick in diese wunderschöne Region erlaubt.

Selbst der Artikel über Brač hat mir unerwarteter weise etwas völlig unbekanntes eröffnet, die Band Valentino Bošković und ihre sehr absurde Musik und Mythologie.

Das Buch hat sich für mich gelohnt, und ich kann es nur empfehlen!

Simia

Fresh bread

31 May 2025

"A loaf of bread please"

"I'm sorry, I can't cut it because it's still hot."

"GIVE. IT. TO. ME."

Simia

News about Abstract Wikipedia

Some material about Abstract Wikipedia from the last few days:

Simia

... further results

Archive - Subcribe to feed