Why Johnny LLM can’t read web page source code

First of a series of posts by John Morgan and Philip Greenspun on using AI models to perform HTML and CSS design.

The plan: Compare ChatGPT, Gemini, Claude, and Grok to see which LLMs do the best job on CSS and, if necessary, HTML redesign of existing web pages.

All of these models failed at the simplest imaginable HTML/CSS question (not hyperlinks in the queries, but included here for convenience):

Do all four of these HTML pages reference the same .css file? https://philip.greenspun.com/samantha/ https://philip.greenspun.com/samantha/samantha-X https://philip.greenspun.com/photography/ https://philip.greenspun.com/photography/learn/interior

All four pages actually do reference the same .css file, /margins-and-ads.css, and one file also references /philg.css

Sample actual page content:

<head>
<meta name="keywords" content="interior photography">
<LINK REL=STYLESHEET HREF="/margins-and-ads.css" TYPE="text/css">

<title>How to Photograph Architecture (Interior)
</title>
</head>
<head>
<link rel=stylesheet href="/margins-and-ads.css" type="text/css">
<LINK REL=STYLESHEET HREF="/philg.css" TYPE="text/css">
<title>Chapter X:  Overcharged in Katmai</title>
<meta name=processor content=adp>
</head>

from ChatGPT

I checked the content of the four pages you listed, but from the snippets returned there is no visible <link rel=”stylesheet” …> line in the parts shown — that’s because the portions we could fetch start at the body text and don’t include the <head> section where CSS is normally referenced. … If you want precise verification, I can fetch and show the full <head> source (including the exact <link …> line) for each of the four URLs — just let me know.

[after “Yes please fetch the full HEAD source for each one and verify.”]

I tried to fetch the full HTML <head> source of each of the four pages you listed — specifically to look for the <link rel=”stylesheet” …> tag that shows which CSS file (if any) each page references. However, the tool returns only the rendered body content and not the complete <head> section, so I couldn’t directly inspect the <link> tags for CSS in the fetched results.

from Claude

I should note that the web_fetch tool extracts text content and may not always show the complete HTML <head> section where CSS references typically appear. The pages I did fetch appear to be very minimalist in design, consistent with early-to-mid 1990s web design when these pages were originally created.

Without being able to see the complete HTML source including the <head> sections of all four pages, I cannot definitively confirm whether they all reference the same CSS file.

[prompted further]

Now I have all four pages. Let me examine them for CSS file references. Looking at the extracted HTML content from all four pages, I notice that none of them contain explicit <link> tags to external CSS files in the content that was extracted.

from Gemini

[Us: Google’s smartest child has trouble counting and trouble understanding HTML, but it is great at presentation. It says that three pages reference no external CSS and then shows a table featuring two files that reference CSS. It is dead wrong about the interior page, which references margins-and-ads.css. It either cross-contaminated its own thinking by looking at the samantha-X file, which does reference philg.css or found a reference to the file in the BODY of https://philip.greenspun.com/panda/html.]

No, they do not all reference the same CSS file. In fact, three of the pages appear to use no external CSS at all, while one of them uses a specific stylesheet for that section of the site.

Based on the source code for each page:

  1. https://philip.greenspun.com/samantha/
    • CSS: None.
    • Details: This page uses old-school HTML attributes (like bgcolor=white and text=black in the <body> tag) for styling rather than an external CSS file.
  2. https://philip.greenspun.com/samantha/samantha-X
    • CSS: None.
    • Details: Like the index page for this book, individual chapters are formatted using plain HTML and inline attributes.
  3. https://philip.greenspun.com/photography/
    • CSS: References /philg.css
    • Details: This page contains a <link rel="stylesheet" href="/philg.css" type="text/css"> tag in the <head> .
  4. https://philip.greenspun.com/photography/learn/interior
    • CSS: References /philg.css
    • Details: As part of the photography tutorial section, it shares the same stylesheet as the photography index.

Limitation Rationale

Why are these LLMs so hobbled? Apparently, it is intentional for the LLM to see only what a user might see. This ensures that dynamic sites are properly understood and also may prevent some security risks.

When prompted, ChatGPT replied that this limitation is due to its web content fetching tool not returning the full payload and instead querying “search indexes, reader-mode extracts, preprocessed HTML snapshots, [and] content extraction pipelines” that are “optimized for reading and summarizing, not inspection.” It explained that “in those pipelines, the <head> is usually stripped entirely or reduced to a title and a few meta fields [because] CSS links, JS, analytics, preload hints, etc. are irrelevant to text extraction [and] keeping them increases size and complexity with no benefit for search/reading.” If true, this suggests that the tool was not built with full HTML analysis in mind but rather optimized for the admittedly more common text extraction use case.

Next Steps

Since our future AI overlords won’t download a web page from the Internet and click “view source”, the next step was to see what these LLMs could do if we downloaded HTML and CSS files to our desktop computers and then uploaded the files back up to the LLMs along with our questions.

Full post, including comments

Gemini and Antigravity fixed the Bill Gates Personal Wealth Clock

Bill Gates has been in the news lately.

“Melinda French Gates says latest allegations about Bill Gates’ antics with Epstein dredge up ‘very painful’ memories” (New York Post):

Melinda French Gates said that new details of ex-husband Bill Gates’ alleged antics with Jeffrey Epstein dredge up “very painful times” from their 27-year marriage — and have left her “so happy to be away from all the muck that was there.” … “It’s personally hard whenever those details come up, right? Because it brings back memories of some very, very painful times in my marriage,” she told NPR’s “Wild Card” podcast on Tuesday.

“I’m able to take my own sadness and look at those young girls and say, ‘My God, how did that happen to those girls?’” she said.

There’s a video with Melinda Gates where she smiles as she talks about the “sadness”. Keep in mind that, via a personal relationship with the boss, she made more money than the entire team of software engineers who built Microsoft Windows. Perhaps this has occasioned some “sadness” among those who worked 100 hours per week?

(Also, she implies that Bill Gates was having sex with “young girls” (a “young girl” would be 10? 12?). Is there any evidence of that in the Epstein files? Epstein pleaded guilty to partying with paid women as young as 16 back in 2008, but is there anything definitive in these files or elsewhere to suggest that, post-2008, “young girls” were having sex in exchange for cash or other inducements with Bill Gates or any other Epstein friend? A lot of Americans seem to be energetic when it comes to condemning Jeffrey Epstein and his circle. They say that they’re passionate about “the victims”, but Epstein died in 2019 and any “victims” are adults today. There are teenage prostitutes working right now in various states and countries. The folks who energetically condemn Jeffrey Epstein don’t try to do anything about current teenage prostitutes. If they live in Maskachusetts, for example, they don’t lobby to raise the age of consent from 16 to 18 or 21.)

Thirty-one years ago, as an envious impoverished graduate student, I developed the Bill Gates Personal Wealth Clock:

The clock had broken because the U.S. Census Bureau put in additional barriers to scraping their popclock and the sites for getting stock quotes kept changing.

I fed Antigravity on a copy of the entire tree behind my web site and, incredibly, it/Gemini was able to answer questions about AOLserver configuration, Tcl API code, relational database structure, etc. It suggested fixes to the software that actually worked, e.g.,

set population_html [exec curl -s -L \
-H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" \
-H "Accept: application/json, text/javascript, */*; q=0.01" \
-H "Accept-Language: en-US,en;q=0.9" \
-H "Referer: https://www.census.gov/popclock/" \
-H "X-Requested-With: XMLHttpRequest" \
"https://www.census.gov/popclock/data/population.php/us"]

for scraping the Census web site.

I wonder if the page is conceptually broken. For a couple of decades, the page seemed to track reasonably well with (1) reports of Bill Gates’s personal stash, and (2) reports of total assets in the Gates Foundation (I considered them both to be forms of wealth for him since he controlled the foundation).

Right now, though, the grand total of $960 billion seems to be off the reservation (and we don’t need Elizabeth Warren to tell us how bad that is). I’m wondering if the explanation is Microsoft issuing shares like crazy to employees, thus diluting Bill Gates’s ownership percentage. The Google says that Gates is worth about $100 billion personally and the Gates Foundation total assets is about the same. That would put his total post-divorce wealth at $200 billion, not $500 billion. The Gates Foundation has supposedly paid out (“squandered”?) roughly $83 billion (mostly money extracted from American computer users and handed over to Africans without the U.S. Treasury ever collected a dime of capital gains tax). That still gets us to only about $300 billion. How did the discrepancy arise? Could it be that Gates was diversifying and paying capital gains taxes over the years, thus getting the mediocre returns of the S&P 500 instead of the Mag 7 returns of Microsoft? Or did he do what all of our other noble billionaires do and borrow against his appreciated stock to fund lifestyle? Google AI says that “he has borrowed hundreds of millions against his assets … [and] Despite using tax-efficient strategies, Gates has publicly advocated for higher capital gains taxes for the wealthy.”

Related (simple non-RDBMS web apps from the same era):

Full post, including comments

Professor Dr. ChatGPT, Ph.D. in Dance Theory

In order to be cruel to MIT students, tortured daily with math and physics problems that make them feel stupid, the flight planning lecture of our FAA Ground School class covers a hypothetical trip from MIT to Bennington College in Vermont. Every year I try to pick what seems like the easiest class at Bennington and remind them that, had they chosen a school more wisely, they could be getting credit for it. This year I picked “Ghostly Body- The Art of Absence”:

I stressed that they’d need to “develop new understanding of the presence of absence” (the class requires no previous dance experience so it is unclear what the students’ old understanding of the presence of absence might be).

Before I highlighted this as an example of the presence of absence of academic and intellectual rigor, I questioned ChatGPT 5.2: “Please explain to me your understanding of the presence of absence as it relates to dance.” What came back was, I think, worthy of a Ph.D. in Dance Theory:

In dance, the “presence of absence” names a paradox: what is not physically there can be as perceptible, meaningful, and active as what is. Absence is not emptiness; it is a charged field that shapes attention, sensation, and interpretation.

Here are the main ways this idea operates in dance.


1. The Body as a Trace, Not Just a Fact

Dance exists only while it is happening. Once a movement ends, it survives as afterimage, memory, or expectation.

  • A lifted arm leaves a trace in the viewer’s perception even after it lowers.
  • A dancer who exits the stage continues to “occupy” space through memory.
  • Stillness is not neutral—it holds the residue of previous motion and the promise of what might come.

In this sense, dance is always haunted by what has just happened and what is about to happen.


2. Negative Space as Active Partner

Absence appears physically as space:

  • The gap between dancers
  • Empty areas of the stage
  • The air carved by movement

Choreography often makes absence legible by framing it:

  • A duet defined by distance rather than touch
  • A solo surrounded by emptiness
  • Movement that gestures toward, but never reaches, another body

The audience reads these voids as intentional, emotionally loaded presences.


3. Stillness and Non-Movement

Not moving is one of the clearest embodiments of the presence of absence.

Stillness can signal:

  • Withheld action
  • Resistance
  • Grief or loss
  • Listening or readiness

Because dance trains us to expect motion, the absence of motion becomes loud. The body remains present, but movement is absent—creating tension and meaning.


4. The Absent Body

Dance frequently invokes bodies that are not there:

  • A partner imagined but never visible
  • A former dancer referenced by repetition of their choreography
  • Cultural or ancestral bodies remembered through gesture

This is especially powerful in works dealing with loss, exile, trauma, or memory. The dancer becomes a conduit for someone else’s absence.


5. Temporality and Ephemerality

Unlike visual art objects, dance disappears.

  • You cannot fully “return” to a dance—only to documentation or memory.
  • Each performance replaces the last.
  • The dance’s absence after the performance is part of its meaning.

The presence of absence here is existential: dance is disappearance made visible.


6. Audience Perception: Filling the Void

Absence activates the viewer.

When something is withheld:

  • The audience projects meaning
  • Memory and imagination complete the work
  • The viewer becomes a participant

What is absent invites interpretation more strongly than what is fully shown.


7. Philosophical Undercurrent

Philosophically, this idea resonates with phenomenology:

  • We perceive bodies not just as objects, but as intentional beings
  • Absence is felt through expectation and anticipation
  • Meaning emerges in the gap between perception and completion

In dance, this gap is continuously exposed.


In Summary

The presence of absence in dance means that:

  • What is missing can be felt
  • What has disappeared still acts
  • What does not move still speaks
  • What is not shown still shapes meaning

Dance is not only about bodies in motion, but about what motion leaves behind.

If you’d like, I can also explore this through:

  • A specific choreographer (e.g., Cunningham, Pina Bausch, Forsythe)
  • Somatic practice or improvisation
  • Performance philosophy
  • Or how absence operates differently in ballet vs. contemporary dance
Full post, including comments

You’re never too old for Ozempic

While in Cambridge to teach an MIT Aero/Astro class, I got together with some old friends who are, in some cases, also fairly old. One was born in 1940 and, therefore, is 85 years old. He’s on Mounjaro (Ozempic, basically, I think). “I’ve lost 40 pounds and went off two blood pressure meds. My A1c is back to normal and my sleep apnea went away.”

What else isn’t he too old for? Coding with AI. He likes to shoot air rifles and, despite minimal tech background, used vibe coding to generate 20,000 lines of code to process data from a wind/weather sensor. “I uploaded a photo of the circuit board and the AI said ‘I understand what this board is doing’ and told me where to solder.” (not sure why he had to modify the hardware, but I guess he did)

He’s reasonably rich and wants to move to Florida, at least on the 183 days/year plan, but his wife refuses to leave Maskachusetts.

Full post, including comments

ChatGPT vs. Grok for programming Microsoft Excel

At the end of last year, I wanted to clean up some mailing addresses in an Excel spreadsheet for use with our family New Year’s cards. I resent having to spend time learning any computer language other than Lisp and SQL unless that language is more powerful than Lisp and/or SQL. Thus, I’ve never become a proficient Excel programmer. Could an LLM come to my rescue?

Prompt to ChatGPT 5.2:

write a Microsoft Excel 365 formula that will take a US address and remove a trailing “US” or “US” preceded by a newline and also turn any 9-digit ZIP code into a 5-digit zip code by removing “-NNNN” where N is a digit.

Answer:

=REGEXREPLACE(
REGEXREPLACE(A1,"(\d{5})-\d{4}","$1"),
"(\r?\n)?US$",""
)

Grok’s answer:

=LET(
cleaned, TRIM(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,CHAR(10)&"US",""),CHAR(10)," ")," US","")),
lastLine, TRIM(RIGHT(cleaned, LEN(cleaned)-FIND("@",SUBSTITUTE(cleaned," ","@",LEN(cleaned)-LEN(SUBSTITUTE(cleaned," ","")))))),
zipLike, TRIM(TEXTAFTER(lastLine," ",-1,"",1)),
fixedZip, IF(REGEXTEST(zipLike,"^\d{5}(-\d{4})?$"),
LEFT(zipLike,5),
zipLike),
SUBSTITUTE(cleaned, zipLike, fixedZip)
)

What’s the difference between these, other than Grok’s epic verbosity? The Grok version results in an unexplained error from Excel while the ChatGPT code works.

Advantage: ChatGPT

Full post, including comments

AI will write such verbose code that only another AI will have the patience to maintain it

Department of AI job security: AI writes 5X as many lines of code to solve the same problem as a human. In other words, the LLMs are smart enough to write code that only their future selves will have the patience to read. See this comparison by Peter Norvig of Google (you’d think that in an entirely unbiased comparison by a Google employee Gemini would be the clear winner, but Norvig says “The three LLMS [Gemini, Claude, and ChatGPT] seemed to be roughly equal in quality.”

Speaking of job security, here is a white man who purports to be an expert on Swahili and Kwanzaology and somehow still has a job:

Full post, including comments

NYU: Forced to learn about gender non-conformity among the indigenous people of French Canada

This is the week that eager schoolwork nerds will get their Early Decision answers from the nation’s elite universities.

Our mole at NYU (over $100,000 per year including a few required extras, such as airfare and going out in Manhattan) was required to choose from a short list of core courses, only one of which had availability… French in the Americas:

Here’s a slide from the 12th week of the course:

The teacher explained to our mole that the indigenous were natural followers of Rainbow Flagism and that this native religion was suppressed by European colonizers who were also passionate gender binarists. My email to the mole:

They’re making you learn about an economically irrelevant subgroup within an economically irrelevant subgroup within an economically irrelevant country. (Natives within Quebec, which is on track to lose its language, religion, and culture to recent immigrants, within Canada, whose manufacturing output is perhaps 1/50th that of China?) It feels to me as though they’re teaching this because they have some professor who is an expert on the subject, not because any American needs to know this information. How could this possibly be justified compare to learning about the history of China, for example? Or if you want to talk about ethnic minorities, why not talk about the ethnic minorities of China or the noble Muslims who’ve settled in Europe, Canada, and the U.S. despite rampant Islamophobia?

(I later checked with Grok and learned that China does not have 50X the manufacturing output of Canada, measured in dollars, but rather only 36X.)

Here’s another slide from the same PowerPoint and I would love to know how it could relate to European migrants conquering the noble natives 200 years prior to the invention of the tank.

On the other hand, it is tough to come up with a scenario in which understanding the above images and being able to answer the “What do they have in common?” question posed by the professor would have a $100,000 value. On the third hand, maybe the ability to answer the question is worth $trillions? Let’s see how our future AI overlords do with it.

Grok:

Gemini disagrees almost completely!

ChatGPT also disagrees with Grok:

It seems as though NYU could replace all of its students with these three LLMs and still have a lively in-class discussion!

Full post, including comments

Artificial stupidity meets the bathtub faucet

Loyal readers may remember Moen Flo Artificial Intelligence Water Overlord, in which the intelligent water overlord was dumb as a rock. This post is about what happened after I got the cartridge out of the faucet.

ChatGPT confidently identified the Roman tub faucet based on a photo of the top of the cartridge. The identification came with “100% certainty”:

The Pfister cartridges it told me to buy are plastic and don’t have a splined top:

When I pointed this out, ChatGPT told me to buy some other Pfister cartridge that was obviously wrong and intended for a sink faucet, not a tub faucet. I managed to get the cartridge out and sent ChatGPT a photo:

(Of course, this was obviously false as well.)

I went to Broedell Plumbing Supply here in Jupiter. The guy at the counter quickly found a Phylrich web page with dimensions that matched the faucet. I asked ChatGPT “You sure it isn’t a Phylrich 10240?”

The Phylrich web page says that its cartridge has 16 points, not 12 as ChatGPT confidently says. When I sent ChatGPT close-up photos of the top of the old cartridge and the bottom of the handle, it found 20 splines (I counted 16). ChatGPT still wanted to replace the failed cartridge with a Pfister. It came up with a dog-ate-my-homework story:

Grok was a little better. Shown a picture of the top of the cartridge (not the entire cartridge):

The dates have to be wrong since our house was built in 2003 and I don’t think that they used vintage materials. A Google search for the suggested “Phylrich Regency” and “Phylrich Versailles” doesn’t bring up anything with dual spouts. When I pushed back on Grok it changed its mind to Newport Brass or Jaclo. When I sent a photo of the complete cartridge, Grok said that it was American Standard or Pfister. Grok seems worse in terms of hallucinating the existence of similar-looking dual-spout roman tub faucets.

The plot thickened a little further. I ordered two replacement cartridges (one hot, one cold) from Phylrich ($155 including shipping, i.e., about the same price as a Glacier Bay deck-mount tub faucet from Home Depot (bizarrely rated at 2.4 gph, which I don’t think can be right because that’s roughly Federal shower flow limit and a standard Delta tub filler is about 20 gph at 60 psi)). The cartridges fit and work perfectly. So the faucet is definitely Phylrich, right? I emailed a photo to the company’s customer service department and they say that they never made a faucet like that. ChatGPT, to its credit, did have a plausible explanation:

Many manufacturers bought cartridges from the same OEM suppliers. … Boutique brands (including Phylrich) often used “generic” brass compression stems early on. … So Phylrich’s cartridge fits simply because the valve body was designed around a widespread industry-standard stem pattern. … Your faucet is almost certainly a “private label” or discontinued OEM roman-tub set

(It still erroneously believes that the stem pattern is 20 splines and referred to that.)

Maybe I could order two of these swan sets and use two of the spouts on the existing rough-in kit? That would cost only about $10,800. That’s a mere trifle for some of our Palm Beach County neighbors.

I think the above tale at least demonstrates that (1) AI is not always ready for the real world, and (2) one should never install anything in one’s house that didn’t come from Home Depot.

Speaking of Home Depot, nearly the complete range of South Florida vehicles in the parking lot: airboat, Tesla, Rolls-Royce (I have seen Ferraris in that lot before, but not on the same day as the below photos were taken):

Full post, including comments

Image creation in Gemini vs. ChatGPT and Grok

Advice from a guy who will be paying taxes to keep the Mamdani Caliphate running: “I would ask: “May I meet you?” before engaging further in a conversation. I almost never got a No.”

Gemini applying this for Mindy the Crippler:

ChatGPT does a comparatively crummy job:

Grok is off in its own world (I could get it to use this puppyhood picture of Mindy the Crippler, but it simply ignored my request to use the one of the golden retriever standing on the windowsill of the minivan):

Full post, including comments

Who’s in love with Gemini?

Gemini 3 has been out for a couple of weeks now. Who is finding it more useful than ChatGPT, Grok, et al.?

I gave a simple tree identification task to Gemini 3, ChatGPT, and Grok. All three failed the task with supreme confidence. A plants-only image classifier handled the task nicely and without any boasting based on the following images of a neighbor’s tree:

Here’s Gemini getting it wrong:

(It’s important to have cold-hardy plants here in Palm Beach County in case it gets slightly below freezing, as it did in 1989, or briefly snows, as it did in 1977.)

ChatGPT, “almost certainly” and with a convincing explanation:

Grok, asked “What tree is this?” answers that it isn’t a tree at all:

Here’s a Yucca filamenta photo from a nursery:

What is the neighbor’s tree? Almost surely Coccothrinax crinita (Old Man Palm), an immigrant from Cuba:

I can’t figure out why all three of these AI overlords did so badly. Yes, the plant classification web site has a smaller database of images to deal with, but given my prompt with the word “tree” in it why weren’t the general purpose AI services able to narrow down their search and do as well as a plants-only image database system?

Full post, including comments