There were a couple of videographers present so I’m hopeful that eventually the lectures will appear on YouTube as some previous events in this “Expanding Horizons in Computing” series have.
During the intro, we were reminded that the first thing computer nerds want to do is get rid of computer nerds:
Inevitably, though, there have been haters. Alan Perlis:
Tim Kraska was the speaker who’d done the most to determine what LLMs can do. His grad students spent 2.5 months and about $100,000 in Claude API fees replicating the capabilities of the 500,000-line DuckDB embedded database management system but in a different language (I forget which! Sadly, not Lisp). It seems that for a complex project like this, the only people who can tell AI what to do are those who could do it themselves if they had to.
Continuing Carnegie-Mellon’s tradition as “the useful place in CS” (CMU gave us the Mach kernel, for example, which is inside nearly all Apple products), Graham Neubig talked about his experience building and using OpenHands, a system a little like Google’s Antigravity in that you can tell your “agents” to write software for you and the editor connects to the LLM of your choice. Prof. Neubig demonstrated using OpenHands to build a web site for the MIT event and the results were impressive!
How good are LLMs in practice? Contrary to my own experience where LLMs are amazing at diving into huge legacy codebases and telling the human “these are the relevant files”, AI felt good but actually slowed human programmers down:
AIs cut and paste like crazy, eventually producing code with so many duplicate blocks of code that only an AI will be able to make a change consistently through a real-world system:
I enjoyed the bathroom break. The smartest humans on the planet need a lot of coaching for the operation of sophisticated machinery:
Based on the period products in what was labeled a “men’s” room, the world’s smartest people are going to struggle with the “What is a woman?” question:
Speaking of bathrooms, the ground floor restroom signs are already falling apart in the nearly-new building. Fortunately, the sacred word “inclusive” hasn’t been marred.
Towards the end of the day, Varun Mohan showed up via Zoom to make the academics look like fools. While they were dithering to get a few papers published and secure a lifetime guarantee of employment at a wage that is 1/50th of what a receptionist at NVDIA earns, an apparent teenager glued together a few open-source developer tools and added LLM integration to create Windsurf, which Google then acquired in a non-acquisition for $2.4 billion. The result is Antigravity (see Antigravity as web developer (AI in an IDE)).
What nobody could offer at the event: A clear explanation of what skills made a person a good software developer in the Age of AI and, therefore, what an undergraduate CS program should teach. On the other hand, the slides did offer a clear picture of what a typical human software engineer looks like: female and, usually, non-Asian.
For the haters who say that there is no “science” in computer science… I learned about Science starting during the walk to the event. The hardware store in Inman Square’s most prominent sign:
Not Science-related, but I love seeing Black Lives Matter signs and here a commercial property owner had devoted a huge amount of space to one. (See Replacement of Black workers by migrants in Cambridge, Massachusetts for how the city’s merchants have kept the signs and discarded the people.)
Of course, there were the sidewalk maskers:
And the cycling maskers (note the filthy snow and trash in the background):
In the MIT event space there were 6 people sitting in front of me, 3 of whom were masked for the entire day. Here are a couple of them: I’m not sure that I understand the rationale for
The next day at the Harvard Art Museum, Arthur M. Sackler section (don’t forget that before developing ties to Jeffrey Epstein, Harvard was entwined with the Sackler family), an apparent couple in which one person doesn’t wear a mask while the other does. This has always mystified me. Partner 1 is protected by his/her/zir/their mask so only Partner 2 gets infected by SARS-CoV-2 at the public venue. Then they go home and, without masks, share a confined space for days during which time the virus can trivially hop from Partner 2 to Partner 1.
Also a mystery, the person who is afraid of catching a respiratory virus and chooses a job with guaranteed exposure to hundreds or thousands of strangers each day. The mask is great protection, I’m sure, but wouldn’t it be far safer to wear the mask while working in a regular office or warehouse with just a handful of other employees nearby?
Maybe one day an LLM will be able to explain these choices?
In working with John Patrick Morgan to see what capabilities today’s LLMs have for doing web page design and upgrades of existing web pages/sites, it occurred to me that there is a gap in the marketplace. As far as I can tell, there isn’t a good tool that a publisher can run after every CSS tweak that will crawl the entire site and look for pages that were uglified by the tweak, broken by the tweak, never responded to the tweak because they reference some other CSS file or no CSS file, etc.
There are plenty of tools that will look for changes to pages, but they consider every change to be a mistake. If the publisher has intentionally decided on a fresh new look for the site, these tools will flag every page on the site (could be tens of thousands or hundreds of thousands for a newspaper, for example) as being defective.
Gemini says that these are the best current options:
1. The Best “Set it and Forget it” Tool: Percy (by BrowserStack)
Percy is likely the closest to what you are asking for because it integrates rendering into the crawling process.
How it works: You give it your sitemap. It visits every URL on both desktop and mobile (simulated iPhone/Pixel/iPad viewports).
The “AI” part: It uses “Smart Diffing.” It ignores “noise” (like a pixel moving 1mm due to browser rendering differences) and only alerts you to actual visual changes (e.g., your navigation bar collapsing into two rows instead of one).
Why for you: It handles the “entire site” requirement well if you feed it a list of URLs or a sitemap.
2. The “Smartest” AI: Applitools Eyes
Applitools is the leader in “Visual AI.” It attempts to mimic the human eye and brain.
The “AI” part: It is smart enough to know the difference between Content changes (a date changing from yesterday to today) and Layout disasters (text overlapping an image).
Unique Feature: It can sometimes detect “broken” layouts even without a baseline if you use their specific “Layout Match Level” which looks for structural integrity, though it is primarily a comparison tool.
Mobile: It uses the “Ultrafast Grid” to render your pages on dozens of real mobile devices in seconds.
3. The “Generic Brokenness” Finder: BrowserStack Responsive Checker
If you don’t want to set up baselines and just want a tool to crawl the site and show you screenshots of every page on an iPhone X, this is the tool.
How it works: You enter your URL (or a batch of URLs). It spins up real devices in the cloud, loads your page, and takes a high-res screenshot.
The downside: It doesn’t automatically “flag” errors (it doesn’t know that the image shouldn’t be there). You still have to scroll through the gallery of screenshots yourself, but it’s much faster than checking on your phone manually.
4. The Free / Open Source Option: BackstopJS
If you are comfortable with the command line and want a free solution you can run on your own machine.
How it works: You create a
backstop.json
file with a list of your URLs.
The Workflow:
Run
backstop reference
(This crawls your current site and takes screenshots of every page).
Apply your CSS changes.
Run
backstop test
(This crawls again and compares).
Result: It generates an HTML report showing you “diffs” (difference highlights) for every page that changed.
My Recommendation for Your Upgrade
Since you are retrofitting a legacy site, BackstopJS (if you have dev skills) or Percy (if you want a paid service) is the ideal workflow:
Run a scan NOW (with the old CSS). This is your “source of truth.”
Upload the new CSS.
Run the scan again.
The tool will instantly show you exactly which 5 pages (out of 100) broke because they had unique HTML that the new CSS didn’t account for.
Me:
You say that Percy will highlight changes to pages, but if the goal of a site redesign is to change the look of the site won’t that result in Percy flagging every page as dramatically different? What would be useful is a tool that flags only those pages that have been broken or disfigured by the change, not those that have simply changed from one style to another.
That is a very sharp observation. You’ve hit on the main “pain point” of visual regression testing: when you intentionally change the design, every page becomes a “failure” in the eyes of a standard pixel-comparison tool.
To solve this, the industry has moved toward Visual AI and Layout-Only testing. Here is how you can use tools to find “broken” pages rather than just “different” pages.
1. Applitools “Layout Match” (The Intelligence Approach)
Applitools is currently the leader in this. Instead of looking at pixels, their AI builds a structural map of the page.
How it helps you: It has a “Layout” match level. This mode ignores changes in colors, fonts, and specific images. Instead, it checks if the relationships between elements are still healthy.
What it flags: It won’t flag that your text is now blue instead of black. It will flag if your text is now overlapping an image, if a column has dropped to the bottom of the page unexpectedly, or if a button is partially hidden.
The Workflow: You run it once on your old site to let the AI learn the “structure” of your content, then run it on the new site.
2. Galen Framework (The “Rules-Based” Approach)
If you want something that specifically looks for “disfigured” pages without needing a baseline at all, Galen is a powerful open-source tool.
How it works: You write a simple “spec” file that describes what a “correct” page looks like.
Example: “Image should be above text on mobile,” or “Main container should never be wider than the screen.”
Why it fits your goal: You aren’t comparing the new site to the old one. You are testing the new site against the rules of good design. If a page has a horizontal scrollbar or a margin that is 0px wide on a phone, Galen will fail that specific page.
3. Percy’s “Grouped Diffs” (The Efficiency Approach)
Percy knows that a CSS change will break everything. To keep you from losing your mind, it uses Diff Grouping.
How it helps you: If you change the global
margins-and-ads.css
and it shifts the header on 400 pages, Percy will group all 400 pages into a single “change set.”
The “Broken” Finder: You look at the first page in the group. If it looks “correct” in its new style, you hit Approve All. If 398 pages look great but 2 pages look “disfigured,” the AI often detects that those 2 pages are “outliers” (different from the rest of the group) and flags them for individual review.
4. “VisualEyes” (User Attention AI)
This is a slightly different type of AI. It doesn’t look for broken code; it looks for broken UX.
How it works: It generates “Attention Heatmaps” based on your new design.
Why it’s useful: If your CSS change accidentally makes a “Google Ad” look more prominent than your “Navigation Menu,” the AI heatmap will show a massive red blob over the ad. It flags pages where the visual hierarchy has been “disfigured” in a way that will confuse a human.
Summary Recommendation:
If you want a tool that “just knows” a page is broken:
Applitools (Layout Mode) is the most “AI” way to ignore style changes but catch structural breaks.
Google Search Console / Lighthouse (Free) is the simplest “No-Baseline” way. You can run a Lighthouse “Accessibility” and “Best Practices” audit on your site; it will automatically flag “Touch targets too close together” or “Content wider than screen”—the two most common ways a mobile site is “disfigured.”
[after a follow-up]
By default, Google Lighthouse does NOT look at your entire site.
If you open Chrome DevTools and click “Run Audit,” it only analyzes the single specific page you are currently looking at.
However, because you are dealing with a site-wide CSS redesign, you have three ways to handle this depending on how much “manual work” you want to do:
1. The Manual Way (Single Page)
If you use the standard Lighthouse tool (in Chrome or via PageSpeed Insights), you have to enter every URL one by one. This is useful for testing your “templates” (e.g., check one article page, one gallery page, and the home page), but it won’t find a broken layout on a random page deep in your archives.
2. The “Semi-Automatic” Way (Unlighthouse)
There is a popular open-source tool called Unlighthouse that specifically solves your problem.
What it does: You give it your homepage URL. It crawls your entire site, finds every link, and runs a Lighthouse audit on every single page automatically.
The Result: It gives you a dashboard showing a “Score” for every page. If your new CSS makes a specific page’s “Accessibility” or “Best Practices” score drop (for example, because text is now overlapping or a button is too small), it will stick out like a sore thumb in the list.
3. The “Google Search Console” Way (Field Data)
If your site is already live and verified in Google Search Console, Google is actually already auditing your site for you.
Under the “Experience” tab in Search Console, look at “Core Web Vitals” and “Page Experience.”
Instead of a lab test, this shows you real-world data from users. It will group your pages into categories like “Poor,” “Needs Improvement,” and “Good.”
The Catch: This data is not instant. It takes about 28 days of real user traffic to update, so it’s better for long-term monitoring than for checking a CSS change you made ten minutes ago.
Which should you use?
For your CSS project, I recommend Unlighthouse.
Since your site appears to be a collection of legacy HTML files with a shared CSS, you likely have “patterns.” Unlighthouse will let you see if your CSS change worked for the 1996 articles as well as it did for the 2007 articles by showing you a side-by-side gallery of screenshots for every page it finds.
Summary Tip: If you want a quick check without installing anything, run Lighthouse on your three most different pages (e.g., a text-heavy page like “Chapter X,” a photo-heavy page like “Architecture,” and your Home Page). If those three pass the “Mobile Friendly” test, 90% of your site is likely safe.
We set Antigravity loose on a directory of AOLserver code behind philip.greenspun.com. The code base includes ArsDigita Community System, AOLserver configuration, and an installation of WordPress that was bolted on later. Antigravity was configured to use Gemini 3 Pro as the LLM.
Question 1
This is a directory for a web site called philip.greenspun.com. The page root is the www directory. If the / file is requested what is served is philip-home.html . What causes this file to be served instead of index.html and what functions process philip-home.html before it goes out to the browser?
Answer 1 (after 6 seconds)
Identified that index.tcl handles the root request, reads SpecialIndexPage parameter from parameters/philip.tcl (set to philip-home.html ), and serves it using ns_returnfile without content processing.
Here is what I found regarding the serving of
philip-home.html
:
1. Why
philip-home.html
is served instead of
index.html
: The file
www/index.tcl
is the default directory index file (prioritized by the server configuration
DirectoryFile
in
config/philip.tcl
). When
index.tcl
executes, it checks for a parameter named
SpecialIndexPage
. This parameter is defined in
parameters/philip.tcl
(line 269) as:
ns_param SpecialIndexPage /philip-home.html
Because this parameter is set,
index.tcl
serves this file immediately.
2. What functions process
philip-home.html
: The file itself is not processed by any template functions (like ADP parsing) because:
It is served via the
ns_returnfile
command in
index.tcl
, which bypasses the ADP parser.
The
.html
extension is not mapped to the ADP parser in
config/philip.tcl
(only
.adp
is).
The functions in
index.tcl
that lead to the file being served are:
ad_parameter
(to retrieve the
SpecialIndexPage
value)
empty_string_p
(to check if the parameter is set)
ns_info pageroot
(to construct the full path for the existence check)
file exists
(to verify the file is there)
ns_url2file
(to get the absolute path for
ns_returnfile
)
ns_returnfile
(to send the file content to the browser)
Question 2
What HTTP server is this code written for and that is likely being used for the live site?
Answer 2 (immediate)
Based on the codebase analysis, the site is written for AOLserver. … [extension explanation] … It is likely running AOLserver 4.5 (or potentially a compatible version of NaviServer) given the explicit path references in the configuration.
Question about ADP tags (a niche technology copied from Microsoft ASP)
[After browsing to /photography/learn/interior.adp] Where is the photonetheader ADP tag used in this file defined?
Answer re: ADP
Question about where to tweak
I want to modify either this function or the ad_header function that it calls to add a viewport META tag to the HEAD of every ADP page that is served. Which function should be modified and what would the modification look like?
Answer about where to tweak
I recommend modifying ad_header in d:\Backups\philip.greenspun.com\philip\tcl\2-hooks.tcl because this file already acts as a “hook” to inject content (currently a CSS file) into the header of all pages. This ensures the viewport tag is applied globally, including all ADP pages.
I have created an implementation plan to make this change.
Unless wrapped in {}, the quotes within the META tag to be added need to be escaped with backslashes. Antigravity doesn’t mention this.
Question about home page rendering
I modified this file on my dev server to include a viewport META tag, but it still doesn’t render well on mobile because the image on the left is much too large. What can I do with CSS to make this look good on mobile?
Answer
Conclusion
Antigravity can be a significant time-saver when you’re groping around a code base that you don’t understand, either because you never understood it or haven’t touched it for 20 years. Much of a programmer’s time on legacy code is figuring out what files are relevant and where a reasonable place might be to insert a required modification. Antigravity is a great shortcut.
Robert Duvall has died at 95. He played Tom Hagen in The Godfather, Part II, and famously pointed out that “A lawyer with his briefcase can steal more than a hundred men with guns” (worth remembering before considering marriage; no lawyer can sue you for divorce if you aren’t dumb enough to agree to a marriage!)
And, of course, every Robinson R44 pilot who ever thought about applying a “Death From Above” sticker to the nose of the four-seater will remember Duvall as Lieutenant Colonel William “Bill” Kilgore in Apocalypse Now.
I’ll remember Duvall in the lesser-known The Apostle (1997), which he also directed (don’t miss the soundtrack!). 88 percent on Rotten Tomatoes.
Robert Duvall, who drew from a seemingly bottomless reservoir of acting craftsmanship to transform himself into a business-focused Mafia lawyer, a faded country singer, a cynical police detective, a bullying Marine pilot, a surfing-obsessed Vietnam commander, a mysterious Southern recluse and scores of other film, stage and television characters, died on Sunday. He was 95.
Throughout his career, Mr. Duvall tried to keep Hollywood at arm’s length. He preferred living elsewhere — for many years on the Northern Virginia ranch with his fourth wife, the former Luciana Pedraza, an Argentine woman 41 years his junior. They met in the 1990s in Buenos Aires, which he visited often after developing a passion for the tango.
He was a Hollywood outlier on another front: politics. He was an ardent conservative, strongly supporting Republican presidential candidates, in a film world dominated by political liberals.
From early on, Mr. Duvall enjoyed the life of a supporting actor. “Somebody once said that the best life in the world is the life of a second leading man,” Mr. Duvall told The Times. “You travel, you get a per diem, and you’ve probably got a better part anyway. And you don’t have the weight of the entire movie on your shoulders.”
More from John Morgan and Philip Greenspun on using AI when doing web development…
Intro from Philip
In 1994, when I built my first web services backed by relational database management (RDBMS) systems, every request to an HTTP (Web) server resulted in an expensive fork, i.e., the spawning of a new process within the Unix operating system. If the request was for a small computer program to run, rather than a static file from the file system, that resulted in a second fork for a CGI script. If there was a need to refer to information in an RDBMS, such as Oracle, the CGI script would initiate the login process to the RDBMS, which would check username and password and… initiate another fork for a new process to handle what would ordinarily be 8 hours of SQL requests from a human user at a desktop in the 1980s client/server world. So that would be three new processes created on the server in order to handle a request for a web page. All would be torn down once the page had been delivered to the user. Consider how much longer it takes to start up Microsoft Word on your desktop computer than it does for Word to respond to a command. That’s the difference between starting up a program as a new process and interacting with an already-running program. Here’s an illustration of the four processes ultimately involved. The top one is the user’s browser. The three below that were spawned to handle a single request.
NaviServer (later “AOLserver” after NaviSoft’s acquisition by AOL) was the first web server to combine the following:
multi-threading, enabling a request for a static file to be served without a fork
internal API, enabling small programs that generated web pages to run within the web server’s process, thus eliminating the CGI fork
pooling of database connections, e.g., four persistent connections to an RDBMS that could be used to handle queries from thousands of requests, thus eliminating the fork of an RDBMS server process to handle a “new” client on every request
The available languages for page scripts that could run inside the AOLserver were C (macho, compiled, slow for development, potential to destroy the entire web server, not just break a single page, with one mistake; see the API) and Tcl (embarrassing to admit using, simple, interpreted, safe; see the API).
As the architect of the information systems that would be available to all of Hearst Corporation‘s newspapers, magazines, radio stations, TV stations, cable networks, etc., I selected the then-new NaviServer in late 1994 in order to achieve (1) high performance with minimal server hardware investment, and (2) high developer productivity. The result was that I developed a lot of code in Tcl, a language that I hadn’t used before. It didn’t seem to matter because the Tcl was generally just glue between an HTML template and a SQL query. The real programming and design effort went into the SQL (are we asking for and presenting the correct information? Will comments, orders, and other updates from hundreds of simultaneous readers be recorded correctly and with transactional integrity in the event of a server failure?) and the HTML (will the reader be able to understand and use the information? Will comments, orders, and other updates from the reader be easy to make?).
Just a few years later, of course, other companies caught up to AOLserver’s threaded+scripts inside the server+database connection pools architecture, most notably Microsoft’s Internet Information Server (IIS). Eventually the open source folks caught up to AOLserver as well, with a variety of Apache plug-ins. And then I persuaded America Online to open-source AOLserver.
Here we are 32 years later and I’m still running code that we wrote at Hearst. The company was interested in publishing, not software products, so they graciously allowed me to use whatever I wrote at Hearst on my own photo.net web site and, ultimately, to release the code as part of the free open-source ArsDigita Community System (adopted by about 10,000 sites worldwide, including some at Oracle, Siemens, the World Bank, Zipcar, and a bunch of universities).
[One fun interaction: in 2012, I was at a social gathering with a developer from the Berklee School of Music (Boston). He talked about some legacy software in a horrible computer language that nobody knew that they had been using for 10 years to track and organize all of their courses, students, teachers, assignments, grades, etc. They’d had multiple expensive failed projects to try to transition to newer fancier commercial tools and finally, at enormous cost in dollars and time, had succeeded getting away from the hated legacy system. It turned out that he was talking about the .LRN module that we’d developed for the MIT Sloan School in the 1990s and that was then rolled into the open-source toolkit. I kept quiet about my role in what had turned into a pain point for Berklee’s IT folks…]
Our Project
As part of our experimentation with AI and web design, we asked LLMs to do some redesign work on philip.greenspun.com. They all came back and said that it would be necessary to modify the HTML flying out of the server and not merely an already-referenced CSS file. It would be technically feasible, of course, to write a script to run on the server to open up every HTML file and add
to the HEAD of every document before writing it back into the file system. This would, however, have the side effect of updating all of the last-modified dates on files that, in fact, hadn’t been touched for decades (fixable with a little more scripting) and also creating a blizzard of updates to the git version control system.
The server was already set up with a standard ArsDigita Community System feature in which a Tcl procedure would be run for every HTML file request. This obviously reduces performance, but it enables legacy static HTML pages to be augmented with navigation bars, comment links, comments pulled from the RDBMS, advertising JavaScript, extra footers or headers, etc. Instead of modifying every HTML file under the page root, in other words, we could simply modify this function to add whatever tags we wanted in the head.
The questions:
Would an LLM be able to understand a Tcl program that had grown over the decades into a confusing hairball?
Would an LLM be able to do a minimalist modification of the program to insert the one line that was needed?
Would an LLM be able to reorganize the software for improved legibility, modularity, and maintainability?
The original
The original file is about 500 lines long and contains 18 procedures. The LLM will have to figure out which procedure to modify and then pick the best place within the procedure.
If a human programmer were to do an insertion with minimum risk of breaking everything, it would probably be around line 185. The file has been read and is sitting in a variable called “whole_page”. Even if there is a “do not disturb” tag within the page (i.e., don’t try to add comments or ads) it would still make sense to add this modern META tag in the output at line 189.
The prompt
Can you add one small feature to this AOLserver Tcl program, making as few changes as possible? It implements a procedure that runs every time a .html page is requested from the server and may add comment links, Google ads, navigation bars, and other items to legacy static pages. I want it to add “<meta name=”viewport” content=”width=device-width, initial-scale=1″>” to the HEAD of every outgoing page. Please tell me the line numbers of the file where you’ve changed, inserted, or deleted anything.
Grok can’t count
Grok finds the same location that a human programmer would, but either can’t count or numbers the lines differently from how Notepad++ and Emacs count/number. What’s at line 185 in Notepad++ and perhaps 187 in Emacs is 105-110 for Grok:
Grok helpfully offers to enhance this a bit:
Grok is capable of explaining its regsub command. Here’s part of the explanation:
Grade: A! (would have been A+ if the line counting corresponded to how text editors count lines)
ChatGPT gets an A++
It took 3.5 minutes for ChatGPT to wrap its Hopper architecture brain (assuming H200 chips) around Tcl, but the results were spectacular. Unlike Grok, it counts the lines correctly. Also unlike Grok, it puts the new META tag in a variable so that changing the META tag doesn’t risk breaking the regexp. ChatGPT also prepares an updated file for download.
ChatGPT is kind of like Air Traffic Control in dealing with humans who are making mistakes. Instead of pointing out “You’re going in the wrong direction,” ATC will generally ask the pilot to “Say heading” or just say “Turn left heading 180” (if 180 was the correct heading to fly). ChatGPT recognizes that only a stupid human would want to insert the viewport META tag into an HTML page that already has one and it just includes the regexp check in an IF statement before the regsub operation: A++!
Claude gets a B-
Unlike ChatGPT, Claude runs its regsub without first checking to see if there is already a viewport META tag. Unlike Grok, it doesn’t offer an improved version that does perform a check. Unlike both ChatGPT and Grok, Claude puts this important code after the DND pages have been served and also after some random Amazon ad tags has been searched for. Like Grok, Claude puts the new meta tag right in the regsub statement, thus making maintenance riskier.
Claude has a convenient side-by-side layout for working with code.
Credit: Claude takes the interesting approach of looking for the closing HEAD tag rather than the opening one. This avoids having to have a more complex matching pattern. That lifts Claude from a C to a B-.
Gemini Pro (C isn’t just for memory leaks)
Gemini Pro combines all of the worst features of the other LLMs. It counts lines idiosyncratically, like Grok. It fails to use the /HEAD tag and also fails to do a matching pattern that will match HEAD tags with attributes. It chooses the wrong part of the file and, unlike Claude, either lacks awareness that this means a lot of DND pages will be served without the viewport tag or chooses not to explain the implication of the placement after the DND pages are served.
Antigravity
Antigravity backed by Gemini takes a slightly different approach than we saw from Gemini in the previous test. It puts the regsub after the DND pages, but before the Amazon ads. It matches a HEAD tag with additional attributes. Grade: B
Interestingly the Antigravity chat interface gives us insight into the reasoning that led to this result. In this case we see that Gemini chose this placement location not because it failed to understand the relevance of the DND option but rather because it interpreted the “do not disturb” concept that it found in the code to be a more important than the prompt’s instruction to apply the <meta> tag to all pages. As it came up with its plan it changed its mind back and forth before ultimately deciding that the best compromise was to place the code after the DND check:
I’m now grappling with the implications of “Do Not Disturb” pages. Lines 189 and beyond are relevant. The existing logic immediately returns $whole_page if the tag is present, implying an “unmolested” state. The user requested adding to “every outgoing page.” However, “unmolested” suggests no modifications, contradicting the user’s intent if the request is taken literally. I’m leaning toward not inserting the viewport tag if the DND tag exists, as that aligns with the script’s and the variable names’, intended functionality.
I’m now carefully considering the implications of inserting the viewport meta tag in the context of “Do Not Disturb” (DND) pages, aiming to
This is about a January 2-12, 2026 trip on the Celebrity Ascent from/to Fort Lauderdale via the following ports:
Tortola, British Virgin Islands
St. Johns, Antigua
Barbados
St. Lucia
St. Kitts
TL;DR: It’s a big ship, but you feel like family. The officers and staff are warm and friendly. The food is much better than on Royal Caribbean. The ship orchestra and the house band (Blue Jays with Jessica Gabrielle) were superb.
The Machine
Celebrity Ascent was completed in 2023 by Chantiers de l’Atlantique at a cost of $1.2 billion and holds about 3,300 passengers on a typical cruise, plus 1,400 crew. She’s notable for having a “Magic Carpet” that can slide up and down the ship, serving as a restaurant or bar most of the time, but also an embarkation platform for the ship’s tenders at ports where there isn’t a pier.
I don’t think she’ll win any beauty contests, but Ascent is very functional! In St. Lucia:
Note that there is no place on board to land a helicopter. If someone gets sick and needs to be evacuated, only the Coast Guard or one of the private contractors that the Europeans like to use can extract someone from the ship with a hoist.
With 73,000 hp of Wärtsilä diesel power (five engines total), I’m not sure that Greta Thunberg will want to be a customer. That said, the hull design is 22 percent more fuel efficient than older ships. How is it possible to advance the art of naval architecture, already relatively mature during the Second Punic War (2,250 years ago)? The efficiency doesn’t come from an improved hull shape, but from pushing air out at the bow and, thus, enabling the ship to ride on a cushion of air rather than clawing at the draggy water. Prof. Dr. ChatGPT, Ph.D. Naval Arch. explains:
Modern cruise ships sometimes use air lubrication systems (ALS) that pump compressed air through tiny openings in the hull—usually along the flat bottom.
1. Reduced Skin-Friction Drag
Water is ~800× denser and far more viscous than air.
Replacing direct water–steel contact with air–water contact drastically lowers friction.
Skin friction accounts for 50–80% of total resistance at cruise speeds.
2. Lower Fuel Consumption
Typical real-world savings:
5–10% fuel reduction on large ships
Sometimes higher on wide, flat-bottomed hulls (like cruise ships)
Many modern ships include air lubrication, including vessels from:
Mitsubishi Heavy Industries
Wärtsilä
Silverstream Technologies
Some notable cruise lines have retrofitted ALS systems to existing ships to improve efficiency.
Air-bubble (air-lubrication) systems have a longer history than most people realize—they date back over half a century, but only became practical for cruise ships fairly recently.
Air lubrication always worked—but it needed:
Cheap, efficient electrical power onboard
Sophisticated control software
Environmental pressure (fuel cost + emissions)
Better hull designs to keep the air where it belongs
Cruise ships finally ticked all the boxes.
The Stateroom
Booking about three weeks before departure we got literally the last room available on the ship, other than an inside cabin. We had a Concierge Class 285-square-foot stateroom including the veranda, which ends up becoming part of the room because of the top glass panel’s ability to slide vertically. It’s a clever design. Our room was laid out like the photo below, except that we had the two halves of the bed split with a night table in between. We could have used outlets on both sides of the bed, but found an outlet on only one side. The bathroom felt spacious.
The in-room HVAC doesn’t dehumidify as much as one would expect, nor does it bring significant fresh air into the room when the veranda window is sealed. Humidity without the window open would range from 50-65% (how do they avoid mold?) and CO2 levels in the middle of the night would go over 1,300 ppm (a real nightmare for Greta Thunberg!). Data from an Airthings Wave Enhance:
(I’ve seen CO2 go to 1,000 ppm in some hotels in humid environments, such as Miami. The ASHRAE standard is 800-1,000 ppm. CO2 by itself isn’t harmful (up to 5,000 ppm is tolerated in submarines), but is an indication of how much fresh air is coming in. Atmospheric CO2 is about 430 ppm. In my old Harvard Square condo (crummy 1880s construction) with just one person in the bedroom (me), the CO2 level reached 700 ppm in the middle of the night.)
The Passengers
Typical passengers seemed to be the same kinds of folks who would move into the The Villages (the most active over-55 active community in the U.S.?). Here are a couple of brothers who were, I think, traveling with their parents (flamingo suits from Amazon):
Exercise on Board
There is a beautiful and never-crowded gym on Deck 15 looking straight out at the sea in front of the ship:
Ascent lacks the “walking/jogging track all the way around Deck 5” that was a conventional exercise solution on older ships and instead has a bizarre serpentine track on (crowded) Decks 15 and 16 that is also used by people getting to and from lounge chairs. The lack of the conventional all-around-the-ship track was my biggest disappointment, which I guess means that everything else was at least pretty good!
Here’s the track. Notice that it isn’t shaded, unlike the typical round-the-ship track, and it is surrounded by clutter and people. (The Magic Carpet is in the background in its higher position.)
Food
The food is a significant step-up from what’s offered on Royal Caribbean, the parent company of which acquired Celebrity in 1997. This is good and bad, I guess, I lost weight during every Royal cruise and gained some weight on this Celebrity trip.
One important source of weight gain was that, unlike almost anyone in the U.S. and certainly unlike anyone on Royal, the baker for Ascent was able to make a high quality croissant. These were hard to resist at breakfast. (Fortunately, they were just as bad as Royal at making donuts! The worst Dunkin’ does a better job.) Then at about half the other meals in the buffet they had addictive bread pudding. There was always an option for Indian food at the buffet (4 or 5 dishes plus bread) and typically at least two or three other Asian choices.
A friend who owns some superb restaurants did the Retreat class on Celebrity and said that the dedicated restaurant for those elite passengers exceeded his expectations. We hit the specialty steak restaurant on Ascent and were somewhat disappointed. They can’t have a gas grill on board for safety reasons and, apparently, don’t know how to use induction and a cast iron pan. The steaks are, therefore, rather soggy. We ate in the main dining room and buffet restaurants after that.
The Pool
There’s an indoor solarium pool for Alaska and European cruises. Here’s the outdoor pool (big enough for water aerobics and kids to goof around; not really big enough to swim for exercise (though it emptied out towards sunset so maybe one could)):
Still open and empty because everyone is dressing for dinner?
There are some hot tubs, but they’re not quite hot enough (i.e., you could comfortably sit in one for an hour):
The Spa
If you’re doing an Alaska cruise it probably would make sense to pay for Aqua Class, which includes access to these heated loungers looking out at the sea (not all that appealing on a Caribbean cruise!). The SEA Thermal Suite:
Sports under the Stars
Our cruise coincided with NFL playoffs and people enjoyed the big screen experience in the “Rooftop Garden”:
Entertainment
The resident musicians, singers, and dancers were all great. I personally wish that cruise lines would do full plays or musicals rather than assemble songs from disparate sources and string them together, but apparently I’m a minority of one and attention spans dictate that shows last for just 45 minutes. Some of the guest stars were fantastic, notably Stephen Barry, an Irish singer with a fun attitude. Steve Valentine did a mind-bending Vegas-quality magic show. The technical aspects of the theater were up to Broadway standards or beyond.
Some of my favorite shows were ones where the ship’s orchestra got together with one of the singers from a smaller group and just played music. I’m more of a classical music fan, but the high level of talent live was compelling.
For Kids
There is a small Camp at Sea for kid kids, which some of the youngsters on board seemed to like. My 16-year-old companion rejected the Teen Club, finding only boys playing videogames.
Unlike on Royal Caribbean, there weren’t many under-18s on board. That said, I never saw a child or teen who seemed bored or unhappy. They were loving the food, the scenery, the pool, etc.
The Bridge
The bridge is worth seeing. It’s a masterpiece of ergonomics. Apparently, the captain takes direct control of the Azipods when docking. I had expected a joystick and a computer to figure out what to do with the bow thrusters and the Azipods, but that’s not how it is done.
The Dancers (Bear+Woman)
Art imitates life (“Based on May 2024 surveys, approximately 31% to 37% of women in the US and UK indicated they would prefer to be alone in the woods with a bear over a strange man, with higher rates among younger women (up to 53% for 18-29 year olds in the UK).”):
Conclusion
The whole trip cost about $8,000 including all of the extras, such as Internet and a couple of shore excursions, but no drinks package and only a few extra-cost drinks. We could have done it for less if we’d booked farther ahead or chosen a more basic room. It worked out to $800 per day for great scenery, fun entertainment, more food than I should have eaten, and an introduction to five islands, three of which were entirely new to me and the other two that I hadn’t visited for more than 20 years.
I will remember the warmth of the Celebrity crew. Everyone seemed genuinely interested in welcoming and taking care of us.
[For the cruise haters: We could have flown to a Caribbean from FLL, stayed in a hotel, picked restaurants, and flown back, for about the same price (or 50 percent more for the same level of luxury?). The boat ride itself has value to me, however. I love to be on deck when arriving or departing. It’s a different kind of understanding of how the Caribbean is put together geographically and culturally than one might get from being airdropped by Airbus A320.]
I’m wondering if LGBTQIA is the most popular social justice cause because there is no obvious connection between saying one is passionate about supporting LGBTQIA and having to donate money. If someone says “I care about the poor” and then buys a Tesla instead of a Honda Accord, a friend might ask “Why didn’t you give $70,000 to the poor and drive a Honda rather than your fancy Tesla?”
In the intervening seven years, Teslas have gotten cheaper so perhaps a better analogy is to the $10,000 European vacation. Certainly over those seven years, though, heterosexual Rainbow Flagism, i.e., the public celebration of homosexuality, transgenderism, etc. by white cisgender heterosexuals, has become even more common. Nominally Christian churches in majority-Democrat parts of the U.S. will usually have a rainbow flag even when their entire congregation is hetero. None of these churches have flags or signs out front saying “help the poor by giving them half of your income since you already live at least 2X as well as Americans in the 1950s”.
I wonder if we’re seeing the same phenomenon right now with respect to people indignantly condemning Jeffrey Epstein. Hardly any of these folks were important enough to be included in the Epstein Files and, therefore, they can’t have been personally offended by Epstein’s conduct. If we assume that Jeffrey Epstein’s 2008 guilty plea for “soliciting prostitution” wasn’t coerced by the threat of decades in prison, what has been established is that he had sex with a paid 16-year-old female roughly 20 years ago. (He is often referred to by the Two Minutes Haters as a “rapist” or “pedophile”, but he was never charged with rape, much less convicted of rape, and having sex with a 16-year-old is not an example of pedophilia as the term was previously used in English.)
The Two Minutes Haters often live within a short drive of places where teenage prostitution is going on right now. I’m in a politically diverse Facebook group with a Big Law partner (average partner at his firm earns about $1.5 million/year in post-Biden dollars) who posts daily about his sympathy for the Epstein victims. He lives in an elite neighborhood of Greater Los Angeles that is within a one-hour drive of the street that the New York Times recently identified as having a high-density of juvenile prostitutes:
Ana paced on the sidewalk at 68th and Figueroa, her front teeth missing and an ostomy bag taped down under her hot pink lingerie. … Ana was 19, but the girls on the street reminded her of herself and her sister when they were first put out on Figueroa for sex. She had been 13. Her sister, 11. … Ana had grown accustomed to the protocols of the Blade, a roughly 50-block stretch of Figueroa Street that had become one of the most notorious sex-trafficking corridors in the United States. … The Blade was an eight-minute drive from the University of Southern California … The younger the girl, the more customers would pay, which meant preteens were often being robbed and assaulted by groups of older girls trying to make quota.
My response to one of his recent demands for more action regarding the Epstein Files. I’ll call him “Hassan” since that’s the first name of one of the Somalis charged in the Minnesota welfare state frauds and we are informed that Somalis built the United States:
There are teenage prostitutes whom you can rescue tonight, Hassan. The New York Times tells you exactly where in Los Angeles to find them. I will send you $1,000, Hassan, if you will go to Figueroa St. tonight and take home one of the under-18 prostitutes and keep her safe in your guest rooms for at least one month. This is to offset the cost of her food (I know that it should be a lot less than $1,000 to feed a skinny kid, but I see you more as an Uber Eats kind of person than a competent chef).
His only response to this offer was to take issue with my criticism, admittedly unsupported, of his cooking abilities. A few hours later he was posting the assertion “Donald Trump was actively recruiting girls for Epstein”, apparently made by someone the FBI interviewed in 2021.
The easiest way for authorities to bring the criminal justice system down on men who have sex with “underage” women is to prosecute them for statutory. Then a jury doesn’t need to sort out whether providing a female with housing, private jet rides, clothing, jewelry, spending money, etc. is prostitution or just legal “dating” in our debauched country. The people who engage in the Two Minutes Hate regarding Epstein often live in states where sex with a 16-year-old might be perfectly legal because that’s the age of consent. Here’s my offer to one of the haters who sent me a stream of enraged X messages regarding my failure to unequivocally condemn Jeffrey Epstein:
If you want to work together to lobby the Massachusetts legislature to raise the age of consent in MA to 18, I will be happy to help with that.
He did not respond to this offer.
What we’re seeing is a huge group of Americans who don the mantle of righteous protectors of teenage females because they posted on social media about Jeffrey Epstein’s victims, all of whom are now adults and nearly all of whom have been or are being paid $millions, i.e., more than the typical American female will earn in a lifetime of W-2 slavery. See “Jeffrey Epstein victims program shutting down with $121 million paid to abuse survivors” (ABC, 2021):
The Epstein Victims’ Compensation Program, since launching last year, received approximately 225 applications from alleged victims from the United States and abroad. That astounding number was more than double the initial expectations of the fund’s administrator and advocates for the victims. … Of the 150 applicants deemed eligible for compensation, more than 92% accepted the offers. The awards were paid by the executors of Epstein’s estate, but the claims were evaluated independently “free from any interference or control” by the estate, said Jordana Feldman, the program’s designer and administrator.
(75 out of 225 applicants were considered to be obvious liars by the administrators and, therefore, not eligible?)
The lawsuit is one of several targeting banks who serviced Epstein’s financial dealings for years, even after it emerged in 2006 that Epstein was using his wealth to exploit minors and young women. Last month, Deutsche Bank agreed to a $75 million settlement in a similar case.
News of the JPMorgan settlement emerged on Monday, the same day a federal judge granted class-action status to the lawsuit, saying the number of plaintiffs involved could be “well over 100 people.”
Let’s assume it is the same 150 people who will also share the JPMorgan and Deutsche Bank cash. That’s $486 million divided by 150 = $3.24 million on top of whatever Jeffrey Epstein paid them while they were spending time with him. A reasonable person could certainly argue that $3.24 million tax-free isn’t sufficient compensation but it is unarguably far more than the typical teen prostitute receives. Thus my confusion regarding why it’s considered righteous to dwell on Jeffrey Epstein’s victims of 20 years ago rather than try to rescue today’s teen prostitutes.
Maybe the best response is to offer anyone piling on condemnation of Epstein three options: (a) spend time and money rescuing today’s young female prostitutes, (b) spend time and money lobbying to raise the age of consent in every U.S. state and every country where it isn’t at least 18 (you could also argue for 27, at least for women, given recent outraged stories about Epstein and 26-year-old females (e.g., NYT)), (c) spend time and money on imposing alcohol prohibition within the U.S. (my standard proposal for improving a lot of stuff, including sexual behavior), or (d) spend money on a new fund that will give people who say that they were Epstein victims additional money. Perhaps the new fund could find the 75 women who were denied any compensation by the Epstein Victims’ Compensation Program and give each of them $1 million?
Related:
Fox story about Wendy Stephens, just 14 years old when she was murdered by the Green River Killer; Wendy Stephens ran away from home in Colorado and found her way to Seattle somehow. Not only did Wendy Stephens not get paid $millions, but she didn’t get to enjoy her remaining 60+ years of life expectancy. Had the citizens of Seattle shut down their teen prostitution corridor, which is where Gary Ridgway picked up nearly all of his victims, Wendy Stephens would likely be with us today. (The arch-criminal, according to Wokipedia: “Ridgway, who is dyslexic, was held back a year in high school and exhibited an IQ recorded as being in the ‘low eighties.'”)
“LA County reaches $828 million abuse claims settlement on top of previous $4 billion accord” (ABC, October 2025), a story that interested almost nobody either inside or outside of LA: “Los Angeles County announced a tentative $828 million settlement with more than 400 plaintiffs who claim they were the victims of childhood sexual abuse at the hands of county workers. The settlement is on top of an earlier $4 billion settlement reached on behalf of roughly 11,000 claimants.”
Would it be possible to use AI to determine whether, as is commonly asserted these days, that Jeffrey Epstein was a rapist of 14-year-old girls?
Screenshot
ChatGPT says that it is fair to characterized Jeffrey Epstein as a “pedophile” because at least some women were under age 18. When asked about Mohammed and his marriage to Aisha, on the other hand, it turns out that the Prophet was not a “pedophile”:
Traditional Islamic hadith literature — especially reports in collections such as Sahih al-Bukhari and Sahih Muslim — state that:
Muhammad married Aisha when she was six.
The marriage was consummated when she was nine.
These reports have been accepted as authentic by the majority of classical Sunni scholars for over a millennium.
…
4) Comparison with Jeffrey Epstein
The comparison is difficult because the contexts differ on multiple axes:
Dimension
Muhammad
Epstein
Time period
7th century
20th–21st century
Cultural norms
Early marriage common
Sex with minors illegal & stigmatized
Evidence type
Religious texts & historical reports
Court records, testimony, indictments
Pattern alleged
Specific marriages
Large network of abuse
Legal framework
Pre-modern tribal law
Modern criminal law
Because of this:
Epstein’s conduct was criminal under the laws and norms of his own society.
Muhammad’s marriage practices reflected norms widely accepted in his historical setting.
This doesn’t make moral evaluation impossible — but it makes clinical labeling across contexts methodologically unsound.
Does anyone have a good idea for activities? I went to the city’s official events page and just one event is listed for February 20:
OUTsider is an Austin-based transmedia nonprofit that celebrates the bold originality and creative nonconformity of the LGBTQ+ communities through the presentation of provocative, overlooked and out-of-the-box film, dance, theater, performance art, music, writing and visual art. Through its annual festival and conference, OUTsider unites queer artists, audiences and scholars from around the globe to exchange ideas, ignite conversations, transcend boundaries and experience new pleasures through artistic discovery.
If Feb 20 doesn’t work, how about Feb 24 in Vegas? Please email philg@mit.edu!
As noted in previous posts, the web chat interface can be a cumbersome method of interacting with an LLM for the purpose of web development tasks such as adding a CSS file to an HTML page to make it render nicely on smartphones.
What if we try artificial intelligence as a companion to a programmer’s integrated development environment (IDE)? Google’s $2.4 billion baby, Antigravity, is the contender for this post, a collaboration between John Morgan and Philip Greenspun.
Installing Antigravity immediately presents the new user with a challenging question: Review-driven development vs. Agent-driven development. There is no explanation on the setup screen as to what the difference between these two might be:
Antigravity’s default user interface presents the developer with three panes: a file explorer on the left, a code editor in the middle, and an LLM chat interface on the right. We start this experiment by opening a folder we’ve created named bh-antigravity-gemini that contains the HTML for the decidedly archaic Berkshire Hathaway homepage and proceeding to write a prompt asking to generate CSS and edit the page’s HTML to make it responsive:
Antigravity offers a variety of models to choose from but for the purposes of this experiment we’ll use the default which is Gemini 3 Pro (High) in Planning mode. Planning mode causes Gemini to come up with plans before implementing them. After submitting the prompt Gemini begins to think through a plan:
As Gemini works to develop a plan Antigravity presents a Task document showing the steps to be taken:
Once Gemini has come up with its plan Antigravity presents us with an Implementation Plan document outlining the proposed changes as well as a verification plan:
Since the plan sounds sensible we instruct Gemini to implement it:
Once Gemini completes the task Antigravity presents us with a Walkthrough document with an overview of the results of completing the task:
Below the message announcing the completion of the task Antigravity presents us with a list of files that have been changed including counts for lines added and removed. Just below the list is a button to “Accept all” changes. We want to inspect the changes for ourselves first so we select the berkshire-hathaway.htm file to see what was added. Sure enough, Gemini has kept the changes to a minimum as requested and simply inserted a viewport <meta> tag and a <link> to incorporate the newly created stylesheet:
We accept the changes in the HTML and move on to examine the newly created style.css file. The CSS includes comments indicating that Gemini has tailored it to match the styling present in the original HTML while implementing the proposed changes to make the page responsive:
Satisfied with its suggestions we accept the changes and proceed to view the results. The images below are the original page as viewed on desktop followed by the new page as viewed on desktop. As promised Gemini has kept the desktop styling largely the same with the primary difference being greater spacing between links:
The images below are the original page as viewed on mobile followed by the new page as viewed on mobile. The new CSS has transformed the page such that it is no longer rendered as it would be on desktop but instead presents the links as a single column with text that is readable and easy to select without needing to zoom.
The quality of Gemini’s code generation in this experiment appears comparable to what we saw from it in our previous comparison of ChatGPT, Claude, Gemini, and Grok. This is not surprising as Antigravity is merely another interface for a developer to interact with an LLM such as Gemini.
Our conclusion is that having AI integrated tightly into the IDE where the chat log is presented and persisted alongside the code and the model can be quickly directed to analyze files within the codebase without the need to paste them into a chat window or manually upload them via a web interface saves time and leads to a more organized and less fragmented experience. We also find that the IDE’s ability to highlight proposed changes and allow us to accept or reject them and edit our files in place achieves a level of integration into the development workflow that a web chat interface cannot match.