Skip to main content

Atoms

đź”—

This article is an engaging read. I tend to bounce back and forth between natural language and GUI or CLI-based modalities – of course, in the last few years I have leaned towards the natural language one due to LLM usage. 1

The article argues (somewhat reasonably) that more structured, inline GUI elements are preferable and reduce interaction latency. I do tend to get frustrated sometimes when I have to copy-paste text in an LLM web interface, or am only given “pick one out of a numbered list” as the option. An additional modality I would like to see in LLMs is a way to refer back to chunks of text or output – something like pilcrows


  1. I remember constructing natural-language like interfaces of my own the hard way, and giving up after about 10 or so scripts. I don’t think I could have imagined then that I could use trivially use natural language commands in under a decade! ↩︎


đź”—

When S3 Tables came out in December 2024, I was a little confused about what exactly the value proposition was over hosting your lake-style files in S3. Andy Warfield answers that question here:

What compaction is doing is, like I said, you’ve got one giant Parquet file, possibly as an initial table. Over time, you’re adding additional Parquet files. Each one of those adds a bunch of metadata files, fragmenting your data like crazy. The simple task of compaction is to take all of those changes, throw away the stuff that was deleted, keep the stuff that’s alive, and fold it into a single or a small number of very large files. This allows you to get back to doing large reads of just the columns of the database that you care about, maximizing the utilization of your request path. That gets you huge performance and the most usable bytes read per bytes used.

The challenge is that the way the customer workload updates the data in the table completely changes the complexity of compaction from workload to workload. A read-only database, like a table that has never changed, obviously doesn’t need compaction—it just sits as it is. The one exception is that you might decide to restructure that table in the background over time if you notice that queries are accessing it in a way that the table is not well laid out for.