Arstechnica – Bi·lak

What Was Actually Achieved By LLMs Building A C-Compiler

Ars Technica has put into perspective what it means that LLMs “created” a C-compiler by themselves. My favourite quotes:

It’s worth noting that a C compiler is a near-ideal task for semi-autonomous AI model coding: The specification is decades old and well-defined, comprehensive test suites already exist, and there’s a known good reference compiler to check against. Most real-world software projects have none of these advantages. The hard part of most development isn’t writing code that passes tests; it’s figuring out what the tests should be in the first place.

[…] Even with all optimizations enabled, it produces less-efficient code than GCC running with all optimizations disabled. […]

Anthropic describes the compiler as a “clean-room implementation” because the agents had no Internet access during development. But that framing is somewhat misleading. The underlying model was trained on enormous quantities of publicly available source code, almost certainly including GCC, Clang, and numerous smaller C compilers. In traditional software development, “clean room” specifically means the implementers have never seen the original code. By that standard, this isn’t one. […]

“It was rather a brute force attempt to decompress fuzzily stored knowledge contained within the network,”

None of this should obscure what the project actually demonstrates. A year ago, no language model could have produced anything close to a functional multi-architecture compiler, even with this kind of babysitting and an unlimited budget. The methodology of parallel agents coordinating through Git with minimal human supervision is novel, and the engineering tricks Carlini developed to keep the agents productive (context-aware test output, time-boxing, the GCC oracle for parallelization) could potentially represent useful contributions to the wider use of agentic software development tools.

Those ones were the expensive headcount anyway

Arstechnica reports on a study where they measured the productivity of software developers of different open source projects doing different (also non-coding) tasks.

In the comments there’s a snarky summary of the articles main point:

“These factors lead the researchers to conclude that current AI coding tools may be particularly ill-suited to “settings with very high quality standards, or with many implicit requirements (e.g., relating to documentation, testing coverage, or linting/formatting) that take humans substantial time to learn.” While those factors may not apply in “many realistic, economically relevant settings” involving simpler code bases, they could limit the impact of AI tools in this study and similar real-world situations.”

So as long as I cull the experienced people and commit to lousy software the glorious Age of AI will deliver productivity gains? Awesome, those ones were the expensive headcount!

spyPod

An Apple engineer who helped launch the iPod said he helped the US government build a secret version of the device that could covertly collect data.
— Arstechnica

Babies Know, They Don’t Know

It looks like babies at the age of 19-21 months already have a concept of when they don’t know something and ask for help if it’s available.

Leaky Apps

How much data are the most popular apps on Android and iOS leaking to third parties (i.e. people who have nothing to do with the app you’re using). A LOT!

Encrypt All Your Devices

Ars Technica has compiled a guide for how to encrypt laptops and phones. There are brief descriptions for all the relevant systems.

Ars again covers interesting research on the psychology toddlers. This time: toddlers with parents with lower tolerance to injustice show stronger differences in EEG-readings when watching prosocial vs. antisocial behavior.

It also has a discussion on how difficult it is to do a “psychological” assessment of toddlers’ behavior and derive concrete explanations or conclusions from them.

Getting Fired Over a Privacy-Invading Management App

Nice, now you can get fired if you refuse to use privacy-invading management app.

ADN: Cracking Russian Crypto

https://alpha.app.net/riyad/post/17764524

The AMD of Authors

This depiction of George R. R. Martin on Ars Technicast Episode 37 made my day. 😀

He’s perpetually slipping the stuff back. He’s the AMD of authors. He can never make a launch date.

Tag: Arstechnica