Time and Motion: March 2026

2026-04-03

I was curious about my overall time investment in Hadron so in February I installed some time tracking software on my laptop.

I acknowledge there's some grind culture here for what is an unpaid and obscure personal project. The tracking software is quite invasive because it has to track what you're doing in each application in order to automatically distinguish work from play. I didn't want to punch a clock when working on Hadron, but that means I have consented to let a company collect data about every activity I perform on my computer.

According to the tracking software, in March I worked on Hadron for a bit over 42 hours. Most of that work happened earlier in the month, I've been struggling with motivation more recently. In studying the tracking data, I find that I could probably use some support on focus and attention. Thankfully, I'm going through a diagnostic process for ADHD and Autism right now, so there might be something to learn there that could support me.

I've started working on the LSP features in Hadron, and although I haven't made significant progress in terms of committed code, I've learned a lot about asynchronous Rust, and written and discarded a number of prototypes.

LSP support, specifically the low-latency interactive language features, caused me to revisit a lot of design decisions in Hadron. Most of my thinking was around designs that might help to elegantly reduce the amount of recomputation that must occur when analyzing incremental changes to a file.

For example, if the user is typing a few characters to extend the name of a variable, say from var a to var abc, and that is in the middle of a few thousand line file, how can we efficiently determine what needs to be recomputed? Clearly we shouldn't need to lex and parse the entire file, but are there simple heuristics that could limit the scope of the repeated work?

I looked at rust-analyzer, which uses an interesting compute cache layer called salsa. It's interesting to consider caching from an architectural perspective, because salsa requires a bit of formalism in terms of inputs and outputs from functions in order to leverage the library effectively. I didn't want such a heavyweight dependency, and it feels like overkill for my needs, but it was fun to read about.

I had an instructive conversation with Scott Carver, the author of the current vscode-supercollider LSP plugin. I was surprised to learn that even with the LSP running in sclang Scott hasn't had any performance issues, even when working on larger-scale projects.

SuperCollider isn't nearly as complex as a large Rust project so hearing Scott's experience with the current LSP and performance made me feel that I'm doing some premature optimization. It's time to get some baselines and benchmarks in place, to establish some norms about functionality and performance.

And then I just kinda.. ran out of gas. I'm recovering from major surgery, and still dealing with some pain and fatigue, and it finally caught up with me in the latter half of March. So I'm taking the time I need to work on getting better. Hopefully torwards the later half of April I'll feel well enough to start to tackle some of the Hadron work again.

Feed Me Weird Things

2026-03-08

With love to Squarepusher.

Please send me links to public repositories containing SuperCollider code. Get in touch if you'd like to share more private source code with me, if you are concerned I might have your source code and you'd like me to remove it, or if you'd like to volunteer to help curate a large collection of SuperCollider source code.

Hadron Needs Real-World Inputs

I've been working in a vacuum on Hadron for a few years now while building many of the fundamentals required for an interpreter, such as a lexer, parser, and "semantic analyzer," a concept borrowed from LLVM's C++ compiler clang.

However, as I'm starting to work on a code formatter, and building up some support code for the language server, it's become readily apparent that I need to build a database of as much extant SuperCollider source code as possible. The more diverse styles and idioms of SuperCollider usage I can capture, the better.

Building a sample source code corpus affords a number of advantages for Hadron, and possibly some benefit for the SuperCollider project in general:

I can validate Hadron's parser against the SuperCollider reference parser I maintain, Sparkler. This helps me build confidence that both parsers are proven against real-world code inputs by actual users.
With caveats, it can be used for benchmarking and optimization of any SC parser, including the existing SCLang lexer/parser stack (which is getting a rewrite, exciting!).
Fuzzing benefits from a large set of input examples, so my Hadron fuzzing efforts will certainly become more efficient.

With that in mind, I've started a collection of publicly available SuperCollider source codes. I made a corpus repository in the Hadron organization on Codeberg, but I've made it private to members of the Hadron organization only, to avoid it being an obvious target for AI scrapers but also to allow for the possibility that some folks might be willing to share private code with the corpus, and I wanted to keep those access controls in place.

Corpus Terms Of Use

I will not share the corpus outside of a select group of Hadron developers. While portions of the corpus may very well be distributed by their own authors, I will never distribute the corpus.
The source code will not be used to train any generative large language model. We may make some small-scale statistical inferences from it, for example benchmarking Hadron code against parts of it, but no part of the work will be used to generate new source code.
Any contribution to the corpus, even those publicly available, can be removed at any time by simply contacting me with the request.

Current Effort

I've already gathered around a million and a half lines of code from three sources:

All registered Quarks. The quarks repository has a directory.txt file containing repository links for each quark. I wrote a python script to add each one as a submodule to the corpus.
The SuperCollider class library, distributed with SuperCollider. I established a Hadron fork of the SuperCollider repository that only contains the class library, and added that fork to the submodule. I'll keep it roughly up to date on releases.
I scrolled through a search of GitHub for projects listing SuperCollider as the primary language. This revealed many of the already-registered Quarks but also uncovered a lot of other interesting projects. These I have organized by author in the corpus.

I still need to comb through the Awesome SuperCollider lists, and I think some folks are moving to Codeberg so it's worth a search there, too. Additionally I think GitHub could use another pass through, as well.

Are We The Baddies?

I had a queasy moment when trawling through GitHub for SuperCollider code. I was getting a little close to the line for my comfort in terms of starting to resemble Large Language Model scraper behaviours. So far, I have taken the opt-out approach to the publicly available SuperCollider source code. Meaning, I have taken the fact that the author has posted their code on GitHub as implied consent for inclusion in the corpus. This seems to me to be much less of a stretch than implied consent for inclusion in the training set of a for-profit generative language model. But, it still feels like a strech.

I'm going to raise awareness about the corpus by circulating this blog post in a few spots known to be popular with SuperCollider users, in the hopes of both generating additional submissions of source code, but also in the interest of gathering feedback about it.

These are, in some ways, "complicated" ethical times. "Complicated" is a polite way of saying that as AI eats my industry I've seen a bunch of folks I respected seem to lose sight of some of the basics in terms of how harmful this technology is, and how dangerous. It makes me wonder about my own ethical sensibilities, and I find myself worrying and questioning when my plans start to resemble those of the robots.

As always, I'd love to hear your thoughts.