Thinking : The model makers

Some of the most interesting responses I saw to Ted Chiang's piece:

Mat Dryhurst tweet:

The easy unlock re the cyclical AI art culture war that will seem obvious in retrospect is that the people creating models are artists

The same rules apply as in Chiang's piece. Some intervene in a deliberate manner in every stage of the process, some make rote and inelegant choices

and this thread from Tim Hwang again emphasizing model making

for chiang, good art is about choices, lots of them

if choice is art, pretraining a model is absolutely an art, and fine-tuning nearly always is too

model architectures, dataset selection, evaluation structures are all deeply matters of curation, taste, and aesthetics

like many people, chiang gets lost in the “push button, get art” implementations we see in the market

he blunders here

chiang mistakes these apps as reflecting something inherent in language models as a technology, rather than a design choice on the part of the designer

this is why the analogy connecting LLMs as an art form to gaming as an art form is so important

game design is an artistic act, but whether or not the interactivity that game allows is artistically worthwhile is separate question, defined by choice on the part of the designer

These were fun, I like to imagine a world of weird models put forward as art objects to explore.

Does it seem like this is where things are headed though?

I haven't been thinking this way. I think because I've mainly been focused on using models in a programming context, where I'm looking for capabilities to fit into apps.

And the narrative so far has just been scaling things up further and further, so that all kinds of responses are available within a model, you just need to prompt yourself into it.

I'm trying to think of things where the model's 'personality' or at least 'angle' is interesting. I've seem people speak about Claude that way, but even if there are edges where you can see the personality there, it still seems to me it's in such a capability race that I can't see real aesthetic-based curation happening. Like I can't imagine them choosing to leave out certain data because it doesn't fit the vibe if the data would contribute to performance.

Or, like, Midjourney has/had an aesthetic, but it feels like their goal is/will be 'anything you want to create, in any style' that just seems to be the way these push.

Maybe the variety shakes out more after the big capability race settles a bit, maybe synthetic data gets all capabilities to a point where more curation is possible and wanted.

And the idea of curating a dataset to define a model, does have some nice analogies - it feels like breeding grape varieties for wine or something. Where your intelligence about adding just a hint of this edgy data to balance out a mostly sophisticated set - and then a tasting with notes like 'a hint of this subreddit'.

But in that example what is it you actually do with the model? Maaaybe it's a chatbot? No I think better if it's language then it's some sort of interactive storytelling - in which case the big thing it would need to be is distilled - so that you could see it's value immediately because the potential for slop is so high...

So it's fun to think about but gets very fuzzy.

To pin something for another day: part of the question is how to distinguish a subtle, artist-trained model/work versus just a huge model prompted into that space. Part of it maybe that you couldn't jailbreak out of the artist model? In that way maybe it is like a game design...