@chaos

chaos@beehaw.org · 21 days ago

This, and see also “minmaxing,” the process of optimizing something (usually your character in a game) to get minimum penalty and/or maximum benefit, usually ignoring anything like realism or storytelling and focusing entirely on the stats and numbers.

chaos@beehaw.org · 22 days ago

The .bin and .cue file are the parts of the actual game disc that you want. The .bin file contains almost all of the data and the .cue file contains some extra information about the structure of the CD. All the rest is Internet Archive stuff (and an image of the game cover of course).

To open it, you can convert it to a .iso disk image instead, which any Linux distribution can open as if it were a real CD. This blog post talks about how to do that. The last paragraph about mount you can probably replace with double-clicking the .iso file in the GUI I would guess.

chaos@beehaw.org · 1 month ago

I know TiddlyWiki quite well but have only poked at Logseq, so maybe it’s more similar to this than I think, but TiddlyWiki is almost entirely implemented in itself. There’s a very small core that’s JavaScript but most of it is implemented as wiki objects (they call them “tiddlers,” yes, really) and almost everything you interact with can be tweaked, overridden, or imitated. There’s almost nothing that “the system” can do but you can’t. It’s idiosyncratic, kind of its own little universe to be learned and concepts to be understood, but if you do it’s insanely flexible.

Dig deep enough, and you’ll discover that it’s not a weird little wiki — it’s a tiny, self-contained object database and web frontend framework that they have used to make a weird little wiki, but you can use it for pretty much anything else you want, either on top of the wiki or tearing it down to build your own thing. I’ve used it to make a prediction tracker for a podcast I follow, I’ve made my own todo list app in it, and I made a Super Bowl prop bet game for friends to play that used to be spreadsheet-based. For me, it’s the perfect “I just want to knock something together as a simple web app” tool.

And it has the fun party trick (this used to be the whole point of it but I’d argue it has moved beyond this now) that your entire wiki can be exported to a single HTML file that contains the entire fully functional app, even allowing people to make their own edits and save a new copy of the HTML file with new contents. If running a small web server isn’t an issue, that’s the easiest way to do it because saving is automatic and everything is centralized, otherwise you need to jump through some hoops to get your web browser to allow writing to the HTML file on disk or just save new copies every time.

chaos@beehaw.org · edit-2 1 month ago

If you run the Node.js version, that’s all handled for you. It’s only if you want to do the party trick of keeping it all in a single HTML file that you need to worry about a plugin or anything like that. And even then, the server version exports to a standalone HTML file with one or two clicks.

Edit to add: it’s the only substantial Node package I’ve ever seen with zero dependencies. Very lightweight and simple to run.

chaos@beehaw.org · 1 month ago

No. The headsets are disabled when the play starts or when the play clock goes below 15 seconds.

chaos@beehaw.org · 4 months ago

OPML files really aren’t much more than a list of the feeds you’re subscribed to. Individual posts or articles aren’t in there. I would expect that importing a second OPML file would just add more subscriptions, but it’d be up to the reader app to decide what it does.

chaos@beehaw.org · 8 months ago

If you ask an LLM to help you with a legal brief, it’ll come up with a bunch of stuff for you, and some of it might even be right. But it’ll very likely do things like make up a case that doesn’t exist, or misrepresent a real case, and as has happened multiple times now, if you submit that work to a judge without a real lawyer checking it first, you’re going to have a bad time.

There’s a reason LLMs make stuff up like that, and it’s because they have been very, very narrowly trained when compared to a human. The training process is almost entirely getting good at predicting what words follow what other words, but humans get that and so much more. Babies aren’t just associating the sounds they hear, they’re also associating the things they see, the things they feel, and the signals their body is sending them. Babies are highly motivated to learn and predict the behavior of the humans around them, and as they get older and more advanced, they get rewarded for creating accurate models of the mental state of others, mastering abstract concepts, and doing things like make art or sing songs. Their brains are many times bigger than even the biggest LLM, their initial state has been primed for success by millions of years of evolution, and the training set is every moment of human life.

LLMs aren’t nearly at that level. That’s not to say what they do isn’t impressive, because it really is. They can also synthesize unrelated concepts together in a stunningly human way, even things that they’ve never been trained on specifically. They’ve picked up a lot of surprising nuance just from the text they’ve been fed, and it’s convincing enough to think that something magical is going on. But ultimately, they’ve been optimized to predict words, and that’s what they’re good at, and although they’ve clearly developed some impressive skills to accomplish that task, it’s not even close to human level. They spit out a bunch of nonsense when what they should be saying is “I have no idea how to write a legal document, you need a lawyer for that”, but that would require them to have a sense of their own capabilities, a sense of what they know and why they know it and where it all came from, knowledge of the consequences of their actions and a desire to avoid causing harm, and they don’t have that. And how could they? Their training didn’t include any of that, it was mostly about words.

One of the reasons LLMs seem so impressive is that human words are a reflection of the rich inner life of the person you’re talking to. You say something to a person, and your ideas are broken down and manipulated in an abstract manner in their head, then turned back into words forming a response which they say back to you. LLMs are piggybacking off of that a bit, by getting good at mimicking language they are able to hide that their heads are relatively empty. Spitting out a statistically likely answer to the question “as an AI, do you want to take over the world?” is very different from considering the ideas, forming an opinion about them, and responding with that opinion. LLMs aren’t just doing statistics, but you don’t have to go too far down that spectrum before the answers start seeming thoughtful.

chaos@beehaw.org · 8 months ago

In its complaint, The New York Times alleges that because the AI tools have been trained on its content, they sometimes provide verbatim copies of sections of Times reports.

OpenAI said in its response Monday that so-called “regurgitation” is a “rare bug,” the occurrence of which it is working to reduce.

“We also expect our users to act responsibly; intentionally manipulating our models to regurgitate is not an appropriate use of our technology and is against our terms of use,” OpenAI said.

The tech company also accused The Times of “intentionally” manipulating ChatGPT or cherry-picking the copycat examples it detailed in its complaint.

https://www.cnn.com/2024/01/08/tech/openai-responds-new-york-times-copyright-lawsuit/index.html

The thing is, it doesn’t really matter if you have to “manipulate” ChatGPT into spitting out training material word-for-word, the fact that it’s possible at all is proof that, intentionally or not, that material has been encoded into the model itself. That might still be fair use, but it’s a lot weaker than the original argument, which was that nothing of the original material really remains after training, it’s all synthesized and blended with everything else to create something entirely new that doesn’t replicate the original.

chaos@beehaw.org · 1 year ago

These models aren’t great at tasks that require precision and analytical thinking. They’re trained on a fairly simple task, “if I give you some text, guess what the next bit of text is.” Sounds simple, but it’s incredibly powerful. Imagine if you could correctly guess the next bit of text for the sentence “The answer to the ultimate question of life, the universe, and everything is” or “The solution to the problems in the Middle East is”.

Recently, we’ve been seeing shockingly good results from models that do this task. They can synthesize unrelated subjects, and hold coherent conversations that sound very human. However, despite doing some things that up until recently only humans could do, they still aren’t at human-level intelligence. Humans read and write by taking in words, converting them into rich mental concepts, applying thoughts, feelings, and reasoning to them, and then converting the resulting concepts back into words to communicate with others. LLMs arguably might be doing some of this too, but they’re evaluated solely on words and therefore much more of their “thought process” is based on “what words are likely to come next” and not “is this concept being applied correctly” or “is this factual information”. Humans have much, much greater capacity than these models, and we live complex lives that act as an incredibly comprehensive training process. These models are small and trained very narrowly in comparison. Their excellent mimicry gives the illusion of a similarly rich inner life, but it’s mostly imitation.

All that comes down to the fact that these models aren’t great at complex reasoning and precise details. They’re just not trained for it. They got through “life” by picking plausible words and that’s mostly what they’ll continue to do. For writing a novel or poem, that’s good enough, but math and physics are more rigorous than that. They do seem to be able to handle code snippets now, mostly, which is progress, but in general this isn’t something that you can be completely confident in them doing correctly. They make silly mistakes because they aren’t really thinking it through. To them, there isn’t really much difference between answers like “that date is 7 days after Christmas” and “that date is 12 days after Christmas.” Which one it thinks is more correct is based on things it has seen, not necessarily an explicit counting process. You can also see this in things like that case where someone tried to use it to write a legal brief, where it came up with citations that seemed plausible but were in fact completely made up. It wasn’t trained on accurate citations, it was trained on words.

They also have a bad habit of sounding confident no matter what they’re saying, which makes it hard to use them for things you can’t check yourself. Anything they say could be right/accurate/good/not plagiarized, but the model won’t have a good sense of that, and if you don’t know either, you’re opening yourself up to risk of being misled.

chaos@beehaw.org · 1 year ago

Protect me from knowing what I don’t need to know. Protect me from even knowing that there are things to know that I don’t know. Protect me from knowing that I decided not to know about the things that I decided not to know about. Amen.

Lord, lord, lord. Protect me from the consequences of the above prayer. Amen.

chaos@beehaw.org · 1 year ago

That’s part of the point, you aren’t necessarily supposed to have an empty mind the whole time. I mean, if you can do that, great, but you aren’t failing if that’s not the case.

Imagine that your thoughts are buses, and your job is to sit at the bus stop and not get on any of them. Just notice them and let them go by. Like a bus stop, you don’t really control what comes by, but you do control which ones you get on board and follow. If you notice that you’ve gotten on a bus, that’s fine, just get off of it and go back to watching. Interesting things can happen if you just watch and notice which thoughts go by, and it’s good practice for noticing what you’re thinking and where you’re going and taking control of it yourself when it’s somewhere you don’t want to go.

chaos@beehaw.org · 1 year ago

I use TiddlyWiki for, well, a bunch of my projects, but primarily for my task management. You can use it as a single HTML file, which contains the entire wiki, your data, its own code, all of it, and of course use it in any browser you like. Saving changes is a bit of a pain until you find a browser extension or some other way of enabling more seamless editing than re-saving the edited wiki as another single HTML file, but there are many solutions to that as described on their site above.

The way I use it, which is more technical but also logistically simpler, is by running their very minimal Node.JS server which you can just visit and use in any browser which takes care of saving and syncing entirely.

The thing I like about TiddlyWiki is that although on its surface it’s a quirky little wiki with a fun party trick of fitting into an HTML file, what it actually is is a self-contained lightweight object database with a simple yet powerful query language and miniature front-end web development environment which they have used to implement a quirky little wiki. Each “article” is an object that is taggable and has key/value data, and “widgets” can be used in the text to edit and display that data, pulling from the “database” using filters. You can use it to make simple web apps for yourself and they come together very quickly once you know what you’re doing, and the entire thing is a demonstration of a complex web app that is also possible. The wiki’s implemented entirely using those same tools, and everything is open for you to tweak and edit to your liking.

I moved a Super Bowl guessing/fake gambling game that I run from a form and spreadsheet to a TiddlyWiki and now I can share an online dashboard that live updates for everyone and it was decently easy to make and works really well. With my task manager, I recently decided to add a feature where I can set an “agenda” value on any task, and they all show up in one place, so I could set it as “Boss” and then quickly see everything I wanted to bring up in our next 1 on 1 meeting. It took just a few minutes to add the text box to anything that gets tagged “Task” and then make another page that collected them all and displayed them in sections.

chaos@beehaw.org · 1 year ago

This is the key with all the machine learning stuff going on right now. The robot will create something, but none of them have a firm understanding of right, wrong, truth, lies, reality, or fiction. You have to be able to evaluate its output because you have no idea if the robot’s telling the truth or not at that moment. Images are pretty immune to this because everyone can evaluate a picture for correctness or realism, and even if it’s a misleading photorealistic image, well, we’ve already had Photoshops for a long time. With text, you always have to keep in mind that the robot might be low quality or outright wrong, and if you aren’t equipped to evaluate its answers for that, you shouldn’t be using it.

chaos@beehaw.org · 1 year ago

The doom and gloom predictions have always been about slow but inexorable changes in the climate. Not that suddenly a mega hurricane is going to rip Florida out of the ground and toss it into the ocean, but that weather is going to get worse and more extreme, that sea levels will rise, and more and more places will gradually become uninhabitable as conditions get worse. There won’t be single things that you can point to and say “that one was global warming”, it’s about trends that are harmful for us in the long term. If you eat a chocolate bar’s worth more calories than you burn every day, it sounds like doom and gloom to say you’ll gain 200 pounds if you don’t change anything, and you won’t be able to point to any one meal as something to be concerned about because that’s not really out of the ordinary for a day… but slowly and steadily, you’ll gain weight, and if nothing changes you will get there eventually.

And even though you aren’t owed dramatic destruction, and shouldn’t require it to believe the thousands of people who study this as their life’s work and all agree that things are dire and not getting better fast enough… you’ve literally just lived through the hottest twenty or so days in recorded history. Is that a coincidence, do you think?

chaos@beehaw.org · 1 year ago

I hope I don’t come across as too cynical about it :) It’s pretty amazing, and the things these things can do in, what, a few gigabytes of weights and a beefy GPU are many, many times better than I would’ve expected if you had outlined the approach for me 2 years ago. But there’s also a long history of GAI being just around the corner, and we do keep turning corners and making useful progress, but it’s always still a ways off after each leap. I remember some people thinking that chess was the pinnacle of human intelligence, requiring creativity and logic to succeed, and when computers blew past humans at chess, it became clear that no, that’s still impressive but you can get good at chess without really getting good at anything else.

It might be possible for an ML model to assemble itself into general intelligence based solely on being fed words like we’re doing, it does seem like the data going in contains enough to do that, but getting that last 10% is going to be hard, each percentage point much harder than the last, and it’s going to require more rigorous training to stop them from skating by with responses that merely come close when things get technical or precise. I’d expect that we need more breakthroughs in tools or techniques to close that gap.

It’s also important to remember that as humans, we’re inclined to read consciousness and intent into everything, which is why pretty much every pantheon of gods includes one for thunder and lightning. Chatbots sound human enough that they cross the threshold for peoples’ brains to start gliding over inaccuracies or strange thinking or phrasing, and we also unconsciously help our conversation partner by clarifying or rephrasing things if the other side doesn’t seem to be understanding. I suppose this is less true now that they’re giving longer responses and remaining coherent, but especially early on, the human was doing more work than they realized keeping the conversation on the rails, and once you started seeing that it removed a bit of the magic. Chatbots are holding their own better now but I think they still get more benefit of the doubt than we realize we’re giving them.

chaos@beehaw.org · 1 year ago

Thanks for that article, it was a very interesting read! I think we’re mostly agreeing about things :) This stood out to me from there as an encapsulation of the conversation:

I don’t think LLMs will approach consciousness until they have a complex cognitive system that requires an interface to be used from within – which in turn requires top-down feedback loops and a great deal more complexity than anything in GPT4. But I agree with Will’s general point: language prediction is sufficiently challenging that complex solutions are called for, and these involve complex cognitive stratagems that go far beyond anything well described as statistics.

“Statistics” is probably an insufficient term for what these things are doing, but it’s helpful to pull the conversation in that direction when a lay person using one of those things is likely to assume quite the opposite, that this really is a person in a computer with hopes and dreams. But I agree that it takes more than simply consulting a table to find the most likely next word to, to take an earlier example, write a haiku about Danny DeVito. That’s synthesizing two ideas together that (I would guess) the model was trained on individually. That’s very cool and deserving of admiration, and could lead to pretty incredible things. I’d expect that the task of predicting words, on its own, wouldn’t be stringent enough to force a model to develop “true” intelligence, whatever that means, to succeed during training, but I suppose we’ll find out, and probably sooner than we expect.

chaos@beehaw.org · 1 year ago

I would agree that we are also very complicated statistical models, there’s nothing magical going on in the human brain either, just physics which as far as we know is math that we could figure out eventually. It’s a massively huge order of magnitude leap in complexity from current machine learning models to human brains, but that’s not to say that the only way we’ll get true artificial intelligence is by accurately simulating a human brain, I’d guess that we’ll have something that’s unambiguously intelligent by any definition well before we’re capable of that. It’ll be a different approach from the human brain and may think and act in alien or unusual ways, but that can still count.

Where we are now, though, there’s really no reason to expect true intelligence to emerge from what we’re currently doing. It’s a bit like training a mouse to navigate a maze and then wondering whether maybe the mouse is now also capable of helping you navigate your cross-country road trip. “Well, you don’t know how it’s doing it, maybe it has acquired general navigation intelligence!” It can’t be disproven, I guess, but there’s no reason to think that it picked up any of those skills because it wasn’t trained to do any of that, and although it’s maybe a superintelligent mouse packing a ton of brainpower into a tiny little brain, all our experience with mice would indicate that their brains aren’t big enough or capable of that regardless of how much you trained them. Once we’ve bred, uh, mice with brains the size of a football, maybe, but not these tiny little mice.

chaos@beehaw.org · 1 year ago

Admittedly this isn’t my main area of expertise, but I have done some machine learning/training stuff myself, and the thing you quickly learn is that machine learning models are lazy, cheating bastards who will take any shortcut they can regardless of what you are trying to get them to do. They are forced to get good at what you train them on but that is all the “effort” they’ll put in, and if there’s something easy they can do to accomplish that task they’ll find it and use it. (Or, to be more precise and less anthropomorphizing, simpler and easier approaches will tend to be more successful than complex and fragile ones, so those are the ones that will shake out as the winners as long as they’re sufficient to get top scores at the task.)

There’s a probably apocryphal (but stuff exactly like this definitely happens) story of early machine learning where the military was trying to train a model to recognize friendly tanks versus enemy tanks, and they were getting fantastic results. They’d train on pictures of the tanks, get really good numbers on the training set, and they were also getting great numbers on the images that they had kept out of the training set, pictures that the model had never seen before. When they went to deploy it, however, the results were crap, worse than garbage. It turns out, the images for all the friendly tanks were taken on an overcast day, and all the images of enemy tanks were in bright sunlight. The model hadn’t learned anything about tanks at all, it had learned to identify the weather. That’s way easier and it was enough to get high scores in the training, so that’s what it settled on.

When humans approach the task of finishing a sentence, they read the words, turn them into abstract concepts in their minds, manipulate and react to those concepts, then put the resulting thoughts back into words that make sense after the previous words. There’s no reason to think a computer is incapable of the same thing, but we aren’t training them to do that. We’re training them on “what’s the next word going to be?” and that’s it. You can do that by developing intelligence and learning to turn thoughts into words, but if you’re just being graded on predicting one word at a time, you can get results that are nearly as good by just developing a mostly statistical model of likely words without any understanding of the underlying concepts. Training for true intelligence would almost certainly require a training process that the model can only succeed at by developing real thoughts and feelings and analytical skills, and we don’t have anything like that yet.

It is going to be hard to know when that line gets crossed, but we’re definitely not there yet. Text models, when put to the test with questions that require synthesizing abstract ideas together precisely, quickly fall short. They’ve got the gist of what’s going on, in the same way a programmer can get some stuff done by just searching for everything and copy-pasting what they find, but that approach doesn’t scale and if they never learn what they’re doing, they’ll get found out when confronted with something that requires actual understanding. Or, for these models, they’ll make something up that sounds right but definitely isn’t, because even the basic understanding of “is this a real thing or is it fake” is beyond them, they just “know” that those words are likely and that’s what got them through training.

chaos@beehaw.org · 1 year ago

It’s overhyped but there are real things happening that are legitimately impressive and cool. The image generation stuff is pretty incredible, and anyone can judge it for themselves because it makes pictures and to judge it, you can just look at and see if it looks real or if it has freaky hands or whatever. A lot of the hype is around the text stuff, and that’s where people are making some real leaps beyond what it actually is.

The thing to keep in mind is that these things, which are called “large language models”, are not magic and they aren’t intelligent, even if they appear to be. What they’re able to do is actually very similar to the autocorrect on your phone, where you type “I want to go to the” and the suggestions are 3 places you talk about going to a lot.

Broadly, they’re trained by feeding them a bit of text, seeing which word the model suggests as the next word, seeing what the next word actually was from the text you fed it, then tweaking the model a bit to make it more likely to give the right answer. This is an automated process, just dump in text and a program does the training, and it gets better and better at predicting words when you a) get better at the tweaking process, b) make the model bigger and more complicated and therefore able to adjust to more scenarios, and c) feed it more text. The model itself is big but not terribly complicated mathematically, it’s mostly lots and lots and lots of arithmetic in layers: the input text will be turned into numbers, layer 1 will be a series of “nodes” that each take those numbers and do multiplications and additions on them, layer 2 will do the same to whatever numbers come out of layer 1, and so on and so on until you get the final output which is the words the model is predicting to come next. The tweaks happen to the nodes and what values they’re using to transform the previous layer.

Nothing magical at all, and also nothing in there that would make you think “ah, yes, this will produce a conscious being if we do it enough”. It is designed to be sort of like how the brain works, with massively parallel connections between relatively simple neurons, but it’s only being trained on “what word should come next”, not anything about intelligence. If anything, it’ll get punished for being too original with its “thoughts” because those won’t match with the right answers. And while we don’t really know what consciousness is or where the lines are or how it works, we do know enough to be pretty skeptical that models of the size we are able to make now are capable of it.

But the thing is, we use text to communicate, and we imbue that text with our intelligence and ideas that reflect the rich inner world of our brains. By getting really, really, shockingly good at mimicking that, AIs also appear to have a rich inner world and get some people very excited that they’re talking to a computer with thoughts and feelings… but really, it’s just mimicry, and if you talk to an AI and interrogate it a bit, it’ll become clear that that’s the case. If you ask it “as an AI, do you want to take over the world?” it’s not pondering the question and giving a response, it’s spitting out the results of a bunch of arithmetic that was specifically shaped to produce words that are likely to come after that question. If it’s good, that should be a sensible answer to the question, but it’s not the result of an abstract thought process. It’s why if you keep asking an AI to generate more and more words, it goes completely off the rails and starts producing nonsense, because every unusual word it chooses knocks it further away from sensible words, and eventually it’s being asked to autocomplete gibberish and can only give back more gibberish.

You can also expose its lack of rational thinking skills by asking it mathematical questions. It’s trained on words, so it’ll produce answers that sound right, but even if it can correctly define a concept, you’ll discover that it can’t actually apply it correctly because it’s operating on the word level, not the concept level. It’ll make silly basic errors and contradict itself because it lacks an internal abstract understanding of the things it’s talking about.

That being said, it’s still pretty incredible that now you can ask a program to write a haiku about Danny DeVito and it’ll actually do it. Just don’t get carried away with the hype.

chaos@beehaw.org · 1 year ago

Like a lot of trouble, the worst doesn’t come from strangers, it comes from people you know. By all accounts, what happens is that anyone who knows you comes out of the woodwork and suddenly wants a favor, or a small loan, or an investment in their brilliant business idea, and of course you’re the asshole if you deny any of them. If you can’t keep it a secret entirely, it’s best to at least lock up the money somewhere so that you can blame the big mean lawyer who is “making” you be responsible and not blow it all on your second cousin’s crypto app.