• 0 Posts
  • 158 Comments
Joined 1 year ago
cake
Cake day: June 20th, 2023

help-circle

  • I think if your photos are on any kind of public website, AI idiots will scrape them regardless of the provider. So at minimum you have to password protect them. That said, I’d feel ok using this:

    https://www.hetzner.com/storage/storage-share/

    It basically runs NextCloud. You’d configure it so that only logged-in users can view the pictures, and give accounts to your friends and family. I don’t think Hetzner is likely to train AI with it, though you could check through their privacy policy. Part of the issue with eg. Google Drive is that everyone wants stuff for free, so Google recovers some of its costs by advertising, AI training, etc. Hetzner charges enough to actually make a profit, while still being IMHO affordable at the level we’re discussing. That means they don’t have to do crap with advertising etc. I have 5TB in their Storage Box product and am happy with it.

    If you want to be more hardcore, you could set up a dedicated server with an encrypted HDD, but now you have to deal with the hassles of self hosting, including backups. It still wouldn’t be end to end encryption, which would require your users to run some kind of special client, or maybe use some awful javascript client.


  • It would help if you gave some numbers. How much data, within a factor of 1000 say? A few megabytes? A few gigabytes? A few terabytes? A few petabytes? The approach you need will change depending on the level. What is your budget?

    What bothers you about cloud storage? Are any of the photos edgy?

    Anyway it sounds to me like you would be fine with a decent web hosting plan and a basic photo gallery app.




  • If whatever they are doing has been working for stuff written in languages other than Rust, we have to ask what makes Rust special. Rust is a low level language, so its dependencies if anything should be simpler than most, with just a minimal shim between its runtime and the C world. Why does any production software have a version <= X constraint in any of its dependencies anyway? I can understand version >= X, but the other way implies that the API’s are unstable and you’re going to get tons of copies stuff around. I remember seeing that in Ruby at a time when Python was relatively free of it, but now Python has it too. Microsoft at least understood in the 1990s that you can’t go around breaking stuff like that.

    No it’s not all C99. I’m using Calibre (written in Python), Pandoc (written in Haskell), GCC (written in C, C++, and Ada), and who knows what else. All of these are complex applications with many dependencies. Eclipse (written in Java) is also in Debian though I don’t use it. Bcachefs though is apparently just special.

    Joe Armstrong (inventor of Erlang) said of OOP, “you wanted a banana but what you got was a gorilla holding the banana, and the entire jungle”. Rust begins to sound like that too. It might not be inherent in the language, but it looks like the way the community thinks.

    I also still don’t understand why the Bcachefs userspace stuff is written in Rust. I can understand about the kernel part, but the concept of a low level language is manual resource management that a HLL handles for you automatically. Writing the userspace in a LLL seems like more pain for unclear gain. Are there intense performance or memory constraints or what?

    Actually I see now that kernel part of Bcachefs is also considered unstable, so maybe the whole thing is not yet ready for production.


  • Talks about different developer styles, slightly interesting and not too long winded I guess, but not much about the actual situation.

    I think this is still not such a great look for Rust. I had expected interfacing Rust to C to present fewer problems than it seems to. I had hoped the Rust compiler could produce object code with almost no runtime dependencies, the way C compilers can. So integrating Rust code into the kernel should be fairly painless from the C side, if things were as one would hope.

    It does sound to me in the earlier post that there was some toxicity going on. Maybe it had something to do with the context being a DRM driver.

    I looked at a few Rust tutorials but they seemed to take forever to get to any interesting parts. I will keep looking.



  • I will take a look at the bootstrapping project page, but “bootstrappability” is a philosophical notion whose extent depends on what you are trying to get from it. Certainly someone who pursues it should give that some thought and reach a conclusion, rather than just following a recipe on some web site. So that’s the deeper reasoning I felt was missing.

    As for C being terrible, well, why would I want to take that up with anyone? It’s simply that we know from 50 years of experience with C that writing bug-free C programs, or noticing the existence of bugs in them, is extremely difficult. If someone decides to use it for bootstrapability anyway, xkcd.com/386 would seem to apply.

    collapseos.org (which uses Forth) might also be of some interest, though I think that was another questionable decision. Real transparency and boostrapability requires that the reasoning process be written out and matched up with the code. C does a pretty poor job of that compared to some alternatives.


  • The deeper reasoning is still not explained. C is just terrible for this. Rust is very complicated and writing a new implementation is a big project even in good languages. So using C seems tragic.

    I’m going to go further and say that TinyCC isn’t bootstrapped either, since the compiler writer’s thought processes aren’t bootstrapped. You would have to use something like CompCert (i.e. all the reasoning that the programs works is embedded in the program and machine checked) and bootstrap that. It is probably doable, but not as a 1 person hobby project.


  • In principle you could start from hand assembly. Look up “sectorlisp” as a lowest level option. Or you could start from Forth, which is traditionally implemented using very simple methods. The blog post really doesn’t make clear what problem the author is trying to solve. It gives some general description but leaves a lot to be guessed at.

    Then there is the question of where the CPU is supposed to come from. Any modern one was designed using lots of mysterious CAD tools. Maybe scrounge a vintage Z80 out of an old Timex-Sinclair or something?


  • This project sounds kind of masochistic but the idea is to bootstrap Rust from tinycc, and have traceability down to the lowest level assembly code. There is a step missing though? Tinycc is written in C after all.

    I think it would make more sense to bootstrap from a small Lisp written in assembly language, if the traceability goal is worthwhile at all. There is nothing special about C.




  • How many ebooks are you talking about (millions)? Is there just a question of finding duplicated files? That’s easy with a shell script. For metadata, see if the books already have it since a lot do. After that, you can use fairly crude hacks as an initial pass at matching library records. There’s code like that around already, try some web searches, maybe code4lib (library related programming) if that is still around. I saw your earlier comment before you deleted it and it was perfectly fine.


  • If the files are literally duplicated (exact same bytes in the files, so matching md5sums) then maybe you could just delete the duplicates and maybe replace them with links.

    Automatically sorting books by category isn’t so easy. Is the metadata any good? Are there categories already? ISBN’s? Even titles and authors? It starts to be kind of a project but you could possibly import MARC records (library metadata) which have some of thatinfo in them, if you can match up the books to library records. I expect that the openlibrary.org API still works but I haven’t used it in ages.