A tiny mouse, a hacker.

  • 0 Posts
  • 23 Comments
Joined 6 months ago
cake
Cake day: December 24th, 2023

help-circle

  • That would result in those fediverse servers theoretically requesting 333333 * 114MB = ~38Gigabyte/s.

    On the other hand, if the site linked would not serve garbage, and would fit like 1Mb like a normal site, then this would be only ~325mb/s, and while that’s still high, it’s not the end of the world. If it’s a site that actually puts effort into being optimized, and a request fits in ~300kb (still a lot, in my book, for what is essentially a preview, with only tiny parts of the actual content loaded), then we’re looking at 95mb/s.

    If said site puts effort into making their previews reasonable, and serve ~30kb, then that’s 9mb/s. It’s 3190 in the Year of Our Lady Discord. A potato can serve that.


  • I only serve bloat to AI crawlers.

    map $http_user_agent $badagent {
      default     0;
      # list of AI crawler user agents in "~crawler 1" format
    }
    
    if ($badagent) {
       rewrite ^ /gpt;
    }
    
    location /gpt {
      proxy_pass https://courses.cs.washington.edu/courses/cse163/20wi/files/lectures/L04/bee-movie.txt;
    }
    

    …is a wonderful thing to put in my nginx config. (you can try curl -Is -H "User-Agent: GPTBot" https://chronicles.mad-scientist.club/robots.txt | grep content-length: to see it in action ;))



  • algernon@lemmy.mltoLinux@lemmy.mlNixOS forked
    link
    fedilink
    arrow-up
    3
    ·
    2 months ago

    There’s plenty, but I do not wish to hijack this thread, so… have a look at the Forgejo 7.0 release notes, the PRs it links to along notable features (and a boatload of bugfixes, many of which aren’t in Gitea). Then compare when (and if) similar features or fixes were implemented in Gitea.

    The major difference (apart from governance, and on a technical level) between Gitea and Forgejo is that Forgejo cherry picks from Gitea weekly (being a hard fork doesn’t mean all ties are severed, it means that development happens independently). Gitea does not cherry pick from Forgejo. They could, the license permits it, and it even permits sublicensing, so it’s not an obstacle for Gitea Cloud or Gitea EE, either. They just don’t.






  • There’s a very easy solution that lets you rest easy that your instance is how you want it to be: don’t do open registration. Vet the people you invite, and job done. If you want to be even safer, don’t post publicly - followers only. If you require follower approval, you can do some basic checks to see that whoever sends a follow request is someone you’re okay interacting with. This works on the microblogging side of the Fediverse quite well, today.

    What I’m trying to say is that with registrations requiring admin approval gets you 99% of the way there, without needing anything more complex than that.





  • Fair bias notice: I am a Forgejo contributor.

    I switched from Gitea to Forgejo when Forgejo was announced, and it was as simple as changing the binary/docker image. It remains that simple today, and will remain that simple for the foreseeable future, because Forgejo cherry picks most of the changes in Gitea on a weekly basis. Until the codebases diverge, that will remain the case, and Forgejo will remain a drop-in replacement until such time comes that we decide not to pick a feature or change. If you’re not reliant on said feature, it’s still a drop-in replacement. (So far, we have a few things that are implemented differently in Forgejo, but still in a compatible way).

    Let me offer a few reasons to switch:

    • Forgejo - as of today, and for the foreseeable future - includes everything in Gitea, but with more tests, and more features on top. A few features Forgejo has that Gitea does not:
      • Forgejo makes it possible to have any signed in user edit Wikis (like GitHub), Gitea restricts it to collaborators only. (Forgejo defaults to that too, but the default can be changed). Mind you, this is not in a Forgejo release yet, it will be coming in the next release probably in April.
      • Gitea has support for showing an Action status badge. Forgejo has badges for action statuses, stars, forks, issues, pull requests.
      • …there are numerous other features being developed for Forgejo that will not make it into Gitea unless they cherry pick it (they don’t do that), or reimplement it (wasting a lot of time, and potentially introducing bugs).
    • Forgejo puts a lot of effort into testing. Every feature developed for Forgejo needs to have a reasonable amount of tests. Most of the things we cherry pick for Gitea, we write tests for if they don’t have any (we write plenty of tests for stuff originating from Gitea).
    • Forgejo is developed in the open, using free tools: we use Forgejo to host the code, issues and releases, Forgejo Actions for CI, and Weblate for translations. Gitea uses GitHub to host the code, issues and releases, uses GitHub CI, and CrowdIn for translations (all of them proprietary platforms).
    • Forgejo accepts contributions without requiring copyright assignment, Gitea does not.
    • Forgejo routinely cherry picks from Gitea, Gitea does not cherry pick from Forgejo (they do tend to reimplement things we’ve done, though, a huge waste of time if you ask me).
    • Forgejo isn’t going anywhere anytime soon, see the sustainability repo. There are people committed to working on it, there are people paid to work on it, and there’s a fairly healthy community around it already.

  • The single best thing I like about Zed is how they unironically put up a video on their homepage where they take a perfectly fine function, and butcher it with irrelevant features using CoPilot, and in the process:

    • Make the function’s name not match what it is actually doing.
    • Hardcode three special cases for no good reason.
    • Write no tests at all.
    • Update the documentation, but make the short version of it misleading, suggesting it accepts all named colors, rather than just three. (The long description clarifies that, so it’s not completely bad.)
    • Show how engineering the prompt to do what they want takes more time than just writing the code in the first place.

    And that’s supposed to be a feature. I wonder how they’d feel if someone sent them a pull request done in a similar manner, resulting in similarly bad code.

    I think I’ll remain firmly in the “if FPS is an important metric in your editor, you’re doing something wrong” camp, and will also steer clear of anything that hypes up the plagiarism parrots as something that’d be a net win.



  • Nevertheless, as Bluesky grows, there are likely to be multiple professionally-run indexers for various purposes. For example, a company that performs sentiment analysis on social media activity about brands could easily create a whole-network index that provides insights to their clients.

    (source)

    Is that supposed to be a selling point? Because I’d like to stay far, far away from that, thank you very much.




  • I found that no general purpose search engine will ever serve my needs. Their goal is to index the entire internet (or a very large subset of it), and sadly, a very large part of the internet is garbage I have no desire to see. So I simply stopped using search engines. I have a carefully curated, topical list of links from where I can look up information from, RSS feeds, and those pretty much cover all what I used search for.

    Lately, I have been experimenting with YaCy, and fed it my list of links to index. Effectively, I now have a personal search engine. If I come across anything interesting via my RSS feeds, or via the Fediverse, I plug it into YaCy, and now its part of my search library. There’s no junk, no ads, no AI, no spam, and the search result quality is stellar. The downside is, of course, that I have to self-host YaCy, and maintain a good quality index. It takes a lot of effort to start, but once there’s a good index, it works great. So far, I found the effort/benefit ratio to be very much worth it.

    I still have a SearxNG instance (which also searches my YaCy instance too, with higher weight than other sources) to fall back to if I need to, but I didn’t need to do that in the past two months, and only two times in the past six.