@Bluefold

Bluefold@sh.itjust.works · 9 months ago

That’s most tech corporate jobs tbf. Lots of middle managers with nothing much better to do than play musical chairs once a quarter. It’s like that XKCD meme about there being the standard that will clean up the mess of there being so many standards. Surely my way of working will solve all our problems of underinvestment and losing key talent…

Bluefold@sh.itjust.works · 9 months ago

The largest owners are Advance Publications and Tencent. Advance also own Condé Nasty (Reddit even used to be under the Condé Nast banner). Weirdly they also own everyone’s favorite plagiarism detection service Turnitin.

Bluefold@sh.itjust.works · 11 months ago

I have. It’s pretty short and to the point. They’re based out of Germany so their requirements for clarity are pretty high by law. They go into quite a lot of detail about what is sent.

In this case they send date, time, language, processing time, number and the type of errors, but not the text itself

However, they do have an optional feature that uses OpenAI to rephrase sentences so that might be training through the back door.

I’ve been using it for years and have been very happy with the service.

Bluefold@sh.itjust.works · 11 months ago

If one of their goals is to sell premium access to train LLMs this type of gibberish would hurt that. When you can’t guarantee that the data source is coherent, then that would have an impact on the final model that is created.

I think a better approach is to transfer comments to a new platform or create new higher quality content. Could the solution to this problem become a guide that goes into more detail?