• cron@feddit.org
    link
    fedilink
    arrow-up
    5
    ·
    3 days ago

    I wouldn’t really call these watermarks. If these are watermarks, then someone might call the longer than usual dash a watermark, too:

    That long dash is called an em dash — like this one.

    • General_Effort@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      3 days ago

      Using identically displayed but differently encoded characters is a way to watermark texts. It was used in a lawsuit a few years ago (SZ-Bericht). The suing company eventually lost because they didn’t actually own the rights to the texts they had watermarked.

      As @luckystarr@feddit.org points out, these whitespaces may make quite a difference, so not likely to be a watermark. Methods for watermarking LLM-generated Text are more subtle anyway, involving altering word frequencies.