• Septimaeus@infosec.pub
    link
    fedilink
    arrow-up
    51
    arrow-down
    3
    ·
    edit-2
    11 months ago

    I usually wear the tin foil hat in these debates, but I must concede in this case: the eavesdropping phone theory in particular is difficult to substantiate, from a technical standpoint.

    For one, a user can check this themselves today with basic local network traffic monitors or packet sniffing tools. Even heavily compressed audio data will stand out in the log, no matter how it’s encrypted, streamed, batched or what have you.

    To get a sense of what I mean, run wireshark and give a wake phrase command to see what that looks like. Now imagine trying to obfuscate that type of transmission for audio longer than 2 seconds, and repeatedly throughout a day.

    Even assuming local audio inference and processing on a completely compromised device (rooted/jailbroken, disabled sandboxing/SIP, unrestricted platform access, the works) most phones will just struggle to do that recording and processing indeterminately without a noticeable impact on energy and data use.

    I’m sure advertising companies would love to collect that much raw candid data. It would seem quite a challenge to do so quietly, however, and given the apparent lack of evidence, is thus unlikely to have been implemented at any kind of scale.

    • admiralteal@kbin.social
      link
      fedilink
      arrow-up
      22
      ·
      11 months ago

      There’s also a totally plausible and far more insidious answer to what’s going on with the experiences people have of the ads matching their conversations.

      That explanation is advertising works. And worse, it works subconsciously. That you’re seeing the ads and don’t even notice you’re seeing them and then they’re worming their way into your conversations at which point you become more aware of them and then start noticing the ads.

      Which does comport with the billions of dollars spent on advertising every year. It would be very weird if an entire ad industry that’s at least a century old was all a complete nonsense waste of money this whole time.

      To me, this whole narrative is just another parable about why we need to do everything possible to limit our own exposure to ads to avoid being manipulated.

      • Septimaeus@infosec.pub
        link
        fedilink
        arrow-up
        7
        ·
        edit-2
        11 months ago

        Damn, I hadn’t thought of that. The chicken egg question of spooky ad relevance. Insidious indeed.

        I feel like the idea of some person or group having enough info to psychologically manipulate or predict should be way scarier than the black helicopter stuff, especially given that it’s one of the few conspiracy theories we actually have a bunch of high quality evidence for, just in marketing and statistics textbooks alone.

        But here we are. Government surveillance is the hot button, not the fact that marketers would happily sock puppet you given the chance.

    • WetBeardHairs@lemmy.ml
      link
      fedilink
      arrow-up
      16
      arrow-down
      2
      ·
      11 months ago

      That is glossing over how they process the data and transmit it to the cloud. The assistant wake word for “Hey Google” invokes an audio stream to an off site audio processor in order to handle the query. So that is easy to identify via traffic because it is immediate and large.

      The advertising-wake words do not get processed that way. They are limited in scope and are handled by the low power hardware audio processor used for listening for the assistant wake word. The wake word processor is an FPGA or ASIC - specifically because it allows the integration of customizable words to listen for in an extremely low power raw form. When an advertising wake word is identified, it sends an interrupt to the CPU along with an enumerated value of which word was heard. The OS then stores that value and transmits a batch of them to a server at a later time. An entire day’s worth of advertising wake word data may be less than 1 kb in size and it is sent along with other information.

      Good luck finding that on wireshark.

      • Septimaeus@infosec.pub
        link
        fedilink
        arrow-up
        8
        ·
        edit-2
        11 months ago

        Hmm, that’s outside my wheelhouse. So you’re saying phone hardware is designed to listen for not just one but multiple predefined or reprogrammable bank of wake words? I hadn’t read about that yet but it sounds more feasible than the constant livestream idea.

        The echo had the capacity for multiple wake words IIRC, but I hadn’t heard of that for mobile devices. I’m curious how many of these key words can they fit?

    • Zerush@lemmy.mlOP
      link
      fedilink
      arrow-up
      11
      arrow-down
      2
      ·
      edit-2
      11 months ago

      Smartphones by definition are Spyware, at least if you use the OS as is, because in them all aspects are controlled and logged, either by Google on Android or by Apple on iOS. Adding the default apps that cannot be uninstalled on a mobile that is not rooted. As COX alleges, they also use third-party logs and therefore can track and profile the user very well, even without using this technology that they claim to have.

      Although they feel authorized by the user’s consent to the TOS and PP, the legality depends directly on the legislation of each country. TOS and PP itself, to be a legal contract, must comply in all its points with local legislation to be applicable to the user. For this reason, I think that these practices are very different in the EU from those in the US, where legislation regarding privacy is conspicuous by its absence, that is, that US users should take these COX statements very seriously in their devices, although in the EU they must also be clear that Google and Apple know exactly what they do and where users live, although they are limited from selling this data to third parties.

      Basics:

      – READ ALWAYS TOS AND PP

      • Review the permissions of each app, leaving only the most essential ones
      • Desactivate GPS if not used
      • Review in Android every app with Exodus Privacy, maybe Lookout or MyCyberHome in iOS (Freemium apps !!!)
      • Use as less possible apps from the store
      • Be aware of discount apps from the Supermarket or Malls
      • Don’t store important data in the Phone (Banking, Medical…)
      • Septimaeus@infosec.pub
        link
        fedilink
        arrow-up
        3
        arrow-down
        1
        ·
        edit-2
        11 months ago

        Agreed, though I think it’s possible to use smart devices safely. For Android it can be difficult outside custom roms. The OEM flavors tend to have spyware baked in that takes time and root to fully undo, and even then I’m never sure I got it all. These are the most common phones, however, especially in economy price brackets, which is why I’d agree that for the average user most phones are spyware.

        Flashing is not useful advice to most. “Just root it bro” doesn’t help your nontechnical relatives who can’t stop downloading toolbars and VPN installers. But with OEM variants undermining privacy at the system level, it feels like a losing battle.

        I’d give credit to Apple for their privacy enablement, especially with E2EE, device lockdown, granular access permission control and audits. Unfortunately their devices are not as affordable and I’m not sure how to advise the average Android user beyond general opt-out vigilance.

          • Septimaeus@infosec.pub
            link
            fedilink
            arrow-up
            2
            ·
            edit-2
            11 months ago

            Yeah those push token systems need an overhaul. IIRC tokens are specific to app-device combinations, so invalidation that isn’t automatic should be push-button revocation. Users should have control of it like any other API on their device, if only to get apps to stop spamming coupons or whatever.

            It’s funny though: when I first saw those headlines, my first reaction was that it was a positive sign, since this was apparently news worthy even though the magnitude of impact for this sort of systemic breach is demonstrably low. (In particular, it pertains to (1) incidental high-noise data (2) associated with devices and (3) available only by request to (4) governments, who are weak compared to even the smallest data brokers WRT capacity for data mining inference and redistribution, to put it mildly.)

            Regardless, those systems need attention.

    • Fungah@lemmy.world
      link
      fedilink
      arrow-up
      11
      arrow-down
      2
      ·
      11 months ago

      My own theory is that they tokenize key words and phrases with an AI so that they’re not sending the actual audio data. Then it’s stored in a form some AI can parse but isn’t technically user data so they can skirt legislation around that.

      A tokenized collection of key phrases omitting delimiters in text format is going be much, much less than audio, or a transcript.

      • ben_dover@lemmy.world
        link
        fedilink
        arrow-up
        12
        ·
        11 months ago

        as someone who has played around with offline speech recognition before - there is a reason why ai assistants only use it for the wake word, and the rest is processed in the cloud: it sucks. it’s quite unreliable, you’d have to pronounce things exactly as expected. so you need to “train” it for different accents and ways to pronounce something if you want to capture it properly, so the info they could siphon this way is imho limited to a couple thousand words. which is considerable already, and would allow for proper profiling, but couldn’t capture your interest in something more specific like a mazda 323f.

        but offline speech recognition also requires a fair amount of compute power. at least on our phones, it would inevitably drain the battery

      • Septimaeus@infosec.pub
        link
        fedilink
        arrow-up
        2
        ·
        11 months ago

        That certainly would make the data smuggling easier. What about battery though? I assume that requires inference and at least rudimentary processing.

        How would a background process do this in real time on a mobile device without leaving traceable evidence like cpu time?

        • BigPotato@lemmy.world
          link
          fedilink
          arrow-up
          3
          ·
          11 months ago

          Cox also sells home automation bundles which advertise “smart” features like voice recognition which are always plugged into the wall.

        • BrownTree33@lemmy.ml
          link
          fedilink
          arrow-up
          2
          ·
          11 months ago

          Can it be implemented on pc? They often turned on and people speak around them too. Cpu activity much harder to trace when there are a lot of different processes. Someone can blame their phone, while it listening pc near by.

          • Septimaeus@infosec.pub
            link
            fedilink
            arrow-up
            4
            ·
            11 months ago

            Yeah outside mobile devices I imagine there’s a lot more leeway technically speaking. I’d be far more inclined to suspect a smart TV or a home assistant appliance like Amazon Echo, for example. And certainly there are plenty of PCs out there that are 100% compromised.

            But it’s the phone that people often think of as eavesdropping on their conversations. The idea is stickier perhaps because it’s a more personal violation. And I wouldn’t put it past data brokers by any means. They would if they could. I’ve just yet to hear a feasible explanation of how they can without being caught. Hence my doubt.

    • andrew_bidlaw@sh.itjust.works
      link
      fedilink
      arrow-up
      8
      ·
      11 months ago

      most phones will just struggle to record and process audio indeterminately without a noticeable impact on energy and data use.

      I mean, it’s still a valid concern for a commoner. Why my phone has twice the ram and twice the cores and is as slow as my previous one? I’d love to fuel this conspiracy into OS, app makers to do their fucking job.

      There’s no reason an app can weight more than 50mb on clean install*, and many socials, messengers fail to fit in. A client I use to write this is only 30+, and that’s one person doing that for donations.

      If there could be a raging theory that apps are selling your data to, like, China, there would be a push to decline it and optimize apps to fit that image.

      * I obviously exclude games, synths, editors of any kind with their textures and templates.

      • WetBeardHairs@lemmy.ml
        link
        fedilink
        arrow-up
        4
        ·
        11 months ago

        The filesize of most binaries is dominated by text strings and images. Modern applications are loaded with them. Lemmy is atypical in that it doesn’t need tons of built in images or text.

        • andrew_bidlaw@sh.itjust.works
          link
          fedilink
          arrow-up
          1
          ·
          11 months ago

          I get it. It’s just I don’t see any dev-put images in many big apps, besides a logo and a welcome screen. Updating them with dozens of megabytes doesn’t feel okay. It seems like there’s some bloat, or a vault management problems. Like in some seasonally updated games that put dupes to speed up load of a map or easily add new content on top of them instead of redownloading a brand new db. Some I followed shawed off tens of gigabytes by rearranging stuff.

          Like, messengers. I don’t get it how Viber wants more than 40+ mb per update having nothing but stickers, emoji already installed and probably don’t change them much. Cheap wireless connection could allow them to ignore that for some reason and start to get heavier in order to offload some from their servers, for many images are localized. Is that probably what their updates are? Or they consequentially add beta patches after an approval, so you download a couple of them in a close succession after they get into public?

    • Cheradenine@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      1
      ·
      11 months ago

      Fucking thank you. As I said in another reply, if this was true my firewall logs would be full, or my data cap blown in a week.

    • library_napper@monyet.cc
      link
      fedilink
      arrow-up
      6
      arrow-down
      1
      ·
      edit-2
      11 months ago

      What if the processing is done locally and the only thing they send back home is keywords for marketable products?

      • Septimaeus@infosec.pub
        link
        fedilink
        arrow-up
        5
        ·
        edit-2
        11 months ago

        Yeah they’d have to it seems, but real time transcription isn’t free. Even late model devices with better inference hardware have limited battery and energy monitoring. I imagine it’d be hard to conceal that behavior especially for an app recording in the background.

        WetBeardHairs@lemmy.ml mentioned that mobile devices use the same hardware coprocessing used for wake word behavior to target specific key phrases. I don’t know anything about that, but it’s one way they could work around the technical limitations.

        Of course, that’s a relatively bespoke hardware solution that might also be difficult to fully conceal, and it would come with its own limitations. Like in that case, there’s a preset list of high value key words that you can tally, in order to send company servers a small “score card” rather than a heavy audio clip. But the data would be far less rich than what people usually think of with these flashy headlines (your private conversations, your bowel movements, your penchant for musical theater, whatever).

    • Goun@lemmy.ml
      link
      fedilink
      arrow-up
      4
      ·
      11 months ago

      I agree.

      What could be possible, would be maybe send tiny bits. For example, a device could categorize some places or times, detect out of pattern behaviours and just record a couple of seconds here and there, then send it to the server when requesting something else to avoid being suspicious. Or just pretend it’s a “false positive” or whatever and say “sorry, I didn’t get that.”

      I don’t think they’re listening to everything, but they could technically get something if they wanted to target you.

      • Septimaeus@infosec.pub
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        11 months ago

        Right, I suppose cybersecurity isn’t so different than physical security in that way. Someone who really wants to get to you always can (read: why there are so many burner phones at def con).

        But for the average person, who uses consumer grade deadbolts in their home and doesn’t hire a private detail when they travel, does an iPhone fit within their acceptable risk threshold? Probably.

  • LemmyIsFantastic@lemmy.world
    link
    fedilink
    arrow-up
    43
    arrow-down
    14
    ·
    edit-2
    11 months ago

    And yet thousands of security researchers can’t find a shed of evidence. This shit is tiresome and counter productive. The general public is weary of hearing this made up bullshit.

    The technical practice isn’t hard. That’s the claim. The reality is nobody is buying shit doing this and this is just another repost from the same 404 article months ago.

      • Dr_Toofing@programming.dev
        link
        fedilink
        arrow-up
        18
        arrow-down
        2
        ·
        11 months ago

        I still wouldn’t believe it. Even the 404 article does not confirm anything and the ad company does not provide any details.

        This whole thing feels like marketing, claiming something outrageous to get people talking about your company.

        • Saik0@lemmy.saik0.com
          link
          fedilink
          English
          arrow-up
          6
          arrow-down
          1
          ·
          11 months ago

          That’s entirely possible. But they did say it themselves on their own site. Look at the link I’ve posted in response to the other guy.

          Even if they’re just joking about it they deserve all the negative press they’ll get.

        • Saik0@lemmy.saik0.com
          link
          fedilink
          English
          arrow-up
          5
          arrow-down
          8
          ·
          11 months ago

          Source: https://web.archive.org/web/20231214235444/https://www.cmglocalsolutions.com/blog/active-listening-an-overview

          Is Active Listening Legal?

          We know what you’re thinking. Is this even legal? The short answer is: yes. It is legal for phones and devices to listen to you. When a new app download or update prompts consumers with a multi-page terms of use agreement somewhere in the fine print, Active Listening is often included.

          So what were you saying?

          • Cheradenine@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            7
            ·
            edit-2
            11 months ago

            Did you read the article? No, you did not.

            According to the company this is all from regular 3rd party stuff. Being legal or not is beside the point when you are not actually doing something.

            You’re argument is based on what a marketing company put in their marketing.

            Read the article, with clarifications from the company

            ETA : if this were true I would either see it in my firewall logs, or it would blow through my data cap in a week. Surveillance capitalism is bullshit, this is just a grift.

            • library_napper@monyet.cc
              link
              fedilink
              arrow-up
              3
              ·
              11 months ago

              Seems funny how you keep saying from the company as if somehow asking s murderer with red bloody hands if they did it is somehow a creditable source

            • Saik0@lemmy.saik0.com
              link
              fedilink
              English
              arrow-up
              3
              arrow-down
              2
              ·
              11 months ago

              You’re argument is based on what a marketing company put in their marketing.

              But your response is

              with clarifications from the company

              So what the company says isn’t good enough… Except when it’s in your favor? You realize that both statement are “from the company”.

              • Cheradenine@sh.itjust.works
                link
                fedilink
                English
                arrow-up
                3
                arrow-down
                1
                ·
                11 months ago

                Fight as long as you want, when they were called out on it they backed off. The technical aspects of this are not trivial, nor is the amount of data needed as anyone who has had an Alexa or similar spyware in their house will tell you.

                Like I said

                if this were true I would either see it in my firewall logs, or it would blow through my data cap in a week.

                • Saik0@lemmy.saik0.com
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  11 months ago

                  Like I said

                  if this were true I would either see it in my firewall logs, or it would blow through my data cap in a week.
                  

                  Audio is literally trivial amounts of bandwidth. You wouldn’t notice it at all. Using something like Opus, you could stream audio 24/7 and reach about 300MBs uploaded. Now do some basic trimming/word processing… That number can easily be less than 10MB a day.

    • JSens1998@lemmy.ml
      link
      fedilink
      arrow-up
      8
      arrow-down
      13
      ·
      11 months ago

      Bro, I’ll literally be having a conversation with someone about a topic, and all of the sudden Google starts recommending me products related to the discussion afterwards. Smart phones and smart speakers without a doubt listen in on our conversations. There’s the evidence.

      • library_napper@monyet.cc
        link
        fedilink
        arrow-up
        2
        arrow-down
        2
        ·
        edit-2
        11 months ago

        Eh, surprised that’s happening to someone in this community. Strip Google off your phone and throw out any hardware with a microphone that doesn’t run open source software and this will stop happening.

    • Saik0@lemmy.saik0.com
      link
      fedilink
      English
      arrow-up
      27
      arrow-down
      6
      ·
      edit-2
      11 months ago

      It’s just they’re no longer afraid of telling us they are

      They’re also lying to themselves…

      https://web.archive.org/web/20231214235444/https://www.cmglocalsolutions.com/blog/active-listening-an-overview

      Is Active Listening Legal?

      We know what you’re thinking. Is this even legal? The short answer is: yes. It is legal for phones and devices to listen to you. When a new app download or update prompts consumers with a multi-page terms of use agreement somewhere in the fine print, Active Listening is often included.

      They believe that just because the phone’s owner agrees that it’s legal. If my wife accepts a ToS that allows them to monitor her, and her phone is in my room listening to me… That’s definitely NOT legal. This really needs to hit court sooner rather than later. This is wiretapping, this is illegal REGARDLESS of the ToS/EULA nonsense they want to claim covers them.

      Edit: Even in one-party consent states this is illegal.

      • ddh@lemmy.sdf.org
        link
        fedilink
        English
        arrow-up
        16
        ·
        edit-2
        11 months ago

        Let’s also remember that these phones are sold worldwide, and it’s foolish to declare something globally legal.

    • Vinegar@kbin.social
      link
      fedilink
      arrow-up
      24
      arrow-down
      2
      ·
      edit-2
      11 months ago

      Companies DO analyze what you say to smart speakers, but only after you have said “ok google, siri, alexa, etc.” (or if they mistake something like “ok to go” as “ok google”). I am not aware of a single reputable source claiming smart speakers are always listening.

      The reality is that analyzing a constant stream of audio is way less efficient and accurate than simply profiling users based on information such as internet usage, purchase history, political leanings, etc. If you’re interested in online privacy device fingerprinting is a fascinating topic to start understanding how companies can determine exactly who you are based solely on information about your device. Then they use web tracking to determine what your interests are, who you associate with, how you spend your time, what your beliefs are, how you can be influenced, etc.

      Your smart speaker isn’t constantly listening because it doesn’t need to. There are far easier ways to build a more accurate profile on you.

      • ristoril_zip@lemmy.zip
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        7
        ·
        11 months ago

        It’s literally impossible for them to not be “analyzing” all the sounds they (perhaps briefly) record.

        [Sound] --> [Record] --> [Analyze for keyword] --> [Perform keyword action] OR [Delete recording]

        Literally all sounds, literally all the time. And we just trust that they delete them and don’t send them “anonymized” to be used for training the audio recognition algorithms or LLMs.

        • bdonvr@thelemmy.club
          link
          fedilink
          arrow-up
          12
          ·
          11 months ago

          It is possible to analyze the traffic leaving these devices, and AFAIK it hasn’t been shown that they are doing this.

        • Solemn@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          11
          ·
          11 months ago

          The way that “Hey Alexa” or “Hey Google” works is by, like you said, constantly analysing the sounds they said. However, this is only analyzed locally for the specific phrase, and is stored in a circular buffer of a few seconds so it can keep your whole request in memory. If the phrase is not detected, the buffer is constantly overwritten, and nothing is sent to the server. If the phrase is detected, then the whole request is sent to the server where more advanced voice recognition can be done.

          You can very easily monitor the traffic from your smart speaker to see if this is true. So far I’ve seen no evidence that this is no longer the common practice, though I’ll admit to not reading the article, so maybe this has changed recently.

          • uzay@infosec.pub
            link
            fedilink
            arrow-up
            2
            ·
            11 months ago

            If they were to listen for a set of predefined product-related keywords as well, they could take note of that and send that info inconspicuously to their servers as well without sending any audio recordings. Doesn’t have to be as precise as voice command recognition either, it’s just ad targeting.

            Not saying they do that, but I believe they could.

        • Zeroc00l@sh.itjust.works
          link
          fedilink
          arrow-up
          11
          ·
          11 months ago

          So, you and your friend were talking about a subject you obviously are interested in, likely spend heaps of time online searching about, commenting and following on social media and you’re surprised you got an ad for it? Bonkers.

        • Solemn@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          11
          ·
          11 months ago

          It’s been published by multiple sources at this point that this happens because of detected proximity. Basically, they know who you hang out with based on where your phones are, and they know the entire search history of everyone you interact with. Based on this, they can build models to detect how likely you are to be interested in something your friend has looked at before.

          • NekuSoul@lemmy.nekusoul.de
            link
            fedilink
            arrow-up
            10
            ·
            edit-2
            11 months ago

            Yup. For companies it’s much safer to connect the dots with the giant amount of available metadata in the background than risk facing a huge backlash when people analyze what data you’re actively collecting.

            Which is why people need to call out the tracking that’s actually happening in the real world a lot more, because I don’t really want my search-history leaked by proxy to people in my proximity either.

        • Chozo@kbin.social
          link
          fedilink
          arrow-up
          18
          arrow-down
          1
          ·
          edit-2
          11 months ago

          Following an investigation by Bloomberg, the company admitted that it had been employing third-party contractors to transcribe the audio messages that users exchanged on its Messenger app.

          So not your IRL conversations.

          There is no indication that Facebook has used the information it collected to sell ads.

          So not for ads.

          It says the opposite of the things you claimed.

        • iAmTheTot@kbin.social
          link
          fedilink
          arrow-up
          7
          ·
          11 months ago

          I generally don’t go out of my way to validate every crazy thing I read on the internet without any backing evidence supplied.

        • null@slrpnk.net
          link
          fedilink
          arrow-up
          4
          ·
          11 months ago

          There is. And the parent commenter can use it to find and share evidence for their claim.

  • Snot Flickerman@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    13
    arrow-down
    13
    ·
    11 months ago

    Services that “listen” for commands like Siri and Alexa have to be, by default, always listening, because otherwise they would not be able to hear the activate command. They are supposed to dump the excess data like anything that came before the activation command, but that’s just a promise. There are very few laws protecting you if that promise turns out to be a lie. The best you can get is likely small restitution through a class action lawsuit (if you didn’t waiver right to that by agreeing to the Terms of Service, which is more often than not, now).

    Of fucking course they’re listening.

    • Serinus@lemmy.world
      link
      fedilink
      arrow-up
      12
      arrow-down
      1
      ·
      11 months ago

      They’re not. Not yet. People are on edge and looking for this exact thing, which hadn’t happened yet. Meanwhile, they’ve already built a pretty damn good profile of you based on your search queries and mistyped urls.

    • null@slrpnk.net
      link
      fedilink
      arrow-up
      12
      arrow-down
      1
      ·
      11 months ago

      They are supposed to dump the excess data like anything that came before the activation command, but that’s just a promise.

      Where are they hiding that data locally, and how are they making it invisible in transit?