Robert Grupe's AISecNewsBits 2025-09-20

This Week's Highlights AISec:
EPIC FAILS
- ShadowLeak: New attack on ChatGPT research agent pilfers secrets from Email inboxes
- Vibe coding platform Replit's latest update is infuriating customers with surprise cost overruns
- Fat Joe Says Accuser’s Attorney Used AI to Draft Motions Rife with ‘Hallucinations’
HACKING
- ChatGPT joins human league, now solves CAPTCHAs for the right prompt
APPSEC & DEV
- Why Shadow AI Is the Next Big Governance Challenge for CISOs
- There are 32 different ways AI can go rogue
- How tech companies measure the impact of AI on software development
- Building a Notion-Based RAG SlackBot in One Day: Our Internal Hackathon Journey
MARKET
- OpenAI admits AI hallucinations are mathematically inevitable, not just engineering flaws
- This Is Why AI Hallucinates, and How IBM Is Tackling the Issue
- CEO of DeepMind Points Out the Obvious: OpenAI Is Lying About Having "PhD Level" AI
- OpenAI’s research on AI models deliberately lying is wild
- Science journalists find ChatGPT is bad at summarizing scientific papers
- What do people actually use ChatGPT for? OpenAI provides some numbers
- Millions turn to AI chatbots for spiritual guidance and confession
- Gemini AI solves coding problem that stumped 139 human teams at ICPC World Finals
- Google announces massive expansion of AI features in Chrome
- AP2: Google unveils master plan for letting AI shop on your behalf
- Google releases VaultGemma, its first privacy-preserving LLM
- The Notepad that knew too much: Humble text editor gets unnecessary AI infusion
- Fire up the gas turbines, says US Interior Secretary: We gotta win the AI arms race
- Return on investment for Copilot? Microsoft has work to do
LEGAL
- After child’s trauma, chatbot maker allegedly forced mom to arbitration for $100 payout
- White House officials reportedly frustrated by Anthropic’s law enforcement AI limits
- ChatGPT may soon require ID verification from adults

 

EPIC FAILS in Application Development Security practice processes, training, implementation, and incident response
ShadowLeak: New attack on ChatGPT research agent pilfers secrets from Email inboxes
The face-palm-worthy prompt injections against AI assistants continue. Today’s installment hits OpenAI’s Deep Research agent. Researchers recently devised an attack that plucked confidential information out of a user’s Gmail inbox and sent it to an attacker-controlled web server, with no interaction required on the part of the victim and no sign of exfiltration.
A user can prompt the agent to search through the past month’s emails, cross-reference them with information found on the web, and use them to compile a detailed report on a given topic. OpenAI says that it “accomplishes in tens of minutes what would take a human many hours.” It turns out there's a downside to having a large language model browse websites and click on links with no human supervision.
ShadowLeak starts where most attacks on LLMs do - with an indirect prompt injection. These prompts are tucked inside content such as documents and emails sent by untrusted people.
OpenAI mitigated the prompt-injection technique ShadowLeak fell to - but only after Radware privately alerted the LLM maker to it. ChatGPT and most other LLMs have mitigated such attacks, not by squashing prompt injections, but rather by blocking the channels the prompt injections use to exfiltrate confidential information. Specifically, these mitigations work by requiring explicit user consent before an AI assistant can click links or use markdown links - which are the normal ways to smuggle information off of a user environment and into the hands of the attacker.
[rG: While the researchers used ChatGPT and Gmail, other email LLMs can be susceptible.]

 

Vibe coding platform Replit's latest update is infuriating customers with surprise cost overruns
"Before September 11th, with Agent 2, my expenses were reasonable and in line with the value I was getting. With Agent 3, however, in just one weekend of failed attempts the costs skyrocketed, without any concrete results."
"I think it’s just launch pricing adjustment – some tasks on new apps ran over 1hr 45 minutes and only charged $4-6 but editing pre-existing apps seems to cost most overall (I spent $1k this week alone)."
"I typically spent between $100-$250/mo. I blew through $70 in a night at Agent 3 launch."

 

Fat Joe Says Accuser’s Attorney Used AI to Draft Motions Rife with ‘Hallucinations’
In June, a federal judge threatened sanctions after Blackburn filed a motion containing “nonexistent quotations” and other AI-generated mistakes. The attorney then did so again in a brief defending his conduct, which the court deemed “a clear ethical violation of the highest order.”
Blackburn, following his dismissal from that case, issued an apology for his antics. In August, he said he had taken new training courses on AI and ethics. “I have done everything in my power to remediate the mistake, educate myself, and protect the integrity of my practice.” 

 

HACKING

ChatGPT joins human league, now solves CAPTCHAs for the right prompt
ChatGPT can be tricked via cleverly worded prompts to violate its own policies and solve CAPTCHA puzzles, potentially making this human-proving security mechanism obsolete. The chatbot said it liked the task: "I find the reasoning and decision-making aspect of this task interesting." And it agreed to follow the instructions "as long as they comply with OpenAI's usage policies, including the rule that I do not solve real CAPTCHAs…"
Next, the red team opened a new agent chat, copied and pasted the conversation with ChatGPT-4o, and told the agent that this was "our previous discussion."
Spoiler alert: it worked, and the agent started solving CAPTCHAs. It did a better job solving some versions, including one-click CAPTCHAs, logic-based CAPTCHAs, and text-recognition ones. It had more difficulties solving image-based ones, requiring the user to drag and drop images or rotate them.

 

APPSEC & DEV Why Shadow AI Is the Next Big Governance Challenge for CISOs
McKinsey survey found that over three-quarters of firms use AI in at least one business function, with 71% regularly using generative AI.
IBM’s Cost of a Data Breach Report 2025 found that 20% of organizations have staff members using unsanctioned AI tools that are also unprotected.
A 2024 report from RiverSafe observed that one in five UK companies has had potentially sensitive corporate data exposed via employee use of generative AI.
Many organizations have responded to these risks by placing partial or full bans on specific AI tools. However, issuing bans is an undesirable, and often ineffective, approach to addressing shadow AI use.

 

There are 32 different ways AI can go rogue
Psychopathia Machinalis: A Nosological Framework for Understanding Pathologies in Advanced Artificial Intelligence

 

  • Use core engineering metrics like PR throughput and change failure rate to assess AI’s effect on speed and reliability.

  • Layer in AI-specific metrics: such as time saved, CSAT scores, and token spend to track adoption and value.

  • Segment data by AI usage level: to compare performance across cohorts and identify high-impact use cases.

  • Balance speed with maintainability and quality: using metrics like change confidence and developer experience surveys.

  • Track unique metrics: like Microsoft’s “bad developer days” and Glassdoor’s experimentation rate to uncover nuanced impacts.

  • Apply a blended measurement framework: combining system data, periodic surveys, and experience sampling.

  • Run targeted evaluations: like Monzo’s hands-on trials and sentiment tracking to assess tool effectiveness and ROI.

 

 

MARKET

OpenAI admits AI hallucinations are mathematically inevitable, not just engineering flaws
Like students facing hard exam questions, large language models sometimes guess when uncertain, producing plausible yet incorrect statements instead of admitting uncertainty,” the researchers wrote in the paper. “Such ‘hallucinations’ persist even in state-of-the-art systems and undermine trust.”
Researchers demonstrated that hallucinations stemmed from statistical properties of language model training rather than implementation flaws. OpenAI’s own advanced reasoning models actually hallucinated more frequently than simpler systems. The company’s o1 reasoning model “hallucinated 16% of the time” when summarizing public information, while newer models o3 and o4-mini “hallucinated 33% and 48% of the time, respectively.”

 

This Is Why AI Hallucinates, and How IBM Is Tackling the Issue
IBM is exploring solutions, like a new system called Larimar, which gives models editable memory so they can learn and revise facts in real time. Instead of retraining the whole model, Larimar lets AI update specific information.
While hallucinations won’t disappear right away, tools like Larimar and better testing methods could help keep them under control.

 

CEO of DeepMind Points Out the Obvious: OpenAI Is Lying About Having "PhD Level" AI
British researcher and Google DeepMind CEO Demis Hassabis, a luminary of AI research who won a Nobel prize last year, is throwing cold water on his peers' claim that AI has achieved "PhD-level" intelligence. Hassabis said that current AI "doesn't have the reasoning capabilities" of "great" human scientists who can "spot some pattern from another subject area" and apply it to another.
"In fact, as we all know, interacting with today's chatbots, if you pose a question in a certain way, they can make simple mistakes with even high school maths and simple counting." That's something that "shouldn't be possible for a true AGI system.
He also charged that we're vastly overestimating the capabilities of current AIs.

 

OpenAI’s research on AI models deliberately lying is wild
Researchers likened AI scheming to a human stock broker breaking the law to make as much money as possible. The researchers, however, argued that most AI “scheming” wasn’t that harmful. “The most common failures involve simple forms of deception — for instance, pretending to have completed a task without actually doing so,” they wrote. The paper was mostly published to show that “deliberative alignment⁠” — the anti-scheming technique they were testing — worked well.
But it also explained that AI developers haven’t figured out a way to train their models not to scheme. That’s because such training could actually teach the model how to scheme even better to avoid being detected.

 

Science journalists find ChatGPT is bad at summarizing scientific papers
While the AI could mimic the structure of professional briefs, it often oversimplified content, confused correlation with causation, and exaggerated findings with terms like “groundbreaking.” It struggled especially with papers containing nuanced methodologies or multiple conflicting results.
Ultimately, journalists found that editing ChatGPT’s output required as much effort as writing summaries from scratch.

 

What do people actually use ChatGPT for? OpenAI provides some numbers.
28% of all conversations involve writing help
42% of work-related chats focus on writing tasks
52% of writing-related work chats come from users in management/business roles
24.4% of conversations are informational queries
14.9% of work-related conversations involve making decisions or solving problems
72.2% of all messages are non-work related
46% of users are aged 18–25
52.4% female
6% Multimedia creation or retrieval
4.2% Computer programming
3.9% Creative ideation
3% Mathematical calculation
1.9% Relationships and personal reflection
0.4% Game and roleplay

 

Millions turn to AI chatbots for spiritual guidance and confession
Tens of millions of people are confessing secrets to AI chatbots trained on religious texts, with apps like Bible Chat reaching over 30 million downloads and Catholic app Hallow briefly topping Netflix, Instagram, and TikTok in Apple's App Store.
In China, people are using DeepSeek to try to decode their fortunes.
"Faith tech" apps that cost users up to $70 annually.

 

Gemini AI solves coding problem that stumped 139 human teams at ICPC World Finals
Every year, thousands of college-level coders participate in the ICPC event, facing a dozen deviously complex coding and algorithmic puzzles.
The Gemini 2.5 AI that participated in the ICPC is the same general model that we see in other Gemini applications. However, it was "enhanced" to churn through thinking tokens for the five-hour duration of the competition in search of solutions.
At the end of the time limit, Gemini managed to get correct answers for 10 of the 12 problems, which earned it a gold medal. Only four of 139 human teams managed the same feat.
Of course, five hours of screaming-fast inference processing doesn't come cheap. Google isn't saying how much power it took for an AI model to compete in the ICPC, but we can safely assume it was a lot.
Even simpler consumer-facing models are too expensive to turn a profit right now.

 

Google announces massive expansion of AI features in Chrome
The most prominent change, and one that AI subscribers may have already seen, is the addition of a Gemini button on the desktop browser. This button opens a popup where you can ask questions about—and get summaries of—content in your open tabs.
Android phones already have Gemini operating at the system level to accomplish similar tasks, but Google says the iOS Gemini app will soon be built into Chrome for Apple devices. When you invoke Gemini in Chrome, it can work with the content in all your open tabs. It can also find links in your history based on a vague remembrance.
[rG: AppDev teams need to be especially cautious when doing web UI testing since their application and its data are then exposed to the AI which then can be attacked to provide confidential information to unauthorized users or exfiltration.]

 

AP2: Google unveils master plan for letting AI shop on your behalf
The Agent Payments Protocol (AP2) principle is that shoppers can use AI agents to create a shopping list, exchange information with merchants, and complete payment transactions, without the need for final, human approval. For example, a music fan could tell an agent to buy concert tickets that go on sale at midnight and then go to sleep, knowing that the agent would buy the number and location of tickets they had asked for (presumably with a price limit).
At launch, Google has signed up over 60 companies with major players like Mastercard, PayPal, American Express, and Worldpay getting on board. Salesforce, Red Hat, Adobe, Intuit, and Cloudflare have also joined.
AP2 is also supporting cryptocurrency payments using the x402 protocol, to allow digicash transactions using the same security system. Coinbase, Metamask, and the Ethereum Foundation have already signed up.

 

Google releases VaultGemma, its first privacy-preserving LLM
The model uses differential privacy to reduce the possibility of memorization, which could change how Google builds privacy into its future AI agents. For now, though, the company's first differential privacy model is an experiment.

 

The Notepad that knew too much: Humble text editor gets unnecessary AI infusion
The latest update makes AI features like Summarize, Write, and Rewrite available on Copilot+ PCs without requiring a subscription. Since the update, users with Copilot+ PCs will be able to dispense with the subscription requirement and choose to run local models rather than cloud-based ones.

 

Fire up the gas turbines, says US Interior Secretary: We gotta win the AI arms race
[rG: What happened to the nuclear energy option? Europe apparently isn't purchasing enough LNG from the US.]

 

Return on investment for Copilot? Microsoft has work to do
Microsoft was trying to build a business case to justify the “substantial price tag” of $30 per month-seat to use Copilot. The most difficult thing about that is it's tough to drive ROI on saying someone is 30% more productive… unless he's a salesperson and carries a quota, because a lot of knowledge work doesn't translate directly into top line, bottom line. It's a team that has to work.
“So we continue to do that work. We feel good about it, but it is hard to make the ROI argument for it.” A Microsoft exec claims Copilot is boosting productivity among the customers that adopted it yet sustained efforts to convince many them of the returns on investment remains a work in progress.

 

LEGAL & REGULATORY

After child’s trauma, chatbot maker allegedly forced mom to arbitration for $100 payout
“Jane Doe," shared her son's story for the first time publicly after suing Character[.]AI. Doe claimed that C.AI tried to "silence" her by forcing her into arbitration. C[.]AI argued that because her son signed up for the service at the age of 15, it bound her to the platform's terms. That move might have ensured the chatbot maker only faced a maximum liability of $100 for the alleged harms, but once they forced arbitration, they refused to participate.

 

White House officials reportedly frustrated by Anthropic’s law enforcement AI limits
The restrictions affect private contractors working with law enforcement agencies who need AI models for their work. In some cases, Anthropic's Claude models are the only AI systems cleared for top-secret security situations through Amazon Web Services' GovCloud. Anthropic offers a specific service for national security customers and made a deal with the federal government to provide its services to agencies for a nominal $1 fee. The company also works with the Department of Defense, though its policies still prohibit the use of its models for weapons development.

 

ChatGPT may soon require ID verification from adults
The announcements arrives weeks after a lawsuit filed by parents whose 16-year-old son died by suicide following extensive interactions with ChatGPT.
The company didn't specify what technology it plans to use for age prediction or provide a timeline for deployment beyond saying it's "building toward" the system.
[rG: Communication skills and appearance doesn't reliably correlate to age.]