What does Google Gemini do with your data? It’s complicated • The Register

Google, after facing accusations about its AI model ingesting private files, says Gemini can read and summarize this type of sensitive data in real time – but only with Workspace users’ express permission. 

A Google comms manager has assured The Register “neither that summary nor the doc itself is stored” by Gemini for future training or processing.

This dust-up started when Kevin Bankston, a senior advisor on AI governance at the Center for Democracy & Technology, xeeted that he opened his tax return in Google Docs, and Gemini, without his permission, immediately summarized it.

“So… Gemini is automatically ingesting even the private docs I open in Google Docs? WTF, guys. I didn’t ask for this,” Bankston said. “Now I have to go find new settings I was never told about to turn this crap off.”

This, as Bankston pointed out, is especially worrisome given Google’s track record of leaking private conversations with its generative AI tools in public search results.

In a very long X missive, Bankston detailed his difficulties and frustrations with finding the right settings to turn off this so-called “feature.” He did ask Gemini where to find its privacy settings itself, but that didn’t work either.

Meanwhile, it appeared the AI had access to more than just Bankston’s tax documents. “Doing a little more testing, it appears this is happening with any PDF of mine that I open from Drive,” he said. “Thankfully not (yet?) automatically happening with Google Docs.”

It took nearly a week, and conversations with real people at Google, before Bankston, who teaches AI law at Georgetown University, was able to solve the issue – and that doesn’t bode well for normal users, who likely neither have Bankston’s technical knowledge nor visibility to get Google’s attention when things go wrong.

A year ago, Bankston signed up to participate in Google’s Workspace Labs. This is an early-access program that lets users test certain generative AI features and provide feedback. One of these Gemini features in Workspace is a side panel that summarizes documents.

Essentially what happened here, according to Google and Bankston, is that Bankston was already using Gemini via the side panel in Google Drive for PDFs. Once a user activates this side panel, it summarizes a document the user opens in real time. Plus, it automatically summarizes every PDF thereafter until the users closes the side panel.

“That happens because once you start using Gemini, we have settings that respect user preferences and carry over to subsequent user sessions within our products,” a Google spokesperson told The Register.

The spokesperson confirmed that the Chocolate Factory had some good conversations with Bankston over the past week, and denied that its AI model ingests users’ Workspace data.

“One big misperception in the original thread we talked through is the notion that data ingestion is occurring. It’s not,” the spokesperson added. “When the feature is enabled, the content from an open doc can be used to generate a summary in real time, but neither that summary nor the doc itself is stored.”

Bankston declined to comment beyond his series of xeets, which began July 10 and ended July 16. In some of the later ones, he also noted that Google isn’t keeping the data that Gemini summarizes, which, he said, “is both good and bad from a privacy perspective.”

The confusion also has to do with the Gemini chatbot and Gemini integrations being two separate things. As Bankston described it: “Gemini app extensions only control whether Gemini chat can access data *from* other apps like Workspace or YouTube or whatever to personalize your chat experience. Confusingly it doesn’t control Gemini features *in* those apps.”

The good news, he explained, is that this means all of those summaries created by users who never closed out the side panel aren’t preserved in logs.

But on the other hand, Bankston added, “that also makes the confusion between Gemini in-other-apps and Gemini chat app even greater, since they are now behaving and treating data very differently.” ®