Continuing our discussion of Google Takeouts data (available to download for your personal Google account or via warrant), we’re going to take a look at the voice searches done on a smartphone. 

If a user employs the microphone in the Google widget on their phone or into the search bar, or to use the Assistant to create a reminder, search for nearby locations,  the data is kept in a text format that is available in Takeouts data. 

The artifact parsed by my script is found within the Takeout>My Activity>Search>Myactivity.html file. This file should not be confused with what appears to be an older artifact found in Takeout>MyActivity>Voice and Audio>MyActivity.html. (Interestingly enough, there are some voice recordings in this folder.)

The search activity housed in the html file Takeout>MyActivity>Search>MyActivity.html is a collection of all Google Assistant activity, including incoming calls, tips, pop up notifications and voice searches. The data has a correlated timestamp for the activity.

The download from Google Takeouts has an html file which can be parsed with Python. Unfortunately, the HTML does not have tags that make separating the voice search data easy, however parsing the data for searches beginning with the word “Said” and then reporting out the following three lines will print most entries cleanly, along with their associated timestamp. 

The Python script will print all voice searches and their timestamps. Code is found here: https://github.com/DFIRLore/GTakeoutsVoice

One response to “Parsing Google Voice Search”

  1. Week 08 – 2022 – This Week In 4n6 Avatar

    […] Camille LoreParsing Google Voice Search […]

    Like

Leave a comment

Quote of the week

“Do or do not. There is no try.”

-Yoda

Designed with WordPress