pdgmail: new tool for gmail memory forensics

I saw John McCash’s article on GMail forensics … I was hooked and created pdgmail.

I’ve been messing around with the volatile toolkit for memory forensics and thought I’d try my hands at GMail memory forensics since, as John says, the GMail data isn’t supposed to end up on disk anyways, maybe it’s in the the browser memory?

Boy is it!

I used the pd dump tool from www.trapkit.de, available here, and tested against my meager GMail account, Windows XP, 2000, IE 6, IE 7 and Firefox 3. In all cases I was able to retrieve contact data, last login times and IP addresses, basic email headers and email bodies. Even if the browser was ‘logged out’ of GMail, they all still retained this data. Even for messages that were not opened, contacts that weren’t used. Simply loading up the GMail UI loads all this data in the memory image.

How to use?

First step is to gather the browser memory. Here’s a sample pd session where 6352 is the PID of a running IE instance:

E:\Program Files\tools>pd -p 6352 > 6352.dump
pd, version 1.1 tk 2006, www.trapkit.de

Dump finished.

E:\Program Files\tools>dir
Directory of E:\Program Files\tools

09/27/2008 06:57 PM 117,908,254 6352.dump

Whoa big file! But this is forensics, we don’t scare at large data sets. To use the pdgmail tool run this memory dump through strings -el to create a strings file, then either cat that file through pdgmail, or run pdgmail with the -f flag specifying your strings filename. example:

strings -el 6352.dump | pdgmail | less

Best mileage will be with Python 2.4.4 or 2.5 on Linux. I haven’t tested it below those versions or on Windows.

It looks for these things:

  • contacts
  • last access records
  • GMail account names
  • message headers
  • message bodies

Contacts show up as:
contact: name: "jeff bryner" email: "myemailaddress@gmail.com

Last Access records show most recent two logins and appear as:
last access: "14 hours ago" from IP "10.15.26.8", most recent access Tue Oct 14 10:57:53 2008 from IP "12.9.4.238"

Email messages are the messiest mostly because memory artifacts don’t always conform to API standards, so picking them out is a best guess.

Using the most familiar email of all, headers show up as:
message header: ["ms","113b0d734737dec4","",4,"Gmail Team ","Gmail Team","mail-noreply@google.com",1184082900000,"Did you know that GMail was voted #2 in PC World's Top 100 products of 2005, ...",["^all","^i"]

Message bodies are parsed to turn the unicode into proper html:

Did you know that GMail was voted #2 in PC World’s Top
100 products of 2005
, right after Firefox? Why wouldn’t you want to
switch? Well, because it can be a pain to switch to a new email
address. We know.

etc…

Nothing fancy, just some glorified regex and unicode handling dumped to stdout. It parses if possible, otherwise it just spits out a familiar line. Feel free to send me patches, tweak, rewrite, etc. Hope it helps someone!

Jeff Bryner , GCFA Gold #137, also holds the CISSP and GCIH certifications, occasionally teaches for SANS and performs forensics, intrusion analysis, and security architecture work on a daily basis.

6 Comments

  1. robtlee
    Posted October 20, 2008 at 12:28 pm | Permalink

    Would this also work on a full memory dump from windows created by mdd or win32dd?

  2. jeffbryner
    Posted October 20, 2008 at 1:24 pm | Permalink

    Yup, just tested with mdd_1.3,exe and it shows the same info.

    It’s a bit slower, repeats the findings quite a bit, and hits false positives for the message bodies, but it works.

    I’ll see if I can refine the message body regex to weed out the falses.

  3. johnhsawyer
    Posted November 6, 2008 at 6:48 pm | Permalink

    Jeff, excellent job! I’ve successfully tested it with memory dumps from mdd, win32dd, Memoryze, a raw export of memory from a winen aquired dump and memory acquisitions using F-Response 2.03 beta.

    Rob, when working with a full memory dump from any of the formats that Volatility or Memoryze supports, you could skip the headache of repeats and false positives by first extracting out the process’ full memory space and then searching just it.

    One cool thing about using the full memory dump, though, is that if a web browser isn’t running when you do the memory dump, you’ll still find artifacts floating in memory even though you can’t extract the process. Volatility’s psscan/2 can show that a browser was running and has exited to help clue you into that.

  4. jeffbryner
    Posted November 8, 2008 at 2:19 pm | Permalink

    OK, updated the tool to work better in large memory dumps where there is more stuff that smells like a message body. Same link, but now it’s version 2.0.

    I also added a command line option to skip searching for message bodies (-b or –bodies). Use it if you get too many false positives on message bodies.

  5. roayers
    Posted January 22, 2009 at 4:14 pm | Permalink

    has anyone used Encase Enterprise to capture the running executables and ram? I tried this today than ran an enscript to convert the executable and the memory capture to separate dd format files. I then tried the above process and here are the results? I used yahoo email prior to the capture of the ram and the iexplore.exe executable so I know Yahoo content existed. I also confirmed this from within Encase during the examination of those files.

    Any thoughts?

    python pdymail -f ramstrings.txt

  6. roayers
    Posted January 22, 2009 at 4:17 pm | Permalink

    Here is the output:

    python pdymail -f ramstrings.txt


One Trackback/Pingback

  1. [...] pdgmail: new tool for gmail memory forensics << SANS Computer Forensics, Investigation, and Re… If you use GMail, you should really read this article. Sandboxing in some fashion sounds like a really good idea. [...]

Post a Comment

You must be logged in to post a comment.