Volatility memory analysis notebook by Eric Hutchins

The Volatility Framework is a powerful and flexible library to analyze volatile memory (e.g., memory dumps). The primary way analysts use this framework is to run the vol.py script from the terminal with various plugins and parameters, printing the results of the command to stdout.

$ python vol.py -f ds_fuzz_hidden_proc.img --profile=WinXPSP2x86 psscan

Offset(P)  Name                PID   PPID PDB        Time created         Time exited         
---------- ---------------- ------ ------ ---------- -------------------- --------------------
0x0181b748 alg.exe             992    660 0x08140260 2008-11-15 23:43:25                      
0x01843b28 wuauclt.exe        1372   1064 0x08140180 2008-11-26 07:39:38

This is a quintessential use-case for IPython Notebook: a place to document various commands, the result of those commands, and markup description to explain the methodology and significance. In other words: your full analysis! Furthermore, as we show at the end, rich inline images also make IPy a fantastic way to guide and document memory analysis.

The downside is that the developers admit using the tools as a library is not perfect. Prepare to have to read some code.

Although its possible to use Volatility as a library, we hope to support it better in the future

It is, however, very exciting to see this on the roadmap:

Interactive IPython shell

Notebook Prerequisites

Modules

Data

ds_fuzz_hidden_proc.img -- Sample memory image via Sample Memory Images directory. (Note: you'll have to bunzip the file before using)

In [1]:

from cStringIO import StringIO

In [2]:

# Imports following example from
# https://code.google.com/p/volatility/wiki/BasicUsage21#Using_Volatility_as_a_Library
import volatility.conf as conf
import volatility.registry as registry
import volatility.commands as commands
import volatility.addrspace as addrspace

import volatility.utils as utils
import volatility.win32.network as network

import volatility.plugins.taskmods as taskmods
import volatility.plugins.vadinfo as vadinfo

registry.PluginImporter()
config = conf.ConfObject()

registry.register_global_options(config, commands.Command)
registry.register_global_options(config, addrspace.BaseAddressSpace)

# You can print the cmds dictionary to see list of available plugins
# These are the same commands you would specify to the command line vol.py script
cmds = registry.get_plugin_classes(commands.Command, lower = True)

In [3]:

# These parameters simulate the command line settings "--profile" and "-f" respectively
config.PROFILE = "WinXPSP2x86"
config.LOCATION = "file:///c:/ds_fuzz_hidden_proc.img"

Iterate processes¶

The PSScan module in The Volatility Framework scans physical memory for EPROCESS allocations. This method discovers processes that may be hidden/excluded from the normal process tree

In [4]:

from volatility.plugins.filescan import PSScan
import pandas as pd

Here is the common way to invoke plugins. First, instantiate the plugin by passing the config. Each plugin should provide a calculate and render_[something] method. The most common renderer is render_text. Since Volatility is primarily intended to be run stand-alone from a terminal, it expects to write the output to a file buffer (or stdout). In our use-case, we want to direct this output into a buffer using the StringIO library.

This approach should work for most Volatility plugins. You will have to check the code for each module to see how to populate additional config parameters as needed.

In [5]:

ps = PSScan(config)

In [6]:

pstable = StringIO()
psdata = ps.calculate()
ps.render_text(pstable, psdata)
print pstable.getvalue()

Offset(P)  Name                PID   PPID PDB        Time created         Time exited         
---------- ---------------- ------ ------ ---------- -------------------- --------------------
0x0181b748 alg.exe             992    660 0x08140260 2008-11-15 23:43:25                      
0x01843b28 wuauclt.exe        1372   1064 0x08140180 2008-11-26 07:39:38                      
0x0184e3a8 wscntfy.exe         560   1064 0x081402a0 2008-11-26 07:44:57                      
0x018557e0 alg.exe             512    672 0x08140260 2008-11-26 07:38:53                      
0x0185dda0 cmd.exe             940   1516 0x081401a0 2008-11-26 07:43:39  2008-11-26 07:45:49 
0x018a13c0 VMwareService.e    1756    672 0x08140220 2008-11-26 07:38:45                      
0x018af448 VMwareUser.exe     1904   1516 0x08140100 2008-11-26 07:38:31                      
0x018af860 VMwareTray.exe     1896   1516 0x08140200 2008-11-26 07:38:31                      
0x018e75e8 spoolsv.exe        1648    672 0x081401e0 2008-11-26 07:38:28                      
0x019456e8 csrss.exe           592    360 0x08140040 2008-11-15 23:42:56                      
0x01946020 svchost.exe         828    660 0x081400c0 2008-11-15 23:42:57                      
0x019467e0 services.exe        660    616 0x08140080 2008-11-15 23:42:56                      
0x0194f658 svchost.exe        1016    660 0x08140100 2008-11-15 23:42:57                      
0x019533c8 svchost.exe         924    660 0x081400e0 2008-11-15 23:42:57                      
0x019ca478 explorer.exe       1516   1452 0x081401c0 2008-11-26 07:38:27                      
0x019dbc30 lsass.exe           684    620 0x081400a0 2008-11-26 07:38:15                      
0x019e4670 smss.exe            360      4 0x08140020 2008-11-26 07:38:11                      
0x019f7da0 svchost.exe        1164    672 0x08140140 2008-11-26 07:38:23                      
0x01a0e6f0 svchost.exe        1264    672 0x08140160 2008-11-26 07:38:25                      
0x01a1bd78 csrss.exe           596    360 0x08140040 2008-11-26 07:38:13                      
0x01a2b100 winlogon.exe        620    360 0x08140060 2008-11-26 07:38:14                      
0x01a3ba78 services.exe        672    620 0x08140080 2008-11-26 07:38:15                      
0x01a3d360 svchost.exe         932    672 0x081400e0 2008-11-26 07:38:18                      
0x01a59d70 svchost.exe         844    672 0x081400c0 2008-11-26 07:38:18                      
0x01aa2300 svchost.exe        1064    672 0x08140120 2008-11-26 07:38:20                      
0x01bcc830 System                4      0 0x00319000

It would be nice to load the PSScan output into a data structure for sorting/filtering/etc. We can loop through the task list ourselves and extract and normalize the key parameters into a dictionary. Pandas can convert a list of dicts into a DataFrame trivially. And since our PSScan object is already in memory and Volatility has its own cache, traversing these objects again is very fast.

In [7]:

taskinfo = []

for task in ps.calculate():
    info = {}
    info['Name'] ='%s' % task.ImageFileName
    info['PID'] = '%i' % task.UniqueProcessId
    info['PPID'] = '%i' % task.InheritedFromUniqueProcessId
    info['Threads'] = '%s' % task.ActiveThreads
    info['HandleCount'] = '%s' % task.ObjectTable.HandleCount
    info['SessionID'] = '%s' % task.SessionId
    info['Wow64'] = '%s' % task.IsWow64
    info['Start'] = str(task.CreateTime or '')
    info['Exit'] = str(task.ExitTime or '')
    
    taskinfo.append(info)

No handlers could be found for logger "volatility.obj"

Convert the list of dict info into a DataFrame. First we specify column ordering (else it defaults to alphabetical). Second we set the index of the table to be the PID for each selection. Then for fun, we interpret the Start and Exit timestamps as actual datetime objects so we can sort the output DataFrame by Start time.

In [8]:

psscandf = pd.DataFrame(taskinfo, columns=['Name', 'PID', 'PPID', 'Threads', 'HandleCount', 'SessionID', 'Wow64', 'Start', 'Exit'])
psscandf.index = psscandf.PID
psscandf['Start'] = pd.to_datetime(psscandf['Start'])
psscandf['Exit'] = pd.to_datetime(psscandf['Exit'])

psscandf.sort(['Start'])

Out[8]:

	Name	PID	PPID	Threads	HandleCount	Wow64	Start	Exit
PID
4	System	4	0	51	254	False	NaT	NaT
660	services.exe	660	616	15	-2121378248	False	2008-11-15 23:42:56	NaT
592	csrss.exe	592	360	10	131072	False	2008-11-15 23:42:56	NaT
828	svchost.exe	828	660	14	0	False	2008-11-15 23:42:57	NaT
1016	svchost.exe	1016	660	51	0	False	2008-11-15 23:42:57	NaT
924	svchost.exe	924	660	7	0	False	2008-11-15 23:42:57	NaT
992	alg.exe	992	660	5	4784160	False	2008-11-15 23:43:25	NaT
360	smss.exe	360	4	3	19	False	2008-11-26 07:38:11	NaT
596	csrss.exe	596	360	10	322	False	2008-11-26 07:38:13	NaT
620	winlogon.exe	620	360	16	503	False	2008-11-26 07:38:14	NaT
672	services.exe	672	620	15	245	False	2008-11-26 07:38:15	NaT
684	lsass.exe	684	620	21	347	False	2008-11-26 07:38:15	NaT
932	svchost.exe	932	672	10	229	False	2008-11-26 07:38:18	NaT
844	svchost.exe	844	672	19	198	False	2008-11-26 07:38:18	NaT
1064	svchost.exe	1064	672	63	1308	False	2008-11-26 07:38:20	NaT
1164	svchost.exe	1164	672	5	77	False	2008-11-26 07:38:23	NaT
1264	svchost.exe	1264	672	14	209	False	2008-11-26 07:38:25	NaT
1516	explorer.exe	1516	1452	12	362	False	2008-11-26 07:38:27	NaT
1648	spoolsv.exe	1648	672	12	112	False	2008-11-26 07:38:28	NaT
1904	VMwareUser.exe	1904	1516	1	28	False	2008-11-26 07:38:31	NaT
1896	VMwareTray.exe	1896	1516	1	26	False	2008-11-26 07:38:31	NaT
1756	VMwareService.e	1756	672	3	45	False	2008-11-26 07:38:45	NaT
512	alg.exe	512	672	6	105	False	2008-11-26 07:38:53	NaT
1372	wuauclt.exe	1372	1064	8	225	False	2008-11-26 07:39:38	NaT
940	cmd.exe	940	1516	0		False	2008-11-26 07:43:39	2008-11-26 07:45:49
560	wscntfy.exe	560	1064	1	31	False	2008-11-26 07:44:57	NaT

With the data in Pandas, we can filter with conditions like: find all child processes of processes called svchost.exe. First we filter the psscandf by Name and extract the unique PID values. Then we can go back to the dataframe and filter for any PPIDs that exist in that list.

In [9]:

svchostpids = psscandf.ix[psscandf['Name'] == 'svchost.exe']['PID'].unique()
svchostpids

Out[9]:

array(['828', '1016', '924', '1164', '1264', '932', '844', '1064'], dtype=object)

In [10]:

psscandf.ix[psscandf['PPID'].isin(svchostpids)]

Out[10]:

	Name	PID	PPID	Threads	HandleCount	Wow64	Start	Exit
PID
1372	wuauclt.exe	1372	1064	8	225	False	2008-11-26 07:39:38	NaT
560	wscntfy.exe	560	1064	1	31	False	2008-11-26 07:44:57	NaT

Or accomplish the same thing by joining the table to itself SQL-style.

In [11]:

psscandf.ix[psscandf['Name'] == 'svchost.exe'].merge(psscandf, 
                                                     left_on=['PID'], 
                                                     right_on=['PPID'], 
                                                     suffixes=('_parent', '_child'))

Out[11]:

	Name_parent	PID_parent	PPID_parent	Threads_parent	HandleCount_parent	SessionID_parent	Wow64_parent	Start_parent	Exit_parent	Name_child	PID_child	PPID_child	Threads_child	HandleCount_child	SessionID_child	Wow64_child	Start_child	Exit_child
0	svchost.exe	1064	672	63	1308		False	2008-11-26 07:38:20	NaT	wuauclt.exe	1372	1064	8	225		False	2008-11-26 07:39:38	NaT
1	svchost.exe	1064	672	63	1308		False	2008-11-26 07:38:20	NaT	wscntfy.exe	560	1064	1	31		False	2008-11-26 07:44:57	NaT

The whole purpose of this sample image file ds_fuzz_hidden_proc.img is to illustrate hidden processes. The PSScan module we ran above will search through memory and find all processes. There is another module, PSList, that will walk the operating system's process tree and show every process you would see in Task Manager. Anything in PSScan that isn't in PSList is an example of a hidden process.

There are native tools in The Volatility Framework to highlight these discrepancies, but it's also easy enough for us to do it with Pandas for the sake of example. We already have PSScan output in the psscandf dataframe, now we build the same data structure based on PSList.

In [12]:

from volatility.plugins.taskmods import PSList

psl = PSList(config)

taskinfo = []

for task in psl.calculate():
    info = {}
    info['Name'] ='%s' % task.ImageFileName
    info['PID'] = '%i' % task.UniqueProcessId
    info['PPID'] = '%i' % task.InheritedFromUniqueProcessId
    info['Threads'] = '%s' % task.ActiveThreads
    info['HandleCount'] = '%s' % task.ObjectTable.HandleCount
    info['SessionID'] = '%s' % task.SessionId
    info['Wow64'] = '%s' % task.IsWow64
    info['Start'] = str(task.CreateTime or '')
    info['Exit'] = str(task.ExitTime or '')
    
    taskinfo.append(info)

pslistdf = pd.DataFrame(taskinfo, columns=['Name', 'PID', 'PPID', 'Threads', 'HandleCount', 'SessionID', 'Wow64', 'Start', 'Exit'])
pslistdf.index = pslistdf.PID
pslistdf['Start'] = pd.to_datetime(pslistdf['Start'])
pslistdf['Exit'] = pd.to_datetime(pslistdf['Exit'])

In [13]:

pslistdf

Out[13]:

	Name	PID	PPID	Threads	HandleCount	SessionID	Wow64	Start	Exit
PID
4	System	4	0	51	254		False	NaT	NaT
360	smss.exe	360	4	3	19		False	2008-11-26 07:38:11	NaT
596	csrss.exe	596	360	10	322	0	False	2008-11-26 07:38:13	NaT
620	winlogon.exe	620	360	16	503	0	False	2008-11-26 07:38:14	NaT
672	services.exe	672	620	15	245	0	False	2008-11-26 07:38:15	NaT
684	lsass.exe	684	620	21	347	0	False	2008-11-26 07:38:15	NaT
844	svchost.exe	844	672	19	198	0	False	2008-11-26 07:38:18	NaT
932	svchost.exe	932	672	10	229	0	False	2008-11-26 07:38:18	NaT
1064	svchost.exe	1064	672	63	1308	0	False	2008-11-26 07:38:20	NaT
1164	svchost.exe	1164	672	5	77	0	False	2008-11-26 07:38:23	NaT
1264	svchost.exe	1264	672	14	209	0	False	2008-11-26 07:38:25	NaT
1516	explorer.exe	1516	1452	12	362	0	False	2008-11-26 07:38:27	NaT
1648	spoolsv.exe	1648	672	12	112	0	False	2008-11-26 07:38:28	NaT
1896	VMwareTray.exe	1896	1516	1	26	0	False	2008-11-26 07:38:31	NaT
1904	VMwareUser.exe	1904	1516	1	28	0	False	2008-11-26 07:38:31	NaT
1756	VMwareService.e	1756	672	3	45	0	False	2008-11-26 07:38:45	NaT
512	alg.exe	512	672	6	105	0	False	2008-11-26 07:38:53	NaT
1372	wuauclt.exe	1372	1064	8	225	0	False	2008-11-26 07:39:38	NaT
560	wscntfy.exe	560	1064	1	31	0	False	2008-11-26 07:44:57	NaT

Next, we take the list of PIDs from the PSList dataframe and filter the PSScan dataframe for any row where the PID is not in the PSList (via the ~ negation operator). Thus, we've discovered seven hidden processes!

This particular memory sample was created to demonstrate a very clever technique to hide processes. In fact, there is another hidden process not shown in the list below. I'll leave that as an exercise for the reader. For more, see Jesse Kornblum's blog post.

In [14]:

psscandf.ix[~psscandf.PID.isin(pslistdf.PID.tolist())].sort(['Start'])

Out[14]:

	Name	PID	PPID	Threads	HandleCount	Wow64	Start	Exit
PID
592	csrss.exe	592	360	10	131072	False	2008-11-15 23:42:56	NaT
660	services.exe	660	616	15	-2121378248	False	2008-11-15 23:42:56	NaT
828	svchost.exe	828	660	14	0	False	2008-11-15 23:42:57	NaT
924	svchost.exe	924	660	7	0	False	2008-11-15 23:42:57	NaT
1016	svchost.exe	1016	660	51	0	False	2008-11-15 23:42:57	NaT
992	alg.exe	992	660	5	4784160	False	2008-11-15 23:43:25	NaT
940	cmd.exe	940	1516	0		False	2008-11-26 07:43:39	2008-11-26 07:45:49

Inline Graphing¶

The most exciting application of Volatility analysis in IPython Notebook, to me at least, is inline graphing. In addition to the primary render_text output option for the PSScan module, there is also a render_dot for the Graphviz dot format. The typical Volatility use-case for Graphviz generation would go like this:

Run vol.py with parameters --output=dot --output-file=out.dot
Open the out.dot file in Graphviz
Save the graph as an image

In IPython, we can do that in just one step and keep the analysis, output, and documentation all in one place!

To render the dot files, I'm using the IPython magic hierarchymagic. This plugin adds a new IPython cell magic %%dot so you can write a cell like:

%%dot
digraph processtree { 
    graph [rankdir = "TB"];
    pid672 -> pid844 [];
    pid672 -> pid932 [];
    //more cool dot stuff
}

and get the SVG output right in your notebook. We actually won't use this %%dot command, though. Instead, there's an underlying worker method called run_dot inside of hierarchymagic that we will use instead. We use the render_dot method to generate the graph in dot syntax. By importing the hierarchymagic library, we can pass the dot text directly to the run_dot method which returns SVG image data (which is just xml). Finally, IPython has a handy SVG method to render the graphic right in the notebook. No need to keep track of temporary files!

In [15]:

%load_ext hierarchymagic
import hierarchymagic #to put the library explicitly in the namespace
from IPython.display import SVG

In [16]:

psdot = StringIO()
psdata = ps.calculate()
ps.render_dot(psdot, psdata)

In [17]:

SVG(hierarchymagic.run_dot(psdot.getvalue(), format='svg'))

Out[17]: