Python Nmap.xml Parser

February 24, 2010
Tags: ,

For a school project, I was required to go through a Nmap.xml file and copy and paste to an excel file. This task seemed a little daunting since the Nmap file contained hundreds of results. So I set out to use python to parse the file. I was having a hard time until I looked through the code of another python script called Slayer. Slayer parses the nmap.xml file and puts it into a SQLite database. I copied the code from Slayer and instead of a SQLite database I placed the data into a .csv file. I will be the first to admit that this code is not very good, but it works for what I am doing.  Here is the code:


#import easy to use xml parser called minidom:
import datetime, re
from xml.dom.minidom import parse, parseString

# start output file
output = open('output.csv', 'a')
output.write('\n<----started at: ')

#get current time and put in output file
now = datetime.datetime.now()
print ""
output.write(now.strftime("%Y-%m-%d %H:%M"))
output.write(' ----->\n')
output.write(',,,,,,,,,\n')
output.write('IP Address,Host Name,All Ports Filtered,Open Ports,')
output.write('State (O/C),Service,Version,Device Type,Running,OS Details\n')

#setup variables
dom = parse('scanlog.xml')
nmapvars = {}
hostname = ''
os = ''
difficulty = ''
args = ''
date = ''
port = []
name = []
protocol = []
product = []
product = []
version = []
extrainfo = []
portstate = []
goodXML = []

scaninfo = dom.getElementsByTagName('nmaprun')[0]
date = scaninfo.getAttribute("startstr")
args = scaninfo.getAttribute('args')

#define translateXml
def translateXml(node):

    if node.nodeName == 'hostname':

        hostname = node.getAttribute('name')
        output.write(node.getAttribute('name'))
        output.write(',')

    elif node.nodeName == 'address':

        if 'ip' in node.getAttribute('addrtype'):

            output.write('\n')
            #output.write(',')
            ipaddr = node.getAttribute('addr')
            output.write(node.getAttribute('addr'))
            output.write(',')

    elif node.nodeName == "port":

        #protocol.append(node.getAttribute("protocol"))
        #output.write(node.getAttribute("protocol"))
        #output.write(',')

        output.write('\n')
        output.write(',')
        output.write(',')
        output.write(',')
        port.append(node.getAttribute("portid"))
        #output.write(addr)
        output.write(node.getAttribute("portid"))
        output.write(',')

    elif node.nodeName == "state":

        portstate.append(node.getAttribute('state'))
        output.write(node.getAttribute('state'))
        output.write(',')

    elif node.nodeName == "service":

        name.append(node.getAttribute("name"))
        output.write(node.getAttribute('name'))
        output.write(',')
        product.append(node.getAttribute("product"))
        output.write(node.getAttribute('product'))
        output.write(',')
        version.append(node.getAttribute("version"))
        output.write(node.getAttribute('version'))
        output.write(',')
        extrainfo.append(node.getAttribute("extrainfo"))
        output.write(node.getAttribute('extrainfo'))
        output.write(',')

    elif node.nodeName == 'osmatch':

        os = node.getAttribute('name')
        output.write(node.getAttribute('name'))
        output.write(',')

    elif node.nodeName == 'tcpsequence':

        difficulty = node.getAttribute('difficulty')

#break down xml to get details
for node in dom.getElementsByTagName('host'):

    #second level within host tag
    for subnode in node.childNodes: #go through each subnode of

        if subnode.attributes is not None: #if the subnode has attributes parse them

            translateXml(subnode) #send the attribute to translateXml
            if len(subnode.childNodes) > 0: #if there are childnodes then dig deeper

                #third level
                for subsubnode in subnode.childNodes: #loop through childnodes

                    if subsubnode.attributes is not None: #if the susubnode has attributes parse them

                        translateXml(subsubnode) #send the attribute to translateXml

                        if len(subsubnode.childNodes) > 0:

                            #fourth level
                            for subsubsubnode in subsubnode.childNodes:

                                if subsubsubnode.attributes is not None:

                                    translateXml(subsubsubnode) #translate the xml

print hostname
dom.unlink()
output.close()

7 Responses to “Python Nmap.xml Parser”

  1. Thank You so much! This has just made my life so much easier! I though I was going to write my own nmap parser from scratch. Thank You!

  2. No problem! That is what the internet is for. 🙂

  3. Thanks for sharing man. Your script worked well…

  4. […] NMAP Python XML-Parser it translates xml results into .csv file. […]

  5. Was looking around the web for this solution, it was perfect.
    This is in 2.7 but all that was needed was the print syntax. I also added 2 lines

    import tkinter.filedialog

    and

    dom = parse(tkinter.filedialog.askopenfilename())

    I’ll have a number of scans to sort thru so i need a dynamic way to choose them and this is pretty easy

  6. Hy,

    thanks for your work. But, could you post the xml file you parsed please ?

    Thanks in advance

  7. Unfortunately, I wrote this blog post many years ago and I do not have the original nmap xml files. The code is still posted to guide you to creating your own python based nmap xml parser for your own needs.

Leave a Reply