Python实现C代码统计工具(二)

标签： Python 代码统计

Python实现C代码统计工具(二)
声明
一. 问题提出
二. 代码实现
三. 效果验证

声明

本文将对《Python实现C代码统计工具(一)》中的C代码统计工具进行重构，以应对各种使用场景。

一. 问题提出

此前实现的C代码统计工具较为简陋，仅能遍历和分析当前目录及其子目录下的代码文件并输出统计报告。

在实际使用中，可能期望支持同时统计多个目录和(或)文件，并可指定遍历深度。当文件总数较少且文件路径较长时，期望支持显示基本文件名(basename)而非全路径；当文件总数较多时，期望支持仅显示总计信息。为求直观，还应显示代码注释率，即注释行/(注释行+有效代码行)。

二. 代码实现

存储统计结果的rawCountInfo和detailCountInfo结构不变。当然，detailCountInfo也可定义为字典，以文件名为键。

CalcLines()函数也不变。注意，将该函数中的C注释符改为#和'''，即可用于Python脚本行数信息统计。

CountFileLines()稍作修改，增加isShortName参数，真则使用基本文件名，假则使用全路径：

def CountFileLines(filePath, isRawReport=True, isShortName=False):

    fileExt = os.path.splitext(filePath)

    if fileExt[1] != '.c' and fileExt[1] != '.h':

        return

    try:

        fileObj = open(filePath, 'r')

    except IOError:

        print 'Cannot open file (%s) for reading!', filePath

    else:

        lineList = fileObj.readlines()

        fileObj.close()

    if isRawReport:

        global rawCountInfo

        rawCountInfo[:-1] = [x+y for x,y in zip(rawCountInfo[:-1], CalcLines(lineList))]

        rawCountInfo[-1] += 1

    elif isShortName:

        detailCountInfo.append([os.path.basename(filePath), CalcLines(lineList)])

    else:

        detailCountInfo.append([filePath, CalcLines(lineList)])

ReportCounterInfo()内增加注释率统计：

def ReportCounterInfo(isRawReport=True):

    def SafeDiv(dividend, divisor):

        if divisor: return float(dividend)/divisor

        elif dividend:       return -1

        else:                return 0

    print 'FileLines  CodeLines  CommentLines  EmptyLines  CommentPercent  %s'\

          %(not isRawReport and 'FileName' or '')

    if isRawReport:

       print '%-11d%-11d%-14d%-12d%-16.2f<Total:%d Code Files>' %(rawCountInfo[0],\

             rawCountInfo[1], rawCountInfo[2], rawCountInfo[3], \

             SafeDiv(rawCountInfo[2], rawCountInfo[2]+rawCountInfo[1]), rawCountInfo[4])

       return

    total = [0, 0, 0, 0]

    #对detailCountInfo按第一列元素(文件名)排序，以提高输出可读性

    detailCountInfo.sort(key=lambda x:x[0])

    for item in detailCountInfo:

        print '%-11d%-11d%-14d%-12d%-16.2f%s' %(item[1][0], item[1][1], item[1][2], \

              item[1][3], SafeDiv(item[1][2], item[1][2]+item[1][1]), item[0])

        total[0] += item[1][0]; total[1] += item[1][1]

        total[2] += item[1][2]; total[3] += item[1][3]

    print '-' * 90  #输出90个负号(minus)或连字号(hyphen)

    print '%-11d%-11d%-14d%-12d%-16.2f<Total:%d Code Files>' \

          %(total[0], total[1], total[2], total[3], \

          SafeDiv(total[2], total[2]+total[1]), len(detailCountInfo))

为支持同时统计多个目录和(或)文件，使用ParseTargetList()解析目录-文件混合列表，将其元素分别存入目录和文件列表：

def ParseTargetList(targetList):

    fileList, dirList = [], []

    if targetList == []:

        targetList.append(os.getcwd())

    for item in targetList:

        if os.path.isfile(item):

            fileList.append(os.path.abspath(item))

        elif os.path.isdir(item):

            dirList.append(os.path.abspath(item))

        else:

            print "'%s' is neither a file nor a directory!" %item

    return [fileList, dirList]

注意，只有basename的文件或目录默认为当前目录下的文件或子目录。此外，相对路径会被扩展为全路径。

CLineCounter()函数基于目录和文件列表进行统计，并增加isStay参数指定遍历深度(真则当前目录，假则递归其子目录)：

def CLineCounter(isStay=False, isRawReport=True, isShortName=False, targetList=[]):

    fileList, dirList = ParseTargetList(targetList)

    if fileList != []:

        CountFile(fileList, isRawReport, isShortName)

    elif dirList != []:

        CountDir(dirList, isStay, isRawReport, isShortName)

    else:

        pass

def CountDir(dirList, isStay=False, isRawReport=True, isShortName=False):

    for dir in dirList:

        if isStay:

            for file in os.listdir(dir):

                CountFileLines(os.path.join(dir, file), isRawReport, isShortName)

        else:

            for root, dirs, files in os.walk(dir):

               for file in files:

                  CountFileLines(os.path.join(root, file), isRawReport, isShortName)

def CountFile(fileList, isRawReport=True, isShortName=False):

    for file in fileList:

        CountFileLines(file, isRawReport, isShortName)

然后，添加命令行解析处理。首选argparse模块解析命令行(Python2.7之后已废弃optparse模块)，实现如下：

import argparse

def ParseCmdArgs(argv=sys.argv):

    parser = argparse.ArgumentParser(usage='%(prog)s [options] target',

                      description='Count lines in C code files(c&h).')

    parser.add_argument('target', nargs='*',

           help='space-separated list of directories AND/OR files')

    parser.add_argument('-s', '--stay', action='store_true',

           help='do not walk down subdirectories')

    parser.add_argument('-d', '--detail', action='store_true',

           help='report counting result in detail')

    parser.add_argument('-b', '--basename', action='store_true',

           help='do not show file\'s full path')

    parser.add_argument('-v', '--version', action='version',

           version='%(prog)s 2.0 by xywang')

    args = parser.parse_args()

    return (args.stay, args.detail, args.basename, args.target)

argparse模块默认检查命令行参数sys.argv，也可直接解析字符串。例如，args = parser.parse_args('foo 1 -x 2'.split())，这在调试中很有用。

注意，argparse模块为Python2.7版本新增。在以前的版本中，可使用getopt模块解析命令行。如下所示：

def Usage():

    '''

usage: CLineCounter.py [options] target

Count lines in C code files(c&h).

positional arguments:

  target          space-separated list of directories AND/OR files

optional arguments:

  -h, --help      show this help message and exit

  -s, --stay      do not walk down subdirectories

  -d, --detail    report counting result in detail

  -b, --basename  do not show file's full path

  -v, --version   show program's version number and exit'''

    print Usage.__doc__

import getopt

def ParseCmdArgs1(argv=sys.argv):

    try:

        opts, args = getopt.gnu_getopt(argv[1:], 'hsdbv', \

                     ['help', 'stay', 'detail', 'basename', 'version'])

    except getopt.GetoptError, e:

        print str(e); Usage(); sys.exit()

    stay, detail, basename, target = False, False, False, []

    verbose = False

    for o, a in opts:

        if o in ("-h", "--help"):

            Usage(); sys.exit()

        elif o in ("-s", "--stay"):

            stay = True

        elif o in ("-d", "--detail"):

            detail = True

        elif o in ("-b", "--basename"):

            basename = True

        elif o in ("-v", "--version"):

            print '%s 2.0 by xywang' %os.path.basename(argv[0]); sys.exit()

        else:

            assert False, "unhandled option"

    return (stay, detail, basename, args)

其中，Usage()函数输出的帮助信息与argparse模块实现的-h选项输出相同。

建议将命令行处理放入另一文件内，以免python环境不支持argparse时导致代码统计本身不可用。

最后，调用以上函数统计代码并输出报告：

if __name__ == '__main__':

    (stay, detail, basename, target) = ParseCmdArgs()

    CLineCounter(stay, not detail, basename, target)

    ReportCounterInfo(not detail)

三. 效果验证

为验证上节的代码实现，在lctest调试目录下新建subdir子目录。该目录下包含test.c文件。

在作者的Windows XP主机上，以不同的命令行参数运行CLineCounter.py，输出如下：

E:\PyTest>tree /F lctest

E:\PYTEST\LCTEST

│  line.c

│  test.c

│  typec.JPG

│

└─subdir

        test.c

E:\PyTest>CLineCounter.py -v

CLineCounter.py 2.0 by xywang

E:\PyTest>CLineCounter.py -d lctest\line.c lctest\test.c

FileLines  CodeLines  CommentLines  EmptyLines  CommentPercent  FileName

33         19         15            4           0.44            lctest\line.c

44         34         3             7           0.08            lctest\test.c

------------------------------------------------------------------------------------------

77         53         18            11          0.25            <Total:2 Code Files>

E:\PyTest>CLineCounter.py -s -d lctest\subdir\test.c lctest

FileLines  CodeLines  CommentLines  EmptyLines  CommentPercent  FileName

33         19         15            4           0.44            E:\PyTest\lctest\line.c

44         34         3             7           0.08            E:\PyTest\lctest\test.c

44         34         3             7           0.08            lctest\subdir\test.c

------------------------------------------------------------------------------------------

121        87         21            18          0.19            <Total:3 Code Files>

E:\PyTest>CLineCounter.py -s -d lctest -b

FileLines  CodeLines  CommentLines  EmptyLines  CommentPercent  FileName

33         19         15            4           0.44            line.c

44         34         3             7           0.08            test.c

------------------------------------------------------------------------------------------

77         53         18            11          0.25            <Total:2 Code Files>

E:\PyTest>CLineCounter.py -d lctest

FileLines  CodeLines  CommentLines  EmptyLines  CommentPercent  FileName

33         19         15            4           0.44            E:\PyTest\lctest\line.c

44         34         3             7           0.08            E:\PyTest\lctest\subdir\test.c

44         34         3             7           0.08            E:\PyTest\lctest\test.c

------------------------------------------------------------------------------------------

121        87         21            18          0.19            <Total:3 Code Files>

E:\PyTest>CLineCounter.py lctest

FileLines  CodeLines  CommentLines  EmptyLines  CommentPercent

121        87         21            18          0.19            <Total:3 Code Files>

E:\PyTest\lctest\subdir>CLineCounter.py -d ..\test.c

FileLines  CodeLines  CommentLines  EmptyLines  CommentPercent  FileName

44         34         3             7           0.08            E:\PyTest\lctest\test.c

------------------------------------------------------------------------------------------

44         34         3             7           0.08            <Total:1 Code Files>

在作者的Linux Redhat主机(Python 2.4.3)上，运行CLineCounter.py后统计输出如下：

[wangxiaoyuan_@localhost]$ python CLineCounter.py -d lctest

FileLines  CodeLines  CommentLines  EmptyLines  CommentPercent  FileName

33         19         15            4           0.44            /home/wangxiaoyuan_/lctest/line.c

44         34         3             7           0.08            /home/wangxiaoyuan_/lctest/subdir/test.c

44         34         3             7           0.08            /home/wangxiaoyuan_/lctest/test.c

------------------------------------------------------------------------------------------

121        87         21            18          0.19            <Total:3 Code Files>

[wangxiaoyuan_@localhost subdir]$ python CLineCounter.py  -d ../test.c

FileLines  CodeLines  CommentLines  EmptyLines  CommentPercent  FileName

44         34         3             7           0.08            /home/wangxiaoyuan_/lctest/test.c

------------------------------------------------------------------------------------------

44         34         3             7           0.08            <Total:1 Code Files>

经人工检验，统计信息正确。

此外，可以看出统计输出非常规整。若将其存入Windows文本文件(txt)，可方便地转换为Excel文件。以Excel 2007为例，通过Office按钮打开待转换的文本文件，在【文本导入向导】中勾选"分隔符号"，点击"完成"，将文本文件载入Excel中。此后，即可在Excel界面里自定义修改并另存为Excel文件。