Go Pentester - HTTP CLIENTS(5)
Parsing Document Metadata with Bing Scaping
Set up the environment - install goquery package.
https://github.com/PuerkitoBio/goquery
go get github.com/PuerkitoBio/goquery
Modify the Proxy setting if in China. Refer to: https://sum.golang.org/
Unzip an Office file and analyze the Open XML file struct. "creator", "lastModifiedBy" in core.xml and "Application", "Company", "AppVersion" in app.xml are of primary interest.
Defining the metadata Package and mapping the data to structs in GO to open, parse, and extract Office Open XML documents.
package metadata import (
"archive/zip"
"encoding/xml"
"strings"
) // Open XML type definition and version mapping
type OfficeCoreProperty struct {
XMLName xml.Name `xml:"coreProperties"`
Creator string `xml:"creator"`
LastModifiedBy string `xml:"lastModifiedBy"`
} type OfficeAppProperty struct {
XMLName xml.Name `xml:"Properties"`
Application string `xml:"Application"`
Company string `xml:"Company"`
Version string `xml:"AppVersion"`
} var OfficeVersion = map[string]string{
"16": "2016",
"15": "2013",
"14": "2010",
"12": "2007",
"11": "2003",
} func (a *OfficeAppProperty) GetMajorVersion() string {
tokens := strings.Split(a.Version, ".") if len(tokens) < 2 {
return "Unknown"
}
v, ok := OfficeVersion[tokens[0]]
if !ok {
return "Unknown"
}
return v
} // Processing Open XML archives and embedded XML documents
func NewProperties(r *zip.Reader) (*OfficeCoreProperty, *OfficeAppProperty, error) {
var coreProps OfficeCoreProperty
var appProps OfficeAppProperty for _, f := range r.File {
switch f.Name {
case "docProps/core.xml":
if err := process(f, &coreProps); err != nil {
return nil, nil, err
}
case "docProps/app.xml":
if err := process(f, &appProps); err != nil {
return nil, nil, err
}
default:
continue
}
}
return &coreProps, &appProps, nil
} func process(f *zip.File, prop interface{}) error {
rc, err := f.Open()
if err != nil {
return err
}
defer rc.Close() if err := xml.NewDecoder(rc).Decode(&prop); err != nil {
return err
} return nil
}
Figure out how to search for and retrieve files by using Bing.
1. Submit a search request to Bing with proper filters to retrieve targeted results.
2. Scrape the HTML response, extracting the HRER(link) data to obtain direct URLs for documents.
3. Submit an HTTP request for each direct document URL.
4. Parse the response body to create a zip.Reader.
5. Pass the zip.Reader into the code you already developed to extract metadata.
Analyze the search result elements in Bing.
Now scrap Bing results and parse the document metadata.
package metadata import (
"archive/zip"
"encoding/xml"
"strings"
) // Open XML type definition and version mapping
type OfficeCoreProperty struct {
XMLName xml.Name `xml:"coreProperties"`
Creator string `xml:"creator"`
LastModifiedBy string `xml:"lastModifiedBy"`
} type OfficeAppProperty struct {
XMLName xml.Name `xml:"Properties"`
Application string `xml:"Application"`
Company string `xml:"Company"`
Version string `xml:"AppVersion"`
} var OfficeVersion = map[string]string{
"16": "2016",
"15": "2013",
"14": "2010",
"12": "2007",
"11": "2003",
} func (a *OfficeAppProperty) GetMajorVersion() string {
tokens := strings.Split(a.Version, ".") if len(tokens) < 2 {
return "Unknown"
}
v, ok := OfficeVersion[tokens[0]]
if !ok {
return "Unknown"
}
return v
} // Processing Open XML archives and embedded XML documents
func NewProperties(r *zip.Reader) (*OfficeCoreProperty, *OfficeAppProperty, error) {
var coreProps OfficeCoreProperty
var appProps OfficeAppProperty for _, f := range r.File {
switch f.Name {
case "docProps/core.xml":
if err := process(f, &coreProps); err != nil {
return nil, nil, err
}
case "docProps/app.xml":
if err := process(f, &appProps); err != nil {
return nil, nil, err
}
default:
continue
}
}
return &coreProps, &appProps, nil
} func process(f *zip.File, prop interface{}) error {
rc, err := f.Open()
if err != nil {
return err
}
defer rc.Close() if err := xml.NewDecoder(rc).Decode(&prop); err != nil {
return err
} return nil
}
Go Pentester - HTTP CLIENTS(5)的更多相关文章
- Go Pentester - HTTP CLIENTS(1)
Building HTTP Clients that interact with a variety of security tools and resources. Basic Preparatio ...
- Go Pentester - HTTP CLIENTS(4)
Interacting with Metasploit msf.go package rpc import ( "bytes" "fmt" "gopk ...
- Go Pentester - HTTP CLIENTS(3)
Interacting with Metasploit Early-stage Preparation: Setting up your environment - start the Metaspl ...
- Go Pentester - HTTP CLIENTS(2)
Building an HTTP Client That Interacts with Shodan Shadon(URL:https://www.shodan.io/) is the world' ...
- Creating a radius based VPN with support for Windows clients
This article discusses setting up up an integrated IPSec/L2TP VPN using Radius and integrating it wi ...
- Deploying JRE (Native Plug-in) for Windows Clients in Oracle E-Business Suite Release 12 (文档 ID 393931.1)
In This Document Section 1: Overview Section 2: Pre-Upgrade Steps Section 3: Upgrade and Configurati ...
- ZK 使用Clients.response
参考: http://stackoverflow.com/questions/11416386/how-to-access-au-response-sent-from-server-side-at-c ...
- MySQL之aborted connections和aborted clients
影响Aborted_clients 值的可能是客户端连接异常关闭,或wait_timeout值过小. 最近线上遇到一个问题,接口日志发现有很多超时报错,根据日志定位到数据库实例之后发现一切正常,一般来 ...
- 【渗透测试学习平台】 web for pentester -2.SQL注入
Example 1 字符类型的注入,无过滤 http://192.168.91.139/sqli/example1.php?name=root http://192.168.91.139/sqli/e ...
随机推荐
- strcmp函数的两种实现
strcmp函数的两种实现,gcc测试通过. 一种实现: C代码 #include<stdio.h> int strcmp(const char *str1,const char *s ...
- Docker拉取镜像加速
关于Docker拉取镜像加速 打开桌面 docker 小图标 选中框框 根据下图 添加国内的加速源即可 Docker加速源 #网易 http://hub-mirror.c.163.com #Docke ...
- Java并发编程的本质是解决这三大问题
[本文版权归微信公众号"代码艺术"(ID:onblog)所有,若是转载请务必保留本段原创声明,违者必究.若是文章有不足之处,欢迎关注微信公众号私信与我进行交流!] 前言 并发编程的 ...
- android 中使用自定义权限
1.如果在一个进程中启动另外一个进程的activity <?xml version="1.0" encoding="utf-8"?> <man ...
- Spring源码系列(二)--bean组件的源码分析
简介 spring-bean 组件是 Spring IoC 的核心,我们可以使用它的 beanFactory 来获取所需的对象,对象的实例化.属性装配和初始化等都可以交给 spring 来管理. 本文 ...
- JavaScript基础对象创建模式之单体/单例模式(Singleton)
首先,单例模式是对象的创建模式之一,此外还包括工厂模式.单例模式的三个特点: 1,该类只有一个实例 2,该类自行创建该实例(在该类内部创建自身的实例对象) 3,向整个系统公开这个实例接口 Java中大 ...
- 浅谈tkinter模块
目录 tkinter模块 tkinter模块简单使用 主窗口 Button按钮 Label标签 Text编辑框 Entry输入框 ListBox列表 RadioButton单选框 CheckButto ...
- 循环&&数组&&方法&&面向对象
day03 数值的默认值 类型 初始化的值 byte,short,int,long 0 float,double 0.0 char 空格 boolean false 引用类型 null JVM的内存 ...
- 手写一个Redux,深入理解其原理
Redux可是一个大名鼎鼎的库,很多地方都在用,我也用了几年了,今天这篇文章就是自己来实现一个Redux,以便于深入理解他的原理.我们还是老套路,从基本的用法入手,然后自己实现一个Redux来替代源码 ...
- 每日一题 - 剑指 Offer 48. 最长不含重复字符的子字符串
题目信息 时间: 2019-07-02 题目链接:Leetcode tag: 动态规划 哈希表 难易程度:中等 题目描述: 请从字符串中找出一个最长的不包含重复字符的子字符串,计算该最长子字符串的长度 ...