Go Pentester - HTTP CLIENTS(5)
Parsing Document Metadata with Bing Scaping
Set up the environment - install goquery package.
https://github.com/PuerkitoBio/goquery
go get github.com/PuerkitoBio/goquery

Modify the Proxy setting if in China. Refer to: https://sum.golang.org/

Unzip an Office file and analyze the Open XML file struct. "creator", "lastModifiedBy" in core.xml and "Application", "Company", "AppVersion" in app.xml are of primary interest.

Defining the metadata Package and mapping the data to structs in GO to open, parse, and extract Office Open XML documents.
package metadata import (
"archive/zip"
"encoding/xml"
"strings"
) // Open XML type definition and version mapping
type OfficeCoreProperty struct {
XMLName xml.Name `xml:"coreProperties"`
Creator string `xml:"creator"`
LastModifiedBy string `xml:"lastModifiedBy"`
} type OfficeAppProperty struct {
XMLName xml.Name `xml:"Properties"`
Application string `xml:"Application"`
Company string `xml:"Company"`
Version string `xml:"AppVersion"`
} var OfficeVersion = map[string]string{
"16": "2016",
"15": "2013",
"14": "2010",
"12": "2007",
"11": "2003",
} func (a *OfficeAppProperty) GetMajorVersion() string {
tokens := strings.Split(a.Version, ".") if len(tokens) < 2 {
return "Unknown"
}
v, ok := OfficeVersion[tokens[0]]
if !ok {
return "Unknown"
}
return v
} // Processing Open XML archives and embedded XML documents
func NewProperties(r *zip.Reader) (*OfficeCoreProperty, *OfficeAppProperty, error) {
var coreProps OfficeCoreProperty
var appProps OfficeAppProperty for _, f := range r.File {
switch f.Name {
case "docProps/core.xml":
if err := process(f, &coreProps); err != nil {
return nil, nil, err
}
case "docProps/app.xml":
if err := process(f, &appProps); err != nil {
return nil, nil, err
}
default:
continue
}
}
return &coreProps, &appProps, nil
} func process(f *zip.File, prop interface{}) error {
rc, err := f.Open()
if err != nil {
return err
}
defer rc.Close() if err := xml.NewDecoder(rc).Decode(&prop); err != nil {
return err
} return nil
}
Figure out how to search for and retrieve files by using Bing.
1. Submit a search request to Bing with proper filters to retrieve targeted results.
2. Scrape the HTML response, extracting the HRER(link) data to obtain direct URLs for documents.
3. Submit an HTTP request for each direct document URL.
4. Parse the response body to create a zip.Reader.
5. Pass the zip.Reader into the code you already developed to extract metadata.
Analyze the search result elements in Bing.

Now scrap Bing results and parse the document metadata.
package metadata import (
"archive/zip"
"encoding/xml"
"strings"
) // Open XML type definition and version mapping
type OfficeCoreProperty struct {
XMLName xml.Name `xml:"coreProperties"`
Creator string `xml:"creator"`
LastModifiedBy string `xml:"lastModifiedBy"`
} type OfficeAppProperty struct {
XMLName xml.Name `xml:"Properties"`
Application string `xml:"Application"`
Company string `xml:"Company"`
Version string `xml:"AppVersion"`
} var OfficeVersion = map[string]string{
"16": "2016",
"15": "2013",
"14": "2010",
"12": "2007",
"11": "2003",
} func (a *OfficeAppProperty) GetMajorVersion() string {
tokens := strings.Split(a.Version, ".") if len(tokens) < 2 {
return "Unknown"
}
v, ok := OfficeVersion[tokens[0]]
if !ok {
return "Unknown"
}
return v
} // Processing Open XML archives and embedded XML documents
func NewProperties(r *zip.Reader) (*OfficeCoreProperty, *OfficeAppProperty, error) {
var coreProps OfficeCoreProperty
var appProps OfficeAppProperty for _, f := range r.File {
switch f.Name {
case "docProps/core.xml":
if err := process(f, &coreProps); err != nil {
return nil, nil, err
}
case "docProps/app.xml":
if err := process(f, &appProps); err != nil {
return nil, nil, err
}
default:
continue
}
}
return &coreProps, &appProps, nil
} func process(f *zip.File, prop interface{}) error {
rc, err := f.Open()
if err != nil {
return err
}
defer rc.Close() if err := xml.NewDecoder(rc).Decode(&prop); err != nil {
return err
} return nil
}

Go Pentester - HTTP CLIENTS(5)的更多相关文章
- Go Pentester - HTTP CLIENTS(1)
Building HTTP Clients that interact with a variety of security tools and resources. Basic Preparatio ...
- Go Pentester - HTTP CLIENTS(4)
Interacting with Metasploit msf.go package rpc import ( "bytes" "fmt" "gopk ...
- Go Pentester - HTTP CLIENTS(3)
Interacting with Metasploit Early-stage Preparation: Setting up your environment - start the Metaspl ...
- Go Pentester - HTTP CLIENTS(2)
Building an HTTP Client That Interacts with Shodan Shadon(URL:https://www.shodan.io/) is the world' ...
- Creating a radius based VPN with support for Windows clients
This article discusses setting up up an integrated IPSec/L2TP VPN using Radius and integrating it wi ...
- Deploying JRE (Native Plug-in) for Windows Clients in Oracle E-Business Suite Release 12 (文档 ID 393931.1)
In This Document Section 1: Overview Section 2: Pre-Upgrade Steps Section 3: Upgrade and Configurati ...
- ZK 使用Clients.response
参考: http://stackoverflow.com/questions/11416386/how-to-access-au-response-sent-from-server-side-at-c ...
- MySQL之aborted connections和aborted clients
影响Aborted_clients 值的可能是客户端连接异常关闭,或wait_timeout值过小. 最近线上遇到一个问题,接口日志发现有很多超时报错,根据日志定位到数据库实例之后发现一切正常,一般来 ...
- 【渗透测试学习平台】 web for pentester -2.SQL注入
Example 1 字符类型的注入,无过滤 http://192.168.91.139/sqli/example1.php?name=root http://192.168.91.139/sqli/e ...
随机推荐
- Docker数据管理与挂载管理
介绍如何在 Docker 内部以及容器之间管理数据:在容器中管理数据主要有两种方式:数据卷(Volumes).挂载主机目录 (Bind mounts) 镜像来源 [root@docker01 ~]# ...
- Jmeter各种组件
断言 用于检查测试中得到的响应数据等是否符合预期,用以保证性能测试过程中的数据交互与预期一致 参数化关联 参数化:指对每次发起的请求,参数名称相同,参数值进行替换,如登录三次系统,每次用不同的用户名和 ...
- Charles 功能详解
Charles的功能有? 1 抓取http和https 网络封包(抓包) 2 Charles 的断点请求 通过断点修改参数 在指定接口打上断点 右键点击接口选择 breakpoints 然后 导航栏 ...
- Page "页面路径" has not been registered yet.
网上找了很多方法,但和我遇到的都不一样,我这个页面是我路由接口更改时遇到的错误,原因是我移动了文件,js里引用的文件找不到了 解决方法:更改引用路径即可
- 必知必会的8个Python列表技巧
原作者:Nik Piepenbreier 翻译&内容补充:费弗里 原文地址:https://towardsdatascience.com/advanced-python-list-techni ...
- Pytorch入门——手把手带你配置云服务器环境
本文始发于个人公众号:TechFlow,原创不易,求个关注 今天这篇是Pytorch专题第一篇文章. 大家好,由于我最近自己在学习Pytorch框架的运用,并且也是为了响应许多读者的需求,推出了这个P ...
- DOM-BOM-EVENT(5)
5.宽.高.位置相关 5.1.clientX/clientY clientX和clientY表示鼠标在浏览器可视区的坐标位置 <script> document.onclick = fun ...
- QUIC/HTTP3 协议简析
从 HTTP 的进化历史讲起,细说使用协议的变迁,了解原因发现问题,解码 QUIC 在 HTTP3 中的支撑作用,共同探讨 HTTP3 的未来. HTTP.HTTP2 和 HTTP3 先和大家来回顾一 ...
- Java使用IO流读取TXT文件
通过BufferedReader读取TXT文件window系统默认的编码是GBK,而IDE的编码多数为UTF-8,如果没有规定new InputStreamReader(new FileInputSt ...
- .netcore开发环境和服务器注意事项
对于开发环境,如果你需要使用.netcore命令的话,你需要安装SDK:如果你还需要运行.netcore的网站的话,你必须还要安装它的[runtime]和[hosting server]: 对于服务器 ...