Go Pentester - HTTP CLIENTS(5)
Parsing Document Metadata with Bing Scaping
Set up the environment - install goquery package.
https://github.com/PuerkitoBio/goquery
go get github.com/PuerkitoBio/goquery

Modify the Proxy setting if in China. Refer to: https://sum.golang.org/

Unzip an Office file and analyze the Open XML file struct. "creator", "lastModifiedBy" in core.xml and "Application", "Company", "AppVersion" in app.xml are of primary interest.

Defining the metadata Package and mapping the data to structs in GO to open, parse, and extract Office Open XML documents.
package metadata import (
"archive/zip"
"encoding/xml"
"strings"
) // Open XML type definition and version mapping
type OfficeCoreProperty struct {
XMLName xml.Name `xml:"coreProperties"`
Creator string `xml:"creator"`
LastModifiedBy string `xml:"lastModifiedBy"`
} type OfficeAppProperty struct {
XMLName xml.Name `xml:"Properties"`
Application string `xml:"Application"`
Company string `xml:"Company"`
Version string `xml:"AppVersion"`
} var OfficeVersion = map[string]string{
"16": "2016",
"15": "2013",
"14": "2010",
"12": "2007",
"11": "2003",
} func (a *OfficeAppProperty) GetMajorVersion() string {
tokens := strings.Split(a.Version, ".") if len(tokens) < 2 {
return "Unknown"
}
v, ok := OfficeVersion[tokens[0]]
if !ok {
return "Unknown"
}
return v
} // Processing Open XML archives and embedded XML documents
func NewProperties(r *zip.Reader) (*OfficeCoreProperty, *OfficeAppProperty, error) {
var coreProps OfficeCoreProperty
var appProps OfficeAppProperty for _, f := range r.File {
switch f.Name {
case "docProps/core.xml":
if err := process(f, &coreProps); err != nil {
return nil, nil, err
}
case "docProps/app.xml":
if err := process(f, &appProps); err != nil {
return nil, nil, err
}
default:
continue
}
}
return &coreProps, &appProps, nil
} func process(f *zip.File, prop interface{}) error {
rc, err := f.Open()
if err != nil {
return err
}
defer rc.Close() if err := xml.NewDecoder(rc).Decode(&prop); err != nil {
return err
} return nil
}
Figure out how to search for and retrieve files by using Bing.
1. Submit a search request to Bing with proper filters to retrieve targeted results.
2. Scrape the HTML response, extracting the HRER(link) data to obtain direct URLs for documents.
3. Submit an HTTP request for each direct document URL.
4. Parse the response body to create a zip.Reader.
5. Pass the zip.Reader into the code you already developed to extract metadata.
Analyze the search result elements in Bing.

Now scrap Bing results and parse the document metadata.
package metadata import (
"archive/zip"
"encoding/xml"
"strings"
) // Open XML type definition and version mapping
type OfficeCoreProperty struct {
XMLName xml.Name `xml:"coreProperties"`
Creator string `xml:"creator"`
LastModifiedBy string `xml:"lastModifiedBy"`
} type OfficeAppProperty struct {
XMLName xml.Name `xml:"Properties"`
Application string `xml:"Application"`
Company string `xml:"Company"`
Version string `xml:"AppVersion"`
} var OfficeVersion = map[string]string{
"16": "2016",
"15": "2013",
"14": "2010",
"12": "2007",
"11": "2003",
} func (a *OfficeAppProperty) GetMajorVersion() string {
tokens := strings.Split(a.Version, ".") if len(tokens) < 2 {
return "Unknown"
}
v, ok := OfficeVersion[tokens[0]]
if !ok {
return "Unknown"
}
return v
} // Processing Open XML archives and embedded XML documents
func NewProperties(r *zip.Reader) (*OfficeCoreProperty, *OfficeAppProperty, error) {
var coreProps OfficeCoreProperty
var appProps OfficeAppProperty for _, f := range r.File {
switch f.Name {
case "docProps/core.xml":
if err := process(f, &coreProps); err != nil {
return nil, nil, err
}
case "docProps/app.xml":
if err := process(f, &appProps); err != nil {
return nil, nil, err
}
default:
continue
}
}
return &coreProps, &appProps, nil
} func process(f *zip.File, prop interface{}) error {
rc, err := f.Open()
if err != nil {
return err
}
defer rc.Close() if err := xml.NewDecoder(rc).Decode(&prop); err != nil {
return err
} return nil
}

Go Pentester - HTTP CLIENTS(5)的更多相关文章
- Go Pentester - HTTP CLIENTS(1)
Building HTTP Clients that interact with a variety of security tools and resources. Basic Preparatio ...
- Go Pentester - HTTP CLIENTS(4)
Interacting with Metasploit msf.go package rpc import ( "bytes" "fmt" "gopk ...
- Go Pentester - HTTP CLIENTS(3)
Interacting with Metasploit Early-stage Preparation: Setting up your environment - start the Metaspl ...
- Go Pentester - HTTP CLIENTS(2)
Building an HTTP Client That Interacts with Shodan Shadon(URL:https://www.shodan.io/) is the world' ...
- Creating a radius based VPN with support for Windows clients
This article discusses setting up up an integrated IPSec/L2TP VPN using Radius and integrating it wi ...
- Deploying JRE (Native Plug-in) for Windows Clients in Oracle E-Business Suite Release 12 (文档 ID 393931.1)
In This Document Section 1: Overview Section 2: Pre-Upgrade Steps Section 3: Upgrade and Configurati ...
- ZK 使用Clients.response
参考: http://stackoverflow.com/questions/11416386/how-to-access-au-response-sent-from-server-side-at-c ...
- MySQL之aborted connections和aborted clients
影响Aborted_clients 值的可能是客户端连接异常关闭,或wait_timeout值过小. 最近线上遇到一个问题,接口日志发现有很多超时报错,根据日志定位到数据库实例之后发现一切正常,一般来 ...
- 【渗透测试学习平台】 web for pentester -2.SQL注入
Example 1 字符类型的注入,无过滤 http://192.168.91.139/sqli/example1.php?name=root http://192.168.91.139/sqli/e ...
随机推荐
- excel筛选重复项代码
Sub test()'updateby Extendoffice 20151030 Dim xRng As Range Dim xTxt As String On Error Res ...
- 认证授权方案之JwtBearer认证
1.前言 回顾:认证方案之初步认识JWT 在现代Web应用程序中,即分为前端与后端两大部分.当前前后端的趋势日益剧增,前端设备(手机.平板.电脑.及其他设备)层出不穷.因此,为了方便满足前端设备与后端 ...
- ps学习。
ps软件及教程,这些东西,你应该要花一辈子来消化.
- disruptor架构四 多生产者多消费者执行
1.首先介绍下那个时候使用RingBuffer,那个时候使用disruptor ringBuffer比较适合场景比较简单的业务,disruptor比较适合场景较为复杂的业务,很多复杂的结果必须使用di ...
- ThinkPHP 5接阿里云短信接口
1.首先将api_sdk文件放入vendor文件夹下 2.在config文件中作相应的配置 3.封装发送短信的方法 4.调用发送短信方法
- ceph集成openstack cinder
本环境ceph已经搭建,ceph搭建麻烦见本博客的其他文章 1 在cinder-volume节点安装ceph client yum install -y ceph-common 注意:glance要安 ...
- 逻辑式编程语言极简实现(使用C#) - 2. 一道逻辑题:谁是凶手
本系列前面的文章: 逻辑式编程语言极简实现(使用C#) - 1. 逻辑式编程语言介绍 这是一道Prolog经典的练习题,中文翻译版来自阮一峰的文章<Prolog 语言入门教程>. 问题 B ...
- Java 从入门到进阶之路(二十八)
在之前的文章我们都是通过 Java 在内存中应用,本章开始我们来看一下 Java 在系统文件(硬盘)上的操作. 系统文件就是我们电脑中的文件,简单来说就是像 Windows 系统中 C D E 等各类 ...
- centos7时间调整
查看时区是否正确,命令date -R: 不正确先调整时区,命令tzselect: 安装ntp,命令yum install ntp: 同步时间,命令ntpdate cn.pool.ntp.org: 设置 ...
- NOIp (on line) 入门组 2020 总结
得分情况 : 估分: 100+30+30=160: 实际: 95+70+25=190: T1 : 题意: 有n块钱,买三种文具,分别为 a:7元.b:4元.c:3元,问怎么买能让n元钱全部用完,而且使 ...