http://www.hashbangcode.com/blog/netscape-http-cooke-file-parser-php

I recently needed to create a function that would read and extract cookies from a Netscape HTTP cookie file. This file is generated by PHP when it runs CURL (with the appropriate options enabled) and can be used in subsequent CURL calls. This file can be read to see what cookies where created after CURL has finished running. As an example, this is the sort of file that might be created during a typical CURL call.

# Netscape HTTP Cookie File
# http://curl.haxx.se/rfc/cookie_spec.html
# This file was generated by libcurl! Edit at your own risk.

www.example.com        FALSE        /        FALSE                cookiename        value

The first few lines are comments and can therefore be ignored. The cookie data consists of the following items (in the order they appear in the file.

  • domain - The domain that created and that can read the variable.
  • flag - A TRUE/FALSE value indicating if all machines within a given domain can access the variable. This value is set automatically by the browser, depending on the value you set for domain.
  • path - The path within the domain that the variable is valid for.
  • secure - A TRUE/FALSE value indicating if a secure connection with the domain is needed to access the variable.
  • expiration - The UNIX time that the variable will expire on.
  • name - The name of the variable.
  • value - The value of the variable.

So the function used to extract this information would look like this. It works in a pretty straightforward way and essentially returns an array of cookies found, if any. I originally tried to use a hash character to determine the start of a commented line and then try to extract anything else that had content. It turns out, however, that some sites will add cookies with a hash character at the start (yes, even for the URL parameter). So it is safer to detect for a cookie line by seeing if there are 6 tab characters in it. This is then exploded by the tab character and converted into an array of data items.

/**
 * Extract any cookies found from the cookie file. This function expects to get
 * a string containing the contents of the cookie file which it will then
 * attempt to extract and return any cookies found within.
 *
 * @param string $string The contents of the cookie file.
 *
 * @return array The array of cookies as extracted from the string.
 *
 */
function extractCookies($string) {
    $cookies = array();

    $lines = explode("\n", $string);

    // iterate over lines
    foreach ($lines as $line) {

        // we only care for valid cookie def lines
        if (isset($line[0]) && substr_count($line, "\t") == 6) {

            // get tokens in an array
            $tokens = explode("\t", $line);

            // trim the tokens
            $tokens = array_map('trim', $tokens);

            $cookie = array();

            // Extract the data
            $cookie['domain'] = $tokens[0];
            $cookie['flag'] = $tokens[1];
            $cookie['path'] = $tokens[2];
            $cookie['secure'] = $tokens[3];

            // Convert date to a readable format
            $cookie['expiration'] = date('Y-m-d h:i:s', $tokens[4]);

            $cookie['name'] = $tokens[5];
            $cookie['value'] = $tokens[6];

            // Record the cookie.
            $cookies[] = $cookie;
        }
    }

    return $cookies;
}

To test this function I used the following code. This takes a URL (google.com in this case) and sets up the options for CURL so that when the page is downloaded it also creates a cookie file. This file is then analyzed using the above function to see what cookies are present therein.

// Url to extract cookies from
$url = 'http://www.google.com/';

// Create a cookiefar file
$cookiefile = tempnam("/tmp", "CURLCOOKIE");

// create a new cURL resource
$curl = curl_init();

curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

// Set user agent
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.3) Gecko/20090910 Ubuntu/9.04 (jaunty) Shiretoko/3.5.3");

// set URL and other appropriate options
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_HEADER, true);

curl_setopt($curl, CURLOPT_COOKIEJAR, $cookiefile);

$data = curl_exec($curl);

// close cURL resource, and free up system resources
curl_close($curl);

// Extract and store any cookies found
print_r(extractCookies(file_get_contents($cookiefile)));

When run, this function produces the following output.

Array
(
    [0] => Array
        (
            [domain] => .google.com
            [flag] => TRUE
            [path] => /
            [secure] => FALSE
            [expiration] => 2013-06-29 10:00:01
            [name] => PREF
            [value] => ID=051f529ee8937fc5:FF=0:TM=1309424401:LM=1309424401:S=4rhYyPL_bW9KxVHI
        )

)

Netscape HTTP Cooke File Parser In PHP的更多相关文章

  1. minIni: A minimal INI file parser

    https://www.compuphase.com/minini.htm https://github.com/compuphase/minIni

  2. VBA json parser[z]

    http://www.ediy.co.nz/vbjson-json-parser-library-in-vb6-xidc55680.html VB-JSON: A Visual Basic 6 (VB ...

  3. Lexer and parser generators (ocamllex, ocamlyacc)

    Chapter 12 Lexer and parser generators (ocamllex, ocamlyacc) This chapter describes two program gene ...

  4. qLibc 对于C C++都是一个很好的框架,提供Tree Hash Stack String I/O File Time等功能

    qLibc Copyright qLibc is published under 2-clause BSD license known as Simplified BSD License. Pleas ...

  5. Python中的option Parser

    一般来说,Python中有两个内建的模块用于处理命令行参数: 一个是 getopt,<Deep in python>一书中也有提到,只能简单处理 命令行参数: 另一个是 optparse, ...

  6. java面试题总汇

    coreJava部分 7 1.面向对象的特征有哪些方面? 7 2.作用域public,private,protected,以及不写时的区别? 7 3.String 是最基本的数据类型吗? 7 4.fl ...

  7. SSH三大框架笔面试总结

    Java工程师(程序员)面题 Struts,Spring,Hibernate三大框架 1.Hibernate工作原理及为什么要用? 原理: 1.读取并解析配置文件 2.读取并解析映射信息,创建Sess ...

  8. Java面试葵花宝典

    面向对象的特征有哪些方面  1. 抽象:抽象就是忽略一个主题中与当前目标2. 无关的那些方面,3. 以便更充分地注意与当前目标4. 有关的方面.抽象并不5. 打算了解全部问题,而6. 只是选择其中的一 ...

  9. java面试题小全

    面向对象的特征有哪些方面   1. 抽象:抽象就是忽略一个主题中与当前目标2. 无关的那些方面,3. 以便更充分地注意与当前目标4. 有关的方面.抽象并不5. 打算了解全部问题,而6. 只是选择其中的 ...

随机推荐

  1. Win10 驱动装不上,提示:Windows 无法验证此设备所需的驱动程序的数字签名。该值受安全引导策略保护,无法进行修改或删除。

    Windows 无法验证此设备所需的驱动程序的数字签名.某软件或硬件最近有所更改,可能安装了签名错误或损毁的文件,或者安装的文件可能是来路不明的恶意软件.(代码52) 最近换了新主板,升级了Windo ...

  2. Apache Commons CLI官方文档翻译 —— 快速构建命令行启动模式

    昨天通过几个小程序以及Hangout源码学习了CLI的基本使用,今天就来尝试翻译一下CLI的官方使用手册. 下面将会通过几个部分简单的介绍CLI在应用中的使用场景. 昨天已经联系过几个基本的命令行参数 ...

  3. Oracle Partition By 的使用

    1.概述 Parttion by 关键字是Oracle中分析性函数的一部分,它和聚合函数不同的地方在于它能够返回一个分组中的多条记录,儿聚合函数一般只有一条反映统计值的结果. 2.使用方式 场景:查询 ...

  4. 马士兵Java视频教程 —— 学习顺序

    第一部分:J2se学习视频内容包括: 尚学堂科技_马士兵_JAVA视频教程_JDK5.0_下载-安装-配置 尚学堂科技_马士兵_JAVA视频教程_J2SE_5.0_第01章_JAVA简介_源代码_及重 ...

  5. JavaScript中const、var和let区别浅析

    在JavaScript中有三种声明变量的方式:var.let.const.下文给大家介绍js中三种定义变量的方式const, var, let的区别. 1.const定义的变量不可以修改,而且必须初始 ...

  6. 如何选择合适的CRM客户关系管理软件?

    面对日益激烈的市场竞争,很多企业管理者不断通过各种途径和方式,试图寻找一个合适并行之有效的解决方案,以帮助他们解决企业管理难题,不断提高企业的业绩,获得持续的成功. 企业管理软件的出现填补了企业管理领 ...

  7. 微信支付JsAPI

    https://pay.weixin.qq.com/wiki/doc/api/download/WxpayAPI_php_v3.zip 下载获取微信支付demo压缩包 打开压缩包,并将其中 Wxpay ...

  8. xcopy /r /y "$(TargetPath)" "$(ProjectDir)"..\CMSAdmin\DLL\

    作用:1.所有都生成这里容易管理 2.tfs获取的时候不会有出问题 3.如果都是引用项目 会存在先后顺序 也会导致生成代码的时候出问题

  9. sdn

    #!/usr/bin/env python #from mininet.cli import CLI #from mininet.link import Link #from mininet.net ...

  10. Javascript > Eclipse > 自动代码规范化

    Reference: http://blog.csdn.net/jmyue/article/details/11060003 大项目往往是有很多人一起完成的,然而每个人都有自己的style,导致整个项 ...