前言:

 兄弟们来了来了,最近有人在问如何模拟新浪微博登陆抓取数据,我听后默默地抽了一口老烟,暗暗的对自己说,老汉是时候该你出场了,所以今天有时间就整理整理,浅谈一二。

首先:

 要想登陆新浪微博需要预登陆,即是将账号base64加密,密码rsa加密以及请求http://login.sina.com.cn/sso/prelogin.php链接获取一些登陆需要参数,返回的接送字符串如:

{"retcode":,"servertime":,"pcid":"gz-9e1f24c9acdefb111e1c8078558c7d9c0bf2","nonce":"VHRDG1","pubkey":"EB2A38568661887FA180BDDB5CABD5F21C7BFD59C090CB2D245A87AC253062882729293E5506350508E7F9AA3BB77F4333231490F915F6D63C55FE2F08A49B353F444AD3993CACC02DB784ABBB8E42A9B1BBFFFB38BE18D78E87A0E41B9B8F73A928EE0CCEE1F6739884B9777E4FE9E88A1BBE495927AC4A799B3181D6442443","rsakv":"","is_openlock":,"lm":,"smsurl":"https:\/\/login.sina.com.cn\/sso\/msglogin?entry=weibo&mobile=18360903574&s=ea7a2e91c5f1d6da7f42aa87fe6963d0","showpin":,"exectime":}

,接下来是预登陆处理代码:

/**
* @author LongJin
* @description 初始登录信息<br> 返回false说明初始失败
* @return
*/
public boolean preLogin(){
boolean flag = false;
try {
su = new String(Base64.encodeBase64(URLEncoder.encode(username, "UTF-8").getBytes()));
String url = "http://login.sina.com.cn/sso/prelogin.php?entry=weibo&rsakt=mod&checkpin=1&" +
"client=ssologin.js(v1.4.5)&_=" + getTimestamp();
url += "&su=" + su;
String content;
content = HttpUtils.getRequest(client, url);
System.out.println("content------------" + content);
JSONObject json = JSONObject.fromObject(content);
System.out.println(json);
servertime = json.getLong("servertime");
nonce = json.getString("nonce");
rsakv = json.getString("rsakv");
pubkey = json.getString("pubkey");
flag = encodePwd();
} catch (UnsupportedEncodingException e) {
System.out.println("抛出UnsupportedEncoding异常");
} catch (ClientProtocolException e) {
System.out.println("抛出ClientProtocol异常");
} catch (IOException e) {
System.out.println("抛出IO异常");
}
return flag;
}

其次:

获取登陆需要的参数后使用post请求http://login.sina.com.cn/sso/login.php,将上述预登陆后处理数据作为参数代入请求,得到结果如下:

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=GBK" />
<title>新浪通行证</title> <script charset="utf-8" src="http://i.sso.sina.com.cn/js/ssologin.js"></script>
</head>
<body>
正在登录 ...
<script>
try{sinaSSOController.setCrossDomainUrlList({"retcode":,"arrURL":["http:\/\/passport.97973.com\/sso\/crossdomain?action=login&savestate=1518828005","http:\/\/passport.weibo.cn\/sso\/crossdomain?action=login&savestate=1"]});}
catch(e){
var msg = e.message;
var img = new Image();
var type = ;
img.src = 'http://login.sina.com.cn/sso/debuglog?msg=' + msg +'&type=' + type;
}try{sinaSSOController.crossDomainAction('login',function(){location.replace('http://passport.weibo.com/wbsso/login?ssosavestate=1518828005&url=http%3A%2F%2Fweibo.com%2Fajaxlogin.php%3Fframelogin%3D1%26callback%3Dparent.sinaSSOController.feedBackUrlCallBack&ticket=ST-NTUwODg3MjkxMQ==-1487292005-gz-FF56C545999F864FC6C7AB86FCA9FA4A-1&retcode=0');});}
catch(e){
var msg = e.message;
var img = new Image();
var type = ;
img.src = 'http://login.sina.com.cn/sso/debuglog?msg=' + msg +'&type=' + type;
}
</script>
</body>
</html>

然后用正则截取其中我们想要的部分:location.replace('')中间部分,正则表达式为:

String regex = "location.replace\\('([\\s\\S]*?)'\\);";

将正则得到的结果进行处理,如果成功则使用get请求得到的链接,截取返回结果的括号部分得到一个json格式字符串:

{"result":true,"userinfo":{"uniqueid":"","userid":null,"displayname":null,"userdomain":"?wvr=5&lf=reg"}}

,取出其中的uniqueid和userdomain用于访问个人主页,登陆部分的代码如下:

 /**
* @author LongJin
* @description 登录
* @return true:登录成功
*/
public boolean login() {
if(preLogin()) {
String url = "http://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.15)";
List<NameValuePair> parms = new ArrayList<NameValuePair>();
parms.add(new BasicNameValuePair("entry", "weibo"));
parms.add(new BasicNameValuePair("geteway", "1"));
parms.add(new BasicNameValuePair("from", ""));
parms.add(new BasicNameValuePair("savestate", "7"));
parms.add(new BasicNameValuePair("useticket", "1"));
parms.add(new BasicNameValuePair("pagerefer", "http://login.sina.com.cn/sso/logout.php?entry=miniblog&r=http%3A%2F%2Fweibo.com%2Flogout.php%3Fbackurl%3D%2F"));
parms.add(new BasicNameValuePair("vsnf", "1"));
parms.add(new BasicNameValuePair("su", su));
parms.add(new BasicNameValuePair("service", "miniblog"));
parms.add(new BasicNameValuePair("servertime", servertime + ""));
parms.add(new BasicNameValuePair("nonce", nonce));
parms.add(new BasicNameValuePair("pwencode", "rsa2"));
parms.add(new BasicNameValuePair("rsakv", rsakv));
parms.add(new BasicNameValuePair("sp", sp));
parms.add(new BasicNameValuePair("encoding", "UTF-8"));
parms.add(new BasicNameValuePair("prelt", "182"));
parms.add(new BasicNameValuePair("url", "http://weibo.com/ajaxlogin.php?framelogin=1&callback=parent.sinaSSOController.feedBackUrlCallBack"));
parms.add(new BasicNameValuePair("domain", "sina.com.cn"));
parms.add(new BasicNameValuePair("returntype", "META"));
try {
String content = HttpUtils.postRequest(client, url, parms);
System.out.println("content----------" + content);
String regex = "location.replace\\('([\\s\\S]*?)'\\);";//\\(' '\\)特殊符转译 匹配('')里面的内容//location.replace([\\s\\S]*?)
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(content);
if(m.find()) {
System.out.println("ss = "+m.group());
location = m.group(1);
if(location.contains("reason=")) {//如果你走进了这一步,恭喜报错了
errInfo = location.substring(location.indexOf("reason=") + 7);
errInfo = URLDecoder.decode(errInfo, "GBK");
} else {
System.out.println("location = "+location);
String result = HttpUtils.getRequest(client, location);//.substring(2, location.length()-2)
int beginIndex = result.indexOf("(");
int endIndex = result.lastIndexOf(")");
result = result.substring(beginIndex+1, endIndex);//截取括号里面的json字符串
//content = URLDecoder.decode(content, "UTF-8");
JSONObject jsonObject = JSONObject.fromObject(result);//转换为json
//获取uniqueid+userdomain用于访问时带的参数
uniqueid = jsonObject.getJSONObject("userinfo").getString("uniqueid");
userdomain = jsonObject.getJSONObject("userinfo").getString("userdomain");
System.out.println("result--------------" + result);
return true;
}
}
} catch (ClientProtocolException e) {
System.out.println("抛出ClientProtocol异常");
} catch (IOException e) {
System.out.println("抛出IO异常");
}
}
return false;
}

  补充一下密码加密部分的代码:

  //登录密码加密js文件内容
private static String sina_js = "var sinaSSOEncoder=sinaSSOEncoder||{};(function(){var hexcase=0;var chrsz=8;this.hex_sha1=function(s){return binb2hex(core_sha1(str2binb(s),s.length*chrsz));};var core_sha1=function(x,len){x[len>>5]|=0x80<<(24-len%32);x[((len+64>>9)<<4)+15]=len;var w=Array(80);var a=1732584193;var b=-271733879;var c=-1732584194;var d=271733878;var e=-1009589776;for(var i=0;i<x.length;i+=16){var olda=a;var oldb=b;var oldc=c;var oldd=d;var olde=e;for(var j=0;j<80;j++){if(j<16)w[j]=x[i+j];else w[j]=rol(w[j-3]^w[j-8]^w[j-14]^w[j-16],1);var t=safe_add(safe_add(rol(a,5),sha1_ft(j,b,c,d)),safe_add(safe_add(e,w[j]),sha1_kt(j)));e=d;d=c;c=rol(b,30);b=a;a=t;}a=safe_add(a,olda);b=safe_add(b,oldb);c=safe_add(c,oldc);d=safe_add(d,oldd);e=safe_add(e,olde);}return Array(a,b,c,d,e);};var sha1_ft=function(t,b,c,d){if(t<20)return(b&c)|((~b)&d);if(t<40)return b^c^d;if(t<60)return(b&c)|(b&d)|(c&d);return b^c^d;};var sha1_kt=function(t){return(t<20)?1518500249:(t<40)?1859775393:(t<60)?-1894007588:-899497514;};var safe_add=function(x,y){var lsw=(x&0xFFFF)+(y&0xFFFF);var msw=(x>>16)+(y>>16)+(lsw>>16);return(msw<<16)|(lsw&0xFFFF);};var rol=function(num,cnt){return(num<<cnt)|(num>>>(32-cnt));};var str2binb=function(str){var bin=Array();var mask=(1<<chrsz)-1;for(var i=0;i<str.length*chrsz;i+=chrsz)bin[i>>5]|=(str.charCodeAt(i/chrsz)&mask)<<(24-i%32);return bin;};var binb2hex=function(binarray){var hex_tab=hexcase?'0123456789ABCDEF':'0123456789abcdef';var str='';for(var i=0;i<binarray.length*4;i++){str+=hex_tab.charAt((binarray[i>>2]>>((3-i%4)*8+4))&0xF)+hex_tab.charAt((binarray[i>>2]>>((3-i%4)*8))&0xF);}return str;};this.base64={encode:function(input){input=''+input;if(input=='')return '';var output='';var chr1,chr2,chr3='';var enc1,enc2,enc3,enc4='';var i=0;do{chr1=input.charCodeAt(i++);chr2=input.charCodeAt(i++);chr3=input.charCodeAt(i++);enc1=chr1>>2;enc2=((chr1&3)<<4)|(chr2>>4);enc3=((chr2&15)<<2)|(chr3>>6);enc4=chr3&63;if(isNaN(chr2)){enc3=enc4=64;}else if(isNaN(chr3)){enc4=64;}output=output+this._keys.charAt(enc1)+this._keys.charAt(enc2)+this._keys.charAt(enc3)+this._keys.charAt(enc4);chr1=chr2=chr3='';enc1=enc2=enc3=enc4='';}while(i<input.length);return output;},_keys:'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/='};}).call(sinaSSOEncoder);;(function(){var dbits;var canary=0xdeadbeefcafe;var j_lm=((canary&0xffffff)==0xefcafe);function BigInteger(a,b,c){if(a!=null)if('number'==typeof a)this.fromNumber(a,b,c);else if(b==null && 'string' !=typeof a)this.fromString(a,256);else this.fromString(a,b);}function nbi(){return new BigInteger(null);}function am1(i,x,w,j,c,n){while(--n>=0){var v=x*this[i++]+w[j]+c;c=Math.floor(v/0x4000000);w[j++]=v&0x3ffffff;}return c;}function am2(i,x,w,j,c,n){var xl=x&0x7fff,xh=x>>15;while(--n>=0){var l=this[i]&0x7fff;var h=this[i++]>>15;var m=xh*l+h*xl;l=xl*l+((m&0x7fff)<<15)+w[j]+(c&0x3fffffff);c=(l>>>30)+(m>>>15)+xh*h+(c>>>30);w[j++]=l&0x3fffffff;}return c;}function am3(i,x,w,j,c,n){var xl=x&0x3fff,xh=x>>14;while(--n>=0){var l=this[i]&0x3fff;var h=this[i++]>>14;var m=xh*l+h*xl;l=xl*l+((m&0x3fff)<<14)+w[j]+c;c=(l>>28)+(m>>14)+xh*h;w[j++]=l&0xfffffff;}return c;}BigInteger.prototype.am=am3;dbits=28;BigInteger.prototype.DB=dbits;BigInteger.prototype.DM=((1<<dbits)-1);BigInteger.prototype.DV=(1<<dbits);var BI_FP=52;BigInteger.prototype.FV=Math.pow(2,BI_FP);BigInteger.prototype.F1=BI_FP-dbits;BigInteger.prototype.F2=2*dbits-BI_FP;var BI_RM='0123456789abcdefghijklmnopqrstuvwxyz';var BI_RC=new Array();var rr,vv;rr='0'.charCodeAt(0);for(vv=0;vv<=9;++vv)BI_RC[rr++]=vv;rr='a'.charCodeAt(0);for(vv=10;vv<36;++vv)BI_RC[rr++]=vv;rr='A'.charCodeAt(0);for(vv=10;vv<36;++vv)BI_RC[rr++]=vv;function int2char(n){return BI_RM.charAt(n);}function intAt(s,i){var c=BI_RC[s.charCodeAt(i)];return(c==null)?-1:c;}function bnpCopyTo(r){for(var i=this.t-1;i>=0;--i)r[i]=this[i];r.t=this.t;r.s=this.s;}function bnpFromInt(x){this.t=1;this.s=(x<0)?-1:0;if(x>0)this[0]=x;else if(x<-1)this[0]=x+DV;else this.t=0;}function nbv(i){var r=nbi();r.fromInt(i);return r;}function bnpFromString(s,b){var k;if(b==16)k=4;else if(b==8)k=3;else if(b==256)k=8;else if(b==2)k=1;else if(b==32)k=5;else if(b==4)k=2;else{this.fromRadix(s,b);return;}this.t=0;this.s=0;var i=s.length,mi=false,sh=0;while(--i>=0){var x=(k==8)?s[i]&0xff:intAt(s,i);if(x<0){if(s.charAt(i)=='-')mi=true;continue;}mi=false;if(sh==0)this[this.t++]=x;else if(sh+k>this.DB){this[this.t-1]|=(x&((1<<(this.DB-sh))-1))<<sh;this[this.t++]=(x>>(this.DB-sh));}else this[this.t-1]|=x<<sh;sh+=k;if(sh>=this.DB)sh-=this.DB;}if(k==8&&(s[0]&0x80)!=0){this.s=-1;if(sh>0)this[this.t-1]|=((1<<(this.DB-sh))-1)<<sh;}this.clamp();if(mi)BigInteger.ZERO.subTo(this,this);}function bnpClamp(){var c=this.s&this.DM;while(this.t>0&&this[this.t-1]==c)--this.t;}function bnToString(b){if(this.s<0)return '-'+this.negate().toString(b);var k;if(b==16)k=4;else if(b==8)k=3;else if(b==2)k=1;else if(b==32)k=5;else if(b==4)k=2;else return this.toRadix(b);var km=(1<<k)-1,d,m=false,r='',i=this.t;var p=this.DB-(i*this.DB)%k;if(i-->0){if(p<this.DB&&(d=this[i]>>p)>0){m=true;r=int2char(d);}while(i>=0){if(p<k){d=(this[i]&((1<<p)-1))<<(k-p);d|=this[--i]>>(p+=this.DB-k);}else{d=(this[i]>>(p-=k))&km;if(p<=0){p+=this.DB;--i;}}if(d>0)m=true;if(m)r+=int2char(d);}}return m?r:'0';}function bnNegate(){var r=nbi();BigInteger.ZERO.subTo(this,r);return r;}function bnAbs(){return(this.s<0)?this.negate():this;}function bnCompareTo(a){var r=this.s-a.s;if(r!=0)return r;var i=this.t;r=i-a.t;if(r!=0)return r;while(--i>=0)if((r=this[i]-a[i])!=0)return r;return 0;}function nbits(x){var r=1,t;if((t=x>>>16)!=0){x=t;r+=16;}if((t=x>>8)!=0){x=t;r+=8;}if((t=x>>4)!=0){x=t;r+=4;}if((t=x>>2)!=0){x=t;r+=2;}if((t=x>>1)!=0){x=t;r+=1;}return r;}function bnBitLength(){if(this.t<=0)return 0;return this.DB*(this.t-1)+nbits(this[this.t-1]^(this.s&this.DM));}function bnpDLShiftTo(n,r){var i;for(i=this.t-1;i>=0;--i)r[i+n]=this[i];for(i=n-1;i>=0;--i)r[i]=0;r.t=this.t+n;r.s=this.s;}function bnpDRShiftTo(n,r){for(var i=n;i<this.t;++i)r[i-n]=this[i];r.t=Math.max(this.t-n,0);r.s=this.s;}function bnpLShiftTo(n,r){var bs=n%this.DB;var cbs=this.DB-bs;var bm=(1<<cbs)-1;var ds=Math.floor(n/this.DB),c=(this.s<<bs)&this.DM,i;for(i=this.t-1;i>=0;--i){r[i+ds+1]=(this[i]>>cbs)|c;c=(this[i]&bm)<<bs;}for(i=ds-1;i>=0;--i)r[i]=0;r[ds]=c;r.t=this.t+ds+1;r.s=this.s;r.clamp();}function bnpRShiftTo(n,r){r.s=this.s;var ds=Math.floor(n/this.DB);if(ds>=this.t){r.t=0;return;}var bs=n%this.DB;var cbs=this.DB-bs;var bm=(1<<bs)-1;r[0]=this[ds]>>bs;for(var i=ds+1;i<this.t;++i){r[i-ds-1]|=(this[i]&bm)<<cbs;r[i-ds]=this[i]>>bs;}if(bs>0)r[this.t-ds-1]|=(this.s&bm)<<cbs;r.t=this.t-ds;r.clamp();}function bnpSubTo(a,r){var i=0,c=0,m=Math.min(a.t,this.t);while(i<m){c+=this[i]-a[i];r[i++]=c&this.DM;c>>=this.DB;}if(a.t<this.t){c-=a.s;while(i<this.t){c+=this[i];r[i++]=c&this.DM;c>>=this.DB;}c+=this.s;}else{c+=this.s;while(i<a.t){c-=a[i];r[i++]=c&this.DM;c>>=this.DB;}c-=a.s;}r.s=(c<0)?-1:0;if(c<-1)r[i++]=this.DV+c;else if(c>0)r[i++]=c;r.t=i;r.clamp();}function bnpMultiplyTo(a,r){var x=this.abs(),y=a.abs();var i=x.t;r.t=i+y.t;while(--i>=0)r[i]=0;for(i=0;i<y.t;++i)r[i+x.t]=x.am(0,y[i],r,i,0,x.t);r.s=0;r.clamp();if(this.s!=a.s)BigInteger.ZERO.subTo(r,r);}function bnpSquareTo(r){var x=this.abs();var i=r.t=2*x.t;while(--i>=0)r[i]=0;for(i=0;i<x.t-1;++i){var c=x.am(i,x[i],r,2*i,0,1);if((r[i+x.t]+=x.am(i+1,2*x[i],r,2*i+1,c,x.t-i-1))>=x.DV){r[i+x.t]-=x.DV;r[i+x.t+1]=1;}}if(r.t>0)r[r.t-1]+=x.am(i,x[i],r,2*i,0,1);r.s=0;r.clamp();}function bnpDivRemTo(m,q,r){var pm=m.abs();if(pm.t<=0)return;var pt=this.abs();if(pt.t<pm.t){if(q!=null)q.fromInt(0);if(r!=null)this.copyTo(r);return;}if(r==null)r=nbi();var y=nbi(),ts=this.s,ms=m.s;var nsh=this.DB-nbits(pm[pm.t-1]);if(nsh>0){pm.lShiftTo(nsh,y);pt.lShiftTo(nsh,r);}else{pm.copyTo(y);pt.copyTo(r);}var ys=y.t;var y0=y[ys-1];if(y0==0)return;var yt=y0*(1<<this.F1)+((ys>1)?y[ys-2]>>this.F2:0);var d1=this.FV/yt,d2=(1<<this.F1)/yt,e=1<<this.F2;var i=r.t,j=i-ys,t=(q==null)?nbi():q;y.dlShiftTo(j,t);if(r.compareTo(t)>=0){r[r.t++]=1;r.subTo(t,r);}BigInteger.ONE.dlShiftTo(ys,t);t.subTo(y,y);while(y.t<ys)y[y.t++]=0;while(--j>=0){var qd=(r[--i]==y0)?this.DM:Math.floor(r[i]*d1+(r[i-1]+e)*d2);if((r[i]+=y.am(0,qd,r,j,0,ys))<qd){y.dlShiftTo(j,t);r.subTo(t,r);while(r[i]<--qd)r.subTo(t,r);}}if(q!=null){r.drShiftTo(ys,q);if(ts!=ms)BigInteger.ZERO.subTo(q,q);}r.t=ys;r.clamp();if(nsh>0)r.rShiftTo(nsh,r);if(ts<0)BigInteger.ZERO.subTo(r,r);}function bnMod(a){var r=nbi();this.abs().divRemTo(a,null,r);if(this.s<0&&r.compareTo(BigInteger.ZERO)>0)a.subTo(r,r);return r;}function Classic(m){this.m=m;}function cConvert(x){if(x.s<0||x.compareTo(this.m)>=0)return x.mod(this.m);else return x;}function cRevert(x){return x;}function cReduce(x){x.divRemTo(this.m,null,x);}function cMulTo(x,y,r){x.multiplyTo(y,r);this.reduce(r);}function cSqrTo(x,r){x.squareTo(r);this.reduce(r);}Classic.prototype.convert=cConvert;Classic.prototype.revert=cRevert;Classic.prototype.reduce=cReduce;Classic.prototype.mulTo=cMulTo;Classic.prototype.sqrTo=cSqrTo;function bnpInvDigit(){if(this.t<1)return 0;var x=this[0];if((x&1)==0)return 0;var y=x&3;y=(y*(2-(x&0xf)*y))&0xf;y=(y*(2-(x&0xff)*y))&0xff;y=(y*(2-(((x&0xffff)*y)&0xffff)))&0xffff;y=(y*(2-x*y%this.DV))%this.DV;return(y>0)?this.DV-y:-y;}function Montgomery(m){this.m=m;this.mp=m.invDigit();this.mpl=this.mp&0x7fff;this.mph=this.mp>>15;this.um=(1<<(m.DB-15))-1;this.mt2=2*m.t;}function montConvert(x){var r=nbi();x.abs().dlShiftTo(this.m.t,r);r.divRemTo(this.m,null,r);if(x.s<0&&r.compareTo(BigInteger.ZERO)>0)this.m.subTo(r,r);return r;}function montRevert(x){var r=nbi();x.copyTo(r);this.reduce(r);return r;}function montReduce(x){while(x.t<=this.mt2)x[x.t++]=0;for(var i=0;i<this.m.t;++i){var j=x[i]&0x7fff;var u0=(j*this.mpl+(((j*this.mph+(x[i]>>15)*this.mpl)&this.um)<<15))&x.DM;j=i+this.m.t;x[j]+=this.m.am(0,u0,x,i,0,this.m.t);while(x[j]>=x.DV){x[j]-=x.DV;x[++j]++;}}x.clamp();x.drShiftTo(this.m.t,x);if(x.compareTo(this.m)>=0)x.subTo(this.m,x);}function montSqrTo(x,r){x.squareTo(r);this.reduce(r);}function montMulTo(x,y,r){x.multiplyTo(y,r);this.reduce(r);}Montgomery.prototype.convert=montConvert;Montgomery.prototype.revert=montRevert;Montgomery.prototype.reduce=montReduce;Montgomery.prototype.mulTo=montMulTo;Montgomery.prototype.sqrTo=montSqrTo;function bnpIsEven(){return((this.t>0)?(this[0]&1):this.s)==0;}function bnpExp(e,z){if(e>0xffffffff||e<1)return BigInteger.ONE;var r=nbi(),r2=nbi(),g=z.convert(this),i=nbits(e)-1;g.copyTo(r);while(--i>=0){z.sqrTo(r,r2);if((e&(1<<i))>0)z.mulTo(r2,g,r);else{var t=r;r=r2;r2=t;}}return z.revert(r);}function bnModPowInt(e,m){var z;if(e<256||m.isEven())z=new Classic(m);else z=new Montgomery(m);return this.exp(e,z);}BigInteger.prototype.copyTo=bnpCopyTo;BigInteger.prototype.fromInt=bnpFromInt;BigInteger.prototype.fromString=bnpFromString;BigInteger.prototype.clamp=bnpClamp;BigInteger.prototype.dlShiftTo=bnpDLShiftTo;BigInteger.prototype.drShiftTo=bnpDRShiftTo;BigInteger.prototype.lShiftTo=bnpLShiftTo;BigInteger.prototype.rShiftTo=bnpRShiftTo;BigInteger.prototype.subTo=bnpSubTo;BigInteger.prototype.multiplyTo=bnpMultiplyTo;BigInteger.prototype.squareTo=bnpSquareTo;BigInteger.prototype.divRemTo=bnpDivRemTo;BigInteger.prototype.invDigit=bnpInvDigit;BigInteger.prototype.isEven=bnpIsEven;BigInteger.prototype.exp=bnpExp;BigInteger.prototype.toString=bnToString;BigInteger.prototype.negate=bnNegate;BigInteger.prototype.abs=bnAbs;BigInteger.prototype.compareTo=bnCompareTo;BigInteger.prototype.bitLength=bnBitLength;BigInteger.prototype.mod=bnMod;BigInteger.prototype.modPowInt=bnModPowInt;BigInteger.ZERO=nbv(0);BigInteger.ONE=nbv(1);function Arcfour(){this.i=0;this.j=0;this.S=new Array();}function ARC4init(key){var i,j,t;for(i=0;i<256;++i)this.S[i]=i;j=0;for(i=0;i<256;++i){j=(j+this.S[i]+key[i%key.length])&255;t=this.S[i];this.S[i]=this.S[j];this.S[j]=t;}this.i=0;this.j=0;}function ARC4next(){var t;this.i=(this.i+1)&255;this.j=(this.j+this.S[this.i])&255;t=this.S[this.i];this.S[this.i]=this.S[this.j];this.S[this.j]=t;return this.S[(t+this.S[this.i])&255];}Arcfour.prototype.init=ARC4init;Arcfour.prototype.next=ARC4next;function prng_newstate(){return new Arcfour();}var rng_psize=256;var rng_state;var rng_pool;var rng_pptr;function rng_seed_int(x){rng_pool[rng_pptr++]^=x&255;rng_pool[rng_pptr++]^=(x>>8)&255;rng_pool[rng_pptr++]^=(x>>16)&255;rng_pool[rng_pptr++]^=(x>>24)&255;if(rng_pptr>=rng_psize)rng_pptr-=rng_psize;}function rng_seed_time(){rng_seed_int(new Date().getTime());}if(rng_pool==null){rng_pool=new Array();rng_pptr=0;var t;while(rng_pptr<rng_psize){t=Math.floor(65536*Math.random());rng_pool[rng_pptr++]=t>>>8;rng_pool[rng_pptr++]=t&255;}rng_pptr=0;rng_seed_time();}function rng_get_byte(){if(rng_state==null){rng_seed_time();rng_state=prng_newstate();rng_state.init(rng_pool);for(rng_pptr=0;rng_pptr<rng_pool.length;++rng_pptr)rng_pool[rng_pptr]=0;rng_pptr=0;}return rng_state.next();}function rng_get_bytes(ba){var i;for(i=0;i<ba.length;++i)ba[i]=rng_get_byte();}function SecureRandom(){}SecureRandom.prototype.nextBytes=rng_get_bytes;function parseBigInt(str,r){return new BigInteger(str,r);}function linebrk(s,n){var ret='';var i=0;while(i+n<s.length){ret+=s.substring(i,i+n)+'\\n';i+=n;}return ret+s.substring(i,s.length);}function byte2Hex(b){if(b<0x10)return '0'+b.toString(16);else return b.toString(16);}function pkcs1pad2(s,n){if(n<s.length+11){return null;}var ba=new Array();var i=s.length-1;while(i>=0&&n>0){var c=s.charCodeAt(i--);if(c<128){ba[--n]=c;}else if((c>127)&&(c<2048)){ba[--n]=(c&63)|128;ba[--n]=(c>>6)|192;}else{ba[--n]=(c&63)|128;ba[--n]=((c>>6)&63)|128;ba[--n]=(c>>12)|224;}}ba[--n]=0;var rng=new SecureRandom();var x=new Array();while(n>2){x[0]=0;while(x[0]==0)rng.nextBytes(x);ba[--n]=x[0];}ba[--n]=2;ba[--n]=0;return new BigInteger(ba);}function RSAKey(){this.n=null;this.e=0;this.d=null;this.p=null;this.q=null;this.dmp1=null;this.dmq1=null;this.coeff=null;}function RSASetPublic(N,E){if(N!=null&&E!=null&&N.length>0&&E.length>0){this.n=parseBigInt(N,16);this.e=parseInt(E,16);}else alert('Invalid RSA public key');}function RSADoPublic(x){return x.modPowInt(this.e,this.n);}function RSAEncrypt(text){var m=pkcs1pad2(text,(this.n.bitLength()+7)>>3);if(m==null)return null;var c=this.doPublic(m);if(c==null)return null;var h=c.toString(16);if((h.length&1)==0)return h;else return '0'+h;}RSAKey.prototype.doPublic=RSADoPublic;RSAKey.prototype.setPublic=RSASetPublic;RSAKey.prototype.encrypt=RSAEncrypt;this.RSAKey=RSAKey;}).call(sinaSSOEncoder);function getpass(pwd,servicetime,nonce,rsaPubkey){var RSAKey=new sinaSSOEncoder.RSAKey();RSAKey.setPublic(rsaPubkey,'10001');var password=RSAKey.encrypt([servicetime,nonce].join('\\t')+'\\n'+pwd);return password;}";

  

/**
* 密码进行RSA加密<br>
* 返回false说明加密失败
* @return
*/
private boolean encodePwd() {
ScriptEngineManager sem = new ScriptEngineManager();
ScriptEngine se = sem.getEngineByName("javascript");
try {
// 使用js加密密码,RSA,调用js内方法 我这里使用的是字符串 也可以直接放入文件中然后读取,如下面注释部分。
se.eval(sina_js);
//调用js内部函数用于加密
if (se instanceof Invocable) {
Invocable iv = (Invocable) se;
sp = (String) iv.invokeFunction("getpass", this.password, this.servertime, this.nonce,
this.pubkey);
}
/* FileReader fr = new FileReader("E:\\encoder.js");
se.eval(fr);
Invocable invocableEngine = (Invocable) se;
String callbackvalue = (String) invocableEngine.invokeFunction("encodePwd", pubkey, servertime, nonce, password);
sp = callbackvalue;*/
return true;
} catch (ScriptException e) {
// TODO Auto-generated catch block
//e.printStackTrace();
} catch (NoSuchMethodException e) {
// TODO Auto-generated catch block
//e.printStackTrace();
}
errInfo = "密码加密失败!";
return false;
}

/**
* @author LongJin
* @description 返回错误信息
* @return
*/
public String getErrInfo() {
return errInfo;
}

登陆部分就基本完成了。

最后来进行测试登陆抓取数据:

 public static void main(String[] args) throws ClientProtocolException, IOException {
SinaWeibo weibo = new SinaWeibo("**", "***");//账号密码在此就不透露了
if(weibo.login()) {
System.out.println("登陆成功!");
InputStream con= HttpUtils.getRequests(client, "http://weibo.com/u/"+uniqueid+userdomain);//请求个人主页获取输入流
String cont = readStreamByEncoding(con, "UTF-8");//将返回的输入流转换为字符串
String sb =HttpUtils.getText(cont);//通过jsoup获取text内容部分
//readStreamOutFileByEncoding(sb);也可已将获取的内容写入文件中
} else {
System.out.println("登录失败!");
} }

  得到的结果为:

text--------------我投给了"易建联" 这个选项。 #本土MVP# 本赛季常规赛最有价值球员(MVP)评选小组由中国篮协新闻委员会成员单位代表、俱乐部推荐的地方媒体代表组成,新浪拥有一票,我们将把粉丝们的意见发给篮协。 R本赛季本土MVP是? ????

到此整个登陆就完成了,佛说,无私奉献是一种美德,所以博主将此篇博客分享给大家,用来共同学习进步,望有不足之处多提点。

Java模拟新浪微博登陆抓取数据的更多相关文章

  1. php中CURL技术模拟登陆抓取数据实战,抓取某校教务处学生成绩。

    这两天有基友要php中curl抓取教务处成绩的源码,用于微信公众平台的开发.下面笔者只好忍痛割爱了.php中CURL技术模拟登陆抓取数据实战,抓取沈阳工学院教务处学生成绩. 首先,教务处登录需要验证码 ...

  2. PHP Curl模拟登录并抓取数据

    使用PHP的Curl扩展库可以模拟实现登录,并抓取一些需要用户账号登录以后才能查看的数据.具体实现的流程如下(个人总结): 1. 首先需要对相应的登录页面的html源代码进行分析,获得一些必要的信息: ...

  3. PHP获取cookie、Token、模拟登录、抓取数据、解析生成json

    本文介绍使用PHP获取cookie,获取Token.以及模拟登录.然后抓取数据.最后解析生成json的的过程. 0. 设置Cookie路径 set_time_limit(0); //使用的cookie ...

  4. Java模拟登录系统抓取内容【转载】

    没有看考勤的习惯,导致我的一天班白上了,都是钱啊,系统也不发个邮件通知下....     为了避免以后还有类似状况特别写了个java模拟登录抓取考勤内容的方法(部分代码来自网络),希望有人修改后也可以 ...

  5. php curl 添加cookie伪造登陆抓取数据(摘自网络)

    有的网页必须登陆才能看到,这个时候想要抓取信息必须在header里面传递cookie值才能获取 1.首先登陆网站,打开firebug就能看到对应的cookie把这些cookie拷贝出来就能使用了 2. ...

  6. php curl模拟登陆抓取数据

    http://www.cnblogs.com/zengguowang/p/6814474.html

  7. 测试开发Python培训:抓取新浪微博抓取数据-技术篇

    测试开发Python培训:抓取新浪微博抓取数据-技术篇   poptest是国内唯一一家培养测试开发工程师的培训机构,以学员能胜任自动化测试,性能测试,测试工具开发等工作为目标.在poptest的se ...

  8. java使用htmlunit工具抓取js中加载的数据

    htmlunit 是一款开源的java 页面分析工具,读取页面后,可以有效的使用htmlunit分析页面上的内容.项目可以模拟浏览器运行,被誉为java浏览器的开源实现.这个没有界面的浏览器,运行速度 ...

  9. java抓取网页数据,登录之后抓取数据。

    最近做了一个从网络上抓取数据的一个小程序.主要关于信贷方面,收集的一些黑名单网站,从该网站上抓取到自己系统中. 也找了一些资料,觉得没有一个很好的,全面的例子.因此在这里做个笔记提醒自己. 首先需要一 ...

随机推荐

  1. HUST 1601 Shepherd

    间隔小的时候dp预处理,大的时候暴力..正确做法不会... dp[i][j]表示以i为开头,间隔为j的和,递推:dp[i][j] = dp[i + j][j] + a[i] 测试数据中间隔可能是0.. ...

  2. Qt编译Oracle OCI驱动

    最近使用qt开发了一个访问数据库的工具, 默认使用ODBC驱动注入的方式,后来发现Oracle中ODBC驱动注入经常失败. 后来就想直接使用OCI方式访问,而默认情况下Qt只有Sqlite和ODBC驱 ...

  3. Online Schema Change for MySQL

    It is great to be able to build small utilities on top of an excellent RDBMS. Thank you MySQL. This ...

  4. IOS 股票K线图、分时图

    IOS 股票K线图.分时图,网上开源项目很少,质量也是参差不齐:偶尔搜索到看似有希望的文章,点进去,还是个标题党:深受毒害.经过一段时间的探索,终于在开源基础上完成了自己的股票K线图.分时图: 先放出 ...

  5. 安装ARM交叉编译器

    1.开发平台 虚拟机:VMware 12 操作系统:Ubuntu 14.04 64bit 2.准备ARM交叉编译工具包 编译uboot和linux kernel都需要ARM交叉工具链支持,这里使用Li ...

  6. ZOJ 1012 Mainframe

    题目大意:有一台主机,有m个cpu和n的内存,有l个任务,每个任务需消耗一定的cpu和内存,给出任务的开始时间和截止时间,完成任务可获得一定的金钱,同时提前完成有奖金,延后完成要扣钱.计算到某个时间所 ...

  7. Robocopy 轉帖

    实例一:文件,想怎么复制就怎么复制 [实现效果] 随时将源文件夹中的纯文本(TXT).Word文档(DOC)还有BMP.TIF图像文件复制到目标文件夹中 ,这是在"资源管理器"中直 ...

  8. java短路问题

    java短路问题 短路运算符就是我们常用的"&&"."||",一般称为"条件操作". class Logic{ public ...

  9. iOS 解决一个复杂bug 之 计分卡

    由于该模块界面和业务逻辑都很复杂,并且整个界面设计和业务逻辑都在ViewController(下面简称为VC)里面完成.该VC共有3000多行,一个函数几百张的也有.所以,解决起来真是头疼. 1. 问 ...

  10. IOC容器Unity的使用及独立配置文件Unity.Config

    [本段摘录自:IOC容器Unity 使用http://blog.csdn.net/gdjlc/article/details/8695266] 面向接口实现有很多好处,可以提供不同灵活的子类实现,增加 ...