日期:2020.02.04

博客期:143

星期二

   【本博客的代码如若要使用,请在下方评论区留言,之后再用(就是跟我说一声)】

  所有相关跳转:

  a.【简单准备

  b.【云图制作+数据导入

  c.【拓扑数据

  d.【数据修复

  e.【解释修复+热词引用

   f.【JSP演示+页面跳转

  g.【热词分类+目录生成】(本期博客)

  h.【热词关系图+报告生成

  i . 【App制作

  j . 【安全性改造


  如下图,我已经解决的需求是标黄的部分,剩余需求就只有 热词分类、目录生成、热词关系图展示、数据报告导出 四部分了,这些需求是最紧要完成的,呼~撸起袖子加油干!

    

   1、热词分类

    老师说要参照各大平台的分类,我就直接按照博客园的分类来吧(我实在看不懂那些机器学习是怎么实现的,连入门的门槛都远远不及)!如下图,可以看到 博客园的新闻将新闻分成了如下几类:互联网类、IT业界类、软件开发类、开源类、电脑硬件类、游戏类、创业类、手机相关类、科学类、其他类。我就根据这几类将对应类新闻里爬出来的数据进行对应类的划分。(看来又要重新爬数据了啊)

    开始爬之前事先说明一下,这次改动应该是最后一次改动了,另外我发现每一类新闻都有 100 页,这...相当于每一类都有,所以不保证有误差的存在,另外为了减少数据量,我打算将 “频数为15” 这一条件上升到 “频数为20”,不然怎么爬的完?我先预算一下,今天和明天一起写这个博客,另外明天的话,就再写一份总结性的博客,这个小目标就算完结吧!当然最后可能会加入微信小程序部分或者APP部分,到时候再说。

    根据这10类新闻,我们总共要爬取些什么数据呢?

    首先,通过带有 header 的 request 方式爬取 https://news.cnblogs.com/ 这一初始链接,要爬以上 10 类新闻的链接,再爬取类中封装链接的构造,并开启新的爬取,对应每一类数据给爬到的热词信息后面追加一个“热词类型”的标签,这需要我们改造 KeyWords 类,向 KeyWords 类中加入 kind 属性,改写 __toString() 成员函数。之后改造调用过 KeyWords 类的地方。(News不需要)

    关于分类页面的构造方法:

      首先是原新闻网址:https://news.cnblogs.com/

      其次,以 “互联网” 为例:https://news.cnblogs.com/n/c1101

      然后是第 100 页的地址:https://news.cnblogs.com/n/c1101?page=100

      很容易的判断到是在原网址的基础上加入对应 互联网的 a 标签上的 href 链接,需要将数据加载到一起来组成爬取链接!

    但是爬的过程中发现了问题,就是我爬不到对应的分类链接,既然这样,我只能人工地获取它们的链接了,就10条数据无所谓了,本来因为懒想让网页帮我做的,看来是博客园让我勤快的。哈哈哈!

    对应链接:

      互联网类:https://news.cnblogs.com/n/c1101

      IT业界类:https://news.cnblogs.com/n/c1102

      软件开发类:https://news.cnblogs.com/n/c1103

      开源类:https://news.cnblogs.com/n/c1109

      电脑硬件类:https://news.cnblogs.com/n/c1111

      游戏类:https://news.cnblogs.com/n/c1110

      创业类:https://news.cnblogs.com/n/c1112

      手机相关类:https://news.cnblogs.com/n/c1113

      科学类:https://news.cnblogs.com/n/c1114

      其他类:https://news.cnblogs.com/n/c1199

    在 Surapity 类 中建立字典,存储类型的名称和对应链接。

    爬取时间较长,从下午4:51到现在第2天的1:44,过程曲折且难以简言明之。

    途中遇到好几个网站会使爬虫程序终止,比如 其他类的 Apple Watch UI动效解析 ,呜哇~试一次,卡一次。程序员的痛苦莫过于此!!!

    统计基础数据共计 17469 条 数据!文件大小约为 1.96 M !

    现在开始制作数据表:(先修改 fileR.py)

 import codecs

 def makeSql():
file_path = "../../testFile/frc/words_sql.txt"
f = codecs.open(file_path, "w+", 'utf-8')
f.write("")
f.close() fw = open("../../testFile/frc/word.txt", mode='r', encoding='utf-8')
tmp = fw.readlines() num = tmp.__len__() for i in range(0,num):
group = tmp[i].split("\t")
group[0] = "'" + group[0] + "'"
group[3] = "'" + group[3][0:group[3].__len__()-1] + "'"
f = codecs.open(file_path, "a+", 'utf-8')
f.write("Insert into words values ("+group[0]+","+group[1]+",'"+group[2]+"',"+group[3]+",'"+group[4]+"');"+"\n")
f.close() makeSql()

fileR.py

    执行并按照之前的方法导入数据,这里博主因为使用电脑管家清理了一下C盘,然后 Navicat就崩掉了,真的崩了(建立不了查询了,这个之后有解决方法的话,我再写一期博客吧!)!所以,不搞虚的,直接用文本导入了!

    建立 keywords 表(或视图)的方法同上上期的博客,那样获取每一个热词的数量!

 CREATE TABLE keywords
AS
(
SELECT
word AS word,
SUM(num) AS num
FROM
words
GROUP BY word
ORDER BY num
DESC
)

CreateKeywordsTable.sql

    

    哈哈哈哈!热词频数过万了呢!希望我的电脑还能撑住,继续爬!(但是现在已经2点了,先定个2个小时的闹钟,拓扑数据让它自己爬着)

    对于 WebConnector 类,我要着重说一下,我本次爬取将此代码注释掉了:

# 这句话处理以后,就将带有 “年”、“月”、“日” 字眼的语句以及之后的语句全部清除掉了,当时是旨在消除不必要的解释部分,但现在看来没必要!多多益善嘛!
tpl = StrSpecialDealer.ut_date(tpl)

    早上醒来发现大问题——电脑自己休眠了,唉~希望自己能够吃一堑长一智吧!

    在电脑熬夜干爬虫的时候尽力将休眠关闭,在设置中如下:

    拓扑数据也完成了,大约又历时 5 个小时,关键是在电脑爬虫时我还不能用电脑干其他的(尤其是截图软件,运行的话,爬虫程序一准给你崩停)

    终于有完整数据了,现在我们开始数据处理!

    根据不同分类将数据汇总和数据处理了(也就是说剩余没有Python的事情了),至此热词分类完毕。

  2、热词目录生成

    我们需要展示每一个分类的前10个数据,以此做成第一个页面。

    可以制作新的视图,也可以直接写大长 Sql 语句,我比较懒,就按长语句来了

 package com.servlet;

 import java.io.IOException;
import java.sql.SQLException;
import java.util.List; import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse; import org.json.JSONArray;
import org.json.JSONObject; import com.dblink.basic.utils.SqlUtils;
import com.dblink.basic.utils.sqlKind.MySql_s;
import com.dblink.basic.utils.user.UserInfo;
import com.dblink.bean.BeanGroup;
import com.dblink.sql.DBLink; @SuppressWarnings("unused")
public class ServletForMoreInfo extends HttpServlet{
/**
*
*/
private static final long serialVersionUID = 1L;
//----------------------------------------------------------------------//
public void doPost(HttpServletRequest request,HttpServletResponse response) throws ServletException, IOException
{
request.setCharacterEncoding("utf-8");
response.setCharacterEncoding("utf-8");
response.setContentType("application/json");
response.setHeader("Cache-Control", "no-cache");
response.setHeader("Pragma", "no-cache"); String kind = request.getParameter("kind"); JSONArray jsonArray = new JSONArray(); JSONObject jsonObj = new JSONObject(); DBLink dbLink = new DBLink(new SqlUtils(new MySql_s("rc"),new UserInfo("root","123456")));
BeanGroup bg = null;
try {
bg = dbLink.getSelect("Select word As word , SUM(num) As num From ( Select * From words Where kind = '"+kind+"' ) Group By word Order By num DESC Limit 0,10 ").beans; int leng = bg.size(); jsonObj.put("Length",leng); jsonArray.put(jsonObj); for(int i=0;i<leng;++i)
{
JSONObject jsonObject = new JSONObject();
jsonObject.put("word",bg.get(i).get(0));
jsonObject.put("num",bg.get(i).get(1));
jsonArray.put(jsonObject);
}
} catch (SQLException e) {
// Do Nothing ...
}
dbLink.free(); ServletOutputStream os = response.getOutputStream();
os.write(jsonArray.toString().getBytes());
os.flush();
os.close();
}
//---------------------------------------------------------------------------------//
}

ServletForMoreInfo.java

    如果你建立了对应 10 个分类的视图,你可以添加 Servlet 如下:(否则将视图名称替换成建立视图的Select语句)

 package com.servlet;

 import java.io.IOException;
import java.sql.SQLException;
import java.util.List; import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse; import org.json.JSONArray;
import org.json.JSONObject; import com.dblink.basic.utils.SqlUtils;
import com.dblink.basic.utils.sqlKind.MySql_s;
import com.dblink.basic.utils.user.UserInfo;
import com.dblink.bean.BeanGroup;
import com.dblink.sql.DBLink; @SuppressWarnings("unused")
public class ServletForKindKeyWords extends HttpServlet{
/**
*
*/
private static final long serialVersionUID = 1L;
//----------------------------------------------------------------------//
public void doPost(HttpServletRequest request,HttpServletResponse response) throws ServletException, IOException
{
request.setCharacterEncoding("utf-8");
response.setCharacterEncoding("utf-8");
response.setContentType("application/json");
response.setHeader("Cache-Control", "no-cache");
response.setHeader("Pragma", "no-cache"); String table = request.getParameter("table");
String sql_rest = request.getParameter("sql"); JSONArray jsonArray = new JSONArray(); JSONObject jsonObj = new JSONObject(); DBLink dbLink = new DBLink(new SqlUtils(new MySql_s("rc"),new UserInfo("root","123456")));
BeanGroup bg = null;
try {
bg = dbLink.getSelect("Select * From "+table+" "+sql_rest).beans; int leng = bg.size(); int maxSize = dbLink.getSelect("Select * From "+table+" ").beans.size(); int page = maxSize%leng==0?(maxSize/30):(maxSize/30)+1; jsonObj.put("Length",leng);
jsonObj.put("MaxSize",maxSize);
jsonObj.put("Page",page); jsonArray.put(jsonObj); for(int i=0;i<leng;++i)
{
JSONObject jsonObject = new JSONObject();
jsonObject.put("word",bg.get(i).get(0));
jsonObject.put("num",bg.get(i).get(1));
jsonObject.put("exp",bg.get(i).get(2));
jsonArray.put(jsonObject);
}
} catch (SQLException e) {
// Do Nothing ...
}
dbLink.free(); ServletOutputStream os = response.getOutputStream();
os.write(jsonArray.toString().getBytes());
os.flush();
os.close();
}
//---------------------------------------------------------------------------------//
}

ServletForKindKeyWords.java

    然后制作 js 部分:

      先显示分类,然后利用套装形式进行数据载入:

  如果点击 获取本类更多热词,就可以跳转至本类页面!

  Like this:

  附加新 js 代码:

 function makePageToKind()
{
var Area = '';
Area += '<div class="row">';
Area += ' <div class="col-md-12">';
Area += ' <h2>热词目录</h2>';
Area += ' </div>';
Area += '</div>';
Area += '<hr />';
Area += '<br>';
Area += '<br>';
Area += '<div id="MessageArea">';
Area += '</div>';
document.getElementById("page-inner").innerHTML = Area;
madeAllKindP();
}
function madeAllKindP()
{
var Area = '';
Area += '<div>';
Area += ' <ul>';
Area += ' <li>';
Area += ' <b>互联网类<b>';
Area += ' <div id="hlw"></div>';
Area += ' </li>';
Area += ' <li>';
Area += ' <b>IT业界类<b>';
Area += ' <div id="ityj"></div>';
Area += ' </li>';
Area += ' <li>';
Area += ' <b>软件开发类<b>';
Area += ' <div id="rjkf"></div>';
Area += ' </li>';
Area += ' <li>';
Area += ' <b>开源类<b>';
Area += ' <div id="ky"></div>';
Area += ' </li>';
Area += ' <li>';
Area += ' <b>电脑硬件类<b>';
Area += ' <div id="dnyj"></div>';
Area += ' </li>';
Area += ' <li>';
Area += ' <b>游戏类<b>';
Area += ' <div id="yx"></div>';
Area += ' </li>';
Area += ' <li>';
Area += ' <b>创业类<b>';
Area += ' <div id="cy"></div>';
Area += ' </li>';
Area += ' <li>';
Area += ' <b>手机相关类<b>';
Area += ' <div id="sjxg"></div>';
Area += ' </li>';
Area += ' <li>';
Area += ' <b>科学类<b>';
Area += ' <div id="kx"></div>';
Area += ' </li>';
Area += ' <li>';
Area += ' <b>其他类<b>';
Area += ' <div id="qt"></div>';
Area += ' </li>';
Area += ' </ul>';
Area += '</div>';
document.getElementById("MessageArea").innerHTML = Area;
makeNextStepOfGroupK("互联网类");
makeNextStepOfGroupK("IT业界类");
makeNextStepOfGroupK("软件开发类");
makeNextStepOfGroupK("开源类");
makeNextStepOfGroupK("电脑硬件类");
makeNextStepOfGroupK("游戏类");
makeNextStepOfGroupK("创业类");
makeNextStepOfGroupK("手机相关类");
makeNextStepOfGroupK("科学类");
makeNextStepOfGroupK("其他类");
}
function getKindWordsByKindName(word)
{
var id_t = "";
if(word=="互联网类")
id_t = "hlw";
else if(word=="IT业界类")
id_t = "ityj";
else if(word=="软件开发类")
id_t = "rjkf";
else if(word=="开源类")
id_t = "ky";
else if(word=="电脑硬件类")
id_t = "dnyj";
else if(word=="游戏类")
id_t = "yx";
else if(word=="创业类")
id_t = "cy";
else if(word=="手机相关类")
id_t = "sjxg";
else if(word=="科学类")
id_t = "kx";
else if(word=="其他类")
id_t = "qt";
return id_t;
}
function makeNextStepOfGroupK(word_t)
{
var xmlHttp = null;
try{
xmlHttp = new XMLHttpRequest();
} catch (e1) {
try {
xmlHttp = new ActiveXObject("Microsoft.XMLHTTP");
} catch (e2) {
alert("Your browser does not support XMLHTTP!");
return;
}
}
xmlHttp.onreadystatechange = function() {
if (xmlHttp.readyState == 4) {
if (xmlHttp.status == 200)
{
var Area = "&nbsp;&nbsp;";
s = xmlHttp.responseText;
var InformationSet = eval('('+s+')');
var leng = InformationSet[0].Length; var kindness = InformationSet[0].KindNess; for(var i=1;i<=leng;++i)
{
var word_s = InformationSet[i].word;
var num = InformationSet[i].num;
Area += "&nbsp;&nbsp;";
Area += "<a href='#' title='在本类型中引用次数:"+num+"' onclick='toSomeWhere(\""+word_s+"\")'>"+word_s+"</a>";
Area += "&nbsp;&nbsp;";
}
Area += "&nbsp;&nbsp;";
Area += "&nbsp;&nbsp;";
Area += "<a href='#' onclick='makePageToOneKind(\""+kindness+"\")'/>获取本类更多热词...</a>";
Area += "&nbsp;&nbsp;";
Area += "&nbsp;&nbsp;"; var id_t = getKindWordsByKindName(kindness);
document.getElementById(id_t).innerHTML = Area;
}
}
};
var url ="../com/servlet/ServletForMoreInfo";
var server = "kind="+word_t; xmlHttp.open("POST", url, true);
xmlHttp.setRequestHeader("Content-Type","application/x-www-form-urlencoded");
xmlHttp.send(server);
}
function makePageToOneKind(kind)
{
var Area = '';
Area += '<div class="row">';
Area += ' <div class="col-md-12">';
Area += ' <h2>'+kind+'</h2>';
Area += ' </div>';
Area += '</div>';
Area += '<hr />';
Area += '<br>';
Area += '<div style="background:rgb(0,153,255);margin-left:20px;margin-right:20px;height:25px;">';
Area += ' <div style="margin-left:10px;margin-right:10px;margin-top:5px;margin-bottom:5px;">';
Area += ' <b style="float:left;">热词表</b>';
Area += ' <div style="float:right;">';
Area += ' <select id="sty" onchange="simpleReset_Kind(\''+kind+'\')">';
Area += ' <option value="0" selected>按照词频顺序</option>';
Area += ' <option value="1">按照字母表顺序</option>';
Area += ' </select>';
Area += '&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;';
Area += ' <select id="order" onchange="simpleReset_Kind(\''+kind+'\')">';
Area += ' <option value="0" selected>降序</option>';
Area += ' <option value="1">增序</option>';
Area += ' </select>';
Area += '&nbsp;&nbsp;';
Area += ' </div>';
Area += ' </div>';
Area += '</div>';
Area += '<br>';
Area += '<br>';
Area += '<div id="MessageArea">';
Area += '</div>';
document.getElementById("page-inner").innerHTML = Area;
simpleReset_Kind(kind);
}
function simpleReset_Kind(kind)
{
wordPage = 1;
resetAndFresh_Kind(kind);
}
function XReset_Kind(p,kind)
{
wordPage = p;
wordPage = parseInt(""+wordPage);
resetAndFresh_Kind(kind);
}
function makeSurePage_Kind(kind)
{
wordPage = document.getElementById("selPage").value;
wordPage = parseInt(""+wordPage);
resetAndFresh_Kind(kind);
}
function resetAndFresh_Kind(kind)
{
var sty = document.getElementById("sty").value;
var order = document.getElementById("order").value;
var xmlHttp = null;
try{
xmlHttp = new XMLHttpRequest();
} catch (e1) {
try {
xmlHttp = new ActiveXObject("Microsoft.XMLHTTP");
} catch (e2) {
alert("Your browser does not support XMLHTTP!");
return;
}
}
xmlHttp.onreadystatechange = function() {
if (xmlHttp.readyState == 4) {
if (xmlHttp.status == 200)
{
var Area = ""; s = xmlHttp.responseText;
var InformationSet = eval('('+s+')');
var leng = InformationSet[0].Length;
var max = InformationSet[0].MaxSize;
var pageNum = InformationSet[0].Page;
var kind = InformationSet[0].KindNess; Area += "<table class='WhatATable' style='margin-left:200px;float:left;'>";
Area += "<tr>";
Area += "<th style='width:100px;'>热词</th>";
Area += "<th style='width:100px;'>词频</th>";
Area += "<th style='width:100px;'>详细信息链接</th>";
Area += "</tr>";
if(leng<10)
{
for (var i=1;i<=leng;++i)
{
Area += "<tr>";
Area += " <td>";
Area += InformationSet[i].word;
Area += " </td>";
Area += " <td>";
Area += InformationSet[i].num;
Area += " </td>";
Area += " <td>";
Area += " <a href='#' onclick='toSomeWhere(\""+InformationSet[i].word+"\")'>详细信息</a>";
Area += " </td>";
Area += "</tr>";
}
}
else
{
for (var i=1;i<=10;++i)
{
Area += "<tr>";
Area += " <td>";
Area += InformationSet[i].word;
Area += " </td>";
Area += " <td>";
Area += InformationSet[i].num;
Area += " </td>";
Area += " <td>";
Area += " <a href='#' onclick='toSomeWhere(\""+InformationSet[i].word+"\")'>详细信息</a>";
Area += " </td>";
Area += "</tr>";
}
}
Area += "</table>"; if(leng>10)
{
Area += "<table class='WhatATable' style='margin-left:10px;float:left;'>";
Area += "<tr>";
Area += "<th style='width:100px;'>热词</th>";
Area += "<th style='width:100px;'>词频</th>";
Area += "<th style='width:100px;'>详细信息链接</th>";
Area += "</tr>";
if(leng<=20)
{
for (var i=11;i<=leng;++i)
{
Area += "<tr>";
Area += " <td>";
Area += InformationSet[i].word;
Area += " </td>";
Area += " <td>";
Area += InformationSet[i].num;
Area += " </td>";
Area += " <td>";
Area += " <a href='#' onclick='toSomeWhere(\""+InformationSet[i].word+"\")'>详细信息</a>";
Area += " </td>";
Area += "</tr>";
}
}
else
{
for (var i=11;i<=20;++i)
{
Area += "<tr>";
Area += " <td>";
Area += InformationSet[i].word;
Area += " </td>";
Area += " <td>";
Area += InformationSet[i].num;
Area += " </td>";
Area += " <td>";
Area += " <a href='#' onclick='toSomeWhere(\""+InformationSet[i].word+"\")'>详细信息</a>";
Area += " </td>";
Area += "</tr>";
}
}
Area += "</table>";
} if(leng>20)
{
Area += "<table class='WhatATable' style='margin-left:10px;float:left;'>";
Area += "<tr>";
Area += "<th style='width:100px;'>热词</th>";
Area += "<th style='width:100px;'>词频</th>";
Area += "<th style='width:100px;'>详细信息链接</th>";
Area += "</tr>";
for (var i=21;i<=leng;++i)
{
Area += "<tr>";
Area += " <td>";
Area += InformationSet[i].word;
Area += " </td>";
Area += " <td>";
Area += InformationSet[i].num;
Area += " </td>";
Area += " <td>";
Area += " <a href='#' onclick='toSomeWhere(\""+InformationSet[i].word+"\")'>详细信息</a>";
Area += " </td>";
Area += "</tr>";
}
Area += "</table>";
}
Area += "<div style='clear:both;'></div>";
Area += "<br>";
Area += "<br>";
Area += "<br>";
Area += "<br>";
Area += "<p style='margin-left:30px;margin-right:30px;'>";
Area += "&nbsp;<button onclick='simpleReset_Kind(\""+kind+"\")'>起始页</button>&nbsp;"; var start = ((wordPage-4)>=1)?wordPage-4:1;
var end = ((wordPage+4)<=pageNum)?(wordPage+4):pageNum; //alert(parseInt(wordPage+4+"")); if(start!=1)
{
Area += "&nbsp;...&nbsp;";
} for(var i=start;i<=end;++i)
{
Area += "&nbsp;<button onclick='XReset_Kind(\""+i+"\",\""+kind+"\")'>"+i+"</button>&nbsp;";
} if(end!=pageNum)
{
Area += "&nbsp;...&nbsp;";
} Area += "&nbsp;<button onclick='XReset_Kind("+pageNum+",\""+kind+"\")'>结束页</button>&nbsp;";
Area += "&nbsp;&nbsp;<b>选择页数跳转</b>&nbsp;&nbsp;";
Area += "<select id='selPage' onchange='makeSurePage_Kind(\""+kind+"\")'>";
for(var i=1;i<=pageNum;++i)
{
Area += "<option value='"+i+"'>"+i+"</option>";
}
Area += "</select>";
Area += "</p>";
document.getElementById("MessageArea").innerHTML = Area;
surePage_Kind();
}
}
};
var url ="../com/servlet/ServletForKindKeyWords";
var server = "sql=";
// 按照词频顺序
if(sty==0)
{
server += " order by num ";
}
// 按照字母表顺序
else if(sty==1)
{
server += " order by word ";
} // 如果是降序
if(order==0)
{
server += " DESC ";
} server += (" Limit "+((wordPage-1)*30)+",30 "); server += "&table="+kind; xmlHttp.open("POST", url, true);
xmlHttp.setRequestHeader("Content-Type","application/x-www-form-urlencoded");
xmlHttp.send(server);
}
function surePage_Kind(kind)
{
document.getElementById("selPage").selectedIndex = wordPage-1;
}

wordkind.js

 var wordPage = 1;
function makePageToWord()
{
var Area = '';
Area += '<div class="row">';
Area += '<div class="col-md-12">';
Area += '<h2>全部热词</h2>';
Area += '</div>';
Area += '</div>';
Area += '<hr />';
Area += '<br>';
Area += '<div style="background:rgb(0,153,255);margin-left:20px;margin-right:20px;height:25px;">';
Area += ' <div style="margin-left:10px;margin-right:10px;margin-top:5px;margin-bottom:5px;">';
Area += ' <b style="float:left;">热词表</b>';
Area += ' <div style="float:right;">';
Area += ' <select id="sty" onchange="simpleReset()">';
Area += ' <option value="0" selected>按照词频顺序</option>';
Area += ' <option value="1">按照字母表顺序</option>';
Area += ' </select>';
Area += '&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;';
Area += ' <select id="order" onchange="simpleReset()">';
Area += ' <option value="0" selected>降序</option>';
Area += ' <option value="1">增序</option>';
Area += ' </select>';
Area += '&nbsp;&nbsp;';
Area += ' </div>';
Area += ' </div>';
Area += '</div>';
Area += '<br>';
Area += '<br>';
Area += '<div id="MessageArea">';
Area += '</div>';
document.getElementById("page-inner").innerHTML = Area;
simpleReset();
}
function simpleReset()
{
wordPage = 1;
resetAndFresh();
}
function XReset(p)
{
wordPage = p;
wordPage = parseInt(""+wordPage);
resetAndFresh();
}
function resetAndFresh()
{
var sty = document.getElementById("sty").value;
var order = document.getElementById("order").value;
var xmlHttp = null;
try{
xmlHttp = new XMLHttpRequest();
} catch (e1) {
try {
xmlHttp = new ActiveXObject("Microsoft.XMLHTTP");
} catch (e2) {
alert("Your browser does not support XMLHTTP!");
return;
}
}
xmlHttp.onreadystatechange = function() {
if (xmlHttp.readyState == 4) {
if (xmlHttp.status == 200)
{
var Area = ""; s = xmlHttp.responseText;
var InformationSet = eval('('+s+')');
var leng = InformationSet[0].Length;
var max = InformationSet[0].MaxSize;
var pageNum = InformationSet[0].Page; Area += "<table class='WhatATable' style='margin-left:200px;float:left;'>";
Area += "<tr>";
Area += "<th style='width:100px;'>热词</th>";
Area += "<th style='width:100px;'>词频</th>";
Area += "<th style='width:100px;'>详细信息链接</th>";
Area += "</tr>";
if(leng<10)
{
for (var i=1;i<=leng;++i)
{
Area += "<tr>";
Area += " <td>";
Area += InformationSet[i].word;
Area += " </td>";
Area += " <td>";
Area += InformationSet[i].num;
Area += " </td>";
Area += " <td>";
Area += " <a href='#' onclick='toSomeWhere(\""+InformationSet[i].word+"\")'>详细信息</a>";
Area += " </td>";
Area += "</tr>";
}
}
else
{
for (var i=1;i<=10;++i)
{
Area += "<tr>";
Area += " <td>";
Area += InformationSet[i].word;
Area += " </td>";
Area += " <td>";
Area += InformationSet[i].num;
Area += " </td>";
Area += " <td>";
Area += " <a href='#' onclick='toSomeWhere(\""+InformationSet[i].word+"\")'>详细信息</a>";
Area += " </td>";
Area += "</tr>";
}
}
Area += "</table>"; if(leng>10)
{
Area += "<table class='WhatATable' style='margin-left:10px;float:left;'>";
Area += "<tr>";
Area += "<th style='width:100px;'>热词</th>";
Area += "<th style='width:100px;'>词频</th>";
Area += "<th style='width:100px;'>详细信息链接</th>";
Area += "</tr>";
if(leng<=20)
{
for (var i=11;i<=leng;++i)
{
Area += "<tr>";
Area += " <td>";
Area += InformationSet[i].word;
Area += " </td>";
Area += " <td>";
Area += InformationSet[i].num;
Area += " </td>";
Area += " <td>";
Area += " <a href='#' onclick='toSomeWhere(\""+InformationSet[i].word+"\")'>详细信息</a>";
Area += " </td>";
Area += "</tr>";
}
}
else
{
for (var i=11;i<=20;++i)
{
Area += "<tr>";
Area += " <td>";
Area += InformationSet[i].word;
Area += " </td>";
Area += " <td>";
Area += InformationSet[i].num;
Area += " </td>";
Area += " <td>";
Area += " <a href='#' onclick='toSomeWhere(\""+InformationSet[i].word+"\")'>详细信息</a>";
Area += " </td>";
Area += "</tr>";
}
}
Area += "</table>";
} if(leng>20)
{
Area += "<table class='WhatATable' style='margin-left:10px;float:left;'>";
Area += "<tr>";
Area += "<th style='width:100px;'>热词</th>";
Area += "<th style='width:100px;'>词频</th>";
Area += "<th style='width:100px;'>详细信息链接</th>";
Area += "</tr>";
for (var i=21;i<=leng;++i)
{
Area += "<tr>";
Area += " <td>";
Area += InformationSet[i].word;
Area += " </td>";
Area += " <td>";
Area += InformationSet[i].num;
Area += " </td>";
Area += " <td>";
Area += " <a href='#' onclick='toSomeWhere(\""+InformationSet[i].word+"\")'>详细信息</a>";
Area += " </td>";
Area += "</tr>";
}
Area += "</table>";
}
Area += "<div style='clear:both;'></div>";
Area += "<br>";
Area += "<br>";
Area += "<br>";
Area += "<br>";
Area += "<p style='margin-left:30px;margin-right:30px;'>";
Area += "&nbsp;<button onclick='simpleReset()'>起始页</button>&nbsp;"; var start = ((wordPage-4)>=1)?wordPage-4:1;
var end = ((wordPage+4)<=pageNum)?(wordPage+4):pageNum; //alert(parseInt(wordPage+4+"")); if(start!=1)
{
Area += "&nbsp;...&nbsp;";
} for(var i=start;i<=end;++i)
{
Area += "&nbsp;<button onclick='XReset("+i+")'>"+i+"</button>&nbsp;";
} if(end!=pageNum)
{
Area += "&nbsp;...&nbsp;";
} Area += "&nbsp;<button onclick='XReset("+pageNum+")'>结束页</button>&nbsp;";
Area += "&nbsp;&nbsp;<b>选择页数跳转</b>&nbsp;&nbsp;";
Area += "<select id='selPage' onchange='makeSurePage()'>";
for(var i=1;i<=pageNum;++i)
{
Area += "<option value='"+i+"'>"+i+"</option>";
}
Area += "</select>";
Area += "</p>";
document.getElementById("MessageArea").innerHTML = Area;
surePage();
}
}
};
var url ="../com/servlet/ServletForAllKeyWords";
var server = "sql=";
// 按照词频顺序
if(sty==0)
{
server += " order by num ";
}
// 按照字母表顺序
else if(sty==1)
{
server += " order by word ";
} // 如果是降序
if(order==0)
{
server += " DESC ";
} server += (" Limit "+((wordPage-1)*30)+",30 "); xmlHttp.open("POST", url, true);
xmlHttp.setRequestHeader("Content-Type","application/x-www-form-urlencoded");
xmlHttp.send(server);
}
function toSomeWhere(word)
{
var Area = '';
Area += '<div class="row">';
Area += ' <div class="col-md-12">';
Area += ' <h2>'+word+'</h2>';
Area += ' </div>';
Area += '</div>';
Area += '<hr />';
Area += '<br>';
Area += '<div id="MessageArea">';
Area += '</div>';
document.getElementById("page-inner").innerHTML = Area; var xmlHttp = null;
try{
xmlHttp = new XMLHttpRequest();
} catch (e1) {
try {
xmlHttp = new ActiveXObject("Microsoft.XMLHTTP");
} catch (e2) {
alert("Your browser does not support XMLHTTP!");
return;
}
}
xmlHttp.onreadystatechange = function() {
if (xmlHttp.readyState == 4) {
if (xmlHttp.status == 200)
{
var Area = ""; s = xmlHttp.responseText;
var InformationSet = eval('('+s+')');
var word = InformationSet[1].word;
var num = InformationSet[1].num;
var exp = InformationSet[1].exp; Area += "<p><b id='word' style='font-size:120%;'>"+word+"</b></p>";
Area += "<p style='color:rgb(200,200,200);'>&nbsp;&nbsp;&nbsp;引用次数:"+num+"</p>"
Area += "<p style='font:\"楷体\";font-size:90%;'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;";
if(exp=="")
{
Area += "目前百度百科上并没有相关解释信息...";
}
else
{
Area += exp;
}
Area += "</p>";
Area += "<br>";
Area += "<div id='finalDIV'></div>"
document.getElementById("MessageArea").innerHTML = Area; getLinksForKey(word);
}
}
};
var url ="../com/servlet/ServletForAllKeyWords";
var server = "sql= where word='"+word+"'"; xmlHttp.open("POST", url, true);
xmlHttp.setRequestHeader("Content-Type","application/x-www-form-urlencoded");
xmlHttp.send(server);
}
function getLinksForKey(word)
{
var xmlHttp = null;
try{
xmlHttp = new XMLHttpRequest();
} catch (e1) {
try {
xmlHttp = new ActiveXObject("Microsoft.XMLHTTP");
} catch (e2) {
alert("Your browser does not support XMLHTTP!");
return;
}
}
xmlHttp.onreadystatechange = function() {
if (xmlHttp.readyState == 4) {
if (xmlHttp.status == 200)
{
var Area = "";
Area += "<br>";
Area += "<br>";
Area += "<b style='font-size:120%;'>引用网页:</b>";
Area += "<br>";
Area += "<br>";
Area += "<ul>";
s = xmlHttp.responseText;
var InformationSet = eval('('+s+')');
var leng = InformationSet[0].Length; for(var i=1;i<=leng;++i)
{
var word = InformationSet[i].word;
var num = InformationSet[i].num;
var title = InformationSet[i].title;
var link = InformationSet[i].link;
Area += "<li>";
Area += "<a href='"+link+"' title='引用次数:"+num+"'>"+title+"</a>"
Area += "</li>";
}
Area += "</ul>"; document.getElementById("finalDIV").innerHTML = Area;
}
}
};
var url ="../com/servlet/ServletForLinkData";
var server = "word="+word; xmlHttp.open("POST", url, true);
xmlHttp.setRequestHeader("Content-Type","application/x-www-form-urlencoded");
xmlHttp.send(server);
}
function surePage()
{
document.getElementById("selPage").selectedIndex = wordPage-1;
}
function makeSurePage()
{
wordPage = document.getElementById("selPage").value;
wordPage = parseInt(""+wordPage);
resetAndFresh();
}

word.js

  更新 web.xml 引用

 <?xml version="1.0" encoding="UTF-8"?>
<web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://xmlns.jcp.org/xml/ns/javaee" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee http://xmlns.jcp.org/xml/ns/javaee/web-app_4_0.xsd" id="WebApp_ID" version="4.0">
<display-name>HotWord</display-name>
<servlet>
<description>This is the description of my J2EE component</description>
<display-name>This is the display name of my J2EE component</display-name>
<servlet-name>ServletForWords</servlet-name>
<servlet-class>com.servlet.ServletForWords</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>ServletForWords</servlet-name>
<url-pattern>/com/servlet/ServletForWords</url-pattern>
</servlet-mapping>
<servlet>
<description>This is the description of my J2EE component</description>
<display-name>This is the display name of my J2EE component</display-name>
<servlet-name>ServletForAllKeyWords</servlet-name>
<servlet-class>com.servlet.ServletForAllKeyWords</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>ServletForAllKeyWords</servlet-name>
<url-pattern>/com/servlet/ServletForAllKeyWords</url-pattern>
</servlet-mapping>
<servlet>
<description>This is the description of my J2EE component</description>
<display-name>This is the display name of my J2EE component</display-name>
<servlet-name>ServletForLinkData</servlet-name>
<servlet-class>com.servlet.ServletForLinkData</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>ServletForLinkData</servlet-name>
<url-pattern>/com/servlet/ServletForLinkData</url-pattern>
</servlet-mapping>
<servlet>
<description>This is the description of my J2EE component</description>
<display-name>This is the display name of my J2EE component</display-name>
<servlet-name>ServletForMoreInfo</servlet-name>
<servlet-class>com.servlet.ServletForMoreInfo</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>ServletForMoreInfo</servlet-name>
<url-pattern>/com/servlet/ServletForMoreInfo</url-pattern>
</servlet-mapping>
<servlet>
<description>This is the description of my J2EE component</description>
<display-name>This is the display name of my J2EE component</display-name>
<servlet-name>ServletForKindKeyWords</servlet-name>
<servlet-class>com.servlet.ServletForKindKeyWords</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>ServletForKindKeyWords</servlet-name>
<url-pattern>/com/servlet/ServletForKindKeyWords</url-pattern>
</servlet-mapping>
<welcome-file-list>
<welcome-file>index.html</welcome-file>
<welcome-file>index.htm</welcome-file>
<welcome-file>index.jsp</welcome-file>
<welcome-file>default.html</welcome-file>
<welcome-file>default.htm</welcome-file>
<welcome-file>default.jsp</welcome-file>
</welcome-file-list>
</web-app>

web.xml

  更新 jsp 页面代码:

 <%@ page language="java" contentType="text/html; charset=utf-8"
pageEncoding="utf-8"%>
<!DOCTYPE html>
<html><!-- xmlns="http://www.w3.org/1999/xhtml" -->
<head>
<!--<meta charset="utf-8" />-->
<meta name="viewport" content="width=device-width, initial-scale=1.0" charset="utf-8"/>
<title>热词分析</title>
<!-- BOOTSTRAP STYLES-->
<link href="../assets/css/bootstrap.css" rel="stylesheet" />
<!-- FONTAWESOME STYLES-->
<link href="../assets/css/font-awesome.css" rel="stylesheet" />
<!-- CUSTOM STYLES-->
<link href="../assets/css/custom.css" rel="stylesheet" />
<!-- PERSONAL FONTS-->
<link href='../cssFiles/basic.css' rel='stylesheet' type='text/css' />
<!-- GOOGLE FONTS-->
<link href='http://fonts.googleapis.com/css?family=Open+Sans' rel='stylesheet' type='text/css' />
</head>
<script src="../jsFiles/jquery/jquery-3.4.1.min.js" charset="utf-8"></script>
<script src="../jsFiles/echarts/echarts.min.js" charset="utf-8"></script>
<script src="../jsFiles/echarts/echarts-wordcloud-master/dist/echarts-wordcloud.min.js" charset="utf-8"></script>
<!-- <script src="../jsFiles/echarts/echarts-wordcloud-master/dist/echarts-wordcloud.min.js" charset="utf-8"></script> -->
<script src="../jsFiles/basic.js" charset="utf-8"></script>
<script src='../jsFiles/echarts/echarts.simple.js'></script>
<script src="../jsFiles/word.js" charset="utf-8"></script>
<script src="../jsFiles/wordkind.js" charset="utf-8"></script>
<script src="../jsFiles/cloud.js" charset="utf-8"></script>
<body>
<div id="wrapper">
<div class="navbar navbar-inverse navbar-fixed-top">
<div class="adjust-nav">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".sidebar-collapse">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand"><i class="fa fa-square-o "></i>&nbsp;欢迎您使用本热词分析系统</a>
</div>
</div>
</div>
<!-- /. NAV TOP -->
<div class="navbar-default navbar-side"> <!-- nav role="navigation" -->
<div class="sidebar-collapse">
<ul class="nav" id="main-menu">
<li class="text-center user-image-back">
<img src="../assets/img/find_user.png" class="img-responsive" />
</li>
<li>
<a href="#" onclick="makePageToMain()"><i class="fa fa-table "></i>主页</a>
</li>
<li>
<a href="#" onclick="makePageToWord()"><i class="fa fa-key "></i>全部热词</a>
</li>
<li>
<a href="#" onclick="makePageToKind()"><i class="fa fa-key "></i>热词目录</a>
</li>
<li>
<a href="#"><i class="fa fa-edit "></i>热词需求<span class="fa arrow"></span></a>
<ul class="nav nav-second-level">
<li>
<a href="#" onclick="makePageToCl()">热词云图</a>
</li>
<li>
<a href="#" onclick="makePageToRe()">热词关系图</a>
</li>
</ul>
</li>
</ul>
</div>
</div>
<!-- /. NAV SIDE -->
<div id="page-wrapper" >
<div id="page-inner">
<div class="row">
<div class="col-md-12">
<h2>主页</h2>
</div>
</div>
<!-- /. ROW -->
<hr />
<!-- /. ROW -->
<br>
<br>
<div id="MessageArea">
<br>
<h3>欢迎您使用本热词分析系统</h3>
</div>
</div>
<!-- /. PAGE INNER -->
</div>
<!-- /. PAGE WRAPPER -->
</div>
<!-- /. WRAPPER -->
<!-- SCRIPTS -AT THE BOTOM TO REDUCE THE LOAD TIME-->
<!-- JQUERY SCRIPTS -->
<script src="../assets/js/jquery-1.10.2.js"></script>
<!-- BOOTSTRAP SCRIPTS -->
<script src="../assets/js/bootstrap.min.js"></script>
<!-- METISMENU SCRIPTS -->
<script src="../assets/js/jquery.metisMenu.js"></script>
<!-- CUSTOM SCRIPTS -->
<script src="../assets/js/custom.js"></script>
</body>
</html>

index.jsp

  另外的部分我想了,还是分开写吧!

Python 爬取 热词并进行分类数据分析-[热词分类+目录生成]的更多相关文章

  1. Python 爬取 热词并进行分类数据分析-[热词关系图+报告生成]

    日期:2020.02.05 博客期:144 星期三 [本博客的代码如若要使用,请在下方评论区留言,之后再用(就是跟我说一声)] 所有相关跳转: a.[简单准备] b.[云图制作+数据导入] c.[拓扑 ...

  2. python 爬取豆瓣电影评论,并进行词云展示及出现的问题解决办法

    本文旨在提供爬取豆瓣电影<我不是药神>评论和词云展示的代码样例 1.分析URL 2.爬取前10页评论 3.进行词云展示 1.分析URL 我不是药神 短评 第一页url https://mo ...

  3. python爬取花木兰豆瓣影评,并进行词云分析

    前言 本文的文字及图片来源于网络,仅供学习.交流使用,不具有任何商业用途,如有问题请及时联系我们以作处理. PS:如有需要Python学习资料的小伙伴可以加点击下方链接自行获取 python免费学习资 ...

  4. Python 爬取 热词并进行分类数据分析-[云图制作+数据导入]

    日期:2020.01.28 博客期:136 星期二 [本博客的代码如若要使用,请在下方评论区留言,之后再用(就是跟我说一声)] 所有相关跳转: a.[简单准备] b.[云图制作+数据导入](本期博客) ...

  5. Python 爬取 热词并进行分类数据分析-[简单准备] (2020年寒假小目标05)

    日期:2020.01.27 博客期:135 星期一 [本博客的代码如若要使用,请在下方评论区留言,之后再用(就是跟我说一声)] 所有相关跳转: a.[简单准备](本期博客) b.[云图制作+数据导入] ...

  6. Python 爬取 热词并进行分类数据分析-[数据修复]

    日期:2020.02.01 博客期:140 星期六 [本博客的代码如若要使用,请在下方评论区留言,之后再用(就是跟我说一声)] 所有相关跳转: a.[简单准备] b.[云图制作+数据导入] c.[拓扑 ...

  7. Python 爬取 热词并进行分类数据分析-[解释修复+热词引用]

    日期:2020.02.02 博客期:141 星期日 [本博客的代码如若要使用,请在下方评论区留言,之后再用(就是跟我说一声)] 所有相关跳转: a.[简单准备] b.[云图制作+数据导入] c.[拓扑 ...

  8. Python 爬取 热词并进行分类数据分析-[拓扑数据]

    日期:2020.01.29 博客期:137 星期三 [本博客的代码如若要使用,请在下方评论区留言,之后再用(就是跟我说一声)] 所有相关跳转: a.[简单准备] b.[云图制作+数据导入] c.[拓扑 ...

  9. Python 爬取 热词并进行分类数据分析-[App制作]

    日期:2020.02.14 博客期:154 星期五 [本博客的代码如若要使用,请在下方评论区留言,之后再用(就是跟我说一声)] 所有相关跳转: a.[简单准备] b.[云图制作+数据导入] c.[拓扑 ...

随机推荐

  1. 计算几何-多边形内核判定-HPI-poj3335

    This article is made by Jason-Cow.Welcome to reprint.But please post the article's address. 先解决一个问题, ...

  2. 使用SQL命令行更改数据库字段类型

    ALTER TABLE 表名 MODIFY COLUMN 字段名 数据类型 添加列 ALTER TABLE students ADD COLUMN address VARCHAR(100) DEFAU ...

  3. gcc 简单使用笔记

    编译生成可执行文件(bin文件): gcc test.c //默认生成可执行文件名为a.out 指定可执行文件名: gcc -o test test.c 编译生成目标文件(.o文件): gcc -c ...

  4. Petr and a Combination Lock

    Petr has just bought a new car. He's just arrived at the most known Petersburg's petrol station to r ...

  5. quernation,euler,rotationmatrix之间的相互转换

    转自:https://blog.csdn.net/zhuoyueljl/article/details/70789472

  6. iframe内外的操作

    因为iframe涉及到跨域问题,有时候有的比较多,这不今天遇到了一个问题,处在iframe里头的js要操作iframe元素,查找百度,是可以实现的: 用jQuery在IFRAME里取得父窗口的某个元素 ...

  7. [Python] Tkinter的食用方法_01_简单界面

    #开始 放假之后感觉整个人已经放飞自我了,完全不知道自己一天天在干什么,明明有很多的事情需要做,但是实际上每天啥都没做,,,虚度光阴... 晚上突然心烦意乱,开始思考今天一天都做了什么,感觉很有负罪感 ...

  8. Spring Log4jConfigListener部署多个项目是出错的问题

    tomcat下部署多个项目,都用到了org.springframework.web.util.Log4jConfigListener时,需要注意在web.xml中加入webAppRootkey,要不然 ...

  9. SpringCloud全家桶学习之服务注册与发现及Eureka高可用集群搭建(二)

    一.Eureka服务注册与发现 (1)Eureka是什么? Eureka是NetFlix的一个子模块,也是核心模块之一.Eureka是一个基于REST的服务,用于定位服务,以实现云端中间层服务发现和故 ...

  10. idea 创建maven子父工程

    1.创建maven工程: 2. 创建工程名称: 3.删除父工程下的src文件夹,指定打包方式为pom,添加maven依赖: 4.右键项目添加子工程: 5.添加子工程名称: 6.子工程创建成功: 7.依 ...