Python 对Twitter tweet的元素 (Word, Screen Name, Hash Tag)的词汇多样性分析
CODE:
#!/usr/bin/python
# -*- coding: utf-8 -*- '''
Created on 2014-7-3
@author: guaguastd
@name: tweet_lexical_diversity.py
''' if __name__ == '__main__': # import login, see http://blog.csdn.net/guaguastd/article/details/31706155
from login import twitter_login # get the twitter access api
twitter_api = twitter_login() # import tweet
from tweet import extract_tweet_entities # import search
from search import search_for_tweet # import lexical_diversity
from lexical_diversity import lexical_diversity,average_words while 1:
query = raw_input('\nInput the query (eg. #MentionSomeoneImportantForYou, exit to quit): ') if query == 'exit':
print 'Successfully exit!'
break statuses = search_for_tweet(twitter_api, query)
status_texts,screen_names,hashtags,words = extract_tweet_entities(statuses) for token in (words, screen_names, hashtags):
print '\rLexical diversity of %s: ' % token
print lexical_diversity(token) for status in (status_texts,):
print '\rAverage words of %s: ' % status
print average_words(status)
RESULT:
Input the query (eg. #MentionSomeoneImportantForYou, exit to quit): #MentionSomeoneImportantForYou
Length of statuses 30 Lexical diversity of [u'RT', u'@xmlovex:', u'#MentionSomeoneImportantForYou', u'@purpledrauhl_23', u'RT', u'@KillahPimpp:', u'#MentionSomeoneImportantForYou', u'@MissRosaa_', u'#MentionSomeoneImportantForYou', u'@justinbieber', u'"@KillahPimpp:', u'#MentionSomeoneImportantForYou', u'@_K_L_O_"', u'RT', u'@KillahPimpp:', u'#MentionSomeoneImportantForYou', u'@_K_L_O_', u'\u201c@0hDearPriscii:', u'"@KillahPimpp:', u'#MentionSomeoneImportantForYou', u'@0hDearPriscii"', u'aww', u'ily\U0001f618\U0001f46f\u201dily2\u2764\ufe0f', u'RT', u'@KillahPimpp:', u'#MentionSomeoneImportantForYou', u'@0hDearPriscii', u'"@KillahPimpp:', u'#MentionSomeoneImportantForYou', u'@0hDearPriscii"', u'aww', u'ily\U0001f618\U0001f46f', u'#MentionSomeoneImportantForYou', u'@', u'my', u'brotherrrr', u'http://t.co/LprqvaLvyu', u'RT', u'@KillahPimpp:', u'#MentionSomeoneImportantForYou', u'@BeyonceTapia', u'\U0001f498', u'RT', u'@thuggie_salma:', u'"@KillahPimpp:', u'#MentionSomeoneImportantForYou', u'@thuggie_salma"', u'baeee', u'\U0001f618\U0001f60f\U0001f62d', u'#MentionSomeoneImportantForYou', u'@BeyonceTapia', u'\U0001f498', u'"@KillahPimpp:', u'#MentionSomeoneImportantForYou', u'@thuggie_salma"', u'baeee', u'\U0001f618\U0001f60f\U0001f62d', u'RT', u'@KillahPimpp:', u'#MentionSomeoneImportantForYou', u'@thuggie_salma', u'RT', u'@KillahPimpp:', u'#MentionSomeoneImportantForYou', u'@NotNormal_Javi', u'#MentionSomeoneImportantForYou', u'@NotNormal_Javi', u'#MentionSomeoneImportantForYou', u'@thuggie_salma', u'RT', u'@KillahPimpp:', u'#MentionSomeoneImportantForYou', u'@EbbsContreras', u'RT', u'@sashaalexxa_:', u'#MentionSomeoneImportantForYou', u'@', u'#MentionSomeoneImportantForYou', u'@EbbsContreras', u'RT', u'@NotNormal_Javi:', u'#MentionSomeoneImportantForYou', u'cheeseburgers', u'\U0001f354\U0001f354', u'#MentionSomeoneImportantForYou', u'@TaeTae2Beast', u'#MentionSomeoneImportantForYou', u'@', u'#MentionSomeoneImportantForYou', u'@Brendaaa23', u'#MentionSomeoneImportantForYou', u'cheeseburgers', u'\U0001f354\U0001f354', u'#MentionSomeoneImportantForYou', u'@_K_L_O_', u'#MentionSomeoneImportantForYou', u'@MissRosaa_', u'#MentionSomeoneImportantForYou', u'@0hDearPriscii', u'@LoveASharie', u'@DJZeeti', u'Speechless', u'beauty', u'and', u'Pretty', u'smile', u'.#WomanCrushWednesday', u'#MentionSomeoneImportantForYou', u'#TeamSharie', u'@louiswonderwall', u'my', u'babeeeee\U0001f60d\U0001f60d\U0001f60d\U0001f60d\U0001f60d', u'#MentionSomeoneImportantForYou']:
0.407079646018 Lexical diversity of [u'xmlovex', u'KillahPimpp', u'MissRosaa_', u'justinbieber', u'KillahPimpp', u'_K_L_O_', u'KillahPimpp', u'_K_L_O_', u'0hDearPriscii', u'KillahPimpp', u'0hDearPriscii', u'KillahPimpp', u'0hDearPriscii', u'KillahPimpp', u'0hDearPriscii', u'KillahPimpp', u'BeyonceTapia', u'thuggie_salma', u'KillahPimpp', u'thuggie_salma', u'BeyonceTapia', u'KillahPimpp', u'thuggie_salma', u'KillahPimpp', u'thuggie_salma', u'KillahPimpp', u'NotNormal_Javi', u'NotNormal_Javi', u'thuggie_salma', u'KillahPimpp', u'EbbsContreras', u'sashaalexxa_', u'EbbsContreras', u'NotNormal_Javi', u'TaeTae2Beast', u'Brendaaa23', u'_K_L_O_', u'MissRosaa_', u'0hDearPriscii', u'LoveASharie', u'DJZeeti', u'louiswonderwall']:
0.380952380952 Lexical diversity of [u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'MentionSomeoneImportantForYou', u'WomanCrushWednesday', u'MentionSomeoneImportantForYou', u'TeamSharie', u'MentionSomeoneImportantForYou']:
0.09375 Average words of [u'RT @xmlovex: #MentionSomeoneImportantForYou @purpledrauhl_23', u'RT @KillahPimpp: #MentionSomeoneImportantForYou @MissRosaa_', u'#MentionSomeoneImportantForYou @justinbieber', u'"@KillahPimpp: #MentionSomeoneImportantForYou @_K_L_O_"', u'RT @KillahPimpp: #MentionSomeoneImportantForYou @_K_L_O_', u'\u201c@0hDearPriscii: "@KillahPimpp: #MentionSomeoneImportantForYou @0hDearPriscii" aww ily\U0001f618\U0001f46f\u201dily2\u2764\ufe0f', u'RT @KillahPimpp: #MentionSomeoneImportantForYou @0hDearPriscii', u'"@KillahPimpp: #MentionSomeoneImportantForYou @0hDearPriscii" aww ily\U0001f618\U0001f46f', u'#MentionSomeoneImportantForYou @ my brotherrrr http://t.co/LprqvaLvyu', u'RT @KillahPimpp: #MentionSomeoneImportantForYou @BeyonceTapia \U0001f498', u'RT @thuggie_salma: "@KillahPimpp: #MentionSomeoneImportantForYou @thuggie_salma" baeee \U0001f618\U0001f60f\U0001f62d', u'#MentionSomeoneImportantForYou @BeyonceTapia \U0001f498', u'"@KillahPimpp: #MentionSomeoneImportantForYou @thuggie_salma" baeee \U0001f618\U0001f60f\U0001f62d', u'RT @KillahPimpp: #MentionSomeoneImportantForYou @thuggie_salma', u'RT @KillahPimpp: #MentionSomeoneImportantForYou @NotNormal_Javi', u'#MentionSomeoneImportantForYou @NotNormal_Javi', u'#MentionSomeoneImportantForYou @thuggie_salma', u'RT @KillahPimpp: #MentionSomeoneImportantForYou @EbbsContreras', u'RT @sashaalexxa_: #MentionSomeoneImportantForYou @', u'#MentionSomeoneImportantForYou @EbbsContreras', u'RT @NotNormal_Javi: #MentionSomeoneImportantForYou cheeseburgers \U0001f354\U0001f354', u'#MentionSomeoneImportantForYou @TaeTae2Beast', u'#MentionSomeoneImportantForYou @', u'#MentionSomeoneImportantForYou @Brendaaa23', u'#MentionSomeoneImportantForYou cheeseburgers \U0001f354\U0001f354', u'#MentionSomeoneImportantForYou @_K_L_O_', u'#MentionSomeoneImportantForYou @MissRosaa_', u'#MentionSomeoneImportantForYou @0hDearPriscii', u'@LoveASharie @DJZeeti Speechless beauty and Pretty smile .#WomanCrushWednesday #MentionSomeoneImportantForYou #TeamSharie', u'@louiswonderwall my babeeeee\U0001f60d\U0001f60d\U0001f60d\U0001f60d\U0001f60d #MentionSomeoneImportantForYou']:
3.76666666667 Input the query (eg. #MentionSomeoneImportantForYou, exit to quit):
Python 对Twitter tweet的元素 (Word, Screen Name, Hash Tag)的词汇多样性分析的更多相关文章
- Python 对Twitter tweet的元素 (Word, Screen Name, Hash Tag)的频率分析
CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-2 @author: guaguastd @name: tw ...
- Python 对新浪微博的博文元素 (Word, Screen Name)的频率分析
CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-9 @author: guaguastd @name: we ...
- Python 新浪微博元素 (Word, Screen Name)词汇多样性
CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-10 @author: guaguastd @name: w ...
- 通过遍历而非排序求最值 python list in 时间复杂度 列表元素存在性
Write a function: def solution(A) that, given an array A of N integers, returns the smallest positiv ...
- python——删除列表中的元素
在python中,删除列表元素的方法有三种,分别为remove(),del(),pop()函数 (1)remove() >>> name = ['小明','小华','小红','小李' ...
- python去除列表中重复元素的方法
列表中元素位置的索引用的是L.index 本文实例讲述了Python去除列表中重复元素的方法.分享给大家供大家参考.具体如下: 比较容易记忆的是用内置的set 1 2 3 l1 = ['b','c', ...
- 如何在python列表中查找某个元素的索引
如何在python列表中查找某个元素的索引 2019-03-15 百度上回复别人的问题,几种方式的回答: 1) print('*'*15,'想找出里面有重复数据的索引值','*'*15) listA ...
- Python+Selenium自动化-定位一组元素,单选框、复选框的选中方法
Python+Selenium自动化-定位一组元素,单选框.复选框的选中方法 之前学习了8种定位单个元素的方法,同时webdriver还提供了8种定位一组元素的方法.唯一区别就是在单词elemen ...
- Python+Selenium自动化-定位页面元素的八种方法
Python+Selenium自动化-定位页面元素的八种方法 本篇文字主要学习selenium定位页面元素的集中方法,以百度首页为例子. 0.元素定位方法主要有: id定位:find_elemen ...
随机推荐
- 【转】git rebase简介(基本篇)
原文网址:http://blog.csdn.net/hudashi/article/details/7664631/ 原文: http://gitbook.liuhui998.com/4_2.html ...
- 一致性hash 算法 (转)
转载请说明出处:http://blog.csdn.net/cywosp/article/details/23397179 一致性哈希算法在1997年由麻省理工学院提出的一种分布式哈希(DHT) ...
- 爬虫—分析Ajax爬取今日头条图片
以今日头条为例分析Ajax请求抓取网页数据.本次抓取今日头条的街拍关键字对应的图片,并保存到本地 一,分析 打开今日头条主页,在搜索框中输入街拍二字,打开开发者工具,发现浏览器显示的数据不在其源码里面 ...
- window下svn开机自动启动
- Centos7下git服务器及gogs部署
1.安装git # yum install -y git 2.创建git用户及组 # groupadd git # adduser git -g git # mkdir /home/git # mkd ...
- BigDecimal,注解
BigDecimal 问题重现 今天在干活的途中,发现一个很坑爹的问题,让我来复现下问题: 从上游接口获得的余额,对于为0的,做了判断 BigDecimal a = new BigDecimal(ac ...
- Java入门第一季——从此投身Java??
找工作告一段落. 最后的工作呢,和java紧密相关,也是阴差阳错,不过都是软件开发,都好了,不过以后侧重点肯定是在java这边,php有机会还是一直学下去的,那么美的说~ Java开发第一季 一.简 ...
- 01-初学总结之《谭浩强C程序设计》
注:个人针对于课本的易错点进行了相关的整理.整理的不专业,多多见谅. C语言中的易出错的点 这个笔记综合了 0. 常量&变量 常量 整型常量 -345,1000,0 实型常量 1) 十进制 ...
- 团体程序设计天梯赛-练习集-L1-025. 正整数A+B
L1-025. 正整数A+B 本题的目标很简单,就是求两个正整数A和B的和,其中A和B都在区间[1,1000].稍微有点麻烦的是,输入并不保证是两个正整数. 输入格式: 输入在一行给出A和B,其间以空 ...
- JS 输名字随机弹出
<!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title> ...