博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
文件方式实现完整的英文词频统计实例
阅读量:5740 次
发布时间:2019-06-18

本文共 5401 字,大约阅读时间需要 18 分钟。

fo = open('test.txt','r')new = fo.read()exc={
'the','is','are','on','to','can'}new = new.lower()for i in ',."': new = new.replace(i,' ')new = new.split(' ')#分词,单词列表print(new)d = {}keys = set(new)#单词的集合,存入字典keys = keys-excprint(keys)for i in keys: d[i] = new.count(i)#统计单词,出现次数的字典print(d)w = list(d.items())#将字典键值对转换为列表w.sort(key = lambda x:x[1],reverse = True)#排序print(w)for i in range(20): print(w[i])for line in fo: print(line)

['after', 'the', 'winter', '-', 'lenka\n\nwhen', 'the', 'rain', 'is', "pourin'", 'down\n\n', 'and', 'there', 'are', 'snowflakes', 'on', 'your', 'cheeks', '\n\nwhen', 'your', 'heart', 'is', 'frozen', 'over\n\nand', 'there', 'is', 'a', 'sea', 'lost', 'sun', 'in', 'weeks\n\njust', 'remember', '\n\njust', 'remember', '\n\nafter', 'the', 'winter', 'comes', 'the', 'spring\n\nthat', 'is', 'when', 'the', 'blue', 'birds\n\nstarts', 'to', 'sing\n\nand', 'you', 'can', 'always', 'count', 'on', 'this\n\nafter', 'the', 'winter', 'comes', 'the', 'spring\n\nwhen', 'the', 'trees', 'have', 'lost', 'the', 'color\n\nand', 'the', 'sky', 'is', 'full', 'of', 'fears\n\nwhen', 'you', 'feel', 'you', 'are', 'going', 'under\n\nand', 'your', 'eyes', 'are', 'full', 'of', 'tears\n\nwhen', 'the', 'bells', 'are', 'all', 'hiding\n\nand', 'you', 'are', 'hiding', 'too\n\noh', '', 'darling', 'just', 'remember\n\nthat', 'everything', 'will', 'soon', 'be', 'new\n\nafter', 'the', 'winter', 'comes', 'the', 'spring\n\nthat', 'is', 'when', 'the', 'blue', 'birds\n\nstart', 'to', 'use', 'their', 'wings\n\nand', 'you', 'can', 'always', 'count', 'on', 'this\n\nafter', 'the', 'winter', 'comes', 'the', 'spring\n\njust', 'remember', '\n\njust', 'remember', '\n\njust', 'remember', '\n\njust', 'remember', '\n\nafter', 'the', 'winter', 'comes', 'the', 'spring', '\n\nthat', 'is', 'when', 'the', 'blue', 'birds', '\n\nstarts', 'to', 'sing', '\n\nand', 'you', 'can', 'always', 'count', 'on', 'this\n\nafter', 'the', 'winter', 'comes', 'the', 'spring\n\nafter', 'the', 'winter', 'comes', 'the', 'spring']

{'', 'just', 'always', 'color\n\nand', 'bells', 'new\n\nafter', 'lenka\n\nwhen', 'everything', 'heart', '\n\nand', 'birds\n\nstart', 'trees', 'cheeks', 'your', 'full', 'frozen', 'spring', 'when', "pourin'", 'over\n\nand', 'you', 'soon', '-', 'will', 'fears\n\nwhen', 'count', 'hiding\n\nand', 'too\n\noh', 'all', 'sing', 'spring\n\nafter', 'feel', 'snowflakes', 'sing\n\nand', 'use', 'remember', '\n\nafter', 'sea', '\n\nstarts', 'be', 'comes', 'tears\n\nwhen', 'birds', 'a', 'blue', 'spring\n\njust', 'have', 'going', 'lost', 'there', 'down\n\n', 'sky', 'weeks\n\njust', 'spring\n\nwhen', 'spring\n\nthat', 'this\n\nafter', '\n\njust', 'darling', 'sun', 'after', 'in', 'hiding', 'eyes', '\n\nwhen', 'wings\n\nand', 'birds\n\nstarts', 'their', 'rain', '\n\nthat', 'and', 'remember\n\nthat', 'of', 'winter', 'under\n\nand'}
{'': 1, 'just': 1, 'always': 3, 'color\n\nand': 1, 'bells': 1, 'new\n\nafter': 1, 'lenka\n\nwhen': 1, 'everything': 1, 'heart': 1, '\n\nand': 1, 'birds\n\nstart': 1, 'trees': 1, 'cheeks': 1, 'your': 3, 'full': 2, 'frozen': 1, 'spring': 2, 'when': 3, "pourin'": 1, 'over\n\nand': 1, 'you': 6, 'soon': 1, '-': 1, 'will': 1, 'fears\n\nwhen': 1, 'count': 3, 'hiding\n\nand': 1, 'too\n\noh': 1, 'all': 1, 'sing': 1, 'spring\n\nafter': 1, 'feel': 1, 'snowflakes': 1, 'sing\n\nand': 1, 'use': 1, 'remember': 6, '\n\nafter': 2, 'sea': 1, '\n\nstarts': 1, 'be': 1, 'comes': 7, 'tears\n\nwhen': 1, 'birds': 1, 'a': 1, 'blue': 3, 'spring\n\njust': 1, 'have': 1, 'going': 1, 'lost': 2, 'there': 2, 'down\n\n': 1, 'sky': 1, 'weeks\n\njust': 1, 'spring\n\nwhen': 1, 'spring\n\nthat': 2, 'this\n\nafter': 3, '\n\njust': 4, 'darling': 1, 'sun': 1, 'after': 1, 'in': 1, 'hiding': 1, 'eyes': 1, '\n\nwhen': 1, 'wings\n\nand': 1, 'birds\n\nstarts': 1, 'their': 1, 'rain': 1, '\n\nthat': 1, 'and': 1, 'remember\n\nthat': 1, 'of': 2, 'winter': 8, 'under\n\nand': 1}
[('winter', 8), ('comes', 7), ('you', 6), ('remember', 6), ('\n\njust', 4), ('always', 3), ('your', 3), ('when', 3), ('count', 3), ('blue', 3), ('this\n\nafter', 3), ('full', 2), ('spring', 2), ('\n\nafter', 2), ('lost', 2), ('there', 2), ('spring\n\nthat', 2), ('of', 2), ('', 1), ('just', 1), ('color\n\nand', 1), ('bells', 1), ('new\n\nafter', 1), ('lenka\n\nwhen', 1), ('everything', 1), ('heart', 1), ('\n\nand', 1), ('birds\n\nstart', 1), ('trees', 1), ('cheeks', 1), ('frozen', 1), ("pourin'", 1), ('over\n\nand', 1), ('soon', 1), ('-', 1), ('will', 1), ('fears\n\nwhen', 1), ('hiding\n\nand', 1), ('too\n\noh', 1), ('all', 1), ('sing', 1), ('spring\n\nafter', 1), ('feel', 1), ('snowflakes', 1), ('sing\n\nand', 1), ('use', 1), ('sea', 1), ('\n\nstarts', 1), ('be', 1), ('tears\n\nwhen', 1), ('birds', 1), ('a', 1), ('spring\n\njust', 1), ('have', 1), ('going', 1), ('down\n\n', 1), ('sky', 1), ('weeks\n\njust', 1), ('spring\n\nwhen', 1), ('darling', 1), ('sun', 1), ('after', 1), ('in', 1), ('hiding', 1), ('eyes', 1), ('\n\nwhen', 1), ('wings\n\nand', 1), ('birds\n\nstarts', 1), ('their', 1), ('rain', 1), ('\n\nthat', 1), ('and', 1), ('remember\n\nthat', 1), ('under\n\nand', 1)]
('winter', 8)
('comes', 7)
('you', 6)
('remember', 6)
('\n\njust', 4)
('always', 3)
('your', 3)
('when', 3)
('count', 3)
('blue', 3)
('this\n\nafter', 3)
('full', 2)
('spring', 2)
('\n\nafter', 2)
('lost', 2)
('there', 2)
('spring\n\nthat', 2)
('of', 2)
('', 1)
('just', 1)

 

 

转载于:https://www.cnblogs.com/1257-/p/7599065.html

你可能感兴趣的文章
Oracle活动会话历史(ASH)及报告解读
查看>>
Project Euler Problem 7: 10001st prime
查看>>
通过 Xshell 5 连接 centOS 7 服务器
查看>>
关于完成生鲜电商项目后的一点总结
查看>>
noip2012 普及组
查看>>
Ai challenger 2017 image caption小结
查看>>
第二阶段 铁大Facebook——十天冲刺(10)
查看>>
蓝桥杯大赛java组准备_蓝桥杯大赛java组算法类冲刺第一天
查看>>
Java判断是否为垃圾_Java GC如何判断对象是否为垃圾
查看>>
多项式前k项和java_多项式朴素贝叶斯softmax改变
查看>>
java数组只能交换0下标和n_编程练习-只用0交换排序数组
查看>>
java的maxrow_聊聊pg jdbc statement的maxRows参数
查看>>
centos7安装mysql视频教程_centos7安装mysql(完整)
查看>>
php图片赋值,php如何优雅地赋值
查看>>
dz.27z.co index.php,dz7.2 伪静态规则
查看>>
php字符串解析xml文件,PHP通过DOM解析XML文件或者xml字符串_PHP教程
查看>>
matlab corr2原码,Ncorr-二维数字图像校正软件
查看>>
mysql增量,MySQL完全、增量的备份与恢复
查看>>
matlab程序复制出现乱码,matlab代码或中文复制到word就变成乱码怎么办?
查看>>
java writer append,Java StringWriter append()方法
查看>>