更新说明
This commit is contained in:
@@ -2,7 +2,7 @@
|
|||||||
<img src="./news_search_engine3.png" width = "650" align=center />
|
<img src="./news_search_engine3.png" width = "650" align=center />
|
||||||
|
|
||||||
# 使用方法
|
# 使用方法
|
||||||
1. 安装python 3.4+环境
|
1. 安装python 3.4+环境(推荐[Anaconda](https://www.anaconda.com/distribution/)或[Miniconda](https://docs.conda.io/en/latest/miniconda.html))
|
||||||
2. 安装lxml html解析器,命令为`pip install lxml`
|
2. 安装lxml html解析器,命令为`pip install lxml`
|
||||||
3. 安装jieba分词组件,命令为`pip install jieba`
|
3. 安装jieba分词组件,命令为`pip install jieba`
|
||||||
4. 安装Flask Web框架,命令为`pip install Flask`
|
4. 安装Flask Web框架,命令为`pip install Flask`
|
||||||
@@ -11,6 +11,8 @@
|
|||||||
|
|
||||||
如果想抓取最新新闻数据并构建索引,一键运行`./code/setup.py`,再按上面的方法测试。
|
如果想抓取最新新闻数据并构建索引,一键运行`./code/setup.py`,再按上面的方法测试。
|
||||||
|
|
||||||
|
2020.4.5:新增抓取[中国新闻网](http://www.chinanews.com/scroll-news/news1.html)的爬虫程序。先运行`./code/spider.chinanews.com.py`爬取最近5天新闻(约2500条);然后注释`./code/setup.py`[第38行](https://github.com/01joy/news-search-engine/blob/master/code/setup.py#L38)并运行,自动构建索引。
|
||||||
|
|
||||||
# 项目介绍
|
# 项目介绍
|
||||||
1. [和我一起构建搜索引擎(一)简介](http://bitjoy.net/2016/01/04/introduction-to-building-a-search-engine-1/)
|
1. [和我一起构建搜索引擎(一)简介](http://bitjoy.net/2016/01/04/introduction-to-building-a-search-engine-1/)
|
||||||
2. [和我一起构建搜索引擎(二)网络爬虫](http://bitjoy.net/2016/01/04/introduction-to-building-a-search-engine-2/)
|
2. [和我一起构建搜索引擎(二)网络爬虫](http://bitjoy.net/2016/01/04/introduction-to-building-a-search-engine-2/)
|
||||||
@@ -19,6 +21,7 @@
|
|||||||
5. [和我一起构建搜索引擎(五)推荐阅读](http://bitjoy.net/2016/01/09/introduction-to-building-a-search-engine-5/)
|
5. [和我一起构建搜索引擎(五)推荐阅读](http://bitjoy.net/2016/01/09/introduction-to-building-a-search-engine-5/)
|
||||||
6. [和我一起构建搜索引擎(六)系统展示](http://bitjoy.net/2016/01/09/introduction-to-building-a-search-engine-6/)
|
6. [和我一起构建搜索引擎(六)系统展示](http://bitjoy.net/2016/01/09/introduction-to-building-a-search-engine-6/)
|
||||||
7. [和我一起构建搜索引擎(七)总结展望](http://bitjoy.net/2016/01/09/introduction-to-building-a-search-engine-7/)
|
7. [和我一起构建搜索引擎(七)总结展望](http://bitjoy.net/2016/01/09/introduction-to-building-a-search-engine-7/)
|
||||||
|
8. [和我一起构建搜索引擎(八)更新爬虫&修改打分&线上部署](https://bitjoy.net/2020/04/05/introduction-to-building-a-search-engine-8//)
|
||||||
|
|
||||||
# 感谢
|
# 感谢
|
||||||
* [jieba](https://github.com/fxsjy/jieba)
|
* [jieba](https://github.com/fxsjy/jieba)
|
||||||
|
|||||||
Reference in New Issue
Block a user