z6045670 大佬有话说 :
请教会抓包的彦祖,怎么得到知乎瀑布分页规则 火车头
我要采集的知乎问答
https://www.zhihu.com/search?type=content&q=mjj
自己按着抓包教程抓到的是下面这种 虽然是所属关键词的内容但数据不对 跟页面上的不一致
https://api.zhihu.com/search_v3?advert_count=0&correction=1&lc_idx=0&imit=20&offset=20&q=mjj&search_hash_id=3a30b7f93413a7ba8e9d7a4a886f83ed&show_all_topics=0&t=general&vertical_info=0%2C1%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C1
https://api.zhihu.com/search_v3?advert_count=0&correction=1&lc_idx=24&limit=20&offset=40&q=mjj&search_hash_id=3a30b7f93413a7ba8e9d7a4a886f83ed&show_all_topics=0&t=general&vertical_info=0%2C1%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C1
https://api.zhihu.com/search_v3?advert_count=0&correction=1&lc_idx=64&limit=20&offset=80&q=mjj&search_hash_id=3a30b7f93413a7ba8e9d7a4a886f83ed&show_all_topics=0&t=general&vertical_info=0%2C1%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C1
其他采集器不熟悉 只会用火车 这种需要抓包获取url的 就懵逼了
求大佬指点 或者给出个1-5页的分页url
16qf 大佬有话说 :
:lol 是要准备采集b乎了吗
baiyangz1 大佬有话说 :
http://board.locoy.com/?post=369
peng123 大佬有话说 :
https://i.w3tt.com/2020/08/28/aGo0b.png
没什么问题呀
Alanku 大佬有话说 :
听说知乎反爬限制挺严,火车头真强
z6045670 大佬有话说 :
baiyangz1 大佬有话说 : 2020-8-28 22:45
http://board.locoy.com/?post=369
额 就是根据这个教程 没整成功
z6045670 大佬有话说 :
peng123 大佬有话说 : 2020-8-28 22:46
没什么问题呀
我一开始抓出来的也是这个url
https://www.zhihu.com/api/v4/search_v3?t=general&q=mjj&correction=1&offset=20&limit=20&lc_idx=22&show_all_topics=0&search_hash_id=557fc484503ab169f4ca6f694c5540f7&vertical_info=0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C1
但用浏览器打开的内容是这样的啊
每个分页的url内容都是这样的
{
"error": {
"code": 10002,
"message": "10002:u8bf7u6c42u53c2u6570u5f02u5e38uff0cu8bf7u5347u7ea7u5ba2u6237u7aefu540eu91cdu8bd5"
}
}
我是需要操作什么吗?求科普 感觉离真相越来越近了 (我用火车头采集)
518 大佬有话说 :
offset=0 试试
yc022t
z6045670 大佬有话说 :
518 大佬有话说 : 2020-8-28 23:55
offset=0 试试
https://www.zhihu.com/api/v4/search_v3?t=general&q=mjj&correction=1&offset=0&limit=20&lc_idx=22&show_all_topics=0&search_hash_id=557fc484503ab169f4ca6f694c5540f7&vertical_info=0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C1
一样啊
518 大佬有话说 :
z6045670 大佬有话说 : 2020-8-29 00:01
https://www.zhihu.com/api/v4/search_v3?t=general&q=mjj&correction=1&offset=0&limit=20&lc_idx=22&sh …
https://api.zhihu.com/search_v3?advert_count=0&correction=1&lc_idx=0&imit=20&offset=0&q=mjj&search_hash_id=3a30b7f93413a7ba8e9d7a4a886f83ed&show_all_topics=0&t=general&vertical_info=0%2C1%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C1