-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tumblr server error 官方服务端api出现异常 #100
Comments
@qiuker521 可以提交个PR来fix |
@jessefeinman you misunderstand my issue. I mean, get the page https://caddy-smashing.tumblr.com/api/read?num=50&start=14200 returns an error, so the entire 50-blog pages returns a wrong xml format, and the spider raise an exception. yes, the page https://caddy-smashing.tumblr.com/api/read?num=37&start=14200 is OK. yes, the page https://caddy-smashing.tumblr.com/api/read?num=38&start=14200 is WRONG. I got your commit, it adds the Not i want to scan it one-by-one from the start, but is to catch the exception, and scan it one-by-one the 50-blog page. |
I actually use go myself, inspired by the spider, so here is my fix in golang as an respect:
|
如下链接中的博客(这个id对应的这条博客),tumblr官方会返回一个蛋疼的xml格式。
并且如果你访问start = 14200 且 num = 50的话,前后 50 条都不会展示出来。
但是start=14200 且 num = 1就没问题。
但是start=14202 且 num = 1就没问题。
这种错误导致的异常并不少见。
such as
https://xxx.tumblr.com/api/read?num=1&start=14201
the tumblr server returns a wrong xml format.
I mean NO.14201 in this blog.
NO.14200 and NO.14202 does not suck.
and, it occurs often.
我个人意见是遇到这种解析错误,可以一条一条解析,解析50次。
if i meet the error myself, i do spide NO.14200-NO.14249 in a 50-times loop.
The text was updated successfully, but these errors were encountered: