Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagnosis of crawl versus replay problems #14

Open
anjackson opened this issue Apr 3, 2019 · 1 comment
Open

Diagnosis of crawl versus replay problems #14

anjackson opened this issue Apr 3, 2019 · 1 comment

Comments

@anjackson
Copy link
Member

A common issue is that it is not clear if a problem with a site is due to gaps in the crawl, or replay-time rewriting limitations. It should be possible to use proxy playback mode to evaluate the crawl without the additional complications of replay-time rewrites. There's two strands to this:

  • Making it easier for users to check things manually in proxy mode (probably best done via a browser extension?)
  • Automated diagnosis of rewriting issues, by comparing proxy-mode playback with re-write mode playback.
@jkafader
Copy link
Contributor

jkafader commented Apr 3, 2019

Another question in this sort of area is "can we trace a missing DOM node back to a particular missing URL?" - which would be useful if what we're trying to eventually do is actually get a crawler to go and fix the page in an automated way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants