Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entity Localization Bug: Sentence. Sentences with colored citations have fragmented bounding boxes, in many papers. #186

Open
andrewhead opened this issue Jan 4, 2021 · 0 comments
Labels
bad-entity-detection An issue or task related to an entity that was detected in the wrong place bug Something isn't working entity-localization An issue or task related to entity localization sentences An issue or task related to sentences

Comments

@andrewhead
Copy link
Contributor

andrewhead commented Jan 4, 2021

Description: In papers that contain colored citations, sentence bounding boxes can be incorrect. In particular, they're fragmented, either missing part of the sentence that contains the citation, or only including long horizontal bars through the citations.

This problem has been observed for the following papers from among those reviewed from the list of 24 papers that appers in this issue #188:

  • 1701.07481v3
  • 1702.01287v1
  • 1701.02810v2
  • 1906.00414v2
  • 1906.01502v1
  • 1908.00300v1
  • 1905.05475v2
  • 1705.06566v2
  • 1802.07740v2
  • 1806.02371v1
  • 1811.12359v4

I have also seen this problem for sentences that contain a URL that wraps more than one line:

  • 1701.02810v2 (First sentence of Background)

Here are a couple of examples. To reproduce, open up the paper by going to https://scholarphi.semanticscholar.org/?file=https://arxiv.org/pdf/[PAPER_ID].pdf. Then open up the web console and enter the following style:

.sentence-annotation {
  background-color: rgba(0, 0, 255, 0.2);
}

Paper 1701.07481v3
image

How to fix (optional): I believe this issue is because citations are assigned a color by the hyperref package when it's enabled for papers. This color is preserved even when the surrounding sentences is assigned a different color. As a result, citations are not detected as belonging to the sentence because they do not have the expected color of that sentence.

@andrewhead andrewhead added bug Something isn't working entity-localization An issue or task related to entity localization bad-entity-detection An issue or task related to an entity that was detected in the wrong place sentences An issue or task related to sentences labels Jan 4, 2021
@andrewhead andrewhead added this to the LaTeX Updates for Alpha milestone Jan 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bad-entity-detection An issue or task related to an entity that was detected in the wrong place bug Something isn't working entity-localization An issue or task related to entity localization sentences An issue or task related to sentences
Projects
None yet
Development

No branches or pull requests

1 participant