Entity Localization Bug: Sentence. Sentences with colored citations have fragmented bounding boxes, in many papers. #186
Labels
bad-entity-detection
An issue or task related to an entity that was detected in the wrong place
bug
Something isn't working
entity-localization
An issue or task related to entity localization
sentences
An issue or task related to sentences
Milestone
Description: In papers that contain colored citations, sentence bounding boxes can be incorrect. In particular, they're fragmented, either missing part of the sentence that contains the citation, or only including long horizontal bars through the citations.
This problem has been observed for the following papers from among those reviewed from the list of 24 papers that appers in this issue #188:
I have also seen this problem for sentences that contain a URL that wraps more than one line:
Here are a couple of examples. To reproduce, open up the paper by going to https://scholarphi.semanticscholar.org/?file=https://arxiv.org/pdf/[PAPER_ID].pdf. Then open up the web console and enter the following style:
Paper 1701.07481v3
How to fix (optional): I believe this issue is because citations are assigned a color by the
hyperref
package when it's enabled for papers. This color is preserved even when the surrounding sentences is assigned a different color. As a result, citations are not detected as belonging to the sentence because they do not have the expected color of that sentence.The text was updated successfully, but these errors were encountered: