Skip to content
This repository has been archived by the owner on Dec 9, 2018. It is now read-only.

Replacing HTML with unknown characters from the original PDF makes them Times New Roman per default #765

Open
mortenmoulder opened this issue Apr 11, 2018 · 0 comments

Comments

@mortenmoulder
Copy link

Problem

So I understand why this is happening. When I convert one of my PDFs to HTML, then change the characters around in the document, all the changes characters (THAT HAS NOT BEEN USED IN THE DOCUMENT), are all automatically turned into Times New Roman.

Example

As you can see, that W is Times New Roman, whereas the rest are Verdana. This happens because I haven't used a W in my document, so the compiled style/font (by pdf2htmlEX) doesn't know about the character.

Possible fix

If I simply do something like: ABCDEFGHIJKLMNOPQRSTUVWXYZ and put that into my document with a white background, it actually works pretty well. Only issue is other people can see this as well, if they select the text.

Is there a way to fix this?

@mortenmoulder mortenmoulder changed the title Replacing HTML with unknown glyphs makes them Times New Roman per default Replacing HTML with characters unknown in the original PDF makes them Times New Roman per default Apr 11, 2018
@mortenmoulder mortenmoulder changed the title Replacing HTML with characters unknown in the original PDF makes them Times New Roman per default Replacing HTML with unknown characters from the original PDF makes them Times New Roman per default Apr 11, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant