-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync with HarfBuzz 2.7.4 #75
Conversation
Never mind; I was silly and didn’t notice that I needed to update the Unicode version in the URLs. |
This matches the output format used in HarfBuzz 2.7.4.
Note to self: is this change to |
@bluebear94 Good progress. Here is my failed attempt. Maybe it would help a bit: Patch.patch I haven't been able to grok Either way, feel free to ask me anything. Just remember that I wrote this code 3 years ago and I do not remember much myself. |
Any changes to font tables parsing are irrelevant. The BASE table is unused, at least of 2.7 |
The way I was doing the back-porting is by checking each commit, not the whole diff at once. harfbuzz commits are usually pretty small and self-contains, so it's easier that way. If a commit has changes related to build system, fonts parsing, subsetting, C++ stuff - it can be safely ignored. If a commit contains changes to tests, like this one, then you have to checkout this commit. Build harfbuzz from sources: meson builddir --reconfigure
ninja -Cbuilddir And then run |
HarfBuzz 2.7.4’s buffer serialization was changed to surround the contents in square brackets. Should this be reflected in |
Sure. |
Trying to fix the remaining test failure. For the font |
I've checked that font with ttf-explorer and it does use |
It’s an existing test case that was manually changed after generation. This test was supposed to have an expected output of |
Ugh... good find. I would assume this change was caused by this commit. In which case it's not a bug a we should prefer rustybuzz output. Some floats rounding differences are expected. Lets leave the manual correction in place and I will think what could be done here. |
Now |
This applies the upstream 2.7.4 unviersal machine changes to the Ragel files. Unfortunately, this means that we have to allocate a vector in find_syllables, as we can’t do unholy things with operator overloading unlike in C++.
Merged the changes from #77 into this branch. Unfortunately, due to limitations in Ragel and in Rust, I had to switch back to allocating a vector in |
@bluebear94 Sure, no problem. Thanks for looking into it. Hopefully it was relatively easy for you to figure out how to build master ragel. I will update the readme soon. |
PS: I wasn't notified that this pull request is no longer a draft. I would suggest pinging me directly. |
Yes. I’ve almost figured out how to get rid of the vec allocation again, but unfortunately, it requires manually applying a few edits to the generated file. The problem is that Ragel outputs assignment statements such as |
This involves a `MachineCursor` type, which is equivalent to HarfBuzz’s `machine_index`. Unfortunately, it is not possible to overload the assignment operator in Rust, forcing us to manually edit the generated source.
Hm... I'm not familiar with Ragel myself, but it would be great to be able to avoid manual edits. Once again, there is no rush. |
By the way, why there are so many changes to |
These are due to changes in the |
Hm... github shows:
I can't even see what was changed. |
Are you talking about Python scripts? I think that should stay untouched. They are ported from harfbuzz as well. Just reformatted and outputting Rust instead of C++. rustybuzz should have as little changes as possible. |
Newer HarfBuzz versions use PackTab for a bunch of stuff. It's a small one-file Python library. Should be easy to add Rust output to it: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a look at the first couple commits. I'll work through the rest over the next week or so.
ot.remove_language_ot('MONT') | ||
ot.add_language('mnw', 'MONT') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't appear in the original set of Harfbuzz changes, what is this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current version of Microsoft’s page of language tags has a note for Thailand Mon that chokes the parser. I had to add this for the script to finish successfully; otherwise, I get this error:
Traceback (most recent call last):
File "/home/felirovas/moddev/rustybuzz/scripts/gen-tag-table.py", line 1036, in <module>
print(' // %s' % bcp_47.get_name(lt))
^^^^^^^^^^^^^^^^^^^
File "/home/felirovas/moddev/rustybuzz/scripts/gen-tag-table.py", line 633, in get_name
name = self._get_name_piece(lt.language)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/felirovas/moddev/rustybuzz/scripts/gen-tag-table.py", line 622, in _get_name_piece
return self.names[subtag].split('\n')[0] + self.scopes.get(subtag, '')
~~~~~~~~~~^^^^^^^^
KeyError: 'mnwseehttps://www.unicode.org/l2/l2020/20163'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry that it took me so long. From my remaining read-through it looks reasonably accurate to Harfbuzz 2.7.4.
In the future, for ease of review I think we should have a better system for porting over commits:
- Port over commits from Harfbuzz one by one.
- Link the commit being ported in the
rustybuzz
commit.
@bluebear94 Any updates on why @notgull Porting commits one by one is fine. Adding the exact commit hash is probably unnecessary. We simply don't need this information. I plan to try to refactor the code base a bit to make it more harfbuzz-like. File structure, code structure, etc. Ideally, rustybuzz should look like harfbuzz transpiled to Rust. The less changes we have, the either is should be to port. |
See my previous comment on this. It might be possible to make the offending tests shorter using |
@bluebear94 Oh, I think I got it now. Those tests produce an enormous output? And harfbuzz tests simply use In this case I would add those tests to the ignore list in |
It’s not that all glyphs from these fonts are being tested, but rather than these fonts contain glyphs that expand indefinitely. Anyway, I pushed a commit to ignore these tests. |
I see now. Those tests were already absurd before. I guess I would have to handle them separately. Anyway, thanks for your work! I honestly wasn't expecting anyone to be brave enough to do it. |
@bluebear94 If you're still interested in updating rustybuzz, I can give you write access to this repo, to simplify your workflow. This would allow more gradual changes. |
See #37.
gen-tag-table.py
gen-universal-table.py
universal_machine::find_syllables
to avoid theVec
allocation?