You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have created a tool called UniGrammar that transpiles grammars in an own JSON-objects-based DSL into grammars in other DSLs, compiles them into the actual stuff that can be used, if it is needed and generates wrappers to use the parsed trees uniformly. It unifies other aspects too, such as storage and access to compiled grammars (I mean there is a gen-bundle command that compiles a grammar with all the tools and stores the artifacts (and then the bundle can be used without much attention to the tools, UniGrammarRuntime itself detects the fastest (it also stores benchmarks results within the bundle) backend available on user's system)), testing, visualization (thought visualization is backend-specific).
parglare has 2 formats, the source one and a precompiled table serialized into python lists and dicts serialized into JSON. But the precompiled table is generated with a CLI app, not via an API, and is fetched automatically based on path of a source file (the architecture of my runtime is such that the stuff is always loaded from memory because when testing I prefer not to create unneeded files, some users have SSDs and floating gate transistors have limited count of erases before they degrade to the state they are useless). Also JSON may be not the best format to store it, it has some overhead text-based formats have. I may want to replace it for example with CBOR.
So I wonder if it makes sense to
parse grammars not only from files or strings, but also from the stuff that is usually serialized into JSON, so I can parse it myself
compile them not to files, but to that stuff and via API, so I can serialize the stuff myself, and again, without side effects since other tools will be run in the same process after it
provide a convenient interface to trace and visualize them. I have not yet decided how in fact I am going to visualize tham, currently I rely on the compilers own functionality, but since most of them have no, I feel like I will have to accept dot source and wrap xdot (or maybe just use networkx) to show them in this cases. So the most likely under API for visualization I mean an API outputting dot source without any side effects.
API for tracing is tricky one. Most of tools have different kind of tracing, and nkne of them visualizes the trace automatically. For example ANTLR prints into tokens and actions and errors into stdout. I guess for tracing we can use the following very generic interface: just a collection in-memory buffers, each of them has some metadata describing its purpose and format. I.e. just an object with the fields tokens: typing.Optional[str] for tokens, the purpose is to be printed into stdout, actions: typing.Optional[str] is text repr of actions, actions_graph: typing.Optional[str] is a GraphViz graph source, the purpose is to be rendered on screen or into a file. Or we may want to get the actions and tokens in an object-oriented format. I have not yet decided. Anyway, there shouldn't be any side effects, such as direct output into stdout or closing the app.
Do you consider refactoring parglare this way as acceptible?
Also it would be nice to have the reciprocal mapping, I mean transpilation from parglare grammars into UniGrammar ones. Is it better to have it within parglare or within UniGrammar?
The text was updated successfully, but these errors were encountered:
Maybe you would be interested in the discussion on #78 which covers some of the ideas you presented here (if I understood you correctly). There is a branch with the implementation (although it will need update/rework to incorporate newest parglare changes). Basically, with that approach textual parglare grammar language is just one of many possible syntaxes.
Description
I have created a tool called UniGrammar that transpiles grammars in an own JSON-objects-based DSL into grammars in other DSLs, compiles them into the actual stuff that can be used, if it is needed and generates wrappers to use the parsed trees uniformly. It unifies other aspects too, such as storage and access to compiled grammars (I mean there is a
gen-bundle
command that compiles a grammar with all the tools and stores the artifacts (and then the bundle can be used without much attention to the tools, UniGrammarRuntime itself detects the fastest (it also stores benchmarks results within the bundle) backend available on user's system)), testing, visualization (thought visualization is backend-specific).parglare
has 2 formats, the source one and a precompiled table serialized into pythonlist
s anddict
s serialized into JSON. But the precompiled table is generated with a CLI app, not via an API, and is fetched automatically based on path of a source file (the architecture of my runtime is such that the stuff is always loaded from memory because when testing I prefer not to create unneeded files, some users have SSDs and floating gate transistors have limited count of erases before they degrade to the state they are useless). Also JSON may be not the best format to store it, it has some overhead text-based formats have. I may want to replace it for example with CBOR.So I wonder if it makes sense to
dot
source and wrapxdot
(or maybe just usenetworkx
) to show them in this cases. So the most likely under API for visualization I mean an API outputting dot source without any side effects.API for tracing is tricky one. Most of tools have different kind of tracing, and nkne of them visualizes the trace automatically. For example ANTLR prints into tokens and actions and errors into stdout. I guess for tracing we can use the following very generic interface: just a collection in-memory buffers, each of them has some metadata describing its purpose and format. I.e. just an object with the fields
tokens: typing.Optional[str]
for tokens, the purpose is to be printed into stdout,actions: typing.Optional[str]
is text repr of actions,actions_graph: typing.Optional[str]
is a GraphViz graph source, the purpose is to be rendered on screen or into a file. Or we may want to get the actions and tokens in an object-oriented format. I have not yet decided. Anyway, there shouldn't be any side effects, such as direct output into stdout or closing the app.Do you consider refactoring parglare this way as acceptible?
Also it would be nice to have the reciprocal mapping, I mean transpilation from
parglare
grammars intoUniGrammar
ones. Is it better to have it within parglare or within UniGrammar?The text was updated successfully, but these errors were encountered: