Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unified operators #7057

Merged
merged 16 commits into from
Nov 6, 2024
Merged

Unified operators #7057

merged 16 commits into from
Nov 6, 2024

Conversation

cometkim
Copy link
Member

@cometkim cometkim commented Sep 28, 2024

Trying to solve #6477, but without implicit open

Rationale

As explained in #6477, we have the "infix explosion" problem. To relax it, I added some specialization to both type inference and primitive inspired by F#

https://learn.microsoft.com/en-us/dotnet/fsharp/language-reference/symbol-and-operator-reference/arithmetic-operators#operators-and-type-inference

The idea is, to extend the type inference rules for primitive operations defined in Pervasives to specialize them for predefined types but fall back to int if they cannot be inferred.

let t1 = 1 + 2     // => external \"+": (int, int) => int = "%addint"
let t2 = 1. + 2.   // => external \"+": (float, float) => float = "%addfloat" 
let t3 = "1" + "2" // => external \"+": (string, string) => string = "%string_concat"
let t4 = 1n + 2n   // => external \"+": (bigint, bigint) => bigint = "%addbigint"

let fn1 = (a, b) => a + b        // => external \"+": (int, int) => int = "%addint"
let fn2 = (a: float, b) => a + b // => external \"+": (float, float) => float = "%addfloat" 
let fn3 = (a, b: float) => a + b // => external \"+": (float, float) => float = "%addfloat" 

let inv1 = (a: int, b: float) => a + b  // => external \"+": (int, int) => int = "%addint"
//                                   ^ error: cannot apply float here, expected int

This doesn't make a breaking change since the existing operations are already defined on int.

However, we can introduce them as unified operators for the new codebase. They are still safe, as they are always specialized and not polymorphic.

Specialized operators:

Op Primitive supported types ready?
~+ %plus int, float, bigint ✔️
~- %neg int, float, bigint ✔️
+ %add int, float , bigint, string ✔️
- %sub int, float, bigint ✔️
* %mul int, float, bigint ✔️
/ %div int, float, bigint ✔️
mod %mod int, float, bigint ✔️
% %mod int, float, bigint #7152
** %exp int, float, bigint #7153
& %bitand (formerly %land) int, bigint
| %bitor (formerly %lor) int, bigint
^ %bitxor (formerly %lxor) int, bigint
~ %bitnot (formerly %lnot) int, bigint
<< %lsl int, bigint
>> %asr int, bigint
>>> %lsr int
abs %abs int, float, bigint
min %min int, float, bigint, string
max %max int, float, bigint, string

Side note

This PR doesn't attempt to add custom ops or polymorphic ops, but that's the specialized custom ops already possible by opening a module.

Polymorphic ops should be deprecated. They are unpredictable and inefficient, as explained in #7031. So, in future versions, comparisons will behave consistently with this change. It will be a breaking change, but we don't have a better alternative yet.

@cometkim

This comment was marked as outdated.

@cometkim cometkim changed the title PoC: generic infix operators PoC: Unified infix operators Sep 28, 2024
@cometkim

This comment was marked as resolved.

@zth
Copy link
Collaborator

zth commented Oct 1, 2024

@cometkim you waiting on anything from any of us on this one?

@cometkim
Copy link
Member Author

cometkim commented Oct 4, 2024

@cometkim you waiting on anything from any of us on this one?

No, I just took a break from coding over the Korean holidays.

@zth
Copy link
Collaborator

zth commented Oct 4, 2024

@cometkim you waiting on anything from any of us on this one?

No, I just took a break from coding over the Korean holidays.

Enjoy your holidays!

@cometkim cometkim changed the title PoC: Unified infix operators PoC: Unified operators Nov 4, 2024
@cometkim
Copy link
Member Author

cometkim commented Nov 4, 2024

It now works for basic arithmetic operators

@cometkim cometkim requested review from cknitt, zth and cristianoc November 4, 2024 17:55
@cometkim cometkim marked this pull request as ready for review November 4, 2024 18:01
@cometkim cometkim changed the title PoC: Unified operators Unified operators Nov 4, 2024
Copy link
Collaborator

@cristianoc cristianoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach looks good to me, both the design and the implementation.
Left some minor comments.

@zth @cknitt any comments on the design?

match entry with
| Some {specialization} -> (
match specialization with
| {int}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if the when clause in this case is complete. But I guess this case is just unnecessary and can be removes as it is already expressed as the last default case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This always takes precedence int over other types. (Rule 1-2) The last default case is a fallback strategy. (Rule 3)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example of any change after removing this first case?
All the other cases seem to have incompatible when clauses, so it would fall back to the last case no matter what.
Unless I'm missing something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. There is no logical difference in the behavior after removing that case. I was just thinking of the computational difference. In the existing codebase, I assume the first case is hit the most frequently (since it was originally int-only).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a detail anyway. Not much difference. Whatever you think is best.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it probably doesn't matter, but I'll leave it as is because it seems easier to understand the intent.

if String.length prim_name > 0 && prim_name.[0] = '%' then
raise (Error (loc, Unknown_builtin_primitive prim_name));
Pccall prim
| [arg1] | [arg1; _] -> translate_unified_ops prim env arg1.exp_type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why also [arg1] with only 1 argument?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a couple of unary primitive to support, %pos and %neg

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I noticed all the other functions don't have unaries, so if they're planned for this PR great.

This comment was marked as resolved.

let rhs = type_exp env rhs_expr in
let rhs_type = expand_head env rhs.exp_type in
let lhs, rhs, result_type =
(* rule 1. *)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some overall comment above should summarise what rule1, rule2, rule3 are, before we get this far into the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can leave a link here to read the unified_ops.ml module first.

Or would you prefer to duplicate the same explanation here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link sounds perfect.
Or should the code move too? Not sure.
Just wondering why the description is in one file and the code in an other. Perhaps dependencies or similarity to other code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering why the description is in one file and the code in an other. Perhaps dependencies or similarity to other code.

I wanted to keep everything in a single file if possible. Unified_ops should be the only module that manages the list of supported operators.

But I used a utility from typecore.ml like type_exp. It isn't easy to replicate as it is defined recursively.

@cristianoc
Copy link
Collaborator

  1. * (%mul): While unusual for most data types, * is often supported for strings in some languages (like Python) for repetition. For example, "abc" * 3 results in "abcabcabc" in Python.

seems a pushing it a bit far -- but just reporting here what a quick search returned

@cknitt
Copy link
Member

cknitt commented Nov 5, 2024

This approach looks good to me, both the design and the implementation. Left some minor comments.

@zth @cknitt any comments on the design?

Looks good to me, too! Awesome work @cometkim!

@zth
Copy link
Collaborator

zth commented Nov 5, 2024

Agreed, looks great! 👏

@cometkim
Copy link
Member Author

cometkim commented Nov 5, 2024

  1. * (%mul): While unusual for most data types, * is often supported for strings in some languages (like Python) for repetition. For example, "abc" * 3 results in "abcabcabc" in Python.

seems a pushing it a bit far -- but just reporting here what a quick search returned

Sounds like an abbreviation of String.prototype.repeat

I'm not sure if that will be useful or cleaner.

@cometkim
Copy link
Member Author

cometkim commented Nov 5, 2024

I haven't added char here (yet?), but it would be confusing.

'a' * 3 looks like be "aaa", but that doesn't match the mechanism (char <> string) and it doesn't match the semantics (char = int) either.

@cometkim
Copy link
Member Author

cometkim commented Nov 5, 2024

For other operators, I need to implement/rework additional primitives. It seems better to merge this first and then do it later on.

Copy link
Collaborator

@cristianoc cristianoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great!
Would you add a change log before merging?
We’ll also need to set documentation tasks for this.

@cometkim

This comment was marked as resolved.

@cometkim

This comment was marked as resolved.

specialization: specialization;
}

let builtin x = Primitive_modules.pervasives ^ "." ^ x
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhhhh right it's pervasives_mini, not pervasive. tricky 😅

@cometkim
Copy link
Member Author

cometkim commented Nov 6, 2024

changelog added

@cometkim
Copy link
Member Author

cometkim commented Nov 6, 2024

documentation issue opened rescript-lang/rescript-lang.org#934

@cometkim cometkim merged commit b0211f3 into rescript-lang:master Nov 6, 2024
20 checks passed
@cometkim cometkim deleted the generic-infix branch November 6, 2024 18:25
@cometkim cometkim mentioned this pull request Nov 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants