Make qcheck-lin and qcheck-stm available on OCaml 4.x #329

edwintorok · 2023-04-17T17:26:50Z

I know this is the 'multicoretests' repo, but the Thread mode of qcheck-lin appears to be useful for testing OCaml 4.x code too :)

Unfortunately 'enabled_if' in 'test' stanza in dune doesn't work: it still attempts to build it, which would fail on OCaml 4, so have to use separate executable/runtest alias where 'enabled_if' works properly.

The following now compiles:

dune build -p qcheck-multicoretests-util,qcheck-stm,qcheck-lin

The tests run, but the output is currently different from OCaml 5:

dune runtest -p qcheck-multicoretests-util,qcheck-stm,qcheck-lin

I haven't added enabled_if to all the other tests that are part of the multicoretests package, too much churn, and most tests wouldn't run because they link both the thread and domain versions.

What do you think?

jmid · 2023-04-18T07:01:42Z

This is great, many thanks! 😃 👍

I expect the output difference is due to using a different underlying RNG (it switched with OCaml5). Using OCaml5's RNG from https://github.com/ocaml/stdlib-random could help make the outputs agree across major versions.
As a first crude approximation, could one simply require OCaml5 for the multicoretests package? I'm hoping that would allow the other 3 packages to be available on 4.x and avoid us having to rewrite all existing (test ...) rules of the test suite...
Note: The Thread modes of both qcheck-lin and qcheck-stm don't work as well as the corresponding Domain modes. This would nevertheless allow for people to try it out on 4.x and also use STM.sequential...

edwintorok · 2023-04-18T08:17:11Z

I moved the test that depended on OCaml 5's RNG to the multicoretests package. That package already depends on base-domains which implies OCaml 5, so should be fine now. This passes:

dune runtest -p qcheck-multicoretests-util,qcheck-stm,qcheck-lin --no-buffer

(it also doesn't seem to be actually running any tests, but thats fine I guess for now, this is well tested on OCaml 5...)

jmid · 2023-04-18T10:54:41Z

Sorry, for the confusion - I was too brief and unclear: 😬

The test/ directory contains "internal tests" of qcheck-multicoretests-util, qcheck-stm, and qcheck-lin
The src/ directory contains "the OCaml5 torture test suite" aka the multicoretests package

In our CI we have a separate step to first run and check the former and subsequently the latter.
In essense, test/ contains the tests we would include if we decide to relocate qcheck-multicoretests-util, qcheck-stm, and qcheck-lin to a different repo.

With the above in mind, marking a test in test/ as belonging to multicoretests disturbs the above separation. As such, I would rather prefer to keep the mutable_set_v5 belonging to the qcheck-stm package. A quick-and-dirty way to fix it on 4.x would be to include .expected-files for the 4.x RNG on 64/32-bit too. Another way would be to add https://github.com/ocaml/stdlib-random as an overall dependency. However I'm open to other suggestions too.

This following comment was targeted at the src/ tests, i.e., suggesting that the torture test suite was limited to just OCaml5+ for a start:

As a first crude approximation, could one simply require OCaml5 for the multicoretests package? I'm hoping that would allow the other 3 packages to be available on 4.x and avoid us having to rewrite all existing (test ...) rules of the test suite...

edwintorok · 2023-04-18T10:59:37Z

A quick-and-dirty way to fix it on 4.x would be to include .expected-files for the 4.x RNG on 64/32-bit too.

I'll take a look whether it is possible to use dune's select feature to choose a 4 vs 5 .expected files (I cannot do it based on compiler version, but I can use availability of 'base-domains' as a proxy for >= 5 which should suffice).
32/64 might be more complicated, I'll setup a 32-bit toolchain to check.

jmid · 2023-04-18T11:16:49Z

Sounds good! There's yet another approach (using dune file generation) in https://github.com/c-cube/qcheck/blob/main/test/core/dune - mostly as an example of what I hope to avoid 😅

Their Thread mode is useful for testing OCaml 4.x code too. Unfortunately 'enabled_if' in 'test' stanza in dune doesn't work: it still attempts to build it, which would fail on OCaml 4, so have to use separate executable/runtest alias where 'enabled_if' works properly. The following now compiles: ``` dune build -p qcheck-multicoretests-util,qcheck-stm,qcheck-lin ``` The tests run, but the output is currently different from OCaml 5: ``` dune runtest -p qcheck-multicoretests-util,qcheck-stm,qcheck-lin ``` Signed-off-by: Edwin Török <[email protected]>

It uses a different RNG. Signed-off-by: Edwin Török <[email protected]>

It uses some new functions from Seq. Signed-off-by: Edwin Török <[email protected]>

edwintorok · 2023-04-18T12:43:08Z

I've created 32 and 64-bit versions of the .expected files now (using opam switch create 4.14.0+32-bit ocaml-variants.4.14.0+options ocaml-option-32bit to get a 32-bit toolchain), the rules there already used enabled_if so I went with that instead of select.

jmid · 2023-04-18T15:32:58Z

I've tried this out locally and it works nicely as intended 👍
Both the 4.x and 5.x Thread modes don't work as nicely as the 5.x Domain ones unfortunately.
There are some previous attempts to improve them in #99

Bonus points for realizing that base-domains are not required for qcheck-multicoretests-util 🥇😃

Q: I see this requires 4.14 now. What prevents us from lowering the required version?
Note: I'm not asking you to lower it! I'm just trying to understand if we are using some Stdlib bindings added in 4.14 (I've had my share of lower bound battles during QCheck releases... 😅 )

edwintorok · 2023-04-18T17:06:57Z

On 18 April 2023 16:33:10 BST, Jan Midtgaard ***@***.***> wrote: I've tried this out locally and it works nicely as intended :+1: Both the 4.x and 5.x `Thread` modes don't work as nicely as the 5.x `Domain` ones unfortunately. There are some previous attempts to improve them in #99 Bonus points for realizing that `base-domains` are not required for `qcheck-multicoretests-util` 🥇😃 Q: I see this requires 4.14 now. What prevents us from lowering the required version? Note: I'm not asking you to lower it! I'm just trying to understand if we are using some `Stdlib` bindings added in 4.14 (I've had my share of lower bound battles during QCheck releases... 😅 )

Seq.iteri and Seq.equal. I do actually have a ocaml4.13 branch where I replaced these 2 functions (haven't tried further back). If there is interest in lowering the min version I can open a PR with that.

…

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

jmid · 2023-04-18T20:41:16Z

The CI is green modulo unrelated failures:

Two Cygwin part1's were cancelled
A Cygwin part2 trunk run timed out due to a long-running (9595.9s!) Lin ref int64 test with Thread
3 Windows runs (5.1, 5.1 bytecode, trunk) triggered the threadomain timeout

A Changes entry would be nice to help document the change in the forthcoming release notes.

jmid · 2023-04-25T13:50:47Z

I've now added a CHANGES entry in 04df34f.
The PR is still marked draft, but I think it is good to go.
Let me know if you want to include further edits.

edwintorok · 2023-04-26T08:24:00Z

Sorry, I got distracted by trying out the code and writing some tests using it. It does work, although writing custom shrinking and printing functions is highly recommended: makes shrinking results a lot more useful and finish in a reasonable time.
Here is a snippet:

let is_small s = String.length s < 6

  let shrink_cmd =
    let open QCheck in
    let open KV in
    function
    | Get k ->
        Iter.map
          (fun k -> Get (Key.of_string_exn k))
          Shrink.(filter is_small string @@ Key.to_string k)

let truncated_str s =
  if String.length s < 6 then
    String.escaped s
  else
    Printf.sprintf "%s…(len:%d)"
      (String.sub s 0 6 |> String.escaped)
      (String.length s)

This relies on the assumption (that at least in my code) bugs have 2 origins: due to the byte values contained in the string (in which case small string should be able to fully exercise that), or due to the length of the string (in which case trying to generate different bytes, or different short strings won't reproduce the bug and make shrinking take a huge amount of time, just leave long strings as they were for now if short ones don't repro the problem).
Other improvements might be possible (e.g. I haven't checked whether shrinking would already do a binary search on length or not), although the non-determinism due to threads/domains does sometimes make shrinking or testing slower (for some reason STM seems ~10x slower than Lin). A custom shrinker makes it work wonderfully though.

I might submit a separate PR to add some examples to the docs about writing a shrink_cmd, IIUC that is needed to attempt to shrink the values, otherwise just the number of commands is being shrunk, and custom printers are needed to keep the printed results readable (especially when printing the 2 parallel branches side-by-side)

jmid · 2023-04-26T14:05:28Z

This relies on the assumption (that at least in my code) bugs have 2 origins: due to the byte values contained in the string (in which case small string should be able to fully exercise that), or due to the length of the string (in which case trying to generate different bytes, or different short strings won't reproduce the bug and make shrinking take a huge amount of time, just leave long strings as they were for now if short ones don't repro the problem).

Conditional string shrinking based on its length seems like a good idea!
I agree that issues tend to be either caused by the length or a particular char content.
Potentially, such improvements should go upstream into QCheck.
We have also experienced needlessly long runs due to long strings, and tend to use small_string generators as a quick (but unsatisfying) workaround.

I haven't checked whether shrinking would already do a binary search on length or not

A nice and tempting rabbit hole starts here: c-cube/qcheck#242 and continues here: c-cube/qcheck#268

Needlessly long strings being printed also sounds familiar. I created #308 to keep track of it.
Again, improved string printers may also be useful to push upstream to QCheck.

Other improvements might be possible (e.g. I haven't checked whether shrinking would already do a binary search on length or not), although the non-determinism due to threads/domains does sometimes make shrinking or testing slower (for some reason STM seems ~10x slower than Lin). A custom shrinker makes it work wonderfully though.

In contrast to Lin, STM has to perform precond checks for each shrinking candidate, as each of them could potentially contain a simplification that breaks a precondition. Following the original QuickCheck race-condition paper we therefore only accept 'shrink candidates' that satisfy all preconds under all possible interleavings... This is exponential in the length of the 'parallel tails', and so may likely explain how a more economical shrinker triggers fewer such checks and thus exhibits better performance...

I might submit a separate PR to add some examples to the docs about writing a shrink_cmd, IIUC that is needed to attempt to shrink the values, otherwise just the number of commands is being shrunk,

Indeed, without supplying cmd shrinkers only the "spine" of the cmd list triples are reduced.

and custom printers are needed to keep the printed results readable (especially when printing the 2 parallel branches side-by-side)

Yep, the horror cabinet starts in #308 😅

edwintorok · 2023-04-26T14:18:26Z

Hmm,interruptible shrinking might be useful, e.g. if I happen to notice that shrinking is taking a long time (how long, will it take hours or days to finish, and meanwhile seeing even the potentially tiny bit longer error might already be useful...), dumping some sort of "state" that can be printed and further shrunk might be useful although that would require 'cmd' to be both serializable and deserializable.
Some form of 'dump current state' to a file upon a signal (^C or ^I/^T) might be useful so I can inspect it and potentially improve the shrinker. Anyway as you say all these improvements can go directly to qcheck upstream, so I'll start there.

edwintorok · 2023-04-26T14:23:29Z

all preconds under all possible interleavings... This is exponential in the length of the 'parallel tails

Limiting the length of the generated commands might help, and is what Jepsen recommends in their tutorials

I'll try out some ideas and report back (hopefully in the form of some PRs...)

edwintorok added 3 commits April 18, 2023 13:19

test: add OCaml 4 versions of expected output

5854e7d

It uses a different RNG. Signed-off-by: Edwin Török <[email protected]>

opam: Require OCaml 4.14 for qcheck-lin

7ba5aaa

It uses some new functions from Seq. Signed-off-by: Edwin Török <[email protected]>

edwintorok force-pushed the ocaml4 branch from e45b3a7 to 7ba5aaa Compare April 18, 2023 12:42

jmid mentioned this pull request Apr 18, 2023

Chore: src/neg_tests cleanups #330

Merged

jmid approved these changes Apr 19, 2023

View reviewed changes

Add a CHANGES entry

04df34f

edwintorok marked this pull request as ready for review April 26, 2023 08:15

jmid merged commit 761fbf3 into ocaml-multicore:main Apr 26, 2023

This was referenced May 1, 2023

A few 4.14 updates #339

Merged

Less exhaustive string shrinking c-cube/qcheck#278

Open

psafont mentioned this pull request May 2, 2023

OCaml 4.13 compatibility #340

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make qcheck-lin and qcheck-stm available on OCaml 4.x #329

Make qcheck-lin and qcheck-stm available on OCaml 4.x #329

edwintorok commented Apr 17, 2023

jmid commented Apr 18, 2023

edwintorok commented Apr 18, 2023

jmid commented Apr 18, 2023

edwintorok commented Apr 18, 2023

jmid commented Apr 18, 2023

edwintorok commented Apr 18, 2023

jmid commented Apr 18, 2023

edwintorok commented Apr 18, 2023 via email

jmid commented Apr 18, 2023

jmid commented Apr 25, 2023

edwintorok commented Apr 26, 2023

jmid commented Apr 26, 2023

edwintorok commented Apr 26, 2023

edwintorok commented Apr 26, 2023

Make qcheck-lin and qcheck-stm available on OCaml 4.x #329

Make qcheck-lin and qcheck-stm available on OCaml 4.x #329

Conversation

edwintorok commented Apr 17, 2023

jmid commented Apr 18, 2023

edwintorok commented Apr 18, 2023

jmid commented Apr 18, 2023

edwintorok commented Apr 18, 2023

jmid commented Apr 18, 2023

edwintorok commented Apr 18, 2023

jmid commented Apr 18, 2023

edwintorok commented Apr 18, 2023 via email

jmid commented Apr 18, 2023

jmid commented Apr 25, 2023

edwintorok commented Apr 26, 2023

jmid commented Apr 26, 2023

edwintorok commented Apr 26, 2023

edwintorok commented Apr 26, 2023