Systematic framework tests for all machines #4712

karlnapf · 2019-07-09T09:46:34Z

testing

use sgobject iterator to instantiate and test all machines from a given list
individual ignore lists for each test (runtime)
automate header and machine name extraction from trained_model_serialization.cc.py
separate data generating code into environment
clean up

tests

consistency of views/subset with subsampled(no-view, same data)
training/apply leaves data unchanged
training thread consistency
test cross-validation thread consistency
import trained_model_serialization tests
delete old trained_model_serialization tests
training without initializing does throw an exception (and doesnt crash)
document non-working machines in ignore lists

karlnapf · 2019-07-09T09:47:37Z

tests/unit/machine/all_machines_unittest.cc

+	}
+}
+
+TEST(AllMachines, cv_thread_consistency)


@theartful these are the kind of tests I was talking about.

these are very nice!

do you think separating the data generating code to the public interface would be a good idea? examples would benefit from them for instance

yes I want to move all that into an environment similar to those used by trained_model_serialization so the data can be re-used for other instances. Will add that to the list, thanks

karlnapf · 2019-07-09T15:11:26Z

tests/unit/evaluation/CrossValidation_unittest.cc

@@ -1,206 +0,0 @@
-/*


these are subsumed by the new ones

vigsterkr

redundant and inconsistent

vigsterkr · 2019-07-10T10:03:03Z

tests/unit/machine/all_machines_unittest.cc

+
+TEST(AllMachines, train_thread_consistency)
+{
+	std::set<std::string> ignores = {


copy-paste all around the source... plz avoid redundancy

the ignores are test-specific.

yeah 3x times the very same ignore

this is because the number of tested machines is small and not yet automated. Although there are already differences here, see random forest. For more machines tested, there will be more diversity. If not, i can of course drop this.

since you copy paste 3 times the value... you could create a value of it instead of copy-pasting the same string around. 1 copy-paste is one too many imo

sure, i will look into this once the coverage is bigger

vigsterkr · 2019-07-10T10:04:30Z

tests/unit/machine/all_machines_unittest.cc

+		cv->put("seed", 1);
+		auto result_single = cv->evaluate();
+
+		get_global_parallel()->set_num_threads(4);


this is a by-chance checking of the consistency... depends on many aspects. for example on the machine that the code is being run on etc.

true, but this already revealed bugs. open for ways to improve

that's one thing that you have a test locally to have this tested... its another thing that have this formalised in a test that is part of the codebase and the CI that is going to be run on many different arch/distro etc...

for example:

do not use global_parallel - that is going to be dropped eventually - the sooner the better

why 4, why not 8, or 3 or 11 or 42? you could actually use a function to get the runtime env supported number of max threads and use that number, and if it's 1 then say that foobar.
....

the number is of course arbitrary, good idea to use the available number of threads, i will do that

vigsterkr · 2019-07-10T10:05:40Z

tests/unit/machine/all_machines_unittest.cc

+		init_machine(machine);
+		auto data = generate_data(machine);
+
+		machine->set_labels(data.second);


the triplet: set_labels, train, apply sprayed around this source again... while it could be a function that returns a tuple that you nicely tie and get it back

for further info: https://en.cppreference.com/w/cpp/language/structured_binding

but in this case you would actually just return the predictions... so you dont even need that... just pass the rvalue of the tuple (labels, data)

vigsterkr · 2019-07-10T10:47:04Z

tests/unit/machine/all_machines_unittest.cc

+}
+
+// TODO, generate this automatically, like in trained_model_serialization
+std::set<string> all_machines = {"LibSVM", "Perceptron", "LibLinear",


at least make a function that returns this set instead of having it here a global var....

will do once this extraction is automated

vigsterkr · 2019-07-10T11:13:18Z

tests/unit/machine/all_machines_unittest.cc

+
+		get_global_parallel()->set_num_threads(1);
+		machine->set_labels(data.second);
+		if (machine->has("seed"))


there's actually a function for this: seed(machine, 1);

vigsterkr · 2019-07-10T11:13:47Z

tests/unit/machine/all_machines_unittest.cc

+		auto result_single = machine->apply(data.first);
+
+		init_machine(machine);
+		get_global_parallel()->set_num_threads(4);


total random number....

vigsterkr · 2019-07-10T11:14:28Z

tests/unit/machine/all_machines_unittest.cc

+		get_global_parallel()->set_num_threads(4);
+		machine2->set_labels(data.second);
+		if (machine2->has("seed"))
+			machine2->put("seed", 1);


i guess the seed should be the same as above, right? in that case again just copy-pasting the same number around... refactoring is a pain.. use a variable.

vigsterkr · 2019-07-10T11:15:53Z

tests/unit/machine/all_machines_unittest.cc

+		machine2->set_labels(data.second);
+		if (machine2->has("seed"))
+			machine2->put("seed", 1);
+		machine2->train(data.first);


same as below: set_labels, train, apply. could you use a function for this?
the seeding part could be fixed by having that function a 4th arg that has a default value of some special number that means no seeding...

vigsterkr · 2019-07-10T11:16:18Z

tests/unit/machine/all_machines_unittest.cc

+			machine_subsampled->put("seed", 1);
+		}
+
+		machine_subset->set_labels(labels_subset);


set_labels, train, apply... all over again

stale · 2020-02-26T15:52:05Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

karlnapf commented Jul 9, 2019

View reviewed changes

karlnapf force-pushed the feature/machine_tests branch 2 times, most recently from 8c4fb25 to 123757c Compare July 9, 2019 15:09

karlnapf commented Jul 9, 2019

View reviewed changes

tests/unit/evaluation/CrossValidation_unittest.cc

@@ -1,206 +0,0 @@

/*

Copy link

Member Author

karlnapf Jul 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are subsumed by the new ones

vigsterkr requested changes Jul 10, 2019

View reviewed changes

Systematic framework tests for all machines

79d7838

karlnapf force-pushed the feature/machine_tests branch from 123757c to 79d7838 Compare July 10, 2019 10:09

vigsterkr reviewed Jul 10, 2019

View reviewed changes

stale bot added the stale label Feb 26, 2020

gf712 self-assigned this Feb 26, 2020

stale bot removed the stale label Feb 26, 2020

gf712 removed their assignment Feb 26, 2020

gf712 added the Tag: Testing label Feb 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Systematic framework tests for all machines #4712

Systematic framework tests for all machines #4712

karlnapf commented Jul 9, 2019 •

edited

Loading

karlnapf Jul 9, 2019

ghost Jul 10, 2019

ghost Jul 10, 2019

karlnapf Jul 10, 2019

karlnapf Jul 9, 2019

vigsterkr left a comment

vigsterkr Jul 10, 2019

karlnapf Jul 10, 2019

vigsterkr Jul 10, 2019

karlnapf Jul 10, 2019

vigsterkr Jul 10, 2019

karlnapf Jul 10, 2019

vigsterkr Jul 10, 2019

karlnapf Jul 10, 2019

vigsterkr Jul 10, 2019

vigsterkr Jul 10, 2019

karlnapf Jul 10, 2019

vigsterkr Jul 10, 2019

vigsterkr Jul 10, 2019

vigsterkr Jul 10, 2019

vigsterkr Jul 10, 2019

karlnapf Jul 10, 2019

vigsterkr Jul 10, 2019

vigsterkr Jul 10, 2019

vigsterkr Jul 10, 2019

karlnapf Jul 10, 2019

vigsterkr Jul 10, 2019

vigsterkr Jul 10, 2019

stale bot commented Feb 26, 2020

Systematic framework tests for all machines #4712

Are you sure you want to change the base?

Systematic framework tests for all machines #4712

Conversation

karlnapf commented Jul 9, 2019 • edited Loading

testing

tests

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vigsterkr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stale bot commented Feb 26, 2020

karlnapf commented Jul 9, 2019 •

edited

Loading