Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor real data generators #17

Open
Miloisthegoat opened this issue Jun 3, 2024 · 0 comments
Open

Refactor real data generators #17

Miloisthegoat opened this issue Jun 3, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@Miloisthegoat
Copy link

Not strictly a bug, but it's current behaviour is not ideal.

Right now, the data are not shuffled — and hitting Regenerate does nothing. So you always have the same test data points with these datasets.

Current method: slice 2n positive and n negative samples from the beginning of the array for train, and 2n +ve and n -ve from the end of the array (hence the negative slices in the Test function). It's not a great way to do it. It also means you can't shuffle the data.

The problem is that there is two separate functions and they don't know about each other. So how does the Test generator know which samples were used for Train? I think the Train generator has to pass back the indices of the Test set, which we can then pass to the Test generator.

This combines and closes agilescientific#15 and agilescientific#23.

@kwinkunks kwinkunks added the bug Something isn't working label Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants