Voice recognition (fixes #58) #59

Mikolaj · 2022-07-05T07:10:38Z

Implements #58. An attempt to recognize to which person a voice belongs in a given window of a sound file. Uses RNN.

The tests to run are cabal test extremelyLongTest --enable-optimization -f test_seq --test-options='-p "Speech RNN"''. Example data files not yet included.

…g data

blackhole64 · 2023-02-17T15:08:40Z

What is the status of this PR?

Mikolaj · 2023-02-17T15:13:03Z

I'm afraid, it's very outdated and the new tools it should use instead of the old ones are not ready.

blackhole64 · 2023-02-17T15:14:40Z

Which tools? Is there a PR/issue for those?

Mikolaj · 2023-02-17T15:33:10Z

To be frank, it's not yet clear what those tools should be. Benchmarks are going to show if more work is needed, but benchmarks first need to be ported from the old API and also new ones created. The new API is not finished yet, though. Once it's finished in its current form, ideally we'd add a shaped, not only ranked version, but this may be too hard to do, at least initially. Once we decide and implement that, benchmarking is the next step.

Mikolaj added 16 commits July 5, 2022 19:18

Mock up TestSpeechRNN copy-pasting from TestMnistRNN

f937d94

Actually load some sound and label files

b3a1e45

Simplify deserialization thanks to len at file start

dd655f3

Sanity check min and max of the files additionally

502cbb5

Classify blocks of windows, not lone windows

6c6f087

Copy-paste more of Mnist RNN machinery

a1c337e

Define mapDomains

e5cb42e

Use mapDomains to obtain initial Float parameters

ff2c4fc

Expose a type error (WIP)

86b66f1

Fix the type error, which also makes the test not crash

af19dbe

Fix GHC warnings

1f40df0

Get back to 0/1 labels to make SoftMaxCrossEntropy optimization valid

e7d247b

Don't fail tests if speech files don't exist

88a4273

Use separate training data in tests

2168961

Fix trivializing the model by having only one label

f646172

Don't make it too hard varying block size between training and testin…

f5ce66a

…g data

Mikolaj force-pushed the voice-recognition branch from 7ed4e58 to f5ce66a Compare July 5, 2022 22:27

Mikolaj added 2 commits July 6, 2022 00:51

Really don't fail tests if speech files don't exist

c218b10

Prevent haddock errors

2557b26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Voice recognition (fixes #58) #59

Voice recognition (fixes #58) #59

Mikolaj commented Jul 5, 2022 •

edited

Loading

blackhole64 commented Feb 17, 2023

Mikolaj commented Feb 17, 2023

blackhole64 commented Feb 17, 2023

Mikolaj commented Feb 17, 2023

Voice recognition (fixes #58) #59

Are you sure you want to change the base?

Voice recognition (fixes #58) #59

Conversation

Mikolaj commented Jul 5, 2022 • edited Loading

blackhole64 commented Feb 17, 2023

Mikolaj commented Feb 17, 2023

blackhole64 commented Feb 17, 2023

Mikolaj commented Feb 17, 2023

Mikolaj commented Jul 5, 2022 •

edited

Loading