Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submit streaming code and examples #43

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

ghjdegithub
Copy link
Contributor

No description provided.

@ghjdegithub
Copy link
Contributor Author

@thewh1teagle Didn't see your comment

///
/// * `samples` - 样本数据
/// * `sample_rate` - 采样率
fn read_wave(filename: &str) -> (Vec<f32>, i32) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use internal function for that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use internal function for that

This is only for testing, and the code logic is not suitable for the framework

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's useful to keep the example short and simple as possible with only the relevant info to dive in quickly. That's why it's useful to use the internal function like in other examples and even i don't use clap in most of the examples when not necessarily

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This place is just to do safe packaging, not to do higher level packaging, and higher level to use those that were made before. The structs you made before could be structs that were wrapped with this safe.

//!
//! [sherpa-onnx]: https://github.com/k2-fsa/sherpa-onnx
//! [onnxruntime]: https://github.com/microsoft/onnxruntime
//! [Next-gen Kaldi]: https://github.com/k2-fsa/
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove all comments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

///
/// please refer to
/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/index.html
/// to download pre-trained models
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove most of the comments in the files. you can add link to specific sherpa docs on top of the file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove most of the comments in the files. you can add link to specific sherpa docs on top of the file

Consider moving it to the entire crate document

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to avoid maintain documentation as possible. We'll handle that in another pr/issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

/// Configuration for online/streaming models
///
/// Please refer to
/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/index.html
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

pub(crate) _marker: PhantomData<T>,
}

pub trait State {}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use trait?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use trait?

Check out the latest code.
Simplify the online_stream parameters in online_recognizer.

@thewh1teagle
Copy link
Owner

@ghjdegithub
Maybe now you can see?

@ghjdegithub
Copy link
Contributor Author

@ghjdegithub

Maybe now you can see?

I saw it.

let (samples, sample_rate) = read_wave(&args.wave_file);

println!("Initializing recognizer (may take several seconds)");
let config = OnlineRecognizerConfig {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify configs with impl default like in other examples

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rust's default means that there can be a default case, but this configuration obviously doesn't. The best practice should be to provide a series of new methods, such as new_ transducer, new_paraformer, etc. default, there really isn't a particular default standard.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree but even using new methods for the structus will be too much expressive for irrelevant fields. so maybe it's worth using default and the examples will show how to use it correctly + sherpa-onnx will throw error with info

/// Configuration for online/streaming transducer models
///
/// Please refer to
/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/index.html
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to models should placed in examples. Keep the source code clean

/// Configuration for online/streaming paraformer models
///
/// Please refer to
/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/index.html
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove most of the comments keep tiny reference maybe just a link


impl OfflineRecognizer {
/// Frees the internal pointer of the recognition to avoid memory leak.
fn delete(&mut self) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove. You can free it directly in impl drop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a private method, and it is currently only used for drop.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the private delete methods and do the logic directly in drop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thewh1teagle I'd recommend keeping it as you might use this private method when writing unit tests.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we'll need the logic for unit tests we simply can call drop(...) that will call the drop implementation.

}

/// Free the internal pointer inside the recognizer to avoid memory leak.
fn delete(&mut self) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove but use in impl drop


impl OfflineStream {
/// Frees the internal pointer of the stream to avoid memory leak.
fn delete(&mut self) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove and keep in impl drop

}

/// Get the recognition result of the offline stream.
pub fn get_result(&self) -> Option<OfflineRecognizerResult> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return result instead of option with bail! Macro to emit errors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return result instead of option with bail! Macro to emit errors

Is there an err type for the appropriate result at the moment

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we use eyre::{Result, bail}
We can improve it in future in another PR though, I don't see real benefit of using custom errors in the crates at this stage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we use eyre::{Result, bail} We can improve it in future in another PR though, I don't see real benefit of using custom errors in the crates at this stage.

In rust practice, if you're a library that returns result, it's a good idea to define your own err and result types. This way, sub-documents can see their own error type and related descriptions. The essence is to facilitate the use of the caller.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll handle it later. see #46

@ghjdegithub
Copy link
Contributor Author

@thewh1teagle Can you just modify these PRs

@thewh1teagle
Copy link
Owner

@ghjdegithub
Please start with smaller, specific PRs from a dedicated branch in the future to make them easier to review and modify. I'll take a look at this one as soon as I have time. Thanks again for your contribution!

@ghjdegithub
Copy link
Contributor Author

@ghjdegithub Please start with smaller, specific PRs from a dedicated branch in the future to make them easier to review and modify. I'll take a look at this one as soon as I have time. Thanks again for your contribution!

Next time I used the feature/XXX branch to get it done, so I'll be lazy this time

@thewh1teagle
Copy link
Owner

Next time I used the feature/XXX branch to get it done, so I'll be lazy this time

Yes, and smaller even tiny PRs :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants