Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chap 4 suggestions #120

Open
salbert83 opened this issue Apr 10, 2021 · 3 comments · May be fixed by #115
Open

Chap 4 suggestions #120

salbert83 opened this issue Apr 10, 2021 · 3 comments · May be fixed by #115
Labels
medium-priority Grammar and typos errors. New feature

Comments

@salbert83
Copy link

Another excellent chapter! Some minor suggestions:

  1. in spam_predict, replace [wrd for wrd in email_words if wrd in vocabulary] with intersect(email_words, model.vocabulary). Also, I think the code on the webpage has a typo (vocabulary instead of model.vocabulary).
  2. Use log(probability) instead of probability to avoid numerical errors
  3. In spam_filter_accuracy, why record predictions? It is unused and not returned.

Thank you.

@salbert83
Copy link
Author

One more, replace
all_words_text = StringDocument(string([string(word, " ") for word in all_words]...))
with
all_words_text = StringDocument(join(all_words, " "))

Thanks

@salbert83
Copy link
Author

Also, think you're missing the line to load the file, something like
raw_df = DataFrame(CSV.File(email_path))

Thanks

@pefontana
Copy link
Collaborator

Hi @salbert83 !
Thanks for the suggestions!
We are currently working on an updated of chapter 4, so this comments will help us a lot
You are right in:

  • the raw_df = DataFrame(CSV.File(email_path)) line is missing
  • We have some typos errors
  • We are not using the predictions array

We are going to use the code suggestions!
In a few day we will upload the new chapter version and you will see the improvements.

Thank you very much!

@pefontana pefontana added the medium-priority Grammar and typos errors. New feature label Apr 12, 2021
@pefontana pefontana linked a pull request Apr 12, 2021 that will close this issue
pefontana added a commit that referenced this issue Apr 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
medium-priority Grammar and typos errors. New feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants