Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken Code in Section 5.3.1 #62

Open
kaybenleroll opened this issue Apr 21, 2019 · 5 comments
Open

Broken Code in Section 5.3.1 #62

kaybenleroll opened this issue Apr 21, 2019 · 5 comments

Comments

@kaybenleroll
Copy link

The code scraping in section 5.3.1 no longer works as most of the code in the package tm.plugin.webmining is not up-to-date.

I tried switching the GoogleFinanceSource to YahooFinanceSource but that did not work either.

I am sure there are alternatives, but I figured it is best reported here first.

@juliasilge
Copy link
Collaborator

juliasilge commented May 4, 2019

Thank you very much for this report! 🙌 I want to acknowledge it and let you know we are aware and looking for a replacement data source to use in the book.

Just to record it here, ideally we would want to find something that:

  • allows us to demonstrate how to tidy() a document-term matrix
  • is an appropriate use case for the Loughran and McDonald sentiment lexicon

This may be too high an ask, though, and we need to break these apart and integrate these two bits of information separately. @dgrtwo

@kaybenleroll
Copy link
Author

Not at all Julia, happy to help! Let me know if you need any help with this - happy to help out any way I can. That book is really useful and has helped me a lot, so happy to contribute back. :)

@nattalides
Copy link

nattalides commented Jan 25, 2020

Same issue - after a bit of search it looks like the service from Yahoo and Google has been deprecated so probably best remove that bit.

@dgrtwo @juliasilge Do you think it would be better/easier to have a stored Corpus/VCorpus/WebCorpus financial article dataset as part of {tidytext} removing dependencies from other packages. This will enable to demonstrate both of the bullet points you raised.

@DesmondChoy
Copy link

Thank you very much for this report! 🙌 I want to acknowledge it and let you know we are aware and looking for a replacement data source to use in the book.

Just to record it here, ideally we would want to find something that:

  • allows us to demonstrate how to tidy() a document-term matrix
  • is an appropriate use case for the Loughran and McDonald sentiment lexicon

This may be too high an ask, though, and we need to break these apart and integrate these two bits of information separately. @dgrtwo

How about company's earnings call transcripts?
I stumbled upon a site that seems to provide these for free: https://news.alphastreet.com/ (Note: I'm not affiliated with them in any way)

@smmathews
Copy link
Contributor

I've created a PR that explains the issue the reader is about to encounter. While this PR is still open and unresolved, it would probably be a good idea to acknowledge the issue in the text.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants