As a result of issue #457
We want to scrape lots of Twitter Lists that aggregate all the politicians in a single country (e.g. as used by Politwoops) as a way of discovering the twitter handles for the legislators in that country.
Writing each of those scrapers currently requires a lot of boiler-plate code (see for example https://github.com/everypolitician-scrapers/twitter-colombia-senate-list/blob/master/scraper.rb), so we want to factor that out into an installable gem so that the scrapers become only a line or two long each.
This makes them not only much simpler and easier to create in the first place, but means when we want to do something new with those scrapers, we should only need to update a single gem, rather than end up with hundreds of scrapers all slightly out of sync with each other needing to be updated individually.
bin
: Executableslib
: Sourcesspec
: Tests
Add this line to your application's Gemfile:
gem 'twitter_list', git: 'https://github.com/everypolitician/twitter_list.git'
And then execute:
$ bundle install
You could also execute:
$ bin/setup
just make sure that bin/setup
has the right permissions (run $ chmod +x bin/setup
if not).
This gem will scrape the Twitter list and return an array of hashes with the following information:
- Twitter id
- Twitter name
- Twitter handle
- Link to avatar picture
First generate new twitter credentials here: https://apps.twitter.com/. Then you can pass them as four separate tokens or as one that contains the four of them.
twitter_list = TwitterList::Scraper.new(
consumer_key: TWITTER_CONSUMER_KEY,
consumer_secret: TWITTER_CONSUMER_SECRET,
access_token: TWITTER_ACCESS_TOKEN,
access_token_secret: TWITTER_ACCESS_TOKEN_SECRET
)
people = twitter_list.people('lechinoise', 'politic-arg')
You could also set a single variable to pass the previous four credentials at once. If you do so, separate them by a pipe (i.e. CREDENTIAL1|CREDENTIAL2 etc.) and make sure they are in the right order:
twitter_list = TwitterList::Scraper.new(
twitter_tokens: "#{TWITTER_CONSUMER_KEY}|#{TWITTER_CONSUMER_SECRET}|#{TWITTER_ACCESS_TOKEN}|#{TWITTER_ACCESS_TOKEN_SECRET}"
)
people = twitter_list.people('lechinoise', 'politic-arg')
Caveat: Make sure that your tokens don't include the pipe symbol!
After checking out the repo, run bin/setup
to install dependencies. Then, run bundle exec spec
to run the tests. You can also run bin/console
for an interactive prompt that will allow you to experiment.
$ bundle exec rspec
Note: The test suite uses vcr to record the HTTP requests to Twitter, so that it can test against actual Twitter responses. If you want to re-record the cassettes in VCR, then you will have to set the TWITTER_TOKENS
variable in your environment, since it is used in the spec_helper.rb
file. You only need to set the environment variable if you’re recording new cassettes from real Twitter responses.
For example:
$ export TWITTER_TOKENS=replace_with_twitter_tokens
To learn more about how the Twitter credentials (and in particular this variable) are set, check out the section "How to use it" above.
Bug reports and pull requests are welcome on GitHub at https://github.com/everypolitician/twitter_list.