Session 6: Reading data from Twitter


1. Concept:

  • Tweets will be read in from Twitter by using Twython, a Python wrapper for Twitter’s application programming interface (API). You are going to learn how to correctly implement Twython in order to import tweets into a list. 


2. Background/Overview

  • What you will refresh and learn in this session: lists, for loops, making function calls, and passing parameter to functions. 


3. Lesson Plan

  • Download the file Sampler.py
  • Download the file Streamer.py
  • Before you start!

    • Each group will be assigned their own Twitter API keys, so that you don't have to worry about rate limits per key. Click here to get your group's keys
  • Save both Sampler.py and Streamer.py into your project, and right click on the project name. Click “Synchronize “Project name”. The file should now be listed under the Project name. 
  • Before we dive into the codes: 
    • We use REST APIs provide access to programs to read and write Twitter Data. You can write a new tweet, read an author’s profile, follower data, and more. In our case, we are using it to read tweets from select hashtags. 
    • Twython is a Python wrapper for Twitter’s API. It takes the JSON (JavaScript Object Notation) and returns a Python object that is easy to use and navigate using simple Python commands. 
    • We have created the Twitter App Key, App Key Secret, Access Token, and Access Token Secret for you. These strings are supplied by Twitter. Anyone with a twitter account can generate their own Tokens and Keys by going to https://dev.twitter.com/apps and creating an application.
    • The variable ‘search’ is a comprehensive list of every piece of information that can be gathered from each tweet. This includes the time it was made, hashtags in the tweet, the user who tweeted it, and more. Because this list is not only hard to traverse, but extremely lengthy, we create another list called ‘tweets’ which only contains the information we need for our program to work. 
      • Any hashtag can be used, but keep in mind that certain hashtags (such as trending topics) will have much more recent tweets available than other hashtags. 
      • The count can range from 0 to 100. 
    • for every tweet in the tweets list, we are printing the tweet id (author) and the actual Tweet text.  
  • Right click on the file name tab at the top, and click “Run <filename>”. You will see Tweets print on your console! Any app that you use that imports tweets uses Twitter’s API to do so! 
  • Reinforcement activities (below)


4. Reinforcement Activities: 

  • Download the file example1.py
    • 0. Our goal is to print tweets and figure out who tweets with hashtags. Follow instruction in comment (#)
    • 1. Create a Reader object
    • 2. Retrieve tweets
    • 3. Find out if tweets contains hashtag
    • 4. Find out the author with hashtag
  • Download the file example2.py
    • 0. Our goal is to practice making function calls to retrieve tweets. Follow instruction below
    • 1. Create objects
    • 2. Get tweets having text "Summer"
    • 3. Get tweets having text "Summer", results returned in 100 seconds
    • 4. Get 5 tweets having text "rockies"
    • 5. Compare your results with your peer and see if you get the same tweet
    • 6. Get 10 tweets and check if any of then contains any hashtags
    • 7. Compare your results with your peer and see if you get the same tweet

    • 8. Get tweets having text "pizza" and count number of tweets received
    • 9. Get tweets having text "salad" and count number of tweets received
    • 10. Get tweets having text "burger" and count number of tweets received
    • 11. Calculate the percentage of each item out of the total number of tweets of three items combined Percentage of pizza = number of times (pizza) / number of time (pizza + salad + burger)
    • 12. Print the rank of the items


5. Solutions


6. Reporting activities:

  • On a piece of paper, answer the following questions: 
      • In the programming language called Java, each list (aka array) can only contain one data type per list. Give an example of a scenario where not having this restriction (Python doesn’t) would be useful. 
      • List any facts/ideas you have learned in this session that you believe will be useful to remember for your future in programming.
      • Explain our reinforcement activity, such as how we stored the word bank.