Sunday, February 21, 2016

Using twitter from the R console

Lately I have been messing around with R and I decided to check out the twitteR package to see if I can post from the R console. In order to use twitter from the R console, we need a couple of things:


  • Setup OAuth  authentication for twitter
  • Install the twitteR package

Setup OAuth  authentication for twitter
As of March 2013 OAuth authentication is required for all Twitter transactions. If you don't already have a OAuth setup, head over to twitter here: https://apps.twitter.com/app/new

Follow the instructions, once you are done, you will see the following 4 items

Consumer Key (API Key)
Consumer Secret (API Secret)

Access Token
Access Token Secret


Install the twitteR package
Now in you R program install the twitteR package
Once the package is installed, it is time to get busy......


 Load the package by executing the following command


library(twitteR)


Now it is time to setup authentication, you do that by using the setup_twitter_oauth command, below is an example, make sure to replace the keys and tokens below with the values you got back when you setup OAuth on twitter



setup_twitter_oauth("API key", "API secret", "Access token", "Access secret")
[1] "Using direct authentication"

If that is all set, we can send a tweet. To update you twitter status, you can use the updateStatus command, this is very simple to use, you pass your status into the function. Here is what it looks like on twitter

updateStatus('testing Tweeting with twitterR package from witin Revolution R Enterprise')
[1] "DenisGobo: testing Tweeting with twitterR package from witin Revolution R Enterprise"

Here is what it looks like from the console


Of course nobody is doing all of this to update their status. The reason I am playing around with this is because I want to do twitter searches and then store the results in a file or database. So let's do a simple search for the tag #rstats and let's also limit the search to only return 6 results

tweets <- searchTwitter('#rstats', n=6) 
tweets

Here is what we got back, as you can see some of the results end in ...., those have been truncated

[1] "psousa75: RT @rquintino: @Mairos_B #sqlsatportugal session: all about R in #SqlServer 2016 #rstats https://t.co/DHrqIZrz1e"

[[2]]
[1] "millerdl: a quick script to use imgcat in #rstats https://t.co/fpUlgWNX33 https://t.co/AhCCMLewCH"

[[3]]
[1] "diana_nario: RT @KirkDBorne: Useful packages (libraries) for Data Analysis in R: https://t.co/haRKopFyly #DataScience #Rstats by @analyticsvidhya https:…"

[[4]]
[1] "emjonaitis: Hey #rstats tweeps, do you have any readings to recommend on sensitivity analysis? Books/articles/websites all welcome."

[[5]]
[1] "caryden: RT @KirkDBorne: A Complete Tutorial on Time Series Modeling in R: https://t.co/7oI6JKyU4E #MachineLearning #DataScience #Rstats by @Analyti…"

[[6]]
[1] "ArkangelScrap: RT @KirkDBorne: A Complete Tutorial on Time Series Modeling in R: https://t.co/7oI6JKyU4E #MachineLearning #DataScience #Rstats by @Analyti…"


What I really want is to convert the output to a data frame. Luckily the twitteR package has this built in, you can use twListToDF. Here is how to do that

tweets <- searchTwitter('#rstats', n=6) 
twListToDF(tweets)

The output now has a lot more stuff, you can see if it has been retweeted or favorited as well as the latitude, longtitude and more


1                             RT @rquintino: @Mairos_B #sqlsatportugal session: all about R in #SqlServer 2016 #rstats https://t.co/DHrqIZrz1e
2                                                      a quick script to use imgcat in #rstats https://t.co/fpUlgWNX33 https://t.co/AhCCMLewCH
3 RT @KirkDBorne: Useful packages (libraries) for Data Analysis in R: https://t.co/haRKopFyly #DataScience #Rstats by @analyticsvidhya https:…
4                      Hey #rstats tweeps, do you have any readings to recommend on sensitivity analysis? Books/articles/websites all welcome.
5 RT @KirkDBorne: A Complete Tutorial on Time Series Modeling in R: https://t.co/7oI6JKyU4E #MachineLearning #DataScience #Rstats by @Analyti…
6 RT @KirkDBorne: A Complete Tutorial on Time Series Modeling in R: https://t.co/7oI6JKyU4E #MachineLearning #DataScience #Rstats by @Analyti…
  favorited favoriteCount replyToSN             created truncated replyToSID
1     FALSE             0        NA 2016-02-20 20:29:54     FALSE         NA
2     FALSE             0        NA 2016-02-20 20:24:50     FALSE         NA
3     FALSE             0        NA 2016-02-20 20:16:25     FALSE         NA
4     FALSE             0        NA 2016-02-20 20:11:08     FALSE         NA
5     FALSE             0        NA 2016-02-20 20:11:06     FALSE         NA
6     FALSE             0        NA 2016-02-20 20:02:05     FALSE         NA
                  id replyToUID
1 701141750161784834         NA
2 701140474019577856         NA
3 701138356466483204         NA
4 701137026075140096         NA
5 701137018508722176         NA
6 701134750296227840         NA
                                                                            statusSource
1                Mobile Web (M5)
2 Tweetbot for Mac
3   Twitter for Android
4                     Twitter Web Client
5     Twitter for iPhone
6                     Twitter Web Client
     screenName retweetCount isRetweet retweeted longitude latitude
1      psousa75            3      TRUE     FALSE        NA       NA
2      millerdl            0     FALSE     FALSE        NA       NA
3   diana_nario           50      TRUE     FALSE        NA       NA
4    emjonaitis            0     FALSE     FALSE        NA       NA
5       caryden           41      TRUE     FALSE        NA       NA
6 ArkangelScrap           41      TRUE     FALSE        NA       NA


Now that we have a dataframe, let's dump it into a csv file. Below is what the command is to write the output to a csv file

write.csv(twListToDF(tweets), file = "c:/temp/Tweets.csv")


Here is what it looks like if you open the csv file in Excel





As you can see each column is filled with correct data. How about instead of writing it into a csv file, we write the data into a database?  That is pretty easy as well, we need the RODBC package to accomplish that. You can see that post here: How to store twitter search results from R into SQL Server



Monday, February 15, 2016

Started a Today I learned project on Github



I started my own Today I learned project on Github



Today I learned

A collection of concise write-ups on small things I learn day to day across a variety of languages and technologies. These are things that don't really warrant a full blog post. Idea stolen from jbranchaud/til


You can find that project here https://github.com/SQLMenace/til
The reason I did this is because it gives me an opportunity to use Github, all my stuff is usually backend SQL Server code, I also don't do any web or app programming. The reason I like the Today I Learned project is that you can easily see all the stuff that you have learned over time. I will probably mostly add R and Powershell items in the foreseeable future. I am messing mostly with R on my own time and Powershell at work. Once I start diving deeper into SQL Server 2016, I will probably add that stuff as well to my Today I Learned Github project.

What do you think... cool idea or not?


Again, you can find that project here https://github.com/SQLMenace/til

Monday, January 4, 2016

How I will learn Chinese

I decided to write down what I will use to learn Chinese.in the coming weeks. Here are some of the things that I am currently doing and some things I will be doing in the near future.

Watch videos on Youtube
There are a bunch of videos I have already watched, there are also a bunch that I bookmarked. Here is one of the videos that I have already watched several times


I like Dani Wang's videos and will watch all of them shortly.

Here is a video that I have bookmarked but will watch this week. This video is by Yangyang Cheng, it is a Google hangout showing you the most effective way to learn Mandarin tones, tone pairs



Listen to Podcasts
I downloaded the Learn Chinese | ChineseClass101.com podcast












I have already listened several time to Lesson #1 - What's Your Name in Chinese? It is actually fun and quite interesting. I listen to most podcast at 1.5 speed but always have to remind myself to set it back to 1.0 speed with this language podcast


Visit websites
There are a bunch of sites that I have bookmarked, I have read some of these already and some are bookmarks because those sites were listen on some of these but I did not get enough time to read them all yet. Here is just a small list for you to check out.

Wikihow
How to Learn Mandarin Chinese

Tim Ferriss, 4 hour workweek blog
How to Learn Any Language in 3 Months
How to Learn (But Not Master) Any Language in 1 Hour
12 Rules for Learning Foreign Languages in Record Time — The Only Post You’ll Ever Need

BBC
Real Chinese - For starters
Quick Fix - Essential phrases in Chinese

Lingholic.com
How to learn Chinese

Semanda.com
Printable PDF Mandarin Flashcards


Books
CHINESE in 10 minutes a day
I have the Italian version of this book, it contains the following

  • 132-page illustrated workbook
  • Full color throughout
  • Organized in 25 easy steps, by essential categories
  • 150 Sticky Labels for home and office
  • Ready-made Flash Cards


This is a real beginners book and is fun to learn from. It will arrive by the end of this week so I will be doing all the other stuff I mentioned in this post instead.

Movies
Watch a movie in Chinese and trying to understand it. I own only 2 movies that are in Chinese: Hero and Crouching Tiger, Hidden Dragon. But luckily for me, my local library has a lot of Chinese movies, here is just a small example of the stuff that they have, there are several of these cases filled with movies.


I won't get to the movies for a couple of months, at this stage there is no point, I probably might recognize 5 words in total. But I will try to watch my first movie in April or so and will let you know how that goes.