twitter – DECISION STATS

How to make inforgraphics easily- Use Infogr.am

What is an Infographic?

http://en.wikipedia.org/wiki/Infographic

Information graphics or infographics are graphic visual representations of information, data or knowledge intended to present complex information quickly and clearly.^[1]^[2] They can improve cognition by utilizing graphics to enhance the human visual system’s ability to see patterns and trends.^[3]^[4] The process of creating infographics can be referred to as data visualization, information design, or information architecture.^[2]

What is Infogr.am?

It Create infographics and interactive online charts. It’s free and super-easy!

How?

Step 1

Step 2

Create

Step 3

Choose New Infographic or New Chart?

Step 4

Create using options-you can edit the table, figures, text, colors etc

That’s it

Use Infogr.am to make inforgraphics easily!

Jetstrap for builiding websites with Twitter Bootstrap

Twitter Bootstrap is a free collection of tools for creating websites and web applications. It contains HTML and CSS-based design templates for typography, forms, buttons, charts, navigation and other interface components, as well as optional JavaScript extensions.

It is the most popular project in GitHub^[2] and is used by NASA and MSNBC among others.

———————-

If you like me, hate to get down and dirty in HTML, CSS , JQuery ( not mentioning the excellent Code Academy HTML/CSS tutorials and JQuery Track ) and want to create a pretty simple website for yourself- Jetstrap helps you build the popular Twitter Bootstrap design (very minimalistic) for websites.

And it’s free! And click and point and paste your content- and awesome CSS, HTML. Allows you to download the HTML to paste in your existing site!

Here is one I created in 5 minutes!

So lose your old website! Because not every website needs WordPress!

Try Jetstrap for Bootstrap!

Download all your tweets

Now that the Government of the United States of America has the legal power to request your information without a warrant (The Chinese love this!)

Anyways- you can also download your own twitter data. Liberate your data.

Have you looked at your own data? Go there at https://twitter.com/settings/account and review the changes.

How to be a Happy Hacker

I write on and off on hackers (see http://bit.ly/VWxSvP) and even some poetry on them (http://bit.ly/11RznQl) . During meetups, conferences, online discussions I run into them, I have interviewed them , and I have trained some of them (in analytics). Based on this decade long experience of observing hackers, and two decade long experience of hanging out with them- some thoughts on making you a better hacker, and a happier hacker even if you are a hacker activist or a hacker in enterprise software.

1) Everybody can be a hacker, but you need to know the basic attitude first. Not every Python or Java coder is a hacker. Coding is not hacking. More details here- https://decisionstats.com/2012/02/12/how-to-learn-to-be-a-hacker-easily/

2) Use tools like Coursera, Udacity, Codeacdemy to learn new languages. Even if you dont have the natural gift for memorizing syntax, some of it helps. (I forget syntax quite often. I google)

3) Learn tools like Metasploit if you want to learn the lucrative and romantic art of exploits hacking (http://www.offensive-security.com/metasploit-unleashed/Main_Page). The demand for information security is going to be huge. hackers with jobs are happy hackers.

4) Develop a serious downtime hobby.

Lets face it- your body was not designed to sit in front of a computer for 8 hours. But being a hacker will mean that commitment and maybe more.

Continue reading “How to be a Happy Hacker”

httR by Hadley #rstats

The awesome Hadley Wickham has just released the next version of httr package. Prof Hadley is currently on leave from Rice Univ and working with the tremendous geeks at R Studio . New things in the httr package-

http://blog.rstudio.org/2012/10/14/httr-0-2/

httr, a package designed to make it easy to work with web APIs. Httr is a wrapper around RCurl, and provides:

functions for the most important http verbs: GET, HEAD, PATCH, PUT, DELETE and POST.
support for OAuth 1.0 and 2.0. Use oauth1.0_token and oauth2.0_token to get user tokens, and sign_oauth1.0 and sign_oauth2.0to sign requests. The demos directory has six demos of using OAuth: three for 1.0 (linkedin, twitter and vimeo) and three for 2.0 (facebook, github, google).

I especially like the OAuth functionality as I occasionaly got flummoxed with existing R OAuth packages , and this should hopefully lead to awesome new social media analytics posts by the larger R blogger community. Also given the fact that unauthenticated API requests to Twitter are greatly expanded by OAuth authenticated requests- (see https://dev.twitter.com/docs/rate-limiting )

Unauthenticated calls are permitted 150 requests per hour. Unauthenticated calls are measured against the public facing IP of the server or device making the request.
OAuth calls are permitted 350 requests per hour and are measured against the oauth_token used in the request.

some creative use cases should see an incredible amount of cross social media analysis (not just one social media channel ) at a time.

R for Social Media Analytics ? Watch this space.. 😉

New version of httr: 0.2 (rstudio.org)

Understanding OAuth 1.0 for #rstats

The lovely lovely diagram at https://developer.linkedin.com/documents/oauth-overview is worth a thousand words and errors.

Very useful if you are trying to coax rCurl to do the job for you.

Credits-Idan Gazit

Also a great slideshare in Japanese (no! Google Translate didnt work on pdf’s and slideshares and scribds (why!!) but still very lucid on using OAuth with R for Twitter.

Why use OAuth- you get 350 calls per hour for authenticated sessions than 150 calls .

I tried but failed using registerTwitterOAuth

There is a real need for a single page where you can go and see which social netowork /website is using what kind of oAuth, which url within that website has your API keys, and the accompanying R Code for the same. Google Plus,LinkedIn, Twitter, Facebook all can be scraped better by OAuth. Something like this-

R OAuth for twitteR from Tor Ozaki

Interview John Myles White , Machine Learning for Hackers

Here is an interview with one of the younger researchers and rock stars of the R Project, John Myles White, co-author of Machine Learning for Hackers.

Ajay- What inspired you guys to write Machine Learning for Hackers. What has been the public response to the book. Are you planning to write a second edition or a next book?

John-We decided to write Machine Learning for Hackers because there were so many people interested in learning more about Machine Learning who found the standard textbooks a little difficult to understand, either because they lacked the mathematical background expected of readers or because it wasn’t clear how to translate the mathematical definitions in those books into usable programs. Most Machine Learning books are written for audiences who will not only be using Machine Learning techniques in their applied work, but also actively inventing new Machine Learning algorithms. The amount of information needed to do both can be daunting, because, as one friend pointed out, it’s similar to insisting that everyone learn how to build a compiler before they can start to program. For most people, it’s better to let them try out programming and get a taste for it before you teach them about the nuts and bolts of compiler design. If they like programming, they can delve into the details later.

We once said that Machine Learning for Hackers is supposed to be a chemistry set for Machine Learning and I still think that’s the right description: it’s meant to get readers excited about Machine Learning and hopefully expose them to enough ideas and tools that they can start to explore on their own more effectively. It’s like a warmup for standard academic books like Bishop’s.

The public response to the book has been phenomenal. It’s been amazing to see how many people have bought the book and how many people have told us they found it helpful. Even friends with substantial expertise in statistics have said they’ve found a few nuggets of new information in the book, especially regarding text analysis and social network analysis — topics that Drew and I spend a lot of time thinking about, but are not thoroughly covered in standard statistics and Machine Learning undergraduate curricula.

I hope we write a second edition. It was our first book and we learned a ton about how to write at length from the experience. I’m about to announce later this week that I’m writing a second book, which will be a very short eBook for O’Reilly. Stay tuned for details.

Ajay- What are the key things that a potential reader can learn from this book?

John- We cover most of the nuts and bolts of introductory statistics in our book: summary statistics, regression and classification using linear and logistic regression, PCA and k-Nearest Neighbors. We also cover topics that are less well known, but are as important: density plots vs. histograms, regularization, cross-validation, MDS, social network analysis and SVM’s. I hope a reader walks away from the book having a feel for what different basic algorithms do and why they work for some problems and not others. I also hope we do just a little to shift a future generation of modeling culture towards regularization and cross-validation.

Ajay- Describe your journey as a science student up till your Phd. What are you current research interests and what initiatives have you done with them?

John-As an undergraduate I studied math and neuroscience. I then took some time off and came back to do a Ph.D. in psychology, focusing on mathematical modeling of both the brain and behavior. There’s a rich tradition of machine learning and statistics in psychology, so I got increasingly interested in ML methods during my years as a grad student. I’m about to finish my Ph.D. this year. My research interests all fall under one heading: decision theory. I want to understand both how people make decisions (which is what psychology teaches us) and how they should make decisions (which is what statistics and ML teach us). My thesis is focused on how people make decisions when there are both short-term and long-term consequences to be considered. For non-psychologists, the classic example is probably the explore-exploit dilemma. I’ve been working to import more of the main ideas from stats and ML into psychology for modeling how real people handle that trade-off. For psychologists, the classic example is the Marshmallow experiment. Most of my research work has focused on the latter: what makes us patient and how can we measure patience?

Ajay- How can academia and private sector solve the shortage of trained data scientists (assuming there is one)?

John- There’s definitely a shortage of trained data scientists: most companies are finding it difficult to hire someone with the real chops needed to do useful work with Big Data. The skill set required to be useful at a company like Facebook or Twitter is much more advanced than many people realize, so I think it will be some time until there are undergraduates coming out with the right stuff. But there’s huge demand, so I’m sure the market will clear sooner or later.

The changes that are required in academia to prepare students for this kind of work are pretty numerous, but the most obvious required change is that quantitative people need to be learning how to program properly, which is rare in academia, even in many CS departments. Writing one-off programs that no one will ever have to reuse and that only work on toy data sets doesn’t prepare you for working with huge amounts of messy data that exhibit shifting patterns. If you need to learn how to program seriously before you can do useful work, you’re not very valuable to companies who need employees that can hit the ground running. The companies that have done best in building up data teams, like LinkedIn, have learned to train people as they come in since the proper training isn’t typically available outside those companies.

Of course, on the flipside, the people who do know how to program well need to start learning more about theory and need to start to have a better grasp of basic mathematical models like linear and logistic regressions. Lots of CS students seem not to enjoy their theory classes, but theory really does prepare you for thinking about what you can learn from data. You may not use automata theory if you work at Foursquare, but you will need to be able to reason carefully and analytically. Doing math is just like lifting weights: if you’re not good at it right now, you just need to dig in and get yourself in shape.

About-

John Myles White is a Phd Student in Ph.D. student in the Princeton Psychology Department, where he studies human decision-making both theoretically and experimentally. Along with the political scientist Drew Conway, he is the author of a book published by O’Reilly Media entitled “Machine Learning for Hackers”, which is meant to introduce experienced programmers to the machine learning toolkit. He is also working with Mark Hansenon a book for laypeople about exploratory data analysis.John is the lead maintainer for several R packages, including ProjectTemplate and log4r.

(TIL he has played in several rock bands!)

—–

You can read more in his own words at his blog at http://www.johnmyleswhite.com/about/

He can be contacted via social media at Google Plus at https://plus.google.com/109658960610931658914 or twitter at twitter.com/johnmyleswhite/

Please share:

Please share:

Please share:

Please share:

Related articles

Please share:

Please share:

Please share: