Home » Posts tagged 'social media'
Tag Archives: social media
1) Blog post title should be self explanatory
2) Use categories and tags for better navigation
3) Use a theme which attracts not distracts
4) Simple language in blog writing works best
5) Useful blogs get more traffic than autobiographical blogs. Unless you are a celebrity.
6) People who enjoy writing blogs create better blogs
7) Writing a blog is like jogging. Do it every day , even when its boring and painful. or Do it as much as your schedule permits.
Tatvic, a up and coming startup founded by an ex-Trilogy colleague, has helped with the R for Google Analytics package. While Tatvic is into heavy duty web analytics, they are betting big on R, and using it for Web Analytics. David Smith, most excellent blogger-de-chief in R universe has blogged on them before here http://blog.revolutionanalytics.com/2013/02/analyze-web-traffic-data-with-google-analytics-and-r.html
Here is an upcoming seminar on R in Web Analytics.
From this webinar, you will get to know:
- What is R and why should you use this tool? How to extract your Web Analytics data into R?
- How to build a predictive model using web analytics data with the help of R?
- How predictive modelling can take your analysis to the next level?
- How to carry out insightful analysis through visualization?
Who should attend: Every web analyst who wants to take his analysis to the next level.
ps- Hat tip to Caroline A
The awesome Hadley Wickham has just released the next version of httr package. Prof Hadley is currently on leave from Rice Univ and working with the tremendous geeks at R Studio . New things in the httr package-
httr, a package designed to make it easy to work with web APIs. Httr is a wrapper around RCurl, and provides:
- functions for the most important http verbs:
- support for OAuth 1.0 and 2.0. Use
oauth2.0_tokento get user tokens, and
sign_oauth2.0to sign requests. The demos directory has six demos of using OAuth: three for 1.0 (linkedin, twitter and vimeo) and three for 2.0 (facebook, github, google).
I especially like the OAuth functionality as I occasionaly got flummoxed with existing R OAuth packages , and this should hopefully lead to awesome new social media analytics posts by the larger R blogger community. Also given the fact that unauthenticated API requests to Twitter are greatly expanded by OAuth authenticated requests- (see https://dev.twitter.com/docs/rate-limiting )
- Unauthenticated calls are permitted 150 requests per hour. Unauthenticated calls are measured against the public facing IP of the server or device making the request.
- OAuth calls are permitted 350 requests per hour and are measured against the oauth_token used in the request.
some creative use cases should see an incredible amount of cross social media analysis (not just one social media channel ) at a time.
R for Social Media Analytics ? Watch this space..
- New version of httr: 0.2 (rstudio.org)
Here is an interview with one of the younger researchers and rock stars of the R Project, John Myles White, co-author of Machine Learning for Hackers.
Ajay- What inspired you guys to write Machine Learning for Hackers. What has been the public response to the book. Are you planning to write a second edition or a next book?
John-We decided to write Machine Learning for Hackers because there were so many people interested in learning more about Machine Learning who found the standard textbooks a little difficult to understand, either because they lacked the mathematical background expected of readers or because it wasn’t clear how to translate the mathematical definitions in those books into usable programs. Most Machine Learning books are written for audiences who will not only be using Machine Learning techniques in their applied work, but also actively inventing new Machine Learning algorithms. The amount of information needed to do both can be daunting, because, as one friend pointed out, it’s similar to insisting that everyone learn how to build a compiler before they can start to program. For most people, it’s better to let them try out programming and get a taste for it before you teach them about the nuts and bolts of compiler design. If they like programming, they can delve into the details later.
Ajay- What are the key things that a potential reader can learn from this book?
John- We cover most of the nuts and bolts of introductory statistics in our book: summary statistics, regression and classification using linear and logistic regression, PCA and k-Nearest Neighbors. We also cover topics that are less well known, but are as important: density plots vs. histograms, regularization, cross-validation, MDS, social network analysis and SVM’s. I hope a reader walks away from the book having a feel for what different basic algorithms do and why they work for some problems and not others. I also hope we do just a little to shift a future generation of modeling culture towards regularization and cross-validation.
Ajay- Describe your journey as a science student up till your Phd. What are you current research interests and what initiatives have you done with them?
John-As an undergraduate I studied math and neuroscience. I then took some time off and came back to do a Ph.D. in psychology, focusing on mathematical modeling of both the brain and behavior. There’s a rich tradition of machine learning and statistics in psychology, so I got increasingly interested in ML methods during my years as a grad student. I’m about to finish my Ph.D. this year. My research interests all fall under one heading: decision theory. I want to understand both how people make decisions (which is what psychology teaches us) and how they should make decisions (which is what statistics and ML teach us). My thesis is focused on how people make decisions when there are both short-term and long-term consequences to be considered. For non-psychologists, the classic example is probably the explore-exploit dilemma. I’ve been working to import more of the main ideas from stats and ML into psychology for modeling how real people handle that trade-off. For psychologists, the classic example is the Marshmallow experiment. Most of my research work has focused on the latter: what makes us patient and how can we measure patience?
Ajay- How can academia and private sector solve the shortage of trained data scientists (assuming there is one)?
John- There’s definitely a shortage of trained data scientists: most companies are finding it difficult to hire someone with the real chops needed to do useful work with Big Data. The skill set required to be useful at a company like Facebook or Twitter is much more advanced than many people realize, so I think it will be some time until there are undergraduates coming out with the right stuff. But there’s huge demand, so I’m sure the market will clear sooner or later.
(TIL he has played in several rock bands!)
ACM, the Association for Computing Machinery www.acm.org, is the world’s largest educational and scientific computing society, uniting computing educators, researchers and professionals to inspire dialogue, share resources and address the field’s challenges. )
- the volume of data that is available is growing at a rate we have never seen before. Cisco has measured an 8 fold increase in the volume of IP traffic over the last 5 years and predicts that we will reach the zettabyte of data over IP in 2016
- more of the data is becoming publicly available. This isn’t only on social networks such as Facebook and twitter, but joins a more general trend involving open research initiatives and open government programs
- the desired time to get meaningful results is going down dramatically. In the past 5 years we have seen the half life of data on Facebook, defined as the amount of time that half of the public reactions to any given post (likes, shares., comments) take place, go from about 12 hours to under 3 hours currently
- our access to the net is always on via mobile device. You are always connected.
- the CPU and GPU capabilities of mobile devices is huge (an iPhone has 10 times the compute power of a Cray-1 and more graphics capabilities than early SGI workstations)
- thought leadership by tracking content that your readership is interested in via TrendSpottr you can be seen as a thought leader on the subject by being one of the first to share trending content on a given subject. I personally do this on my Facebook page (http://www.facebook.com/alain.chesnais) and have seen my klout score go up dramatically as a result
- brand marketing to be able to know when something is trending about your brand and take advantage of it as it happens.
- competitive analysis to see what is being said about two competing elements. For instance, searching TrendSpottr for “Obama OR Romney” gives you a very good understanding about how social networks are reacting to each politician. You can also do searches like “$aapl OR $msft OR $goog” to get a sense of what is the current buzz for certain hi tech stocks.
- understanding your impact in real time to be able to see which of the content that you are posting is trending the most on social media so that you can highlight it on your main page. So if all of your content is hosted on common domain name (ourbrand.com), searching for ourbrand.com will show you the most active of your site’s content. That can easily be set up by putting a TrendSpottr widget on your front page
Ajay- What are some of the privacy guidelines that you keep in mind- given the fact that you collect individual information but also have government agencies as potential users.
Prior to his election as ACM president, Chesnais was vice president from July 2008 – June 2010 as well as secretary/treasurer from July 2006 – June 2008. He also served as president of ACM SIGGRAPH from July 2002 – June 2005 and as SIG Governing Board Chair from July 2000 – June 2002.
As a French citizen now residing in Canada, he has more than 20 years of management experience in the software industry. He joined the local SIGGRAPH Chapter in Paris some 20 years ago as a volunteer and has continued his involvement with ACM in a variety of leadership capacities since then.
TrendSpottr is a real-time viral search and predictive analytics service that identifies the most timely and trending information for any topic or keyword. Our core technology analyzes real-time data streams and spots emerging trends at their earliest acceleration point — hours or days before they have become “popular” and reached mainstream awareness.
TrendSpottr serves as a predictive early warning system for news and media organizations, brands, government agencies and Fortune 500 companies and helps them to identify emerging news, events and issues that have high viral potential and market impact. TrendSpottr has partnered with HootSuite, DataSift and other leading social and big data companies.
Some possible electronic disruptions that threaten to disrupt the electoral cycle in United States of America currently underway is-
1) Limited Denial of Service Attacks (like for 5-8 minutes) on fund raising websites, trying to fly under the radar of network administrators to deny the targeted fundraising website for a small percentage of funds . Money remains critical to the world’s most expensive political market. Even a 5% dropdown in online fund-raising capacity can cripple a candidate.
2) Limited Man of the Middle Attacks on ground volunteers to disrupt ,intercept and manipulate communication flows. Basically cyber attacks at vulnerable ground volunteers in critical counties /battleground /swing states (like Florida)
3) Electro-Magnetic Disruptions of Electronic Voting Machines in critical counties /swing states (like Florida) to either disrupt, manipulate or create an impression that some manipulation has been done.
4) Use search engine flooding (for search engine de-optimization of rival candidates keywords), and social media flooding for disrupting the listening capabilities of sentiment analysis.
5) Selected leaks (including using digital means to create authetntic, fake or edited collateral) timed to embarrass rivals or influence voters , this can be geo-coded and mass deployed.
6) using Internet communications to selectively spam or influence independent or opinionated voters through emails, short messaging service , chat channels, social media.
7) Disrupt the Hillary for President 2016 campaign by Anonymous-Wikileak sympathetic hacktivists.