SAS X

0o0 0O

Tal G, creator of the rbloggers.com website, has created a new blog aggregator for SAS language users at http://sas-x.com/

With almost 26 blogs joining there (I suspect many more should join , it seems like a good website to use for analytics users and students.  My favorite SAS Blog is http://statcompute.spaces.live.com/ – its pure code- little anything else.

Related-

SAS MACRO TO CALCULATE PDO (Points to Double Odds) OF A SCORECARD

A SAS MACRO FOR DECISION STUMP

A DEMO OF VECTOR AUTOREGRESSIVE FORECASTING MODEL

 

 

 

Trying out Google Prediction API from R

Ubuntu Login
Image via Wikipedia

So I saw the news at NY R Meetup and decided to have a go at Prediction API Package (which first started off as a blog post at

http://onertipaday.blogspot.com/2010/11/r-wrapper-for-google-prediction-api.html

1)My OS was Ubuntu 10.10 Netbook

Ubuntu has a slight glitch plus workaround for installing the RCurl package on which the Google Prediction API is dependent- you need to first install this Ubuntu package for RCurl to install libcurl4-gnutls-dev

Once you install that using Synaptic,

Simply start R

2) Install Packages rjson and Rcurl using install.packages and choosing CRAN

Since GooglePredictionAPI is not yet on CRAN

,

3) Download that package from

https://code.google.com/p/google-prediction-api-r-client/downloads/detail?name=googlepredictionapi_0.1.tar.gz&can=2&q=

You need to copy this downloaded package to your “first library ” folder

When you start R, simply run

.libPaths()[1]

and thats the folder you copy the GooglePredictionAPI package  you downloaded.

5) Now the following line works

  1. Under R prompt,
  2. > install.packages("googlepredictionapi_0.1.tar.gz", repos=NULL, type="source")

6) Uploading data to Google Storage using the GUI (rather than gs util)

Just go to https://sandbox.google.com/storage/

and thats the Google Storage manager

Notes on Training Data-

Use a csv file

The first column is the score column (like 1,0 or prediction score)

There are no headers- so delete headers from data file and move the dependent variable to the first column  (Note I used data from the kaggle contest for R package recommendation at

http://kaggle.com/R?viewtype=data )

6) The good stuff:

Once you type in the basic syntax, the first time it will ask for your Google Credentials (email and password)

It then starts showing you time elapsed for training.

Now you can disconnect and go off (actually I got disconnected by accident before coming back in a say 5 minutes so this is the part where I think this is what happened is why it happened, dont blame me, test it for yourself) –

and when you come back (hopefully before token expires)  you can see status of your request (see below)

> library(rjson)
> library(RCurl)
Loading required package: bitops
> library(googlepredictionapi)
> my.model <- PredictionApiTrain(data="gs://numtraindata/training_data")
The request for training has sent, now trying to check if training is completed
Training on numtraindata/training_data: time:2.09 seconds
Training on numtraindata/training_data: time:7.00 seconds

7)

Note I changed the format from the URL where my data is located- simply go to your Google Storage Manager and right click on the file name for link address  ( https://sandbox.google.com/storage/numtraindata/training_data.csv)

to gs://numtraindata/training_data  (that kind of helps in any syntax error)

8) From the kind of high level instructions at  https://code.google.com/p/google-prediction-api-r-client/, you could also try this on a local file

Usage

## Load googlepredictionapi and dependent libraries
library(rjson)
library(RCurl)
library(googlepredictionapi)

## Make a training call to the Prediction API against data in the Google Storage.
## Replace MYBUCKET and MYDATA with your data.
my.model <- PredictionApiTrain(data="gs://MYBUCKET/MYDATA")

## Alternatively, make a training call against training data stored locally as a CSV file.
## Replace MYPATH and MYFILE with your data.
my.model <- PredictionApiTrain(data="MYPATH/MYFILE.csv")

At the time of writing my data was still getting trained, so I will keep you posted on what happens.

China -United States -The Third Opium War

U.S.troops in China during the Boxer Rebellion...
Image via Wikipedia

A brief glance through http://www.treasury.gov/resource-center/data-chart-center/tic/Documents/mfh.txt

shows that while US added 600 billion of debt during the past one year, the Chinese actually reduced their exposure by 50 billion Dollars.

so who has been financing the debt for the US for the past one year- It is Japan- eager to keep its currency down and United Kingdom which has pumped in an extra 300 billion of T Bills.

See the whole table at official link above or at goo.gl/qMugp

—————————————————————————————-

China still remembers the Opium Wars in which the then ruling Anglo Saxon superpower used naval superiority to enforce trade and eventual political dependency. Is China unsure of the United States brotherly nice  intentions? They certainly seem to be putting their money that way.

http://en.wikipedia.org/wiki/Opium_Wars

Britain forced the Chinese government into signing theTreaty of Nanking and the Treaty of Tianjin, also known as the Unequal Treaties, which included provisions for the opening of additional ports to unrestricted foreign trade, for fixed tariffs; for the recognition of both countries as equal in correspondence; and for the cession of Hong Kong to Britain. The British also gained extraterritorial rights. Several countries followed Britain and sought similar agreements with China. Many Chinese found these agreements humiliating and these sentiments contributed to the Taiping Rebellion (1850–1864), the Boxer Rebellion (1899–1901), and the downfall of the Qing Dynasty in 1912, putting an end to dynastic China.

———————————————————————————————-

The Koreans can always be depended on provide the first shot in any conflict- and though Anglo-US-Chinese conflict would be expensive- I guess as long as the cost of outstanding debt with China is less than cost of a brief -techno-war , we would see interesting games in this neighborhood. Note China restricts major trade with United States particularly in software, internet services (like Web Advertising, Facebook, Twitter ) and represents a lucrative market for big pharma (especially in psychiatric drugs) and big tech once it reforms its intellectual property rights. Software would be the opium of the 21st Century- if Chinese resist the Treasury Bills as their poppy flowers. The widespread Western media coverage of school kids murders by pyschopaths is also a trade tactic to encourage flow of more US made medicine in the Chinese market.

It would also help create an economic revival in the United States to exaggerate the Chinese threat (remember Sputnik) and build up its own cyber spending. Any military or cyber humiliation for the ruling party in China can help create a political vacuum for more malleable and agreeable alternatives to emerge.

(to be continued)

 

R is Ready for Business™

A new 5 page brochure from Revolution Analytics. Not that slick and some marketing under-kill (which frankly is a surprise)- but I guess Revolution Analytics does not have a full time graphics designer to help with it’s collateral.

Take a look if you are curious how and why R is getting more and more ready for business.

Test drive a Chrome notebook.

The United States
Image via Wikipedia

Wanna test out the new Chrome OS.

Go to https://services.google.com/fb/forms/cr48basic/

and fill the form

Chrome

Test drive a Chrome notebook.

We have a limited number of Chrome notebooks to distribute, and we need to ensure that they find good homes. That’s where you come in. Everything is still very much a work in progress, and it’s users, like you, that often give us our best ideas about what feels clunky or what’s missing. So if you live in the United States, are at least 18 years old, and would like to be considered for our small Pilot program, please fill this out. We’ll review the requests that come in and contact you if you’ve been selected.

https://services.google.com/fb/forms/cr48basic/

 

How to Analyze Wikileaks Data – R SPARQL

Logo for R
Image via Wikipedia

Drew Conway- one of the very very few Project R voices I used to respect until recently. declared on his blog http://www.drewconway.com/zia/

Why I Will Not Analyze The New WikiLeaks Data

and followed it up with how HE analyzed the post announcing the non-analysis.

“If you have not visited the site in a week or so you will have missed my previous post on analyzing WikiLeaks data, which from the traffic and 35 Comments and 255 Reactions was at least somewhat controversial. Given this rare spotlight I thought it would be fun to use the infochimps API to map out the geo-location of everyone that visited the blog post over the last few days. Unfortunately, after nearly two years with the same web hosting service, only today did I realize that I was not capturing daily log files for my domain”

Anyways – non American users of R Project can analyze the Wikileaks data using the R SPARQL package I would advise American friends not to use this approach or attempt to analyze any data because technically the data is still classified and it’s possession is illegal (which is the reason Federal employees and organizations receiving federal funds have advised not to use this or any WikiLeaks dataset)

https://code.google.com/p/r-sparql/

Overview

R is a programming language designed for statistics.

R Sparql allows you to run SPARQL Queries inside R and store it as a R data frame.

The main objective is to allow the integration of Ontologies with Statistics.

It requires Java and rJava installed.

Example (in R console):

> library(sparql)> data <- query("SPARQL query>","RDF file or remote SPARQL Endpoint")

and the data in a remote SPARQL  http://www.ckan.net/package/cablegate

SPARQL is an easy language to pick  up, but dammit I am not supposed to blog on my vacations.

http://code.google.com/p/r-sparql/wiki/GettingStarted

Getting Started

1. Installation

1.1 Make sure Java is installed and is the default JVM:

$ sudo apt-get install sun-java6-bin sun-java6-jre sun-java6-jdk$ sudo update-java-alternatives -s java-6-sun

1.2 Configure R to use the correct version of Java

$ sudo R CMD javareconf

1.3 Install the rJava library

$ R> install.packages("rJava")> q()

1.4 Download and install the sparql library

Download: http://code.google.com/p/r-sparql/downloads/list

$ R CMD INSTALL sparql-0.1-X.tar.gz

2. Executing a SPARQL query

2.1 Start R

#Load the librarylibrary(sparql)#Run the queryresult <- query("SELECT ... ", "http://...")#Print the resultprint(result)

3. Examples

3.1 The Query can be a string or a local file:

query("SELECT ?date ?number ?season WHERE {  ... }", "local-file.rdf")
query("my-query.rq", "local-file.rdf")

The package will detect if my-query.rq exists and will load it from the file.

3.3 The uri can be a file or an url (for remote queries):

query("SELECT ... ","local-file.db")
query("SELECT ... ","http://dbpedia.org/sparql")

3.4 Get some examples here: http://code.google.com/p/r-sparql/downloads/list

SPARQL Tutorial-

http://openjena.org/ARQ/Tutorial/index.html

Also read-

http://webr3.org/blog/linked-data/virtuoso-6-sparqlgeo-and-linked-data/

and from the favorite blog of Project R- Also known as NY Times

http://bits.blogs.nytimes.com/2010/11/15/sorting-through-the-government-data-explosion/?twt=nytimesbits

In May 2009, the Obama administration started putting raw 
government data on the Web. 
It started with 47 data sets. Today, there are more than
 270,000 government data sets, spanning every imaginable 
category from public health to foreign aid.

Collateral

10 9 09 Bearman Cartoon Obama Nobel Peace Prize
Image by Bearman2007 via Flickr

It has always surprised me- how my American friends who passionately support the First Amendment kind of always oppose the Second Amendment and vice versa. Being a non American- I would always take the Fifth.

An earlier Wikileak video of killing two Reuters Employees-and I am not sure who is right- American govt for restricting access to federal employees or Chinese govt for restricting access to Nobel peace  prize.

or all the Govts of the world for all the cables they write. and all the journalists for all the stories they tell.

Merry Christmas anyways.

from Wikiquotes of another Indian.

http://en.wikiquote.org/wiki/Mohandas_Karamchand_Gandhi

Facts we would always place before our readers, whether they are palatable or not, and it is by placing them constantly before the public in their nakedness that the misunderstanding between the two communities in South Africa can be removed.

In this instance of the fire-arms, the Asiatic has been most improperly bracketed with the native. The British Indian does not need any such restrictions as are imposed by the Bill on the natives regarding the carrying of fire-arms. The prominent race can remain so by preventing the native from arming himself. Is there a slightest vestige of justification for so preventing the British Indian?

  • Comments on a court case in The Indian Opinion (25 March 1905)
  • Had we adopted non-violence as the weapon of the strong, because we realised that it was more effective than any other weapon, in fact the mightiest force in the world, we would have made use of its full potency and not have discarded it as soon as the fight against the British was over or we were in a position to wield conventional weapons. But as I have already said, we adopted it out of our helplessness. If we had the atom bomb, we would have used it against the British.
    • Speech (16 June 1947) as the official date for Indian independence approached (15 August 1947) , as quoted in Mahatma Gandhi : The Last Phase (1958) by Pyarelal, p. 326. The last sentence of this statement has sometimes been quoted as if it was being made as an affirmation of extreme hostility to the British, rather than as part of an affirmation of the strength of non-violence, and the ultimate weakness of those who needlessly resort to violence if it is within their power.
  • One of the objects of a newspaper is to understand popular feeling and to give expression to it; another is to arouse among the people certain desirable sentiments; and the third is fearlessly to expose popular defects
  • The non-violent state will be an ordered anarchy. That State is the best governed which is governed the least.

[http://www.youtube.com/watch?v=5rXPrfnU3G0&feature=player_embedded]