R and Google Docs

I was working on R and generating a huge number of graphs. These then needed to be uploaded to Google Docs for Collobrative working. A bit taxed, I asked a query on the

R Help list r-help@r-project.org

My fears were right – there was no such package. But then a great programming wizard Duncan from Omegahat wrote back that he has just developed a package to do that.

Hallejuah !

Cloud computing /Google Docs say hello to the R Statistics package.

R say hello to the cloud.

R can be used for data mining as well (but it is dependent on RAM as it works in RAM memory).It is an ideal fit for Statistics Packages as a SaaS on something like Amazon EC2 given high annual costs of  other stats packages, and local hardware needed.

I first wrote about this fit in the Ohri Framework which talks of the fit of cloud computing and open source statistics.R is free and has thousands of developers.

Anyways ,Download the RGoogleDocs package from below site

http://www.omegahat.org/RGoogleDocs/

"

The RGoogleDocs package

Last Release: 0.1-0 (24 Sep 2008)

This package is an example of using the RCurl and XML packages to quickly develop an interface to the Google Documents API. It was written to test RCurl and also to illustrate how to use these tools with modern Web applications. Google Documents is a REST "application" and so this is an example of using R to interface to that.

The package allows you to get a list of the documents and details about each of them, download the contents of a document, remove a document, and upload a document, even binary files.

There is some documentation for the package at this point in the form of a "user’s guide" (or in PDF form). These are included with the package.

If you want to upload binary files, you will want to install a beta version of the RCurl package. Duncan Temple Lang <duncan@wald.ucdavis.edu> Last modified: Wed Sep 24 11:51:45 PDT 2008

"

Tip: Google Docs for backup during Travel

My standard procedure while traveling is to upload a copy of E tickets, scanned copis of passport and Visas ,Insurance documents, identification on Google Docs. Share them with my family. It allows access from anywhere where there is an internet so quite useful if traveling and you want to back up travel documents safely.(Note Some details hidden)

image DOCUMENTS

image GRAPHS/SPREADSHEET

For important meeting especially bulky presentations, I upload to Google docs, to avoid/back up hardware failure. Also loading them in browser is faster and looks better during the meeting (besides getting a wow factor)

http://docs.google.com

Only problem- Google Docs doesnot support PDF uploads more than 10 mb. This is especially a problem when creating a large number of graphs using R, as they can be heavy. The workaround is splitting the PDF (if it is a normal document) , or Modifying R for pictures per page to reduce size.

 

image PDFimage PPT

So Try uploading a sample document to www.docs.google.com today and get surprsied.

For creating a free collobrative space between team members globally ,similarly check out http://sites.google.com .

If you want you can even build your corporate intranet or knowledge sharing wiki here , and track usage statistics through Google Analytics.

image

Cost is free. No ads. Since Google earns so much from ads and now from phones, this is their way of changing the way the world works better.

ps You can download all data/maintain offline copies using Google Gears and click delete in less than a minute if you need to shut it down.

Advanced Mathematical Economics

Total Cost of Mortgage Bailout (including companies specific loans etc) = 1300 Billion Dollars

Number of US households defaulting on loans  = 100 million (assumed)

Bailout per Household = 13,000 Dollars

Deleting extravagent houses/overpaid /people at fault /above 3 bedrooms = 50 million households

New Bailout per Household =26,000 Dollars

Lowering of EMI by Lowering of Mortgage Rate, by Fed Rate Cut =1000 Dollars per annum

Lowering of EMI by increasing tenure to 40 years = 500 dollars per month

Why cant we give tax payer’s money back to the tax payer themselves

Risk Added to Govt Treasury = 0

Jobs lost at financial sector = 0

Houses lost by ordinary Americans =0

Note- Numbers are close approximation of estimates using Fermi Logic of problem solving

Can someone tell whats the problem in this logic ???

Online Market Survey:SurveyMonkey

I did a test survey for my struggling cartoon series Blog Boy

Here is Blog Boy

 

 

 

image

Here are the responses from 99 people,using free survey from www.surveymonkey.com

Please note the site has fabulous features so you should visit to check for yourself. Anyways, back to analytics

So dear reader, what do you think the analytical decisions are based on the survey response. ?

 

Total Started Survey:

99

Total Completed Survey:

99  (100%)

Page: Default Section

1. Blog Boy is a cartoon based on Bloggers. What is the first thing that comes to your mind when you think Blog Boy

 

Response
Percent

Response
Count

A boy who blogs

clip_image002

74.7%

74

A generic term

clip_image002

11.1%

11

Where is blog girl?

clip_image002

4.0%

4

clip_image003Other (please specify)

clip_image002

10.1%

10

 

answered question

99

 

skipped question

0

 

2. What is the source of cartoons for you ?

 

Response
Percent

Response
Count

Blogs

clip_image002

6.1%

6

Websites

clip_image002

35.4%

35

Newspaper (Daily)

clip_image002

53.5%

53

Newspaper (Sunday)

clip_image002

37.4%

37

Newsletter

clip_image002

2.0%

2

clip_image003Other (please specify)

clip_image002

14.1%

14

 

answered question

99

 

skipped question

0

 

3. Which cartoons have you read ?

 

Response
Percent

Response
Count

Garfield and Jon

clip_image002

59.6%

59

Dilbert

clip_image002

87.9%

87

Calvin and Hobbes

clip_image002

75.8%

75

I dont read cartoons anymore

clip_image002

8.1%

8

I dont like to read cartoons

clip_image002

4.0%

4

clip_image003Other (please specify)

clip_image002

13.1%

13

 

answered question

99

 

skipped question

0

 

4. Are you a blogger yourself ?

 

Response
Percent

Response
Count

Yes

clip_image002

21.2%

21

No

clip_image002

75.8%

75

Define Blogger

clip_image002

4.0%

4

clip_image003Other (please specify)

1

 

answered question

99

 

skipped question

0

 

5. What other sources of humour do you go for ?

 

Response
Percent

Response
Count

TV shows

clip_image002

72.7%

72

Daily spoof news

clip_image002

13.1%

13

Websites (please specify)

clip_image002

28.3%

28

Actual funny news

clip_image002

22.2%

22

Friends

clip_image002

70.7%

70

Online chat sessions

clip_image002

7.1%

7

Long market research surveys

clip_image002

4.0%

4

clip_image003Other (please specify)

clip_image002

10.1%

10

 

answered question

99

 

skipped question

0

 

6. What are the various issues that you think a blogger faces ?

 

Response
Count

clip_image003

99

 

answered question

99

 

skipped question

0

 

7. You just became editor of a major syndicated cartoon called Blog Boy.What features would you like him to have.Blog Boy is hosted on http://www.iwannacrib.com

 

Response
Percent

Response
Count

Great cute graphics

clip_image002

35.4%

35

Clean Simple graphics

clip_image002

64.6%

64

Clear words

clip_image002

52.5%

52

Constant change in themes

clip_image002

44.4%

44

New characters

clip_image002

27.3%

27

Feedback mechanisn to allow viewers to request a custom cartoon

clip_image002

29.3%

29

clip_image003Other (please specify)

12

 

answered question

99

 

skipped question

0

 

8. How often do you read cartoons ?

 

Response
Percent

Response
Count

Everyday

clip_image002

37.4%

37

Weekly

clip_image002

35.4%

35

Monthly

clip_image002

6.1%

6

Occasionally (once in two-three months)

clip_image002

14.1%

14

Rarely (once in six months)

clip_image002

1.0%

1

Never

clip_image002

3.0%

3

clip_image003Other (please specify)

clip_image002

3.0%

3

 

answered question

99

 

skipped question

0

 

6. What are the various issues that you think a blogger faces ?

Displaying 1 – 99 of 99 responses    << Prev Next >> Jump To:  Go >>
  Comment Text
1. fear of acceptance nd readership
2. lack of good options to custoize the blog page
3. i am least bothered!
4. No idea
5. no clue buddy
6. Not enough exposure
7. not enough people responding to blogs unable to post blogs on the site unable to modify / re-design the blog site – graphics etc.. when the blogger wishes to do so
8. dont know
9. creating an identity of his own
10. Dearth of topics; desire for readership
11. 1. privacy 2. relevant content 3. avoiding being an offense to someone
12. frequency of updation/ popularization/ business benefits if any/ template issues.
13. Not sure
14. A good syndication mechanism
15. NA
16. – Lack of interesting material – People who comment without actually reading or understanding the post – People who go off-topic in their comments
17. Wouldn’t know as I’ve already mentioned I’m not a blogger.
18. No clue
19. Publicity
20. scarcity of novelty and limited reach to the audience.
21. Privacy?
22. Not sure; not a blogger myself
23. na
24. I guess it becomes an exercise in itself to consistently write articles/update his blog.
25. Increasing viewership of blog: increasing number of hits on his blog.
26. Finding topics deserving to be written about, the dilemma whether to use publicity stunts, the disappointment at no or useless comments
27. Don’t know coz I don’t blog
28. no comments
29. Ideas !
30. No clue
31. None!!!
32.
33. giving an opinion without having any
34. Not aware
35. Possible issues arising from the gray areas between right to freedom of speech and breaking a law Possible discrimination for expressing personal views
36. I do not Blog so not very sure
37. various blogs
38. Never blogged, no idea!
39. not a blogger but i guess motivation to continue blogging would be high on the list
40. Creativity
41. s
42. targeting the right audience he’s looking for, posting blogs relevant to his target audience, immature and unwanted responses, time to keep blog updated
43. generate content that can interest people
44. 1. laziness (time) 2. laziness (effort) 3. laziness (easier alternative to while time on net, just by clicking mouse) 4. others’ blog (blogging the brain away off your potential readers)
45. Content!
46. Bandwidth for constant updation.
47. none – blog what u want
48. none
49. none
50. spam offline access
51. Never blogged !! I guess I’m a traditionalist !!😦
52. lack of time
53. no idea
54. No idea!
55. Don’t know
56. regarding wht? ask a clear question if u want a good answer
57. Time and connectivity problems
58. No idea as i am not a blogger.
59. Content pressure
60. > How to ensure that everyone reads what he writes
61. privacy
62. too much free time
63. Creativity
64. 1. Blogs sites are seldom highlighted in main stream.. 2. Can be used to harm some1’s image.. 3. Anonymous content need to be screened esp. private ones… u know for a miscreant everything is within reach when it is on net.. 4. Many blogsites doesn’t allow lots of images insertion, edit the date of insertion (i generally post blog afer i write it on my diary) and transfer the same when i connnect online.. so all blogs posted on same day come to one single date).
65. Getting an audience to read his/her blogs
66. content creation, quality of content
67. none
68. I dont know
69. dont blog, no idea
70. put words to thought …..
71. Getting his blog known, running out of fresh ideas
72. Creating a larger network of repeated visitors
73. how to kill time
74. Readership. Curbing of freedom, in some cases.
75. i dont know
76. lack of readership abuse personal attacks
77. Online Stalkers!
78. privacy, writers block
79. na
80. No Idea
81. freedom to express opinions
82. i thought blogging is just a fad dying a natural death.
83. getting readers to come to the site
84. Um – dont know
85. Dunno
86. content and time
87. Don’t know
88. I dont care!
89. No clue
90. i dint know .
91. NA
92. Understanding what topics readers might find interesting to read about.
93. regular content
94. comments
95. No clue
96. public perception, building and keeping an audience, coming up with fresh ideas
97. I have no idea
98. I don’t blog enough to have thought about it.
99. Low traffic
  10 responses per page  25 responses per page  50 responses per page  100 responses per page  250 responses per page 

 
There seems a potential market here , but Blog Boy just gonna be a slog boy.

The Digital Divide Hypothesis

image

This is a screenshot of last 1000 visitors to Decision Stats. Notice heavy concentration in US East Coast and West Coast, Latin America’s light visitors, Almost none from Africa, Some Israeli visitors from Middle East (but not many from Arab countries). Europe is visiting ,it tapers of in Eastern Europe but not most of Western Russia and China (language problems ? proxy walls ?)

It may also serve as a hypothesis for where internet is affordable enough and economies developed enough for people to surf analytics web sites. And where the digital haves and have nots stand. Either that or something wrong in Web Analytics software.

image

Bonus- Blog Boy cartoons on Weekends for levity in these interesting times.

Textbooks for free

If you wanted to learn without paying thousands of rupees for expensive textbooks ,help is here. This is an initiative by Rice University called connexions and I quote "

Connexions is:

a place to view and share educational material made of small knowledge chunks called modules that can be organized as courses, books, reports, etc. Anyone may view or contribute:

  • authors create and collaborate
  • instructors rapidly build and share custom collections
  • learners find and explore content "

The textbooks are basically mashups of modules by experts around the world. Especially check out the applied statistics textbook.

I kind of liked it better than Google Knol ,but lately

Google Scholar is a good way to search for knowledge too.

The site is here at http://cnx.org/

image

Tip :Regression Models using MS Excel

Did you know that you can create a regression tool in MS Excel for regression of upto 16 variables. You can automate it using record Macro feature even if you donot know Excel Macros.

SAS ETS and SPSS Trends can both make time series models. But so can MS Excel by linking up of cells. In addition you can use the solver add-in to minimize the square of the error term by changing the parameters of the exponential smoothening forecasts used.

Regression is one of the simplest technique to use, and is the industry standard for analytics, yet is sometimes overlooked for segmentation or decision trees to get even more simplicity.