Google Docs and The Olympics

I did a graphical analysis of Olympic Medal Distributions here

http://spreadsheets.google.com/ccc?key=pS9vSxWuwOlmcp-m7LUaAKA&hl=en

After ranking the countries, and doing logs of their medals, I did the graphs (they are in sheet 2) – here are the findings

1) Log Bronze Medals is like a step function across countries

2)Total Medals is like a Poisson function.

3) Log Total Medals is like a nice Lorenze curve like figure.

4) Google docs charts is not as good as excel , but they are coming close.

5)Google docs charts are no MS silverlight when it comes to showing online graphs.

But they are ready to be published on the web in 5 mins.

6) if Michael Phelps was a country , he would be number 6 or 7.

7) China leads in gold medals (equal to next three US, Russia and UK combined)

8) US leads in total medals.

9) Worst per capita medal tally (non zero) would be India. 3 Medals per billion people.This is the first time India has won more than 1 medal in an Olympics though. We are still very very very happy about it.

10) Sports and Olympics are cool stuff. Despite the country rankings, most countries love sports.

Number 10 is not based on data but on hope. A special thanks to discussions which inspired this .http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0808C&L=sas-l&D=1&O=D&P=42233

Parsing XML files easily

To parse a XML (or KML or PMML) file easily without using any complicated softwares, here is a piece of code that fits right in your excel sheet.

Just import this file using Excel, and then use the function getElement, after pasting the XML code in 1 cell.

It is used  for simply reading the xml/kml code as a text string. Just pasted all the xml code in one cell, and used the start ,end function (for example start=<constraints> and end=</constraints> to get the value of constraints in the xml code).

Simply read into the value in another cell using the getElement function.

heres the code if you ever need it.Just paste it into the VB editor of Excel to create the GetElement function (if not there already) or simply import the file in the link above.

Attribute VB_Name = “Module1”
Public Function getElement(xml As String, start As String, finish As String)
For i = 1 To Len(xml)
If Mid(xml, i, Len(start)) = start Then
For j = i + Len(start) To Len(xml)
If Mid(xml, j, Len(finish)) = finish Then
getElement = Mid(xml, i + Len(start), j – i – Len(start))
Exit Function
End If
Next j
End If
Next i
End Function

FOR Using the R Package for parsing XML …………………………reference this site –

http://www.omegahat.org/RSXML/Overview.html

or this thread from R -Help

> Lines <- ‘
+ <root>
+  <data loc=”1″>
+    <val i=”t1″> 22 </val>
+    <val i=”t2″> 45 </val>
+  </data>
+  <data loc=”2″>
+    <val i=”t1″> 44 </val>
+    <val i=”t2″> 11 </val>
+  </data>
+ </root>
+ ‘
>
> library(XML)
> doc <- xmlTreeParse(Lines, asText = TRUE, trim = TRUE, useInternalNodes = TRUE)
> root <- xmlRoot(doc)
>
> data1 <- getNodeSet(root, “//data”)[[1]]
> xmlValue(getNodeSet(data1, “//val”)[[1]])
[1] ” 22 “

SAS Analytics :Google Earth and Lex Jansen’s Site

Google earth stores values into KML files . These are almost like XMl file formats. The zipped versions of the KML file is the KMZ file .(It beats me why Google Wanted to create a zipped file format for KML ,since most KML files are extremely small).

To do any geo-coding analysis with Google Earth, here are two SAS papers from Lex Jansen’s terrific site.

1) 

Put Your Customers on the Map: Integrating SAS/GRAPH and Google Earth
(http://www2.sas.com/proceedings/forum2008/252-2008.pdf)

Daniel Kuiper, Koen Vyverman (SAS Global Forum, 2008-03)

 

and

2) Using SAS and Google Earth to Access and Display Air Pollution Data

(http://www2.sas.com/proceedings/forum2008/253-2008.pdf)
Joshua Drukenbrod, David Mintz (SAS Global Forum, 2008-03)

 

These two papers are great in the way they use Google Earth for geo coding analysis and visual representation. They however require SAS to be licensed with you.

Lex Jansen ‘s site is generally considered the de facto site to search for analytics especially related to SAS.

 

 

 

 

 

 

 

 

 

 

 

 

Upcoming Book

The great Bob Muenchen, is coming with the very good updated version of the R for SAS and SPSS users book— in September 2008 to help people learn R , if they have used only SAS or SPSS before.We first covered the earlier edition of the book here.The book adds sections on R Commander, Rattle and JGR as well as two chapters on graphics ; one on basic stats. The author runs the examples and walks though them explaining each step, especially where the results differ from SAS & SPSS. Check the new book here-

http://www.amazon.com/SAS-SPSS-Users-Statistics-Computing/dp/0387094172/ref=pd_bbs_sr_1?ie=UTF8&s=books&qid=1217456813&sr=8-1