Online Education takes off

Udacity is a smaller player but welcome competition to Coursera. I think companies that have on demand learning programs should consider donating a course to these online education players (like SAS Institute for SAS , Revolution Analytics for R, SAP, Oracle for in-memory analytics etc)

Any takers!

http://www.udacity.com/

 

Coursera  is doing a superb job with huge number of free courses from notable professors. 111 courses!

I am of course partial to the 7 courses that are related to my field-

https://www.coursera.org/

 

 

New Amazon Instance: High I/O for NoSQL

Latest from the Amazon Cloud-

hi1.4xlarge instances come with eight virtual cores that can deliver 35 EC2 Compute Units (ECUs) of CPU performance, 60.5 GiB of RAM, and 2 TiB of storage capacity across two SSD-based storage volumes. Customers using hi1.4xlarge instances for their applications can expect over 120,000 4 KB random write IOPS, and as many as 85,000 random write IOPS (depending on active LBA span). These instances are available on a 10 Gbps network, with the ability to launch instances into cluster placement groups for low-latency, full-bisection bandwidth networking.

High I/O instances are currently available in three Availability Zones in US East (N. Virginia) and two Availability Zones in EU West (Ireland) regions. Other regions will be supported in the coming months. You can launch hi1.4xlarge instances as On Demand instances starting at $3.10/hour, and purchase them as Reserved Instances

http://aws.amazon.com/ec2/instance-types/

High I/O Instances

Instances of this family provide very high instance storage I/O performance and are ideally suited for many high performance database workloads. Example applications include NoSQL databases like Cassandra and MongoDB. High I/O instances are backed by Solid State Drives (SSD), and also provide high levels of CPU, memory and network performance.

High I/O Quadruple Extra Large Instance

60.5 GB of memory
35 EC2 Compute Units (8 virtual cores with 4.4 EC2 Compute Units each)
2 SSD-based volumes each with 1024 GB of instance storage
64-bit platform
I/O Performance: Very High (10 Gigabit Ethernet)
Storage I/O Performance: Very High*
API name: hi1.4xlarge

*Using Linux paravirtual (PV) AMIs, High I/O Quadruple Extra Large instances can deliver more than 120,000 4 KB random read IOPS and between 10,000 and 85,000 4 KB random write IOPS (depending on active logical block addressing span) to applications. For hardware virtual machines (HVM) and Windows AMIs, performance is approximately 90,000 4 KB random read IOPS and between 9,000 and 75,000 4 KB random write IOPS. The maximum sequential throughput on all AMI types (Linux PV, Linux HVM, and Windows) per second is approximately 2 GB read and 1.1 GB write.

SAS and Hadoop

Awesomely informative post on sascom magazine (whose editor I have I interviewed before here at http://www.decisionstats.com/interview-alison-bolen-sas-com/ – )

Great piece by Michael Ames ,SAS Data Integration Product Manager.

http://www.sas.com/news/sascom/hadoop-tips.html

 

Also see SAS’s big data thingys here at

http://www.sas.com/software/high-performance-analytics/in-memory-analytics/index.html

Solutions and Capabilities Using SAS® In-Memory Analytics

  • High-Performance Analytics – Get near-real-time insights with appliance-ready analytics software designed to tackle big data and complex problems.
  • High-Performance Risk – Faster, better risk management decisions based on the most up-to-date views of your overall risk exposure.
  • High-Performance Liquidity Risk Management – Take quick, decisive actions to secure adequate funding, especially in times of volatility.
  • High-Performance Stress Testing – Make faster, more precise decisions to protect the health of the firm.
  • Visual Analytics – Explore big data using in-memory capabilities to better understand all of your data, discover new patterns and publish reports to the Web and iPad®.

(Ajay- I liked the Visual Analytics piece especially for Big Data )

Note-

 

Who made Who in #Rstats

While Bob M, my old mentor and fellow TN man maintains the website http://r4stats.com/ how popular R is across various forums, I am interested in who within R community of 3 million (give or take a few) is contributing more. I am very sure by 2014, we can have a new fork of R called Hadley R, in which all packages would be made by Hadley Wickham and you wont need anything else.

But jokes apart, since I didnt have the time to

1) scrape CRAN for all package authors

2) scrape for lines of code across all packages

3) allocate lines of code (itself a dubious software productivity metric) to various authors of R packages-

OR

1) scraping the entire and 2011’s R help list

2) determine who is the most frequent r question and answer user (ala SAS-L’s annual MVP and rookie of the year awards)

I did the following to atleast who is talking about R across easily scrapable Q and A websites

Stack Overflow still rules over all.

http://stackoverflow.com/tags/r/topusers shows the statistics on who made whom in R on Stack Overflow

All in all, initial ardour seems to have slowed for #Rstats on Stack Overflow ? or is it just summer?

No the answer- credit to Rob J Hyndman is most(?) activity is shifting to Stats Exchange

http://stats.stackexchange.com/tags/r/topusers


You could also paste this in Notepad and some graphs on Average Score / Answer or even make a social network graph if you had the time.

Do NOT (Go/Bi) search for Stack Overflow API or web scraping stack overflow- it gives you all the answers on the website but 0 answers on how to scrape these websites.

I have added a new website called Meta Optimize to this list based on Tal G’s interview of Joseph Turian,  at http://www.r-statistics.com/2010/07/statistical-analysis-qa-website-did-stackoverflow-just-lose-it-to-metaoptimize-and-is-it-good-or-bad/

http://metaoptimize.com/qa/tags/r/?sort=hottest

There are only 17 questions tagged R but it seems a lot of views is being generated.

I also decided to add views from Quora since it is Q and A site (and one which I really like)

http://www.quora.com/R-software

Again very few questions but lot many followers

The economics of software piracy

Software piracy exists because-

1) Lack of appropriate technological controls (like those on DVDs) or on Bit Torrents (an innovation on the centralized server like Napster) or on Streaming etc etc.

Technology to share content has evolved at a much higher pace than technology to restrict content from being shared or limited to purchasers.

2) Huge difference in purchasing power across the globe.

An Itunes song at 99 cents might be okay buy in USA, but in Asia it is very expensive. Maybe if content creators use Purchasing Power Parity to price their goods, it might make an indent.

3) State sponsored intellectual theft as another form of economic warfare- this has been going on since the West stole gunpowder and silk from the Chinese, and Intel decided to win back the IP rights to the microprocessor (from the Japanese client)

4) Lack of consensus in policy makers across the globe on who gets hurt from IP theft, but complete consensus across young people in the globe that they are doing the right thing by downloading stuff for free.

5) There is no such thing as a free lunch. Sometimes software (and movie and songs) piracy help create demand across ignored markets – I always think the NFL can be huge in India if they market it.Sometimes it forces artists to commit suicide because they give up on the life of starving musician.

Mostly piracy has helped break profits of intermediaries between the actual creator and actual consumer.

So how to solve software piracy , assuming it is something that can be solved-

I dont know, but I do care.

I give most of my writings as CC-by-SA and that includes my poems. People (friends and family) sometimes pay me not to sing.

Pirates have existed and will exist as long as civilized men romanticize the notion of piracy and bicker between themselves for narrow gains.

  1. Ephesians 4:28 Let the thief no longer steal, but rather let him labor, doing honest work with his own hands, so that he may have something to share with anyone in need.
  2. A clean confession, combined with a promise never to commit the sin again, when offered before one who has the right to receive it, is the purest type of repentance.-Gandhi
  3. If you steal, I will wash your mouth with soap- Anonymous Mother.
  4. You shall not steal- Moses
  5. Steal may refer to: Theft, the illegal taking of another person’s property without that person’s freely-given consent; The gaining of a stolen base in baseball;