People sceptical of any analytical value of Facebook should see the nice embedded analytics, which is a close rival and even more to Google Analytics for websites. It has recently been updated as well.
It is right there on the button called Insights on left margin of your Facebook Page
From the press release, the maker of Map Reduce based BI software gets 30 mill $ as Series C funding. Given the valuation recently by IBM to Netezza, AsterData seems set to cross the Billion Dollar valuation within the next 18-24 months IMO
Aster Data Closes $30 Million Series C Financing
Explosive Growth and Market Leadership Attracts New and Existing Investors
San Carlos, CA – September 22, 2010 – Aster Data, a market leader in big data management and advanced analytics, today announced that it has closed a $30 million Series C round of financing led by both new and existing investors. The company will use the new funding to accelerate growth, scale operations, and expand its global market share in the $20 billion database market – a market that is experiencing rapid growth as a result of both the explosion in data volumes across organizations and the urgent need to deliver a new class of analytics and data-driven applications. The Series C round of funding includes previous investors Sequoia Capital, JAFCO Ventures, Institutional Venture Partners, Cambrian Ventures, as well as an additional new strategic investor. Also investing in this round is early investor David Cheriton, who previously backed high-growth companies including Google and VMware, and co-founded several successful technology companies.
Today’s Series C funding announcement underscores a year of strong innovation, execution, and overall momentum for the analytic database company. Key milestones include:
Strong sales growth: Since 2008, Aster Data has doubled revenue year-over-year and secured key customers that leverage Aster Data’s platform to address the big data management problem including MySpace, comScore, Barnes & Noble, and Akamai. Like so many organizations today,
Aster Data’s customers are experiencing explosive data growth across their organizations and recognize the need for rich, advanced analytics that give them deeper insights from their data.
Key executive hires: Quentin Gallivan, former CEO of both PivotLink and Postini and EVP of worldwide sales at Verisign, recently joined the company as Chief Executive Officer. In addition, earlier this year, John Calonico, previously at Interwoven, BEA, and Autodesk, joined as Chief Financial Officer; and Nitin Donde, formerly an executive at EMC and 3PAR, joined as Executive Vice President Engineering. The strength and experience of Aster Data’s management team helps further establish a strong operational foundation for growth in 2010 and beyond.
Industry recognition:Aster Data was positioned in the “Visionaries” Quadrant of Gartner, Inc.’s
Data Warehouse Database Management Systems Magic Quadrant, published 2010 *; was recently named 2011 Tech Pioneer by the World Economic Forum; was named “Company to Watch” in the Information Management category of TechWeb’s Intelligent Enterprise 2010 Editors’ Choice Awards; and was awarded the 2010 San Francisco Business Times Technology and Innovation Award in the Best Product and Services Category.
Product Innovation: Aster Data continues to deliver ground-breaking capabilities to address the big data management and advanced analytics market need. Its recent announcement of
Aster Data nCluster 4.6 includes a column data store, making it the first hybrid row and column MPP DBMS with a unified SQL and MapReduce analytic framework for advanced analytics on large data sets. This year, Aster Data also delivered the most extensive library of pre-packaged MapReduce analytics totaling over 1000 functions, to ease and accelerate delivery of highly advanced analytic applications.
Aster Data’s analytic database, also called a ‘Data-Analytics Server’ is specifically designed to enable organizations to cost effectively store and analyze massive volumes of data. Aster Data leverages the power of commodity, general-purpose hardware, to reduce the cost to scale to support large data volumes and uniquely allows analysis of all data ‘in-database’ enabling richer and faster processing of large data sets. Aster Data’s in-database analytics engine uses the power of MapReduce, a parallel processing framework created by Google.
”The funding we received in our Series C round is a strong endorsement of Aster Data’s market leadership position and the high growth potential of the big data market,” said Quentin Gallivan, Chief Executive Officer, Aster Data. “The Aster Data team has executed exceptionally well to-date and I am excited to have the resources to accelerate the growth of the company as we expand our operations and execute aggressively across all fronts.”
Here is a comparison of Windows Azure instances vs Amazon compute instances
Compute Instance Sizes:
Developers have the ability to choose the size of VMs to run their application based on the applications resource requirements. Windows Azure compute instances come in four unique sizes to enable complex applications and workloads.
Compute Instance Size
CPU
Memory
Instance Storage
I/O Performance
Small
1.6 GHz
1.75 GB
225 GB
Moderate
Medium
2 x 1.6 GHz
3.5 GB
490 GB
High
Large
4 x 1.6 GHz
7 GB
1,000 GB
High
Extra large
8 x 1.6 GHz
14 GB
2,040 GB
High
Standard Rates:
Windows Azure
Compute
Small instance (default): $0.12 per hour
Medium instance: $0.24 per hour
Large instance: $0.48 per hour
Extra large instance: $0.96 per hour
Storage
$0.15 per GB stored per month
$0.01 per 10,000 storage transactions
Content Delivery Network (CDN)
$0.15 per GB for data transfers from European and North American locations*
$0.20 per GB for data transfers from other locations*
Instances of this family are well suited for most applications.
Small Instance – default*
1.7 GB memory
1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
160 GB instance storage (150 GB plus 10 GB root partition)
32-bit platform
I/O Performance: Moderate
API name: m1.small
Large Instance
7.5 GB memory
4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
850 GB instance storage (2×420 GB plus 10 GB root partition)
64-bit platform
I/O Performance: High
API name: m1.large
Extra Large Instance
15 GB memory
8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
1,690 GB instance storage (4×420 GB plus 10 GB root partition)
64-bit platform
I/O Performance: High
API name: m1.xlarge
Micro Instances
Instances of this family provide a small amount of consistent CPU resources and allow you to burst CPUcapacity when additional cycles are available. They are well suited for lower throughput applications and web sites that consume significant compute cycles periodically.
Micro Instance
613 MB memory
Up to 2 EC2 Compute Units (for short periodic bursts)
EBS storage only
32-bit or 64-bit platform
I/O Performance: Low
API name: t1.micro
High-Memory Instances
Instances of this family offer large memory sizes for high throughput applications, including database and memory caching applications.
High-Memory Extra Large Instance
17.1 GB of memory
6.5 EC2 Compute Units (2 virtual cores with 3.25 EC2 Compute Units each)
420 GB of instance storage
64-bit platform
I/O Performance: Moderate
API name: m2.xlarge
High-Memory Double Extra Large Instance
34.2 GB of memory
13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each)
850 GB of instance storage
64-bit platform
I/O Performance: High
API name: m2.2xlarge
High-Memory Quadruple Extra Large Instance
68.4 GB of memory
26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform
I/O Performance: High
API name: m2.4xlarge
High-CPU Instances
Instances of this family have proportionally more CPU resources than memory (RAM) and are well suited for compute-intensive applications.
High-CPU Medium Instance
1.7 GB of memory
5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute Units each)
350 GB of instance storage
32-bit platform
I/O Performance: Moderate
API name: c1.medium
High-CPU Extra Large Instance
7 GB of memory
20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform
I/O Performance: High
API name: c1.xlarge
Cluster Compute Instances
Instances of this family provide proportionally high CPU resources with increased network performance and are well suited for High Performance Compute (HPC) applications and other demanding network-bound applications. Learn more about use of this instance type for HPC applications.
Cluster Compute Quadruple Extra Large Instance
23 GB of memory
33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
1690 GB of instance storage
64-bit platform
I/O Performance: Very High (10 Gigabit Ethernet)
API name: cc1.4xlarge
Revolution News
Every month, we’ll bring you the latest news about Revolution’s products and events in this section. Follow us on Twitter at @RevolutionR for up-to-the-minute news and updates from Revolution Analytics!
Revolution R Enterprise 4.0 for Windows now available. Based on the latest R 2.11.1 and including the RevoScaleR package for big-data analysis in R, Revolution R Enterprise is now available for download for Windows 32-bit and 64-bit systems. Click here to subscribe, or available free to academia.
New! Integrate R with web applications, BI dashboards and more with web services. RevoDeployR is a new Web Services framework that integrates dynamic R-based computations into applications for business users. It will be available September 30 with Revolution R Enterprise Server on RHEL 5. Click here to learn more.
Inside-R: A new site for the R Community. At www.inside-R.org you’ll find the latest information about R from around the Web, searchable R documentation and packages, hints and tips about R, and more. You can even add a “Download R” badge to your own web-page to help spread the word about R.
R News, Tips and Tricks from the Revolutions blog
The Revolutions blog brings you daily news and tips about R, statistics and open source. Here are some highlights from Revolutions from the past month.
R’s key role in the oil spill response: Read how NIST’s Division Chief of Statistical Engineering used R to provide critical analysis in real time to the Secretaries of Energy and the Interior, and helped coordinate the government’s response.
Animating data with R and Google Earth: Learn how to use R to create animated visualizations of geographical data with Google Earth, such as this video showing how tuna migrations intersect with the location of the Gulf oil spill.
Are baseball games getting longer? Or is it just Red Sox games? Ryan Elmore uses nonparametric regression in R to find out.
Keynote presentations from useR! 2010: the worldwide R user’s conference was a great success, and there’s a wealth of useful tips and information in the presentations. Video of the keynote presentations are available too: check out in particular Frank Harrell’s talk Information Allergy, and Friedrich Leisch’s talk on reproducible statistical research.
Looking for more R tips and tricks? Check out the monthly round-ups at the Revolutions blog.
Upcoming Events Every month, we’ll highlight some upcoming events from R Community Calendar.
September 23: The San Diego R User Group has a meetup on BioConductor and microarray data analysis.
September 28: The Sydney Users of R Forum has a meetup on building world-class predictive models in R (with dinner to follow).
September 28: The Los Angeles R User Group presents an introduction to statistical finance with R.
September 28: The Seattle R User Group meets to discuss, “What are you doing with R?”
His argument of love is not very original though it was first made by these four guys
I am going to argue that “some” R developers should be paid, while the main focus should be volunteers code. These R developers should be paid as per usage of their packages.
Let me expand.
Imagine the following conversation between Ross Ihaka, Norman Nie and Peter Dalgaard.
Norman- Hey Guys, Can you give me some code- I got this new startup.
Ross Ihaka and Peter Dalgaard- Sure dude. Here is 100,000 lines of code, 2000 packages and 2 decades of effort.
Norman- Thanks guys.
Ross Ihaka- Hey, What you gonna do with this code.
Norman- I will better it. Sell it. Finally beat Jim Goodnight and his **** Proc GLM and **** Proc Reg.
Ross- Okay, but what will you give us? Will you give us some code back of what you improve?
Norman – Uh, let me explain this open core …
Peter D- Well how about some royalty?
Norman- Sure, we will throw parties at all conferences, snacks you know at user groups.
Ross – Hmm. That does not sound fair. (walks away in a huff muttering)-He takes our code, sells it and wont share the code
Peter D- Doesnt sound fair. I am back to reading Hamlet, the great Dane, and writing the next edition of my book. I am glad I wrote a book- Ross didnt even write that.
Norman-Uh Oh. (picks his phone)- Hey David Smith, We need to write some blog articles pronto – these open source guys ,man…
———–I think that sums what has been going on in the dynamics of R recently. If Ross Ihaka and R Gentleman had adopted an open core strategy- meaning you can create packages to R but not share the original where would we all be?
At this point if he is reading this, David Smith , long suffering veteran of open source flameouts is rolling his eyes while Tal G is wondering if he will publish this on R Bloggers and if so when or something.
Lets bring in another R veteran- Hadley Wickham who wrote a book on R and also created ggplot. Thats the best quality, most often used graphics package.
In terms of economic utilty to end user- the ggplot package may be as useful if not more as the foreach package developed by Revolution Computing/Analytics.
However lets come to open core licensing ( read it here http://alampitt.typepad.com/lampitt_or_leave_it/2008/08/open-core-licen.html ) which is where the debate is- Revolution takes code- enhances it (in my opinion) substantially with new formats XDF for better efficieny, web services API, and soon coming next year a GUI (thanks in advance , Dr Nie and guys)
and sells this advanced R code to businesses happy to pay ( they are currently paying much more to DR Goodnight and HIS guys)
Why would any sane customer buy it from Revolution- if he could download exactly the same thing from http://r-project.org
Hence the business need for Revolution Analytics to have an enhanced R- as they are using a product based software model not software as a service model.
If Revolution gives away source code of these new enhanced codes to R core team- how will R core team protect the above mentioned intelectual property- given they have 2 decades experience of giving away free code , and back and forth on just code.
Now Revolution also has a marketing budget- and thats how they sponsor some R Core events, conferences, after conference snacks.
How would people decide if they are being too generous or too stingy in their contribution (compared to the formidable generosity of SAS Institute to its employees, stakeholders and even third party analysts).
Would it not be better- IF Revolution can shift that aspect of relationship to its Research and Development budget than it’s marketing budget- come with some sort of incentive for “SOME” developers – even researchers need grants and assistantships, scholarships, make a transparent royalty formula say 17.5 % of the NEW R sales goes to R PACKAGE Developers pool, which in turn examines usage rate of packages and need/merit before allocation- that would require Revolution to evolve from a startup to a more sophisticated corporate and R Core can use this the same way as John M Chambers software award/scholarship
Dont pay all developers- it would be an insult to many of them – say Prof Harrell creator of HMisc to accept – but can Revolution expand its dev base (and prospect for future employees) by even sponsoring some R Scholarships.
And I am sure that if Revolution opens up some more code to the community- they would the rest of the world and it’s help useful. If it cant trust people like R Gentleman with some source code – well he is a board member.
——————————————————————————————–
Now to sum up some technical discussions on NeW R
1) An accepted way of benchmarking efficiencies.
2) Code review and incorporation of efficiencies.
3) Multi threading- Multi core usage are trends to be incorporated.
4) GUIs like R Commander E Plugins for other packages, and Rattle for Data Mining to have focussed (or Deducer). This may involve hiring User Interface Designers (like from Apple 😉 who will work for love AND money ( Even the Beatles charge royalty for that song)
5) More support to cloud computing initiatives like Biocep and Elastic R – or Amazon AMI for using cloud computers- note efficiency arguements dont matter if you just use a Chrome Browser and pay 2 cents a hour for an Amazon Instance. Probably R core needs more direct involvement of Google (Cloud OS makers) and Amazon as well as even Salesforce.com (for creating Force.com Apps). Note even more corporates here need to be involved as cloud computing doesnot have any free and open source infrastructure (YET)
“If something goes wrong with Microsoft, I can phone Microsoft up and have it fixed. With Open Source, I have to rely on the community.”
And the community, as much as we may love it, is unpredictable. It might care about your problem and want to fix it, then again, it may not. Anyone who has ever witnessed something online go “viral”, good or bad, will know what I’m talking about.
I had a skype video chat with Karime Chine and he was kind enough to walk me through the new portal Elastic-R at http://www.elastic-r.org
Basically you can work on a collaborative basis in this with multiple users working on excel as well as R projects.
Some screenshots-in a short presentation I made on my notes during K Chine’s presentation
Also, Revolution Analytics is coming out with a Web Services product for R
RevoDeployR: Web Services for R
Both are very powerful uses of R for cloud computing- and it would be interesting if the original cloud computing champion Google gets into the R Project.