SAS Data Loader for Hadoop is now a 90 day free trial

From-

http://www.cloudera.com/content/cloudera/en/downloads/quickstart_vms/cdh-5-3-x.html

 

SAS Data Loader for Hadoop eliminates the complexities of writing MapReduce code, with a simple, point-and-click interface that empowers business analysts to prepare, integrate and cleanse big data faster and easier than ever. In addition, data scientists and programmers can run SAS code on Hadoop in parallel for better performance and greater productivity.

 


Get Started

  1. Download and install Cloudera QuickStart VM for CDH 5.3x.
  2. Download and install either VMware Player 6.0 or later (for Windows) or VMware Fusion for OS X 6.0 (for Mac).
  3. Download and install your 90-day free trial of SAS Data Loader for Hadoop.

and

from

http://www.sas.com/en_us/software/data-management/data-loader-hadoop.html

 

Polyglots for Data Science #python #sas #r #stats #spss #matlab #julia #octave

In the future I think analysts need to be polyglots- you will need to know more than one language for crunching data.

SAS, Python, R, Julia,SPSS,Matlab- Pick Any Two 😉 or Any Three.

No, you can’t count C or Java as a statistical  language 🙂 🙂

Efforts to promote Polyglots in Statistical Software are-

1) R for SAS and SPSS Users (free or book)

2) R for Stata Users (book)

3) SAS and R (blog and book)

4) Using Python and R together

Probably we need a Python and R for Data Analysis book- just like we have for SAS and R books.

5) Matlab   and R

Reference (http://mathesaurus.sourceforge.net/matlab-python-xref.pdf ) includes Python

5) Octave and R

package http://cran.r-project.org/web/packages/RcppOctave/vignettes/RcppOctave.pdf includes Matlab

reference http://cran.r-project.org/doc/contrib/R-and-octave.txt

6) Julia and python

  • PyPlot uses the Julia PyCall package to call Python’s matplotlib directly from Julia

7) SPSS and Python is here

8) SPSS and R is as below

  • The Essentials for R for Statistics versions 22, 21, 20, and 19 are available here.
  • This link will take you to the SourceForge site where the Version 18 Essentials and Plugins are hosted.

     

9) Using R from Clojure – Incanter

Use embedded R from Clojure and Incanter http://github.com/jolby/rincanter

How to help your government keep the world safe using statistics #rstats #python #sas

Big Data for Big Brother. Now playing. At a computer near you. How to help water the tree of liberty using statistics?

Use R

 

or

Use Python

 

LKF2-eVZHWtc-47347

WvfC-nxDTMqJ-97899

or use SAS software

SAS/CIA from the last paragraph of

Click to access ET_CD_Mumbai_Jul12.pdf

Screenshot from 2013-06-09 20:19:01

 

Top five ways to do business unethically in India

Over a decade long career , I have often been reminded of this saying from erstwhile mentors in long forgotten consulting email group- It is not WHAT you KNOW, it is WHO you KNOW. The power of WHO you KNOW can defeat even what you know , have learnt or worked hard at. Accordingly these are some wry observations on how businesses sometimes take shortcuts in India, and the whys and wherefores.

1) Regulatory Arbitrage due to Lack of Regulatory Oversight- This is especially true in terms of labor practices. This includes under-paying Caucasians and non -Indians for internships , or jobs (in the name of sponsoring the work visa). India is an extremely inexpensive place to stay in, but it is sometimes unfriendly (in terms of laws not people) to people visiting from the West. This ranges from amusing things to paying 10 times the price for non Indian visitors to Taj Mahal- to not so funny things as paying them lower salaries because they need  a reason to stay on. Unfortunately this is true in many countries -underpaying aliens, but it is much better regulated in the West.

2) Stealing Intellectual Property– I have often known people to steal presentations and even excel macros from the place they were working to the new place. Almost no one gets prosecuted for intellectual property theft (unless you are caught with 10,000 pirated music or film cds)

3) Using Pirated Softwares – Lack of awareness of FOSS means many SMEs use shortcuts including downloading software from Pirate Bay and using this to work for clients in the West. Example- This could be as simple as downloading SAS software from Internet, or using WPS software for training and mis-representing SAS Institute’s name. (added confusion due to SAS -software,company,language ) . There are other major companies who suffer from this too, notably Microsoft.

This could be as complex as using academic versions of enterprise software for businesses purposes. In each case because of the geography, legal risk is quite low, and returns quite high from pirated software. It also helps lower the unethical vendor’s quotation of prices compared to the one who is doing it straight.

One way to avoid this is –ask your vendor to show you copy of how many legal licence’s for software. It can also help in cutting down exaggerated bench strength claims of vendors, as sometimes businesses hire many people and then put them on internal projects.

4) Illegal Trade practices- This include making employees sign a 1 year bond for not leaving the company after they have visited the West for company work- in the name of training . This also includes abusing the loopholes in various types of visa.

5) Ignoring signed contracts and negotiating to lower prices at every step illegally, in collusion with other vendors ( there is no effective anti -trust act ) and using the complete inadequate and lengthy nature of filing court cases in India.
Almost every non Indian client I know pays on time- almost every Indian client I know needs reminders. This is more of a mindset problem , knowing the reluctance to file lawsuits in India given slow progress in the courts ( India has 1.2 billion people and per capita access to judges and lawyers is quite low). The buzz word is- How much can we settle this? Lets do a settlement!

In the long run, this is choking off growth and potential of SMEs in India. In a continuing series- I will help the non Indian users with ways to use technology for legal remedies  in India for intellectual property  along with known case studies and examples.

Top 10 Regrets on Learning the SAS Language

  1. I didn’t learn the SAS Macro Language enough. SAS Macros are cool, and fast. Ditto for arrays. or ODS.
  2. Not keeping up with the changes in Version 9+. Especially the hash method.(Why name a technique after a recreational drug,  most unfair)
  3. Not studying more statistics theory.
  4. Flunking SAS Certification Twice.
  5. Not making enough money because customers need a solution not a p value.
  6. There is no Proc common sense. There is no Proc Clean the Data.
  7. No Macros to automate the model. Here is dirty data. There is clean model.  Wait till version 16.
  8. Not getting selected by SAS R & D.Not applying to SAS R & D.
  9. Google has better voice recognition for typing notes. No Voice Recognition in SAS langvuage to type syntax.
  10. Enhanced Editor and EG are both idiotic junk pushed by Marketing!

Inspired by true events at

http://www.sascommunity.org/wiki/Category:Bricolage

Interview Anne Milley JMP

Here is an interview with Anne Milley,Sr Director, Analytic Strategy, JMP.

Ajay- Review – How was the year 2012 for Analytics in general and JMP in particular?
Anne- 2012 was great!  Growing interest in analytics is evident—more analytics books, blogs, LinkedIn groups, conferences, training, capability, integration….  JMP had another good year of worldwide double-digit growth.

Ajay-  Forecast- What is your forecast for analytics in terms of top 5 paradigms for 2013?
Anne- In an earlier blog, I had predicted we will continue to see more lively data and information visualizations—by that I mean more interactive and dynamic graphics for both data analysts and information consumers.
We will continue to hear about big data, data science and other trendy terms. As we amass more and more data histories, we can expect to see more innovations in time series visualization. I am excited by the growing interest we see in spatial and image analysis/visualization and hope those trends continue—especially more objective, data-driven image analysis in medicine! Perhaps not a forecast, but a strong desire, to see more people realize and benefit from the power of experimental design. We are pleased that more companies—most recently SiSoft—have integrated with JMP to make DOE a more seamless part of the design engineer’s workflow.

 Ajay- Cloud- Cloud Computing seems to be the next computing generation. What are JMP plans for cloud computing?
Anne- With so much memory and compute power on the desktop, there is still plenty of action on PCs. That said, JMP is Citrix-certified and we do see interest in remote desktop virtualization, but we don’t support public clouds.

Ajay- Events- What are your plans for the International Year of Statistics at JMP?
Anne- We kicked off our Analytically Speaking webcast series this year with John Sall in recognition of the first-ever International Year of Statistics. We have a series of blog posts on our International Year of Statistics site that features a noteworthy statistician each month, and in keeping with the goals of Statistics2013, we are happy to:

  • increase awareness of statistics and why it’s essential,
  • encourage people to consider it as a profession and/or enhance their skills with more statistical knowledge, and
  • promote innovation in the sciences of probability and statistics.

Both JMP and SAS are doing a variety of other things to help celebrate statistics all year long!

Ajay- Education Training-  How does JMP plan to leverage the MOOC paradigm (massive open online course) as offered by providers like Coursera etc.?
Anne- Thanks to you for posting this to the JMP Professional Network  on LinkedIn, where there is some great discussion on this topic.  The MOOC concept is wonderful—offering people the ability to invest in themselves, enhance their understanding on such a wide variety of topics, improve their communities….  Since more and more professors are teaching with JMP, it would be great to see courses on various areas of statistics (especially since this is the International Year of Statistics!) using JMP. JMP strives to remove complexity and drudgery from the analysis process so the analyst can stay in flow and focus on solving the problem at hand. For instance, the one-click bootstrap is a great example of something that should be promoted in an intro stats class. Imagine getting to appreciate the applied results and see the effects of sampling variability without having to know distribution theory. It’s good that people have options to enhance their skills—people can download a 30-day free trial of JMP and browse our learning library as well.

Ajay- Product- What are some of the exciting things JMP users and fans can look forward to in the next releases this year?
Anne- There are a number of enhancements and new capabilities planned for new releases of the JMP family of products, but you will have to wait to hear details…. OK, I’ll share a few!  JMP Clinical 4.1 will have more sophisticated fraud detection. We are also excited about releasing version 11 of JMP and JMP Pro this September.  JMP’s DOE capability is well-known, and we are pleased to offer a brand new class of experimental design—definitive screening designs. This innovation has already been recognized with The 2012 Statistics in Chemistry Award to Scott Allen of Novomer in collaboration with Bradley Jones in the JMP division of SAS. You will hear more about the new releases of JMP and JMP Pro at  Discovery Summit in San Antonio—we are excited to have Nate Silver as our headliner!

About-

Anne Milley directs analytic strategy in JMP Product Marketing at SAS.  Her ties to SAS began with bank failure prediction at FHLB Dallas.  Using SAS continued at 7-Eleven Corporation in Strategic Planning.  She has authored papers and served on committees for SAS Education conferences, KDD, and SIAM.  In 2008, she completed a 5-month assignment at a UK bank.  Milley completed her M.A. in Economics from Florida Atlantic University, did post-graduate work at RWTH Aachen, and is proficient in German.

JMP-

Introduced in 1989, JMP has grown into a family of statistical discovery products used worldwide in almost every industry. JMP is statistical discovery software  that links dynamic data visualization with robust statistics, in memory and on the desktop. From its beginnings, JMP software has empowered its users by enabling interactive analytics on the desktop. JMP products continue to complement – and are often deployed with – analytics solutions that provide server-based business intelligence.

 

SAS gets awesome revenues

There are 2.87 billion reasons SAS is not going away anywhere in the Big Data Analytics space. Yes , thats the revenue figures declared by them-http://www.sas.com/news/preleases/2012financials.html

Of course I have always wondered how much they earn from SAS Federal LLC ( which is a subsidary that caters to the lucrative and not very competitive analytics in Intelligence) and their revenue breakdown by Product ( how much did they earn by Base SAS licenses versus how much they earned by Cyber Security  http://www.sas.com/industry/government/cybersecurity/index.html )

I wonder how many other analytics companies have even realized that they can help cut down the federal government costs ( or even have something close to this http://www.sas.com/industry/government/national-security/intelligence-management.html )

This year revenue breakdown was-

The Americas generated 47 percent of SAS’ total revenue; Europe, Middle East and Africa (EMEA) 41 percent; and Asia Pacific 12 percent.

but last year

The Americas accounted for 46 percent of total revenue; Europe, Middle East and Africa (EMEA) 42 percent; and Asia Pacific 12 percent

So Americas revenue grew faster than Europe revenues!Okay

Continue reading “SAS gets awesome revenues”