Home » Posts tagged 'why'

Tag Archives: why

Top 7 Business Strategy Models

UPDATED POST- Some Models I use for Business Strategy- to analyze the huge reams of qualitative and uncertain data that business generates. I have added a bonus the Business canvas Model (number 2)

  1. Porters 5 forces Model-To analyze industries
  2. Business Canvas
  3. BCG Matrix- To analyze Product Portfolios
  4. Porters Diamond Model- To analyze locations
  5. McKinsey 7 S Model-To analyze teams
  6. Gernier Theory- To analyze growth of organization
  7. Herzberg Hygiene Theory- To analyze soft aspects of individuals
  8. Marketing Mix Model- To analyze marketing mix.

(more…)

JMP Student Edition

I really liked the initiatives at JMP/Academic. Not only they offer the software bundled with a textbook, which is both good common sense as well as business sense given how fast students can get confused

(Rant 1 Bundling with textbooks is something I think is Revolution Analytics should think of doing instead of just offering the academic  version for free downloading- it would be interesting to see the penetration of R academic market with Revolution’s version and the open source version with the existing strategy)

From http://www.jmp.com/academic/textbooks.shtml

Major publishers of introductory statistics textbooks offer a 12-month license to JMP Student Edition, a streamlined version of JMP, with their textbooks.

and a glance through this http://www.jmp.com/academic/pdf/jmp_se_comparison.pdf  shows it is a credible and not extremely whittled down version which would be just dishonest.

And I loved this Reference Card at http://www.jmp.com/academic/pdf/jmp10_se_quick_guide.pdf

 

Oracle, SAP- Hana, Revolution Analytics and even SAS/STAT itself can make more reference cards like this- elegant solutions for students and new learners!

More- creative-rants Honestly why do corporate sites use PDFs anymore when they can use Instapaper , or any of these SlideShare/Scribd formats to show information in a better way without diverting the user from the main webpage.

But I digress, back to JMP

 

Resources for Faculty Using JMP® Student Edition

Faculty who select a JMP Student Edition bundle for their courses may be eligible for additional resources, including course materials and training.

Special JMP® Student Edition for AP Statistics

JMP Student Edition is available in a convenient five-year license for qualified Advanced Placement statistics programs.

Try and have a look yourself at http://www.jmp.com/academic/student.shtml

 

 

 

Why Cyber War?

The Necessity of Cyber War as a better alternative to traditional warfare

 

By the time our generation is done with this living on this planet, we should have found a way to flip warfare into just another computer game.

 

  1. Cyber War does not kill people but does diminish both production as well offensive capabilities of enemy.
  2. It destroys lesser resources of the enemy irreversibly, thus leading to increased capacity to claim damages or taxes from the loser of the conflict
  3. It does not motivate general population for war hysteria thus minimizing inflationary pressures
  4. Cyber War does not divert too many goods and services (like commodities, metals, fuels) from your economy unlike traditional warfare
  5. Capacity to wage cyber war needs human resources  and can reduce asymmetry between nations in terms of resources available naturally or historically (like money , access to fuel and logistics, geography , educated population,colonial history  )
  6. It is more effective in both offensive and defensive capabilities and at a much much cheaper cost to defense budgets
  7. Most developed countries have already invested heavily in it, and it can render traditional weaponry ineffective and expensive. If you ignore investing in cyber war capabilities your defense forces would be compromised and national infrastructure can be held to ransom

 

Self-defence….is the only honourable course where there is unreadiness for self-immolation.- Gandhi.

Using Rapid Miner and R for Sports Analytics #rstats

Rapid Miner has been one of the oldest open source analytics software, long long before open source or even analytics was considered a fashion buzzword. The Rapid Miner software has been a pioneer in many areas (like establishing a marketplace for Rapid Miner Extensions) and the Rapid Miner -R extension was one of the most promising enablers of using R in an enterprise setting.
The following interview was taken with a manager of analytics for a sports organization. The sports organization considers analytics as a strategic differentiator , hence the name is confidential. No part of the interview has been edited or manipulated.

Ajay- Why did you choose Rapid Miner and R? What were the other software alternatives you considered and discarded?

Analyst- We considered most of the other major players in statistics/data mining or enterprise BI.  However, we found that the value proposition for an open source solution was too compelling to justify the premium pricing that the commercial solutions would have required.  The widespread adoption of R and the variety of packages and algorithms available for it, made it an easy choice.  We liked RapidMiner as a way to design structured, repeatable processes, and the ability to optimize learner parameters in a systematic way.  It also handled large data sets better than R on 32-bit Windows did.  The GUI, particularly when 5.0 was released, made it more usable than R for analysts who weren’t experienced programmers.

Ajay- What analytics do you do think Rapid Miner and R are best suited for?

 Analyst- We use RM+R mainly for sports analysis so far, rather than for more traditional business applications.  It has been quite suitable for that, and I can easily see how it would be used for other types of applications.

 Ajay- Any experiences as an enterprise customer? How was the installation process? How good is the enterprise level support?

Analyst- Rapid-I has been one of the most responsive tech companies I’ve dealt with, either in my current role or with previous employers.  They are small enough to be able to respond quickly to requests, and in more than one case, have fixed a problem, or added a small feature we needed within a matter of days.  In other cases, we have contracted with them to add larger pieces of specific functionality we needed at reasonable consulting rates.  Those features are added to the mainline product, and become fully supported through regular channels.  The longer consulting projects have typically had a turnaround of just a few weeks.

 Ajay- What challenges if any did you face in executing a pure open source analytics bundle ?

Analyst- As Rapid-I is a smaller company based in Europe, the availability of training and consulting in the USA isn’t as extensive as for the major enterprise software players, and the time zone differences sometimes slow down the communications cycle.  There were times where we were the first customer to attempt a specific integration point in our technical environment, and with no prior experiences to fall back on, we had to work with Rapid-I to figure out how to do it.  Compared to the what traditional software vendors provide, both R and RM tend to have sparse, terse, occasionally incomplete documentation.  The situation is getting better, but still lags behind what the traditional enterprise software vendors provide.

 Ajay- What are the things you can do in R ,and what are the things you prefer to do in Rapid Miner (comparison for technical synergies)

Analyst- Our experience has been that RM is superior to R at writing and maintaining structured processes, better at handling larger amounts of data, and more flexible at fine-tuning model parameters automatically.  The biggest limitation we’ve had with RM compared to R is that R has a larger library of user-contributed packages for additional data mining algorithms.  Sometimes we opted to use R because RM hadn’t yet implemented a specific algorithm.  The introduction the R extension has allowed us to combine the strengths of both tools in a very logical and productive way.

In particular, extending RapidMiner with R helped address RM’s weakness in the breadth of algorithms, because it brings the entire R ecosystem into RM (similar to how Rapid-I implemented much of the Weka library early on in RM’s development).  Further, because the R user community releases packages that implement new techniques faster than the enterprise vendors can, this helps turn a potential weakness into a potential strength.  However, R packages tend to be of varying quality, and are more prone to go stale due to lack of support/bug fixes.  This depends heavily on the package’s maintainer and its prevalence of use in the R community.  So when RapidMiner has a learner with a native implementation, it’s usually better to use it than the R equivalent.

Update!

I have been busy-

1) Finally my divorce came through. My advice – dont do it without a pre-nup ! Alimony means all the money.

2) Spending time on Quora after getting bored from LinkedIn, Twitter,Facebook,Google Plus,Tumblr, WordPress

See this answer to-

 What are common misconceptions about startups?

1) we will change the world
2) if we get 1% of a billion people market, we will be rich
3) if we have got funding, most of the job is done
4) lets pay ourselves high salaries since we got funded
5) our idea is awesome and cant be copied, improvised, stolen, replicated
6) startups are painless
7) it is a better life than a corporate career
8) long term vision is important than short term cash burn
9) we will never sell out or exit. never
10) its a great idea to make startups with friend

Say hello to me – http://www.quora.com/Ajay-Ohri/answers

3) Writing freelance articles on APIs for Programmable Web

Why write pro? See point 1)

Recent Articles-

http://blog.programmableweb.com/2012/07/30/predict-the-future-with-google-prediction-api/

http://blog.programmableweb.com/2012/08/01/your-store-in-the-cloud-google-cloud-storage-api/

http://blog.programmableweb.com/2012/07/27/the-romney-vs-obama-api/

4) Writing poetry on http://poemsforkush.com/. It now gets 23000 views a month. I wish I could say my poems were great, but the readers are kind (364 subscribers!) and also Google Image Search is very very kind.

5) Kicking tires with next book ” R for Cloud Computing” and be tuned for another writing announcement

6) Waiting for Paul Kent, VP, SAS Big Data to reply to my emails for interview after HE promised me!! You dont get to 105 interviews without being a bit stubborn!

7) Sighing on politics engulfing my American friends especially with regards to Chic-fil-A and Romney’s gaffes. Now thats what I call a first world problem! Protesting by eating or boycotting chicken sandwiches! In India we had the world’s biggest blackout two days in a row- and no one is attending the Hunger Fast against corruption protests!

8) Watching Olympics! Our glorious nation of 1.2 billion very smart people has managed to win 1 Bronze till today!! Michael Phelps has won more medals and more gold than the whole of  India has since the Olympics Games began!!

9) Consulting to pay the bills. includes writing R code, making presentations. Why consult when I have writing to do? See point 1)

10) Reading New York Times to get insights on Big Data and Analytics. Trust them- they know what they are doing!

Interview John Myles White , Machine Learning for Hackers

Here is an interview with one of the younger researchers  and rock stars of the R Project, John Myles White,  co-author of Machine Learning for Hackers.

Ajay- What inspired you guys to write Machine Learning for Hackers. What has been the public response to the book. Are you planning to write a second edition or a next book?

John-We decided to write Machine Learning for Hackers because there were so many people interested in learning more about Machine Learning who found the standard textbooks a little difficult to understand, either because they lacked the mathematical background expected of readers or because it wasn’t clear how to translate the mathematical definitions in those books into usable programs. Most Machine Learning books are written for audiences who will not only be using Machine Learning techniques in their applied work, but also actively inventing new Machine Learning algorithms. The amount of information needed to do both can be daunting, because, as one friend pointed out, it’s similar to insisting that everyone learn how to build a compiler before they can start to program. For most people, it’s better to let them try out programming and get a taste for it before you teach them about the nuts and bolts of compiler design. If they like programming, they can delve into the details later.

We once said that Machine Learning for Hackers  is supposed to be a chemistry set for Machine Learning and I still think that’s the right description: it’s meant to get readers excited about Machine Learning and hopefully expose them to enough ideas and tools that they can start to explore on their own more effectively. It’s like a warmup for standard academic books like Bishop’s.
The public response to the book has been phenomenal. It’s been amazing to see how many people have bought the book and how many people have told us they found it helpful. Even friends with substantial expertise in statistics have said they’ve found a few nuggets of new information in the book, especially regarding text analysis and social network analysis — topics that Drew and I spend a lot of time thinking about, but are not thoroughly covered in standard statistics and Machine Learning  undergraduate curricula.
I hope we write a second edition. It was our first book and we learned a ton about how to write at length from the experience. I’m about to announce later this week that I’m writing a second book, which will be a very short eBook for O’Reilly. Stay tuned for details.

Ajay-  What are the key things that a potential reader can learn from this book?

John- We cover most of the nuts and bolts of introductory statistics in our book: summary statistics, regression and classification using linear and logistic regression, PCA and k-Nearest Neighbors. We also cover topics that are less well known, but are as important: density plots vs. histograms, regularization, cross-validation, MDS, social network analysis and SVM’s. I hope a reader walks away from the book having a feel for what different basic algorithms do and why they work for some problems and not others. I also hope we do just a little to shift a future generation of modeling culture towards regularization and cross-validation.

Ajay- Describe your journey as a science student up till your Phd. What are you current research interests and what initiatives have you done with them?

John-As an undergraduate I studied math and neuroscience. I then took some time off and came back to do a Ph.D. in psychology, focusing on mathematical modeling of both the brain and behavior. There’s a rich tradition of machine learning and statistics in psychology, so I got increasingly interested in ML methods during my years as a grad student. I’m about to finish my Ph.D. this year. My research interests all fall under one heading: decision theory. I want to understand both how people make decisions (which is what psychology teaches us) and how they should make decisions (which is what statistics and ML teach us). My thesis is focused on how people make decisions when there are both short-term and long-term consequences to be considered. For non-psychologists, the classic example is probably the explore-exploit dilemma. I’ve been working to import more of the main ideas from stats and ML into psychology for modeling how real people handle that trade-off. For psychologists, the classic example is the Marshmallow experiment. Most of my research work has focused on the latter: what makes us patient and how can we measure patience?

Ajay- How can academia and private sector solve the shortage of trained data scientists (assuming there is one)?

John- There’s definitely a shortage of trained data scientists: most companies are finding it difficult to hire someone with the real chops needed to do useful work with Big Data. The skill set required to be useful at a company like Facebook or Twitter is much more advanced than many people realize, so I think it will be some time until there are undergraduates coming out with the right stuff. But there’s huge demand, so I’m sure the market will clear sooner or later.

The changes that are required in academia to prepare students for this kind of work are pretty numerous, but the most obvious required change is that quantitative people need to be learning how to program properly, which is rare in academia, even in many CS departments. Writing one-off programs that no one will ever have to reuse and that only work on toy data sets doesn’t prepare you for working with huge amounts of messy data that exhibit shifting patterns. If you need to learn how to program seriously before you can do useful work, you’re not very valuable to companies who need employees that can hit the ground running. The companies that have done best in building up data teams, like LinkedIn, have learned to train people as they come in since the proper training isn’t typically available outside those companies.
Of course, on the flipside, the people who do know how to program well need to start learning more about theory and need to start to have a better grasp of basic mathematical models like linear and logistic regressions. Lots of CS students seem not to enjoy their theory classes, but theory really does prepare you for thinking about what you can learn from data. You may not use automata theory if you work at Foursquare, but you will need to be able to reason carefully and analytically. Doing math is just like lifting weights: if you’re not good at it right now, you just need to dig in and get yourself in shape.
About-
John Myles White is a Phd Student in  Ph.D. student in the Princeton Psychology Department, where he studies human decision-making both theoretically and experimentally. Along with the political scientist Drew Conway, he is  the author of a book published by O’Reilly Media entitled “Machine Learning for Hackers”, which is meant to introduce experienced programmers to the machine learning toolkit. He is also working with Mark Hansenon a book for laypeople about exploratory data analysis.John is the lead maintainer for several R packages, including ProjectTemplate and log4r.

(TIL he has played in several rock bands!)

—–
You can read more in his own words at his blog at http://www.johnmyleswhite.com/about/
He can be contacted via social media at Google Plus at https://plus.google.com/109658960610931658914 or twitter at twitter.com/johnmyleswhite/

Decisionstats.com is back from a dDOS

  1. Servers were okay, it was the DNS server that got swamped.
  2. I am sorry for the downtime- hopefully you didnt even notice
  3. I have faced challenges like domain name hijacking, sql injection , malicious WP plugins and thats why shifted to a professional hosting. I stand by my vendors and their professional judgement, moving away would mean the hackers won.
  4. This was very clever to swamp the DNS provider- my compliments to the tech talent behind this.
  5. You would think that every webmaster would have a back up plan in case his site went dDOS, but surprisingly even corporate websites dont have a back up (under attack) plan

 

Follow

Get every new post delivered to your Inbox.

Join 844 other followers