Teradata Analytics

A recent announcement showing Teradata partnering with KXEN and Revolution Analytics for Teradata Analytics.

http://www.teradata.com/News-Releases/2012/Teradata-Expands-Integrated-Analytics-Portfolio/

The Latest in Open Source Emerging Software Technologies
Teradata provides customers with two additional open source technologies – “R” technology from Revolution Analytics for analytics and GeoServer technology for spatial data offered by the OpenGeo organization – both of which are able to leverage the power of Teradata in-database processing for faster, smarter answers to business questions.

In addition to the existing world-class analytic partners, Teradata supports the use of the evolving “R” technology, an open source language for statistical computing and graphics. “R” technology is gaining popularity with data scientists who are exploiting its new and innovative capabilities, which are not readily available. The enhanced “R add-on for Teradata” has a 50 percent performance improvement, it is easier to use, and its capabilities support large data analytics. Users can quickly profile, explore, and analyze larger quantities of data directly in the Teradata Database to deliver faster answers by leveraging embedded analytics.

Teradata has partnered with Revolution Analytics, the leading commercial provider of “R” technology, because of customer interest in high-performing R applications that deliver superior performance for large-scale data. “Our innovative customers understand that big data analytics takes a smart approach to the entire infrastructure and we will enable them to differentiate their business in a cost-effective way,” said David Rich, chief executive officer, Revolution Analytics. “We are excited to partner with Teradata, because we see great affinity between Teradata and Revolution Analytics – we embrace parallel computing and the high performance offered by multi-core and multi-processor hardware.”

and

The Teradata Data Lab empowers business users and leading analytic partners to start building new analytics in less than five minutes, as compared to waiting several weeks for the IT department’s assistance.

“The Data Lab within the Teradata database provides the perfect foundation to enable self-service predictive analytics with KXEN InfiniteInsight,” said John Ball, chief executive officer, KXEN. “Teradata technologies, combined with KXEN’s automated modeling capabilities and in-database scoring, put the power of predictive analytics and data mining directly into the hands of business users. This powerful combination helps our joint customers accelerate insight by delivering top-quality models in orders of magnitude faster than traditional approaches.”

Read more at

http://www.sacbee.com/2012/03/06/4315500/teradata-expands-integrated-analytics.html

Interview Prof Benjamin Alamar , Sports Analytics

Here is an interview with Prof Benjamin Alamar, founding editor of the Journal of Quantitative Analysis in Sport, a professor of sports management at Menlo College and the Director of Basketball Analytics and Research for the Oklahoma City Thunder of the NBA.

Ajay – The movie Moneyball recently sparked out mainstream interest in analytics in sports.Describe the role of analytics in sports management

Benjamin- Analytics is impacting sports organizations on both the sport and business side.
On the Sport side, teams are using analytics, including advanced data management, predictive anlaytics, and information systems to gain a competitive edge. The use of analytics results in more accurate player valuations and projections, as well as determining effective strategies against specific opponents.
On the business side, teams are using the tools of analytics to increase revenue in a variety of ways including dynamic ticket pricing and optimizing of the placement of concession stands.
Ajay-  What are the ways analytics is used in specific sports that you have been part of?

Benjamin- A very typical first step for a team is to utilize the tools of predictive analytics to help inform their draft decisions.

Ajay- What are some of the tools, techniques and software that analytics in sports uses?
Benjamin- The tools of sports analytics do not differ much from the tools of business analytics. Regression analysis is fairly common as are other forms of data mining. In terms of software, R is a popular tool as is Excel and many of the other standard analysis tools.
Ajay- Describe your career journey and how you became involved in sports management. What are some of the tips you want to tell young students who wish to enter this field?

Benjamin- I got involved in sports through a company called Protrade Sports. Protrade initially was a fantasy sports company that was looking to develop a fantasy game based on advanced sports statistics and utilize a stock market concept instead of traditional drafting. I was hired due to my background in economics to develop the market aspect of the game.

There I met Roland Beech (who now works for the Mavericks) and Aaron Schatz (owner of footballoutsiders.com) and learned about the developing field of sports statistics. I then changed my research focus from economics to sports statistics and founded the Journal of Quantitative Analysis in Sports. Through the journal and my published research, I was able to establish a reputation of doing quality, useable work.

For students, I recommend developing very strong data management skills (sql and the like) and thinking carefully about what sort of questions a general manager or coach would care about. Being able to demonstrate analytic skills around actionable research will generally attract the attention of pro teams.

About-

Benjamin Alamar, Professor of Sport Management, Menlo College

Benjamin Alamar

Professor Benjamin Alamar is the founding editor of the Journal of Quantitative Analysis in Sport, a professor of sports management at Menlo College and the Director of Basketball Analytics and Research for the Oklahoma City Thunder of the NBA. He has published academic research in football, basketball and baseball, has presented at numerous conferences on sports analytics. He is also a co-creator of ESPN’s Total Quarterback Rating and a regular contributor to the Wall Street Journal. He has consulted for teams in the NBA and NFL, provided statistical analysis for author Michael Lewis for his recent book The Blind Side, and worked with numerous startup companies in the field of sports analytics. Professor Alamar is also an award winning economist who has worked academically and professionally in intellectual property valuation, public finance and public health. He received his PhD in economics from the University of California at Santa Barbara in 2001.

Prof Alamar is a speaker at Predictive Analytics World, San Fransisco and is doing a workshop there

http://www.predictiveanalyticsworld.com/sanfrancisco/2012/agenda.php#day2-17

2:55-3:15pm

All level tracks Track 1: Sports Analytics
Case Study: NFL, MLB, & NBA
Competing & Winning with Sports Analytics

The field of sports analytics ties together the tools of data management, predictive modeling and information systems to provide sports organization a competitive advantage. The field is rapidly developing based on new and expanded data sources, greater recognition of the value, and past success of a variety of sports organizations. Teams in the NFL, MLB, NBA, as well as other organizations have found a competitive edge with the application of sports analytics. The future of sports analytics can be seen through drawing on these past successes and the developments of new tools.

You can know more about Prof Alamar at his blog http://analyticfootball.blogspot.in/ or journal at http://www.degruyter.com/view/j/jqas. His detailed background can be seen at http://menlo.academia.edu/BenjaminAlamar/CurriculumVitae

Predictive Models Ain’t Easy to Deploy

 

This is a guest blog post by Carole Ann Matignon of Sparkling Logic. You can see more on Sparkling Logic at http://my.sparklinglogic.com/

Decision Management is about combining predictive models and business rules to automate decisions for your business. Insurance underwriting, loan origination or workout, claims processing are all very good use cases for that discipline… But there is a hiccup… It ain’t as easy you would expect…

What’s easy?

If you have a neat model, then most tools would allow you to export it as a PMML model – PMML stands for Predictive Model Markup Language and is a standard XML representation for predictive model formulas. Many model development tools let you export it without much effort. Many BRMS – Business rules Management Systems – let you import it. Tada… The model is ready for deployment.

What’s hard?

The problem that we keep seeing over and over in the industry is the issue around variables.

Those neat predictive models are formulas based on variables that may or may not exist as is in your object model. When the variable is itself a formula based on the object model, like the min, max or sum of Dollar amount spent in Groceries in the past 3 months, and the object model comes with transaction details, such that you can compute it by iterating through those transactions, then the problem is not “that” big. PMML 4 introduced some support for those variables.

The issue that is not easy to fix, and yet quite frequent, is when the model development data model does not resemble the operational one. Your Data Warehouse very likely flattened the object model, and pre-computed some aggregations that make the mapping very hard to restore.

It is clearly not an impossible project as many organizations do that today. It comes with a significant overhead though that forces modelers to involve IT resources to extract the right data for the model to be operationalized. It is a heavy process that is well justified for heavy-duty models that were developed over a period of time, with a significant ROI.

This is a show-stopper though for other initiatives which do not have the same ROI, or would require too frequent model refresh to be viable. Here, I refer to “real” model refresh that involves a model reengineering, not just a re-weighting of the same variables.

For those initiatives where time is of the essence, the challenge will be to bring closer those two worlds, the modelers and the business rules experts, in order to streamline the development AND deployment of analytics beyond the model formula. The great opportunity I see is the potential for a better and coordinated tuning of the cut-off rules in the context of the model refinement. In other words: the opportunity to refine the strategy as a whole. Very ambitious? I don’t think so.

About Carole Ann Matignon

http://my.sparklinglogic.com/index.php/company/management-team

Carole-Ann Matignon Print E-mail

Carole-Ann MatignonCarole-Ann Matignon – Co-Founder, President & Chief Executive Officer

She is a renowned guru in the Decision Management space. She created the vision for Decision Management that is widely adopted now in the industry.  Her claim to fame is managing the strategy and direction of Blaze Advisor, the leading BRMS product, while she also managed all the Decision Management tools at FICO (business rules, predictive analytics and optimization). She has a vision for Decision Management both as a technology and a discipline that can revolutionize the way corporations do business, and will never get tired of painting that vision for her audience.  She speaks often at Industry conferences and has conducted university classes in France and Washington DC.

She started her career building advanced systems using all kinds of technologies — expert systems, rules, optimization, dashboarding and cubes, web search, and beta version of database replication. At Cleversys (acquired by Kurt Salmon & Associates), she also conducted strategic consulting gigs around change management.

While playing with advanced software components, she found a passion for technology and joined ILOG (acquired by IBM). She developed a growing interest in Optimization as well as Business Rules. At ILOG, she coined the term BRMS while brainstorming with her Sales counterpart. She led the Presales organization for Telecom in the Americas up until 2000 when she joined Blaze Software (acquired by Brokat Technologies, HNC Software and finally FICO).

Her 360-degree experience allowed her to gain appreciation for all aspects of a software company, giving her a unique perspective on the business. Her technical background kept her very much in touch with technology as she advanced.

Oracle launches its version of R #rstats

From-

http://www.oracle.com/us/corporate/press/1515738

Integrates R Statistical Programming Language into Oracle Database 11g

News Facts

Oracle today announced the availability of Oracle Advanced Analytics, a new option for Oracle Database 11g that bundles Oracle R Enterprise together with Oracle Data Mining.
Oracle R Enterprise delivers enterprise class performance for users of the R statistical programming language, increasing the scale of data that can be analyzed by orders of magnitude using Oracle Database 11g.
R has attracted over two million users since its introduction in 1995, and Oracle R Enterprise dramatically advances capability for R users. Their existing R development skills, tools, and scripts can now also run transparently, and scale against data stored in Oracle Database 11g.
Customer testing of Oracle R Enterprise for Big Data analytics on Oracle Exadata has shown up to 100x increase in performance in comparison to their current environment.
Oracle Data Mining, now part of Oracle Advanced Analytics, helps enable customers to easily build and deploy predictive analytic applications that help deliver new insights into business performance.
Oracle Advanced Analytics, in conjunction with Oracle Big Data ApplianceOracle Exadata Database Machine and Oracle Exalytics In-Memory Machine, delivers the industry’s most integrated and comprehensive platform for Big Data analytics.

Comprehensive In-Database Platform for Advanced Analytics

Oracle Advanced Analytics brings analytic algorithms to data stored in Oracle Database 11g and Oracle Exadata as opposed to the traditional approach of extracting data to laptops or specialized servers.
With Oracle Advanced Analytics, customers have a comprehensive platform for real-time analytic applications that deliver insight into key business subjects such as churn prediction, product recommendations, and fraud alerting.
By providing direct and controlled access to data stored in Oracle Database 11g, customers can accelerate data analyst productivity while maintaining data security throughout the enterprise.
Powered by decades of Oracle Database innovation, Oracle R Enterprise helps enable analysts to run a variety of sophisticated numerical techniques on billion row data sets in a matter of seconds making iterative, speed of thought, and high-quality numerical analysis on Big Data practical.
Oracle R Enterprise drastically reduces the time to deploy models by eliminating the need to translate the models to other languages before they can be deployed in production.
Oracle R Enterprise integrates the extensive set of Oracle Database data mining algorithms, analytics, and access to Oracle OLAP cubes into the R language for transparent use by R users.
Oracle Data Mining provides an extensive set of in-database data mining algorithms that solve a wide range of business problems. These predictive models can be deployed in Oracle Database 11g and use Oracle Exadata Smart Scan to rapidly score huge volumes of data.
The tight integration between R, Oracle Database 11g, and Hadoop enables R users to write one R script that can run in three different environments: a laptop running open source R, Hadoop running with Oracle Big Data Connectors, and Oracle Database 11g.
Oracle provides single vendor support for the entire Big Data platform spanning the hardware stack, operating system, open source R, Oracle R Enterprise and Oracle Database 11g.
To enable easy enterprise-wide Big Data analysis, results from Oracle Advanced Analytics can be viewed from Oracle Business Intelligence Foundation Suite and Oracle Exalytics In-Memory Machine.

Supporting Quotes

“Oracle is committed to meeting the challenges of Big Data analytics. By building upon the analytical depth of Oracle SQL, Oracle Data Mining and the R environment, Oracle is delivering a scalable and secure Big Data platform to help our customers solve the toughest analytics problems,” said Andrew Mendelsohn, senior vice president, Oracle Server Technologies.
“We work with leading edge customers who rely on us to deliver better BI from their Oracle Databases. The new Oracle R Enterprise functionality allows us to perform deep analytics on Big Data stored in Oracle Databases. By leveraging R and its library of open source contributed CRAN packages combined with the power and scalability of Oracle Database 11g, we can now do that,” said Mark Rittman, co-founder, Rittman Mead.
Oracle Advanced Analytics — an option to Oracle Database 11g Enterprise Edition – extends the database into a comprehensive advanced analytics platform through two major components: Oracle R Enterprise and Oracle Data Mining. With Oracle Advanced Analytics, customers have a comprehensive platform for real-time analytic applications that deliver insight into key business subjects such as churn prediction, product recommendations, and fraud alerting.

Oracle R Enterprise tightly integrates the open source R programming language with the database to further extend the database with Rs library of statistical functionality, and pushes down computations to the database. Oracle R Enterprise dramatically advances the capability for R users, and allows them to use their existing R development skills and tools, and scripts can now also run transparently and scale against data stored in Oracle Database 11g.

Oracle Data Mining provides powerful data mining algorithms that run as native SQL functions for in-database model building and model deployment. It can be accessed through the SQL Developer extension Oracle Data Miner to build, evaluate, share and deploy predictive analytics methodologies. At the same time the high-performance Oracle-specific data mining algorithms are accessible from R.

BENEFITS

  • Scalability—Allows customers to easily scale analytics as data volume increases by bringing the algorithms to where the data resides – in the database
  • Performance—With analytical operations performed in the database, R users can take advantage of the extreme performance of Oracle Exadata
  • Security—Provides data analysts with direct but controlled access to data in Oracle Database 11g, accelerating data analyst productivity while maintaining data security
  • Save Time and Money—Lowers overall TCO for data analysis by eliminating data movement and shortening the time it takes to transform “raw data” into “actionable information”
Oracle R Hadoop Connector Gives R users high performance native access to Hadoop Distributed File System (HDFS) and MapReduce programming framework.
This is a  R package
From the datasheet at

Predictive Analytics World Events in 2012

A new line up of Predictive Analytics World and Text Analytics World conferences and workshops are coming March through July, plus see the save-the-dates and call-for-speakers for events in Sept, Oct, and Nov.

CONFERENCE: Predictive Analytics World – San Francisco

March 4-10, 2012 in San Francisco, CA
http://predictiveanalyticsworld.com/sanfrancisco/2012
Discount Code for $150 off: AJAYBP12

CONFERENCE: Text Analytics World – San Francisco
March 6-7, 2012 in San Francisco, CA
http://textanalyticsworld.com/sanfrancisco/2012
Discount Code for $150 off: AJAYBP12

VARIOUS ANALYTICS WORKSHOPS:
A plethora of 1-day workshops are held alongside PAW and TAW
For details see: http://pawcon.com/sanfrancisco/2012/analytics_workshops.php

SEMINAR: Predictive Analytics for Business, Marketing & Web
March 22-23, 2012 in New York City, NY
July 26-27, 2012 in São Paulo, Brazil
Oct 11-12, 2012 in San Francisco
A concentrated training program lead by PAW’s chair, Eric Siegel
http://businessprediction.com

CONFERENCE: Predictive Analytics World – Toronto
April 25-26, 2012 in Toronto, Ontario
http://predictiveanalyticsworld.com/toronto/2012
Discount Code for $150 off: AJAYBP12

CONFERENCE: Predictive Analytics World – Chicago
June 25-26, 2012 in Chicago, IL
http://www.predictiveanalyticsworld.com/chicago/2012/
Discount Code for $150 off: AJAYBP12

 

From Ajay-

CONTEST- If you use the discount code AJAYBP12, you will not only get the $150 off, but you will be entered in a contest to get 2 complementary passes like I did last year . Matt Stromberg won that one

http://www.decisionstats.com/contest-2-free-passes-to-predictive-analytics-world/

 

see last year results-

http://www.decisionstats.com/congrats-to-matt-stromberg-winner-2-free-passes-to-paw-new-york/

Predictive analytics in the cloud : Angoss

I interviewed Angoss in depth here at http://www.decisionstats.com/interview-eberhard-miethke-and-dr-mamdouh-refaat-angoss-software/

Well they just announced a predictive analytics in the cloud.

 

http://www.angoss.com/predictive-analytics-solutions/cloud-solutions/

Solutions

Overview

KnowledgeCLOUD™ solutions deliver predictive analytics in the Cloud to help businesses gain competitive advantage in the areas of sales, marketing and risk management by unlocking the predictive power of their customer data.

KnowledgeCLOUD clients experience rapid time to value and reduced IT investment, and enjoy the benefits of Angoss’ industry leading predictive analytics – without the need for highly specialized human capital and technology.

KnowledgeCLOUD solutions serve clients in the asset management, insurance, banking, high tech, healthcare and retail industries. Industry solutions consist of a choice of analytical modules:

KnowledgeCLOUD for Sales/Marketing

KnowledgeCLOUD solutions are delivered via KnowledgeHUB™, a secure, scalable cloud-based analytical platform together with supporting deployment processes and professional services that deliver predictive analytics to clients in a hosted environment. Angoss industry leading predictive analytics technology is employed for the development of models and deployment of solutions.

Angoss’ deep analytics and domain expertise guarantees effectiveness – all solutions are back-tested for accuracy against historical data prior to deployment. Best practices are shared throughout the service to optimize your processes and success. Finely tuned client engagement and professional services ensure effective change management and program adoption throughout your organization.

For businesses looking to gain a competitive edge and put their data to work, Angoss is the ideal partner.

—-

Hmm. Analytics in the cloud . Reduce hardware costs. Reduce software costs . Increase profitability margins.

Hmmmmm

My favorite professor in North Carolina who calls cloud as a time sharing, are you listening Professor?

Interview Kelci Miclaus, SAS Institute Using #rstats with JMP

Here is an interview with Kelci Miclaus, a researcher working with the JMP division of the SAS Institute, in which she demonstrates examples of how the R programming language is a great hit with JMP customers who like to be flexible.

 

Ajay- How has JMP been using integration with R? What has been the feedback from customers so far? Is there a single case study you can point out where the combination of JMP and R was better than any one of them alone?

Kelci- Feedback from customers has been very positive. Some customers are using JMP to foster collaboration between SAS and R modelers within their organizations. Many are using JMP’s interactive visualization to complement their use of R. Many SAS and JMP users are using JMP’s integration with R to experiment with more bleeding-edge methods not yet available in commercial software. It can be used simply to smooth the transition with regard to sending data between the two tools, or used to build complete custom applications that take advantage of both JMP and R.

One customer has been using JMP and R together for Bayesian analysis. He uses R to create MCMC chains and has found that JMP is a great tool for preparing the data for analysis, as well as displaying the results of the MCMC simulation. For example, the Control Chart platform and the Bubble Plot platform in JMP can be used to quickly verify convergence of the algorithm. The use of both tools together can increase productivity since the results of an analysis can be achieved faster than through scripting and static graphics alone.

I, along with a few other JMP developers, have written applications that use JMP scripting to call out to R packages and perform analyses like multidimensional scaling, bootstrapping, support vector machines, and modern variable selection methods. These really show the benefit of interactive visual analysis of coupled with modern statistical algorithms. We’ve packaged these scripts as JMP add-ins and made them freely available on our JMP User Community file exchange. Customers can download them and now employ these methods as they would a regular JMP platform. We hope that our customers familiar with scripting will also begin to contribute their own add-ins so a wider audience can take advantage of these new tools.

(see http://www.decisionstats.com/jmp-and-r-rstats/)

Ajay- Are there plans to extend JMP integration with other languages like Python?

Kelci- We do have plans to integrate with other languages and are considering integrating with more based on customer requests. Python has certainly come up and we are looking into possibilities there.

 Ajay- How is R a complimentary fit to JMP’s technical capabilities?

Kelci- R has an incredible breadth of capabilities. JMP has extensive interactive, dynamic visualization intrinsic to its largely visual analysis paradigm, in addition to a strong core of statistical platforms. Since our brains are designed to visually process pictures and animated graphs more efficiently than numbers and text, this environment is all about supporting faster discovery. Of course, JMP also has a scripting language (JSL) allowing you to incorporate SAS code, R code, build analytical applications for others to leverage SAS, R and other applications for users who don’t code or who don’t want to code.

JSL is a powerful scripting language on its own. It can be used for dialog creation, automation of JMP statistical platforms, and custom graphic scripting. In other ways, JSL is very similar to the R language. It can also be used for data and matrix manipulation and to create new analysis functions. With the scripting capabilities of JMP, you can create custom applications that provide both a user interface and an interactive visual back-end to R functionality. Alternatively, you could create a dashboard using statistical and/or graphical platforms in JMP to explore the data and with the click of a button, send a portion of the data to R for further analysis.

Another JMP feature that complements R is the add-in architecture, which is similar to how R packages work. If you’ve written a cool script or analysis workflow, you can package it into a JMP add-in file and send it to your colleagues so they can easily use it.

Ajay- What is the official view on R from your organization? Do you think it is a threat, or a complimentary product or another statistical platform that coexists with your offerings?

Kelci- Most definitely, we view R as complimentary. R contributors are providing a tremendous service to practitioners, allowing them to try a wide variety of methods in the pursuit of more insight and better results. The R community as a whole is providing a valued role to the greater analytical community by focusing attention on newer methods that hold the most promise in so many application areas. Data analysts should be encouraged to use the tools available to them in order to drive discovery and JMP can help with that by providing an analytic hub that supports both SAS and R integration.

Ajay-  While you do use R, are there any plans to give back something to the R community in terms of your involvement and participation (say at useR events) or sponsoring contests.

 Kelci- We are certainly open to participating in useR groups. At Predictive Analytics World in NY last October, they didn’t have a local useR group, but they did have a Predictive Analytics Meet-up group comprised of many R users. We were happy to sponsor this. Some of us within the JMP division have joined local R user groups, myself included.  Given that some local R user groups have entertained topics like Excel and R, Python and R, databases and R, we would be happy to participate more fully here. I also hope to attend the useR! annual meeting later this year to gain more insight on how we can continue to provide tools to help both the JMP and R communities with their work.

We are also exploring options to sponsor contests and would invite participants to use their favorite tools, languages, etc. in pursuit of the best model. Statistics is about learning from data and this is how we make the world a better place.

About- Kelci Miclaus

Kelci is a research statistician developer for JMP Life Sciences at SAS Institute. She has a PhD in Statistics from North Carolina State University and has been using SAS products and R for several years. In addition to research interests in statistical genetics, clinical trials analysis, and multivariate analysis/visualization methods, Kelci works extensively with JMP, SAS, and R integration.

.