Interesting R and BI Web Event

An interesting webinar from Revolution, the vanguard of corporate R things- mixing R analytics and BI Dashboards. Me thinks – an alliance with BI dashboard maker could also help the Revo guys as BI and Analytics are two similar yet different markets. Also could help if you are a newbie to BI  but know enough analytics/stats.

Click on the screenshot below if interested.

SUPERCHARGE BI AND DASHBOARDS WITH PREDICTIVE ANALYTICS

FREE WEBINAR WEDNESDAY, JUNE 2

Presenters:
David Smith, vice president of Marketing, Revolution Analytics
Steve Miller, president, OpenBI, LLC
Andrew Lampitt, senior director, Technology Alliances, Jaspersoft

Audience:
BI implementors seeking to integrate predictive analytics into BI dashboards;
R users and developers seeking to distribute advanced analytics to business users;
Business users seeking to improve their BI outcomes.

R Modeling with huge data

Here is a training course by BI Vendor, Netezza which uses R analytical capabilties. Its using R in the customized appliances of Netezza.

Source-

http://www.netezza.com/userconference/pce.html#rmftfic

R Modeling for TwinFin i-Class

Objective
Learn how to use TwinFin i-Class for scaling up the R language.

Description
In this class, you’ll learn how to use R to create models using huge data and how to create R algorithms that exploit our asymmetric massively parallel (AMPP®) architecture. Netezza has seamlessly integrated with R to offload the heavy lifting of the computational processing on TwinFin i-Class. This results in higher performance and increased scalability for R. Sign up for this class to learn how to take advantage of TwinFin i-Class for your R modeling. Topics include:

  1. R CRAN package installation on TwinFin i-Class
  2. Creating models using R on TwinFin i-Class
  3. Creating R algorithms for TwinFin i-Class

Format
Hands-on classroom lecture, lab exercises, tour

Audience
Knowledgeable R users – modelers, analytic developers, data miners

Course Length
0.5 day: 12pm-4pm Wednesday, June 23 OR 8am-12pm Thursday, June 24 OR 1pm-5pm Thursday, June 24, 2010

Delivery
Enzee Universe 2010, Boston, MA

Student Prerequisites

  • Working knowledge of R and parallel computing
  • Have analytic, compute-intensive challenges
  • Understanding of data mining and analytics

SAP and BI on Demand

SAP announced BI On Demand or a SaaS product. From the press release here

SAP AG (NYSE: SAP) today announced the SAP® BusinessObjects™ BI OnDemand solution. Targeted at casual BI users currently underserved by products on the market, the solution will deliver a complete BI toolset in one flexible offering. Its ease-of-use also will allow them to be up and running with no prior experience or training. With SAP BusinessObjects BI OnDemand, business users will be able to access and visually navigate data from any source using SAP® BusinessObjects™ Explorersoftware. Even casual users can then combinethat data in just a few clicks, and follow a guided path that walks them through reporting and analysis. The solution will have scalable pricing models based on business need, allowing companies to easily and cost-effectively scale as required.

M2009 Interview Peter Pawlowski AsterData

Here is an interview with Peter Pawlowski, who is the MTS for Data Mining at Aster Data. I ran into Peter at his booth at AsterData during M2009, and followed up with an email interview. Also included is a presentation by him of which he was a co-author.

[tweetmeme source=”decisionstats”]

Ajay- Describe your career in Science leading up till today.

Peter- Went to Stanford, where I got a BS & MS in Computer Science. I did some work on automated bug-finding tools while at Stanford.
( Note- that sums up the career of almost 60 % of CS scientists)

Ajay- How is life working at Aster Data- what are the challenges and the great stuff

Peter- Working at Aster is great fun, due to the sheer breadth and variety of the technical challenges. We have problems to solve in the optimization, languages, networking, databases, operating systems, etc. It’s been great to think about problems end-to-end & consider the impact of a change on all aspects of the system. I worked on SQL/MR in particular, which had lots of interesting challenges: how do you define the API? how do you integrate with SQL? how do you make it run fast? how do you make it scale?

Ajay- Do you think Universities offer adequate preparation for in demand skills like Mapreduce, Hadoop and Business Intelligence

Peter-   Probably not BI–I learned everything I know about BI while at Aster. In terms of M/R, it’d be useful to have more hands-on experience with distributed system which at school. We read the MapReduce paper but didn’t get a chance to actually play with M/R. I think that sort of exposure would be useful. We recently made our software available to some students taking a data mining class at Stanford, and they came up with some fascinating use cases for our system, esp. around the Netflix challenge dataset.

Ajay- Describe some of the recent engineering products that you have worked with at Aster

Peter-  SQL/MR is the main aspects of nCluster that i’ve worked with–interesting challenged described in #2.

Ajay- All BI companies claim to crunch data the fastest at the lowest price at highest quality as per their marketing brochure- How would you validate your product’s performance scientifically and transparently.

Peter- I’ve found that the hardest part of judging performance is to come up with a realistic workload. There are public benchmarks out there, but they may or may not reflect the kinds of workloads that our customers want to run. Our goal is to make our customers’ experience as good as possible, so we focus on speeding up the sorts of workloads they ask about.
And here is a presentation at Slideshare.net on more of what Peter works on.

Analytics and BI for small biz

I saw a story on Warren B and Goldman S creating a 500$ million pool for small business owners.

  • The program will contribute $200 million to community colleges, universities and other institutions to provide small- business owners with practical business education.

  • Goldman Sachs repaid the $10 billion it was given last year under the taxpayer-funded Troubled Asset Relief Program, plus dividends. The firm continues to benefit from federal guarantees on about $21 billion of long-term debt.

  • Buffett, known as the “Oracle of Omaha” for his investing prowess, is the second-richest American. Berkshire, which invests in companies ranging from retailers to insurers, paid $5 billion in September 2008 to acquire preferred stock in Goldman Sachs that pays a 10 percent dividend. Berkshire, based in Omaha, Nebraska, also gained five-year warrants to buy $5 billion of common stock at $115 per share.

  • ( NOTE Curent Price of GS shares is 172$ – thats a 50% profit on 5 Billion~ 2.5 Billion for Mr Buffett but he is probably waiting for long term capital gains ax rates to kick in before encashing his patriotic  “Buy American. I am” warrants (see NYT op ed by him  http://www.nytimes.com/2008/10/17/opinion/17buffett.html )
  • A better analysis of the above Bloomberg story was given on Bloomberg itself at http://www.bloomberg.com/apps/news?pid=20601039&sid=asjp51YPDwJU
  • A small thought- could smaller businesses gain from efficiencies of programs like SPSS, SAS and R. Or would they be better off with customized GUI’s linked to their POS data.

Anyways a need for analytics for small businesses in inventory management, and sales planning could help. Joe the Plumber could do with some ETS and Regression Models as well.

However apart for Salesforce.com applications this field seems to be totally vacant for analytics. What are IBM SPSS, SAS, or even other stats packages doing for small businesses. or even developing Salesforce.com applications for their own equivalent software

The market could be an interesting one to atleast do a test in. Unless you don’t believe in test and control.

See below the IBM Cognos by IBM itself and the third party app by Pervasive for SAP Integration-

Citation-

http://sites.force.com/appexchange/listingDetail?listingId=a0N300000016YGYEA2

and

http://sites.force.com/appexchange/listingDetail?listingId=a0N300000016am1EAA

The World of Data as I think

Post discussions on my performance at grad school and WHAT exactly DO I want to work in- I drew the following curves.

Feel free to draw better circles- and I will include your reference here

Caution- Based upon a very ordinary understanding of extra ordinary technical things.

THE WORLD OF DATA

Screenshot-18

AND WHAT I WANT TO DO IN IT

Screenshot-19

ps- What do you think? Add a comment

“Build a better mousetrap, and the world will beat a path to your door.”- Emerson

Interview Ken O Connor Business Intelligence Consultant

Here is an interview with an industry veteran of Business Intelligence, Ken O Connor.

Ajay- Describe your career journey across the full development cycle of Business Intelligence.

Ken- I started my career in the early 80’s in the airline industry, where I worked as an application programmer and later as a systems programmer. I took a computer science degree by night. The airline industry was one of the first to implement computer systems in the ‘60s, and the legacy of being an early adaptor was that airline reservation systems were developed in Assembler. Remarkable as it sounds now, as application programmers, we wrote our own file access methods. Even more remarkable, as systems programmers, we modified the IBM supplied Operating System, originally known as the Airline Control Program (ACP), later renamed as Transaction Processing Facility (TPF). The late ‘80s saw the development of Global “Computer Reservations Systems” (CRS systems) including AMADEUS and GALILEO. I moved from Aer Lingus, a small Irish airline, to work in London on the British Airways systems, to enable the British Airways systems share information and communicate with the new Global CRS systems.

I learnt very important lessons during those years.

* The criticality of standards

* The drive for interoperability of systems

* The drive towards information sharing

* The drive away from bespoke development

In the 90’s I returned to Dublin, where I worked as an independent consultant with IBM on many data intensive projects. On one project I was lead developer in the IBM Dublin Laboratory on the development of the Data Replication tool called “Data Propagator NonRelational”. This tool automatically propagates updates made on IMS databases to DB2 databases. On this project, we successfully piloted using the Cleanroom Development Method, as part of IBM’s derive towards Six Sigma quality.

In the past 15 years I have moved away from IT towards the business. I describe myself as a Hybrid. I believe there is a serious communications gap between business users and IT, and this is a frequent cause of project failures. I seek to bridge that gap. I ensure that requirements are clear, measurable, testable, and capable of being easily understood and signed off by business owners.

One of my favorite programmes was Euro Changeover, This was a hugely data intensive programme. It was the largest changeover undertaken by European Financial Institutions. I worked as an independent consultant with the IBM Euro Centre of Competence. I developed changeover strategies for a number of Irish Enterprises, and was the End to End IT changeover process owner in a major Irish bank. Every application and every data store holding currency sensitive data (not just amounts, but currency signs etc.) had to be converted at exactly the same time to ensure that all systems successfully switched to euro processing on 1st January 2002.

I learnt many, many lasting lessons about data the hard way on Euro Changeover programmes, such as:

* The extent to which seemingly separate applications share operational data – often without the knowledge of the owning application.

* The extent to which business users use (abuse) data fields to hold information never intended for the data field.

* The critical distinction between the underlying data (in a data store) and the information displayed to a business user.

I have worked primarily on what I call “End of food chain” projects and programmes, such as Single View of Customer, data migrations, and data population of repositories for BASEL II and Anti Money Laundering (AML) systems. Business Intelligence is another example of an “End of food chain” project. “End of food-chain” projects share the following characteristics:

* Dependent on existing data

* No control over the quality of existing data they depend on

* No control over the data entry processes by which the data they require is captured.

* The data required may have been captured many years previously.

Recently, I have shared my experience of “Enterprise wide data issues” in a series of posts on my blog, together with a process for assessing the status of those issues within an Enterprise (more details). In my experience, the success of a Business Intelligence programme and the ease with which an Enterprise completes “End of food chain” data dependent programmes directly depends on the status of the common Enterprise Wide data issues I have identified.

Ajay -Describe the educational scene for science graduates in Ireland. What steps do you think governments and universities can do to better teach science and keep young people excited about it?

Ken- I am not in a position to comment on the educational scene for science graduates in Ireland. However, I can say that currently there are insufficient numbers of school children studying science in primary and 2nd level education. There is a need to excite young people about science. There is a need for more interactive science museums, like W5 in Belfast which is hugely successful. Kids love to get involved, and practical science can be great fun.

Ajay- What are some of the key trends in business intelligence that you have seen-

Ken- Since the earliest days of my career, I have seen an ever increasing move towards standards based interoperability of systems, and interchange of data. This has accelerated dramatically in recent years. This is the good news. Further good news is the drive towards the use of external reference databases to verify the accuracy of data, at point of data entry (See blog post on Upstream prevention by Henrik Liliendahl Sørensen). One example of this drive is cloud based verification services from new companies like Ireland based Clavis Technology.

The harsh reality is that “Old hardware goes into museums, while old software goes into production every night”. Enterprises have invested vast amounts of money in legacy applications over decades. These legacy systems access legacy data in legacy data stores. This legacy data will continue to pose challenges in the delivery of Business Intelligence to the Business community that needs it. These challenges will continue to provide opportunities for Data Quality professionals.

Ajay- What is going to be the next fundamental change in this industry in your opinion?

Ken- The financial crisis will result in increased regulatory requirements. This will be good news for the Business Intelligence / Data Quality industry. In time, it will no longer be sufficient to provide the regulator with ‘just’ the information requested. The regulator will want to see the process by which the information was gathered; the process controls, and evidence of the quality the underlying data from which the information was derived. This move will result in funding for Data Governance programmes, which will lead to increased innovation in our industry.

Ajay- Describe your startup Map My Business, your target customer and your vision for it.

Ken- I started MapMyBusiness.com as a “recession buster”. Ireland was hit particularly hard by the financial crisis. I had become over dependent on the financial services industry, and a blanket ban on the use of external consultants left me with no option but to reinvent myself. MapMyBusiness.com helps small businesses to attract clients, by getting them on Google page one. Having been burnt by an over dependence on one industry, my vision is to diversify. I believe that Data Governance is industry independent, and I am focussing on increasing my customer base for my Data Governance consultancy skills, via my company Professional IT Personnel Ltd.

Ajay- What do you do when not working with customers or blogging on your website?

Ken- I try to achieve a reasonable work/life balance. I am married with two children aged 12 and 10, and like to spend time with them, especially outdoors, walking, hiking, playing tennis etc. I am involved in my community, lobbying for improved cycling infrastructure in our area (more details). Ireland, like most countries, is facing an obesity epidemic, due to an increasingly sedentary lifestyle. Too many people get little or no exercise, and don’t have the time, willpower, or perhaps money, to regularly work out in a gym. By including “Active Travel” in our daily lives – by walking or cycling to schools and local amenities, we can get enough physical exercise to prevent obesity, and obesity related health problems. We need to make our cities, towns and villages more pedestrian and cyclist friendly, to encourage “active travel”. My voluntary work in this area introduced me to mapping (see example), and enabled me to set up MapMyBusiness.com.

Biography-

Ken O’Connor is an independent IT Consultant with almost 30 years of work experience. He specialises in Data: Data Migration, Data Population, Data Governance, Data Quality, Data Profiling…His  company is called Professional IT Personnel Ltd.

Ken started his blog (Ken O’ Connor Data Consultant) to share his experience and to learn from the experience of others.   Dylan Jones, editor of dataqualitypro, describe Ken as a “grizzled veteran”, with almost 30 years experience across the full development lifecycle.