Name an industry in which top level executives are mostly white males, new recruits are mostly male (white or Indian/Chinese), women are primarily shunted into publicity relationships, social media or marketing.
Statistical Computing And Business Intelligence are the white man’s last stand to preserve an exclusive club of hail fellow well met and lets catch up after drinks culture. Newer startups are the exception in the business intelligence world , but a whiter face helps (so do an Indian or Chinese male) to attract a mostly male white venture capital industry.
I have earlier talked about technology being totally dominated by Asian males at grad student level and ASA membership almost not representing minorities like blacks and yes women- but this is about corporate culture in the traditional BI world.
If you are connected to the BI or Stat Computing world, who would you rather hire AND who have you actually hired- with identical resumes
White Male or White Female or Brown Indian Male/Female or Yellow Male/Female or Black Male or Black Female
How many Black Grad Assistants do you see in tech corridors- (Nah- it is easier to get a hard working Chinese /Indian- who smiles and does a great job at $12/hour)
How many non- Asian non white Authors do you see in technology and does that compare to pie chart below
Note_ 2010 Census numbers arent available for STEM, and I was unable to find ethnic background for various technology companies, because though these numbers are collected for legal purposes, they are not publicly shared.
Any technology company which has more than 40% women , or more than 10% blacks would be fairly representative to the US population. Anecdotal evidence suggests European employment for minorities is worse (especially for Asians) but better for women.
Any data sources to support/ refute these hypothesis are welcome for purposes of scientific inquiry.
ALISO VIEJO, Calif., Oct 19, 2010 (BUSINESS WIRE) — Predixion Software today introduced Predixion PMML Connexion(TM), an interface that provides Predixion Insight(TM), the company’s low-cost, self-service in the cloud predictive analytics solution, direct and seamless access to SAS, SPSS (IBM) and other predictive models for use by Predixion Insight customers. Predixion PMML Connexion enables companies to leverage their significant investments in legacy predictive analytics solutions at a fraction of the cost of conventional licensing and maintenance fees.
The announcement was made at the Predictive Analytics World conference in Washington, D.C. where Predixion also announced a strategic partnership with Zementis, Inc., a market leader in PMML-based solutions. Zementis is exhibiting in Booth #P2.
The Predictive Model Markup Language (PMML) standard allows for true interoperability, offering a mature standard for moving predictive models seamlessly between platforms. Predixion has fully integrated this PMML functionality into Predixion Insight, meaning Predixion Insight users can now effortlessly import PMML-based predictive models, enabling information workers to score the models in the cloud from anywhere and publish reports using Microsoft Excel(R) and SharePoint(R). In addition, models can also be written back into SAS, SPSS and other platforms for a truly collaborative, interoperable solution.
“Predixion’s investment in this PMML interface makes perfect business sense as the lion’s share of the models in existence today are created by the SAS and SPSS platforms, creating compelling opportunity to leverage existing investments in predictive and statistical models on a low-cost cloud predictive analytics platform that can be fed with enterprise, line of business and cloud-based data,” said Mike Ferguson, CEO of Intelligent Business Strategies, a leading analyst and consulting firm specializing in the areas of business intelligence and enterprise business integration. “In this economy, Predixion’s low-cost, self-service predictive analytics solutions might be welcome relief to IT organizations chartered with quickly adding additional applications while at the same time cutting costs and staffing.”
“We are pleased to be partnering with Zementis, truly a PMML market leader and innovator,” said Predixion CEO Simon Arkell. “To allow any SAS or SPSS customer to immediately score any of their predictive models in the cloud from within Predixion Insight, compare those models to those created by Predixion Insight, and share the results within Excel and Sharepoint is an exciting step forward for the industry. SAS and SPSS customers are fed up with the high prices they must pay for their business users just to access reports generated by highly skilled PhDs who are burdened by performing routine tasks and thus have become a massive bottleneck. That frustration is now a thing of the past because any information worker can now unlock the power of predictive analytics without relying on experts — for a fraction of the cost and from anywhere they can connect to the cloud,” Arkell said.
Dr. Michael Zeller, Zementis CEO, added, “Our mission is to significantly shorten the time-to-market for predictive models in any industry. We are excited to be contributing to Predixion’s self-service, cloud-based predictive analytics solution set.”
About Predixion Software
Predixion Software develops and markets collaborative predictive analytics solutions in the public and private cloud. Predixion enables self-service predictive analytics, allowing customers to use and analyze large amounts of data to make actionable decisions, all within the familiar environment of Excel and PowerPivot. Predixion customers are achieving immediate results across a multitude of industries including: retail, finance, healthcare, marketing, telecommunications and insurance/risk management.
Predixion Software is headquartered in Aliso Viejo, California with development offices in Redmond, Washington. The company has venture capital backing from established investors including DFJ Frontier, Miramar Venture Partners and Palomar Ventures. For more information please contact us at 949-330-6540, or visit us atwww.predixionsoftware.com.
About Zementis
Zementis, Inc. is a leading software company focused on the operational deployment and integration of predictive analytics and data mining solutions. Its ADAPA(R) decision engine successfully bridges the gap between science and engineering. ADAPA(R) was designed from the ground up to benefit from open standards and to significantly shorten the time-to-market for predictive models in any industry. For more information, please visit www.zementis.com.
Google Instant is a relatively newer feature in Google Search Engine- it suggests websites at each type of keyword rather than wait for you to type the whole keyword.
The impact on user experience is incredible- rather than search or scroll through the results- you are more likely to click on the almost one of the ten websites you would have seen by the time you finished typing- or just clicking on the relevant ad (which probably changes on the right margin as fast as the websites below)
This spells a death for all those who indulged in black hat SEO– or link building, link exchanging- as these techniques pushed up your rank in search page only incrementally and rarely to the top 2-3 for a keyword.
Remember the size of the screen is such that each Google instant snapshot basically shows you or rather makes you focus on the top ranked search (and then presumably type on to get a newer result- rather than scroll down as the case was before).
It would be interesting to see or research the effect of keywords in the auction pricing, as well as compare those keyword pricing with Bing.com- Maybe there should be a website api tool for advertisers -like Adwords Instant that would show them the price instantly of keywords,comparison with Bing AND the search engine results for the keyword in a visual way.
Anyways- it is a incredible innovation and it is good Google is back to the math after the flings with being “Mad Men” of advertising.
and yes- I heard there is a new movie coming- it is called “The Search Engine” 🙂
Here’s a group of questions and answers that David Smith of Revolution Analytics was kind enough to answer post the launch of the new R Package which integrates Hadoop and R- RevoScaleR
Ajay- How does RevoScaleR work from a technical viewpoint in terms of Hadoop integration?
David-The point isn’t that there’s a deep technical integration between Revolution R and Hadoop, rather that we see them as complementary (not competing) technologies. Hadoop is amazing at reliably (if slowly) processing huge volumes of distributed data; the RevoScaleR package complements Hadoop by providing statistical algorithms to analyze the data processed by Hadoop. The analogy I use is to compare a freight train with a race car: use Hadoop to slog through a distributed data set and use Map/Reduce to output an aggregated, rectangular data file; then use RevoScaleR to perform statistical analysis on the processed data (and use the speed of RevolScaleR to iterate through many model options to find the best one).
Ajay- How is it different from MapReduce and R Hipe– existing R Hadoop packages?
David- They’re complementary. In fact, we’ll be publishing a white paper soon by Saptarshi Guha, author of the Rhipe R/Hadoop integration, showing how he uses Hadoop to process vast volumes of packet-level VOIP data to identify call time/duration from the packets, and then do a regression on the table of calls using RevoScaleR. There’s a little more detail in this blog post: http://blog.revolutionanalytics.com/2010/08/announcing-big-data-for-revolution-r.html
Ajay- Is it going to be proprietary, free or licensable (open source)?
David- RevoScaleR is a proprietary package, available to paid subscribers (or free to academics) with Revolution R Enterprise. (If you haven’t seen it, you might be interested in this Q&A I did with Matt Shotwell: http://biostatmatt.com/archives/533 )
Ajay- Any existing client case studies for Terabyte level analysis using R.
David- The VOIP example above gets close, but most of the case studies we’ve seen in beta testing have been in the 10’s to 100’s of Gb range. We’ve tested RevoScaleR on larger data sets internally, but we’re eager to hear about real-life use cases in the terabyte range.
Ajay- How can I use RevoScaleR on my dual chip Win Intel laptop for say 5 gb of data.
David- One of the great things about RevoScaleR is that it’s designed to work on commodity hardware like a dual-core laptop. You won’t be constrained by the limited RAM available, and the parallel processing algorithms will make use of all cores available to speed up the analysis even further. There’s an example in this white paper (http://info.revolutionanalytics.com/bigdata.html) of doing linear regression on 13Gb of data on a simple dual-core laptop in less than 5 seconds.
AJ-Thanks to David Smith, for this fast response and wishing him, Saptarshi Guha Dr Norman Nie and the rest of guys at Revolution Analytics a congratulations for this new product launch.
Here are some thoughts on using existing statistical software for better analytics and/or business intelligence (reporting)-
1) User Interface Design Matters- Most stats software have a legacy approach to user interface design. While the Graphical User Interfaces need to more business friendly and user friendly- example you can call a button T Test or You can call it Compare > Means of Samples (with a highlight called T Test). You can call a button Chi Square Test or Call it Compare> Counts Data. Also excessive reliance on drop down ignores the next generation advances in OS- namely touchscreen instead of mouse click and point.
Given the fact that base statistical procedures are the same across softwares, a more thoughtfully designed user interface (or revamped interface) can give softwares an edge over legacy designs.
2) Branding of Software Matters- One notable whine against SAS Institite products is a premier price. But really that software is actually inexpensive if you see other reporting software. What separates a Cognos from a Crystal Reports to a SAS BI is often branding (and user interface design). This plays a role in branding events – social media is often the least expensive branding and marketing channel. Same for WPS and Revolution Analytics.
3) Alliances matter- The alliances of parent companies are reflected in the sales of bundled software. For a complete solution , you need a database plus reporting plus analytical software. If you are not making all three of the above, you need to partner and cross sell. Technically this means that software (either DB, or Reporting or Analytics) needs to talk to as many different kinds of other softwares and formats. This is why ODBC in R is important, and alliances for small companies like Revolution Analytics, WPS and Netezza are just as important as bigger companies like IBM SPSS, SAS Institute or SAP. Also tie-ins with Hadoop (like R and Netezza appliance) or Teradata and SAS help create better usage.
4) Cloud Computing Interfaces could be the edge- Maybe cloud computing is all hot air. Prudent business planing demands that any software maker in analytics or business intelligence have an extremely easy to load interface ( whether it is a dedicated on demand website) or an Amazon EC2 image. Easier interfaces win and with the cloud still in early stages can help create an early lead. For R software makers this is critical since R is bad in PC usage for larger sets of data in comparison to counterparts. On the cloud that disadvantage vanishes. An easy to understand cloud interface framework is here ( its 2 years old but still should be okay) http://knol.google.com/k/data-mining-through-cloud-computing#
5) Platforms matter- Softwares should either natively embrace all possible platforms or bundle in middle ware themselves.
Here is a case study SAS stopped supporting Apple OS after Base SAS 7. Today Apple OS is strong ( 3.47 million Macs during the most recent quarter ) and the only way to use SAS on a Mac is to do either
Why does this matter? Well SAS is free to academics and students from this year, but Mac is a preferred computer there. Well WPS can be run straight away on the Mac (though they are curiously not been able to provide academics or discounted student copies 😉 ) as per
Does this give a disadvantage based on platform. Yes. However JMP continues to be supported on Mac. This is also noteworthy given the upcoming Chromium OS by Google, Windows Azure platform for cloud computing.