Interview Rapid-I -Ingo Mierswa and Simon Fischer

Here is an interview with Dr Ingo Mierswa , CEO of Rapid -I and Dr Simon Fischer, Head R&D. Rapid-I makes the very popular software Rapid Miner – perhaps one of the earliest leading open source software in business analytics and business intelligence. It is quite easy to use, deploy and with it’s extensions and innovations (including compatibility with R )has continued to grow tremendously through the years.

In an extensive interview Ingo and Simon talk about algorithms marketplace, extensions , big data analytics, hadoop, mobile computing and use of the graphical user interface in analytics.

Special Thanks to Nadja from Rapid I communication team for helping coordinate this interview.( Statuary Blogging Disclosure- Rapid I is a marketing partner with Decisionstats as per the terms in https://decisionstats.com/privacy-3/)

Ajay- Describe your background in science. What are the key lessons that you have learnt while as scientific researcher and what advice would you give to new students today.

Ingo: My time as researcher really was a great experience which has influenced me a lot. I have worked at the AI lab of Prof. Dr. Katharina Morik, one of the persons who brought machine learning and data mining to Europe. Katharina always believed in what we are doing, encouraged us and gave us the space for trying out new things. Funnily enough, I never managed to use my own scientific results in any real-life project so far but I consider this as a quite common gap between science and the “real world”. At Rapid-I, however, we are still heavily connected to the scientific world and try to combine the best of both worlds: solving existing problems with leading-edge technologies.

Simon: In fact, during my academic career I have not worked in the field of data mining at all. I worked on a field some of my colleagues would probably even consider boring, and that is theoretical computer science. To be precise, my research was in the intersection of game theory and network theory. During that time, I have learnt a lot of exciting things, none of which had any business use. Still, I consider that a very valuable experience. When we at Rapid-I hire people coming to us right after graduating, I don’t care whether they know the latest technology with a fancy three-letter acronym – that will be forgotten more quickly than it came. What matters is the way you approach new problems and challenges. And that is also my recommendation to new students: work on whatever you like, as long as you are passionate about it and it brings you forward.

Ajay-  How is the Rapid Miner Extensions marketplace moving along. Do you think there is a scope for people to say create algorithms in a platform like R , and then offer that algorithm as an app for sale just like iTunes or Android apps.

 Simon: Well, of course it is not going to be exactly like iTunes or Android apps are, because of the more business-orientated character. But in fact there is a scope for that, yes. We have talked to several developers, e.g., at our user conference RCOMM, and several people would be interested in such an opportunity. Companies using data mining software need supported software packages, not just something they downloaded from some anonymous server, and that is only possible through a platform like the new Marketplace. Besides that, the marketplace will not only host commercial extensions. It is also meant to be a platform for all the developers that want to publish their extensions to a broader community and make them accessible in a comfortable way. Of course they could just place them on their personal Web pages, but who would find them there? From the Marketplace, they are installable with a single click.

Ingo: What I like most about the new Rapid-I Marketplace is the fact that people can now get something back for their efforts. Developing a new algorithm is a lot of work, in some cases even more that developing a nice app for your mobile phone. It is completely accepted that people buy apps from a store for a couple of Dollars and I foresee the same for sharing and selling algorithms instead of apps. Right now, people can already share algorithms and extensions for free, one of the next versions will also support selling of those contributions. Let’s see what’s happening next, maybe we will add the option to sell complete RapidMiner workflows or even some data pools…

Ajay- What are the recent features in Rapid Miner that support cloud computing, mobile computing and tablets. How do you think the landscape for Big Data (over 1 Tb ) is changing and how is Rapid Miner adapting to it.

Simon: These are areas we are very active in. For instance, we have an In-Database-Mining Extension that allows the user to run their modelling algorithms directly inside the database, without ever loading the data into memory. Using analytic databases like Vectorwise or Infobright, this technology can really boost performance. Our data mining server, RapidAnalytics, already offers functionality to send analysis processes into the cloud. In addition to that, we are currently preparing a research project dealing with data mining in the cloud. A second project is targeted towards the other aspect you mention: the use of mobile devices. This is certainly a growing market, of course not for designing and running analyses, but for inspecting reports and results. But even that is tricky: When you have a large screen you can display fancy and comprehensive interactive dashboards with drill downs and the like. On a mobile device, that does not work, so you must bring your reports and visualizations very much to the point. And this is precisely what data mining can do – and what is hard to do for classical BI.

Ingo: Then there is Radoop, which you may have heard of. It uses the Apache Hadoop framework for large-scale distributed computing to execute RapidMiner processes in the cloud. Radoop has been presented at this year’s RCOMM and people are really excited about the combination of RapidMiner with Hadoop and the scalability this brings.

 Ajay- Describe the Rapid Miner analytics certification program and what steps are you taking to partner with academic universities.

Ingo: The Rapid-I Certification Program was created to recognize professional users of RapidMiner or RapidAnalytics. The idea is that certified users have demonstrated a deep understanding of the data analysis software solutions provided by Rapid-I and how they are used in data analysis projects. Taking part in the Rapid-I Certification Program offers a lot of benefits for IT professionals as well as for employers: professionals can demonstrate their skills and employers can make sure that they hire qualified professionals. We started our certification program only about 6 months ago and until now about 100 professionals have been certified so far.

Simon: During our annual user conference, the RCOMM, we have plenty of opportunities to talk to people from academia. We’re also present at other conferences, e.g. at ECML/PKDD, and we are sponsoring data mining challenges and grants. We maintain strong ties with several universities all over Europe and the world, which is something that I would not want to miss. We are also cooperating with institutes like the ITB in Dublin during their training programmes, e.g. by giving lectures, etc. Also, we are leading or participating in several national or EU-funded research projects, so we are still close to academia. And we offer an academic discount on all our products 🙂

Ajay- Describe the global efforts in making Rapid Miner a truly international software including spread of developers, clients and employees.

Simon: Our clients already are very international. We have a partner network in America, Asia, and Australia, and, while I am responding to these questions, we have a training course in the US. Developers working on the core of RapidMiner and RapidAnalytics, however, are likely to stay in Germany for the foreseeable future. We need specialists for that, and it would be pointless to spread the development team over the globe. That is also owed to the agile philosophy that we are following.

Ingo: Simon is right, Rapid-I already is acting on an international level. Rapid-I now has more than 300 customers from 39 countries in the world which is a great result for a young company like ours. We are of course very strong in Germany and also the rest of Europe, but also concentrate on more countries by means of our very successful partner network. Rapid-I continues to build this partner network and to recruit dynamic and knowledgeable partners and in the future. However, extending and acting globally is definitely part of our strategic roadmap.

Biography

Dr. Ingo Mierswa is working as Chief Executive Officer (CEO) of Rapid-I. He has several years of experience in project management, human resources management, consulting, and leadership including eight years of coordinating and leading the multi-national RapidMiner developer team with about 30 developers and contributors world-wide. He wrote his Phd titled “Non-Convex and Multi-Objective Optimization for Numerical Feature Engineering and Data Mining” at the University of Dortmund under the supervision of Prof. Morik.

Dr. Simon Fischer is heading the research & development at Rapid-I. His interests include game theory and networks, the theory of evolutionary algorithms (e.g. on the Ising model), and theoretical and practical aspects of data mining. He wrote his PhD in Aachen where he worked in the project “Design and Analysis of Self-Regulating Protocols for Spectrum Assignment” within the excellence cluster UMIC. Before, he was working on the vtraffic project within the DFG Programme 1126 “Algorithms for large and complex networks”.

http://rapid-i.com/content/view/181/190/ tells you more on the various types of Rapid Miner licensing for enterprise, individual and developer versions.

(Note from Ajay- to receive an early edition invite to Radoop, click here http://radoop.eu/z1sxe)

 

How to surf anonymously on the mobile- Use Orbot

This is an interesting use case of anonymous surfing through mobile by using Tor Project on the Android Mobile OS.

Source- https://guardianproject.info/apps/orbot/
 

Orbot requires different configuration depending on the Android operating system version it is used on.

For standard Android 1.x devices (G1, MyTouch3G, Hero, Droid Eris, Cliq, Moment)

  • WEB BROWSING: You can use the Orweb Privacy Browser which we offer, which only works via Orbot and Tor.
  • For Instant Messsaging, please try Gibberbot which provides integrated, optional support for Orbot and Tor.

For Android 2.x devices: Droid, Nexus, Evo, Galaxy

  • WEB BROWSING: Non-rooted devices should use Firefox for Android with our ProxyMob Add-On to browse via the Tor network. Rooted devices can take advantage of transparent proxying (see below) and do not need an additional app installed.
  • Transparent Proxying: You must root your device in order for Orbot to work transparently for all web and DNS traffic. If you root your device, whether it is 1.x or 2.x based, Orbot will automatically, transparently proxy all web traffic on port 80 and 443 and all DNS requests. This includes the built-in Browser, Gmail, YouTube, Maps and any other application that uses standard web traffic.
  • For Instant Messsaging, please try Gibberbot which provides integrated, optional support for Orbot and Tor.

Developers

Interview Mike Boyarski Jaspersoft

Here is an interview with Mike Boyarski , Director Product Marketing at Jaspersoft

.

 

the largest BI community with over 14 million downloads, nearly 230,000 registered members, representing over 175,000 production deployments, 14,000 customers, across 100 countries.

Ajay- Describe your career in science from Biology to marketing great software.
Mike- I studied Biology with the assumption I’d pursue a career in medicine. It took about 2 weeks during an internship at a Los Angeles hospital to determine I should do something else.  I enjoyed learning about life science, but the whole health care environment was not for me.  I was initially introduced to enterprise-level software while at Applied Materials within their Microcontamination group.  I was able to assist with an internal application used to collect contamination data.  I later joined Oracle to work on an Oracle Forms application used to automate the production of software kits (back when documentation and CDs had to be physically shipped to recognize revenue). This gave me hands on experience with Oracle 7, web application servers, and the software development process.
I then transitioned to product management for various products including application servers, software appliances, and Oracle’s first generation SaaS based software infrastructure. In 2006, with the Siebel and PeopleSoft acquisitions underway, I moved on to Ingres to help re-invigorate their solid yet antiquated technology. This introduced me to commercial open source software and the broader Business Intelligence market.  From Ingres I joined Jaspersoft, one of the first and most popular open source Business Intelligence vendors, serving as head of product marketing since mid 2009.
Ajay- Describe some of the new features in Jaspersoft 4.1 that help differentiate it from the rest of the crowd. What are the exciting product features we can expect from Jaspersoft down the next couple of years.
Mike- Jaspersoft 4.1 was an exciting release for our customers because we were able to extend the latest UI advancements in our ad hoc report designer to the data analysis environment. Now customers can use a unified intuitive web-based interface to perform several powerful and interactive analytic functions across any data source, whether its relational, non-relational, or a Big Data source.
 The reality is that most (roughly 70%) of todays BI adoption is in the form of reports and dashboards. These tools are used to drive and measure an organizations business, however, data analysis presents the most strategic opportunity for companies because it can identify new opportunities, efficiencies, and competitive differentiation.  As more data comes online, the difference between those companies that are successful and those that are not will likely be attributed to their ability to harness data analysis techniques to drive and improve business performance. Thus, with Jaspersoft 4.1, and our improved ad hoc reporting and analysis UI we can effectively address a broader set of BI requirements for organizations of all sizes.
Ajay-  What do you think is a good metric to measure influence of an open source software product – is it revenue or is it number of downloads or number of users. How does Jaspersoft do by these counts.
Mike- History has shown that open source software is successful as a “bottoms up” disrupter within IT or the developer market.  Today, many new software projects and startup ventures are birthed on open source software, often initiated with little to no budget. As the organization achieves success with a particular project, the next initiative tends to be larger and more strategic, often displacing what was historically solved with a proprietary solution. These larger deployments strengthen the technology over time.
Thus, the more proven and battle tested an open source solution is, often measured via downloads, deployments, community size, and community activity, usually equates to its long term success. Linux, Tomcat, and MySQL have plenty of statistics to model this lifecycle. This model is no different for open source BI.
The success to date of Jaspersoft is directly tied to its solid proven technology and the vibrancy of the community.  We proudly and openly claim to have the largest BI community with over 14 million downloads, nearly 230,000 registered members, representing over 175,000 production deployments, 14,000 customers, across 100 countries.  Every day, 30,000 developers are using Jaspersoft to build BI applications.  Behind Excel, its hard to imagine a more widely used BI tool in the market.  Jaspersoft could not reach these kind of numbers with crippled or poorly architected software.
Ajay- What are your plans for leveraging cloud computing, mobile and tablet platforms and for making Jaspersoft more easy and global  to use.

Google Speed Test

Here is a new service in beta for helping test your website for speed. It is continuing series of initiatives for Google to help the internet (and their own computing resources)

If you want to request access to the limited beta

Request Access – Here

https://docs.google.com/spreadsheet/viewform?hl=en_US&formkey=dDdjcmNBZFZsX2c0SkJPQnR3aGdnd0E6MQ&ifq

 

Continue reading “Google Speed Test”

The Top Statisticians in the World

 

 

 

 

 

 

http://en.wikipedia.org/wiki/John_Tukey

 

John Tukey

From Wikipedia, the free encyclopedia
John Tukey

John Wilder Tukey
Born June 16, 1915
New Bedford, Massachusetts, USA
Died July 26, 2000 (aged 85)
New Brunswick, New Jersey
Residence United States
Nationality American
Fields Mathematician
Institutions Bell Labs
Princeton University
Alma mater Brown University
Princeton University
Doctoral advisor Solomon Lefschetz
Doctoral students Frederick Mosteller
Kai Lai Chung
Known for FFT algorithm
Box plot
Coining the term ‘bit’
Notable awards Samuel S. Wilks Award (1965)
National Medal of Science (USA) in Mathematical, Statistical, and Computational Sciences (1973)
Shewhart Medal (1976)
IEEE Medal of Honor (1982)
Deming Medal (1982)
James Madison Medal (1984)
Foreign Member of the Royal Society(1991)

John Wilder Tukey ForMemRS[1] (June 16, 1915 – July 26, 2000) was an American statistician.

Contents

[hide]

[edit]Biography

Tukey was born in New Bedford, Massachusetts in 1915, and obtained a B.A. in 1936 and M.Sc.in 1937, in chemistry, from Brown University, before moving to Princeton University where he received a Ph.D. in mathematics.[2]

During World War II, Tukey worked at the Fire Control Research Office and collaborated withSamuel Wilks and William Cochran. After the war, he returned to Princeton, dividing his time between the university and AT&T Bell Laboratories.

Among many contributions to civil society, Tukey served on a committee of the American Statistical Association that produced a report challenging the conclusions of the Kinsey Report,Statistical Problems of the Kinsey Report on Sexual Behavior in the Human Male.

He was awarded the IEEE Medal of Honor in 1982 “For his contributions to the spectral analysis of random processes and the fast Fourier transform (FFT) algorithm.”

Tukey retired in 1985. He died in New Brunswick, New Jersey on July 26, 2000.

[edit]Scientific contributions

His statistical interests were many and varied. He is particularly remembered for his development with James Cooley of the Cooley–Tukey FFT algorithm. In 1970, he contributed significantly to what is today known as the jackknife estimation—also termed Quenouille-Tukey jackknife. He introduced the box plot in his 1977 book,”Exploratory Data Analysis“.

Tukey’s range test, the Tukey lambda distributionTukey’s test of additivity and Tukey’s lemma all bear his name. He is also the creator of several little-known methods such as the trimean andmedian-median line, an easier alternative to linear regression.

In 1974, he developed, with Jerome H. Friedman, the concept of the projection pursuit.[3]

http://en.wikipedia.org/wiki/Ronald_Fisher

Sir Ronald Aylmer Fisher FRS (17 February 1890 – 29 July 1962) was an English statistician,evolutionary biologisteugenicist and geneticist. Among other things, Fisher is well known for his contributions to statistics by creating Fisher’s exact test and Fisher’s equationAnders Hald called him “a genius who almost single-handedly created the foundations for modern statistical science”[1] while Richard Dawkins named him “the greatest biologist since Darwin“.[2]

 

contacts.xls

http://en.wikipedia.org/wiki/William_Sealy_Gosset

William Sealy Gosset (June 13, 1876–October 16, 1937) is famous as a statistician, best known by his pen name Student and for his work on Student’s t-distribution.

Born in CanterburyEngland to Agnes Sealy Vidal and Colonel Frederic Gosset, Gosset attendedWinchester College before reading chemistry and mathematics at New College, Oxford. On graduating in 1899, he joined the Dublin brewery of Arthur Guinness & Son.

Guinness was a progressive agro-chemical business and Gosset would apply his statistical knowledge both in the brewery and on the farm—to the selection of the best yielding varieties ofbarley. Gosset acquired that knowledge by study, trial and error and by spending two terms in 1906–7 in the biometric laboratory of Karl Pearson. Gosset and Pearson had a good relationship and Pearson helped Gosset with the mathematics of his papers. Pearson helped with the 1908 papers but he had little appreciation of their importance. The papers addressed the brewer’s concern with small samples, while the biometrician typically had hundreds of observations and saw no urgency in developing small-sample methods.

Another researcher at Guinness had previously published a paper containing trade secrets of the Guinness brewery. To prevent further disclosure of confidential information, Guinness prohibited its employees from publishing any papers regardless of the contained information. However, after pleading with the brewery and explaining that his mathematical and philosophical conclusions were of no possible practical use to competing brewers, he was allowed to publish them, but under a pseudonym (“Student”), to avoid difficulties with the rest of the staff.[1] Thus his most famous achievement is now referred to as Student’s t-distribution, which might otherwise have been Gosset’s t-distribution.

Facebook to Google Plus Migration

and there is a new tool on that already but you are on your own if your data gets redirected. Does Chrome take legal liability for malware extensions? Dunno-and yes it works on Chrome alone (at the point of speaking)

https://chrome.google.com/webstore/detail/ficlccidpkaiepnnboobcmafnnfoomga

 

Facebook Friend Exporter
Logo 

Facebook Friend Exporter
Verified author: mohamedmansour.com
Free
Get *your* data contact out of Facebook to Google Contacts or CSV, whether they want you to or not.
103 ratings
5,527 users
Install
Description
Get *your* data contact out of Facebook, whether they want you to or not. You gave them your friends and allowed them to store that data, and you have right to take it back out! Facebook doesn't own my friends. Only available in English Facebook. Any other language will not work.

SOURCE CODE: http://goo.gl/VtRCl (GitHub) fb-exporter

PRE NOTICE:
 1 - Must have English version of Facebook for this to work (you can switch)
 2 - Do not enable SSL for Facebook use HTTP not HTTPS
 3 - If you need any help running this, contact me. Commenting below will be lost.
 4 - An "Export" button will appear on Facebooks toolbar after refresh once installed.
 5 - Please disable all Facebook Extensions that you have downloaded, many of them affect the page. For example "Better Facebook" breaks this extension.

This extension will allow you to get your friends information that they shared to you: Continue reading "Facebook to Google Plus Migration"