Home » Posts tagged 'Graphical user interface'
Tag Archives: Graphical user interface
I got interviewed on moving on from Excel to R in Human Resources (HR) here at http://www.hrtecheurope.com/blog/?p=5345
“There is a lot of data out there and it’s stored in different formats. Spreadsheets have their uses but they’re limited in what they can do. The spreadsheet is bad when getting over 5000 or 10000 rows – it slows down. It’s just not designed for that. It was designed for much higher levels of interaction.
In the business world we really don’t need to know every row of data, we need to summarise it, we need to visualise it and put it into a powerpoint to show to colleagues or clients.”
And a more recent interview with my fellow IIML mate, and editor at Analytics India Magazine
AIM: Which R packages do you use the most and which ones are your favorites?
AO: I use R Commander and Rattle a lot, and I use the dependent packages. I use car for regression, and forecast for time series, and many packages for specific graphs. I have not mastered ggplot though but I do use it sometimes. Overall I am waiting for Hadley Wickham to come up with an updated book to his ecosystem of packages as they are very formidable, completely comprehensive and easy to use in my opinion, so much I can get by the occasional copy and paste code.
A surprising review at R- Bloggers.com /Intelligent Trading
The good news is that many of the large companies do not view R as a threat, but as a beneficial tool to assist their own software capabilities.
After assisting and helping R users navigate through the dense forest of various GUI interface choices (in order to get R up and running), Mr. Ohri continues to handhold users through step by step approaches (with detailed screen captures) to run R from various simple to more advanced platforms (e.g. CLOUD, EC2) in order to gather, explore, and process data, with detailed illustrations on how to use R’s powerful graphing capabilities on the back-end.
Do you want to write a review too? You can visit the site here
- What does R do? Bring people together, of course! (r-bloggers.com)
- Book Review: R for Business Analytics, A Ohri (r-bloggers.com)
I love GUIs (graphical user interfaces)- they might be TCL/TK based or GTK based or even QT based. As a researcher they help me with faster coding, as a consultant they help with faster transition of projects from startup to handover stage and as an R instructor helps me get people to learn R faster.
I wish Python had some GUIs though ;)
from the open access journal of statistical software-
JSS Special Volume 49: Graphical User Interfaces for R
Pedro M. Valero-Mora, Ruben Ledesma
Vol. 49, Issue 1, Jun 2012
Submitted 2012-06-03, Accepted 2012-06-03
Ya-Shan Cheng, Chien-Yu Peng
Vol. 49, Issue 2, Jun 2012
Submitted 2010-12-31, Accepted 2011-06-29
Joris J. Snellenburg, Sergey Laptenok, Ralf Seger, Katharine M. Mullen, Ivo H. M. van Stokkum
Vol. 49, Issue 3, Jun 2012
Submitted 2011-01-20, Accepted 2011-09-16
Marcel Austenfeld, Wolfram Beyschlag
Vol. 49, Issue 4, Jun 2012
Submitted 2011-01-05, Accepted 2012-02-20
Byron C. Wallace, Issa J. Dahabreh, Thomas A. Trikalinos, Joseph Lau, Paul Trow, Christopher H. Schmid
Vol. 49, Issue 5, Jun 2012
Submitted 2010-11-01, Accepted 2012-12-20
Bei Huang, Dianne Cook, Hadley Wickham
Vol. 49, Issue 6, Jun 2012
Submitted 2011-01-20, Accepted 2012-04-16
John Fox, Marilia S. Carvalho
Vol. 49, Issue 7, Jun 2012
Submitted 2010-12-26, Accepted 2011-12-28
Vol. 49, Issue 8, Jun 2012
Submitted 2011-02-28, Accepted 2011-09-08
Stefan Rödiger, Thomas Friedrichsmeier, Prasenjit Kapat, Meik Michalke
Vol. 49, Issue 9, Jun 2012
Submitted 2010-12-28, Accepted 2011-05-06
Vol. 49, Issue 10, Jun 2012
Submitted 2010-12-17, Accepted 2011-05-11
Vol. 49, Issue 11, Jun 2012
Submitted 2010-12-08, Accepted 2011-07-15
Here is an interview with Dr Ingo Mierswa , CEO of Rapid -I and Dr Simon Fischer, Head R&D. Rapid-I makes the very popular software Rapid Miner – perhaps one of the earliest leading open source software in business analytics and business intelligence. It is quite easy to use, deploy and with it’s extensions and innovations (including compatibility with R )has continued to grow tremendously through the years.
In an extensive interview Ingo and Simon talk about algorithms marketplace, extensions , big data analytics, hadoop, mobile computing and use of the graphical user interface in analytics.
Special Thanks to Nadja from Rapid I communication team for helping coordinate this interview.( Statuary Blogging Disclosure- Rapid I is a marketing partner with Decisionstats as per the terms in http://decisionstats.com/privacy-3/)
Ajay- Describe your background in science. What are the key lessons that you have learnt while as scientific researcher and what advice would you give to new students today.
Ingo: My time as researcher really was a great experience which has influenced me a lot. I have worked at the AI lab of Prof. Dr. Katharina Morik, one of the persons who brought machine learning and data mining to Europe. Katharina always believed in what we are doing, encouraged us and gave us the space for trying out new things. Funnily enough, I never managed to use my own scientific results in any real-life project so far but I consider this as a quite common gap between science and the “real world”. At Rapid-I, however, we are still heavily connected to the scientific world and try to combine the best of both worlds: solving existing problems with leading-edge technologies.
Simon: In fact, during my academic career I have not worked in the field of data mining at all. I worked on a field some of my colleagues would probably even consider boring, and that is theoretical computer science. To be precise, my research was in the intersection of game theory and network theory. During that time, I have learnt a lot of exciting things, none of which had any business use. Still, I consider that a very valuable experience. When we at Rapid-I hire people coming to us right after graduating, I don’t care whether they know the latest technology with a fancy three-letter acronym – that will be forgotten more quickly than it came. What matters is the way you approach new problems and challenges. And that is also my recommendation to new students: work on whatever you like, as long as you are passionate about it and it brings you forward.
Ajay- How is the Rapid Miner Extensions marketplace moving along. Do you think there is a scope for people to say create algorithms in a platform like R , and then offer that algorithm as an app for sale just like iTunes or Android apps.
Simon: Well, of course it is not going to be exactly like iTunes or Android apps are, because of the more business-orientated character. But in fact there is a scope for that, yes. We have talked to several developers, e.g., at our user conference RCOMM, and several people would be interested in such an opportunity. Companies using data mining software need supported software packages, not just something they downloaded from some anonymous server, and that is only possible through a platform like the new Marketplace. Besides that, the marketplace will not only host commercial extensions. It is also meant to be a platform for all the developers that want to publish their extensions to a broader community and make them accessible in a comfortable way. Of course they could just place them on their personal Web pages, but who would find them there? From the Marketplace, they are installable with a single click.
Ingo: What I like most about the new Rapid-I Marketplace is the fact that people can now get something back for their efforts. Developing a new algorithm is a lot of work, in some cases even more that developing a nice app for your mobile phone. It is completely accepted that people buy apps from a store for a couple of Dollars and I foresee the same for sharing and selling algorithms instead of apps. Right now, people can already share algorithms and extensions for free, one of the next versions will also support selling of those contributions. Let’s see what’s happening next, maybe we will add the option to sell complete RapidMiner workflows or even some data pools…
Ajay- What are the recent features in Rapid Miner that support cloud computing, mobile computing and tablets. How do you think the landscape for Big Data (over 1 Tb ) is changing and how is Rapid Miner adapting to it.
Simon: These are areas we are very active in. For instance, we have an In-Database-Mining Extension that allows the user to run their modelling algorithms directly inside the database, without ever loading the data into memory. Using analytic databases like Vectorwise or Infobright, this technology can really boost performance. Our data mining server, RapidAnalytics, already offers functionality to send analysis processes into the cloud. In addition to that, we are currently preparing a research project dealing with data mining in the cloud. A second project is targeted towards the other aspect you mention: the use of mobile devices. This is certainly a growing market, of course not for designing and running analyses, but for inspecting reports and results. But even that is tricky: When you have a large screen you can display fancy and comprehensive interactive dashboards with drill downs and the like. On a mobile device, that does not work, so you must bring your reports and visualizations very much to the point. And this is precisely what data mining can do – and what is hard to do for classical BI.
Ingo: Then there is Radoop, which you may have heard of. It uses the Apache Hadoop framework for large-scale distributed computing to execute RapidMiner processes in the cloud. Radoop has been presented at this year’s RCOMM and people are really excited about the combination of RapidMiner with Hadoop and the scalability this brings.
Ajay- Describe the Rapid Miner analytics certification program and what steps are you taking to partner with academic universities.
Ingo: The Rapid-I Certification Program was created to recognize professional users of RapidMiner or RapidAnalytics. The idea is that certified users have demonstrated a deep understanding of the data analysis software solutions provided by Rapid-I and how they are used in data analysis projects. Taking part in the Rapid-I Certification Program offers a lot of benefits for IT professionals as well as for employers: professionals can demonstrate their skills and employers can make sure that they hire qualified professionals. We started our certification program only about 6 months ago and until now about 100 professionals have been certified so far.
Simon: During our annual user conference, the RCOMM, we have plenty of opportunities to talk to people from academia. We’re also present at other conferences, e.g. at ECML/PKDD, and we are sponsoring data mining challenges and grants. We maintain strong ties with several universities all over Europe and the world, which is something that I would not want to miss. We are also cooperating with institutes like the ITB in Dublin during their training programmes, e.g. by giving lectures, etc. Also, we are leading or participating in several national or EU-funded research projects, so we are still close to academia. And we offer an academic discount on all our products :-)
Ajay- Describe the global efforts in making Rapid Miner a truly international software including spread of developers, clients and employees.
Simon: Our clients already are very international. We have a partner network in America, Asia, and Australia, and, while I am responding to these questions, we have a training course in the US. Developers working on the core of RapidMiner and RapidAnalytics, however, are likely to stay in Germany for the foreseeable future. We need specialists for that, and it would be pointless to spread the development team over the globe. That is also owed to the agile philosophy that we are following.
Ingo: Simon is right, Rapid-I already is acting on an international level. Rapid-I now has more than 300 customers from 39 countries in the world which is a great result for a young company like ours. We are of course very strong in Germany and also the rest of Europe, but also concentrate on more countries by means of our very successful partner network. Rapid-I continues to build this partner network and to recruit dynamic and knowledgeable partners and in the future. However, extending and acting globally is definitely part of our strategic roadmap.
Dr. Ingo Mierswa is working as Chief Executive Officer (CEO) of Rapid-I. He has several years of experience in project management, human resources management, consulting, and leadership including eight years of coordinating and leading the multi-national RapidMiner developer team with about 30 developers and contributors world-wide. He wrote his Phd titled “Non-Convex and Multi-Objective Optimization for Numerical Feature Engineering and Data Mining” at the University of Dortmund under the supervision of Prof. Morik.
Dr. Simon Fischer is heading the research & development at Rapid-I. His interests include game theory and networks, the theory of evolutionary algorithms (e.g. on the Ising model), and theoretical and practical aspects of data mining. He wrote his PhD in Aachen where he worked in the project “Design and Analysis of Self-Regulating Protocols for Spectrum Assignment” within the excellence cluster UMIC. Before, he was working on the vtraffic project within the DFG Programme 1126 “Algorithms for large and complex networks”.
http://rapid-i.com/content/view/181/190/ tells you more on the various types of Rapid Miner licensing for enterprise, individual and developer versions.
(Note from Ajay- to receive an early edition invite to Radoop, click here http://radoop.eu/z1sxe)
I am hoping to put this on my pre-ordered or Amazon Wish list. The book the common people who wanted to do data mining with , but were unable to ask aloud they didnt know much. It is written by the seminal Australian authority on data mining Dr Graham Williams whom I interviewed here at http://decisionstats.com/2009/01/13/interview-dr-graham-williams/
Data Mining for the masses using an ergonomically designed Graphical User Interface.
Thank you Springer. Thank you Dr Graham Williams
Data Mining with Rattle and R
The Art of Excavating Data for Knowledge Discovery
Series: Use R
1st Edition., 2011, XX, 409 p. 150 illus. in color.
Softcover, ISBN 978-1-4419-9889-7
Due: August 29, 201154,95 €
- Encourages the concept of programming with data – more than just pushing data through tools, but learning to live and breathe the data
- Accessible to many readers and not necessarily just those with strong backgrounds in computer science or statistics
- Details some of the more popular algorithms for data mining, as well as covering model evaluation and model deployment
Data mining is the art and science of intelligent data analysis. By building knowledge from information, data mining adds considerable value to the ever increasing stores of electronic data that abound today. In performing data mining many decisions need to be made regarding the choice of methodology, the choice of data, the choice of tools, and the choice of algorithms.
Throughout this book the reader is introduced to the basic concepts and some of the more popular algorithms of data mining. With a focus on the hands-on end-to-end process for data mining, Williams guides the reader through various capabilities of the easy to use, free, and open source Rattle Data Mining Software built on the sophisticated R Statistical Software. The focus on doing data mining rather than just reading about data mining is refreshing.
The book covers data understanding, data preparation, data refinement, model building, model evaluation, and practical deployment. The reader will learn to rapidly deliver a data mining project using software easily installed for free from the Internet. Coupling Rattle with R delivers a very sophisticated data mining environment with all the power, and more, of the many commercial offerings.
Content Level » Research
Keywords » Data mining
Related subjects » Physical & Information Science
On a whim, I took the all time stats of my blog posts (more than 1000 posts) , and tried to plot their distribution.
Basically I copied and pasted all the data in a Google docs spreadsheet. and I created dummy codes (like URL1, URL2…. URL 500)
Next I downloaded the….
I wasnt in the mood for downloading and uploading stuff so I decided to use GGPLOT using Jeroen’s Application at http://www.stat.ucla.edu/~jeroen/
I used the mirror server that Dataspora provides as I have had latency issues with Jeroen’s website.
I got this error while trying to connect the Dataspora App to my Google spreadsheet
The page you have requested cannot be displayed. Another site was requesting access to your Google Account, but sent a malformed request. Please contact the site that you were trying to use when you received this message to inform them of the error. A detailed error message follows:
The site “http://dataspora.com” has not been registered.
Oh dear! Back to Jeroen’s /UCLA’s page.
I get this warning but it still manages to log in
This website has not registered with Google to establish a secure connection for authorization requests. We recommend that you continue the process only if you trust the following destination:
wow it works! thats cloud computing now so I wonder why Google and Amazon continue to ignore the rApache, and Jeroen’s cloud app . Surely their Google Fusion Tables can be always improved or tweaked. Not to mention the next gen version of R which will have its own server
Pretty cool screenshot (but click to see more)
I get the following pretty graph. Hadley Wickham would be ashamed of me by now.
What went wrong- well one page has 36000 views . Scale is the key to graphical coherence . So I redo- delete home page in Google spreadsheet ,reimport replot. ( I didnt know how to modify data in the cloud app, maybe we need a cloud PlyR) I redo it again as I have a big outlier-The top 10 Statistical GUI article which ironically has only 5 GUIs in that article but hush dont tell to high quality search engine)
So again Belatedly I discover something called layer in ggplot.
I give up. I rather prefer hist() I go to my favorite GUI Rattle, but it has some dating issues with the dll of GTK+
So I go to John Fox’s simple GUI. R Commander- is the best GUI if you use Occam’s Razor, and I am using Occam’s Chainsaw now.
I get the analysis I want in 12 secs
Summary- GGPLot is more complicated than base graphics engine.
Deducer GUI is not as simple too
R Commander is the best GUI because it retains simplicity
Ignore long tail of internet only at your peril
Almost 2/3 rds of my daily traffic of 400+ comes from old archived content That is why Search Engine Optimization and Alerts for Keywords are CRITICAL for any poor soul trying to write on a blog (which has no journal like prestige nor rewards)
If you make life easier for the search engine, it being a fair chap, rewards you well
Existing web traffic estimates like Comscore and Google Trends ignore this long tail
Comments are welcome (Data is pasted below of 500 rows X 2 columns if you can come up with a better analysis)
Since SAS has ignored web analytics and Google Analytics is hmm hmm, this could be an area of opportunity for R developers as well to create a web analytics package.
- Cloud Computing May Decrease Your API Call Limit (programmableweb.com)
- Book: ggplot2 by Hadley Wickham (r-bloggers.com)
- Google Instant Search: What does this mean for advertisers? (wpromote.com)
- 2 Fun and Useful Goog,e Spreadsheet Tricks (searchenginejournal.com)
- R Graphs Resources (decisionstats.com)
- The Importance of the Long Tail with Keywords and Phrases (businessbloggingtips.com)
- As Google Retools its Search Engine, Content Farms Lose Traction (xconomy.com)