What is Radoop? Quite possibly an exciting mix of analytics and big data computing
What is Radoop?
Hadoop is an excellent tool for analyzing large data sets, but it lacks an easy-to-use graphical interface. RapidMiner is an excellent tool for data analytics, but its data size is limited by the memory available, and a single machine is often not enough to run the analyses on time. In this project, we combine the strengths of both projects and provide a RapidMiner extension for editing and running ETL, data analytics and machine learning processes over Hadoop.
We have closely integrated the highly optimized data analytics capabilities of Hive and Mahout, and the user-friendly interface of RapidMiner to form a powerful and easy-to-use data analytics solution for Hadoop.
and what’s new
Radoop 0.3 released – fully graphical big data analytics
Today, Radoop had a major step forward with its 0.3 release. The new version of the visual big data analytics package adds full support for all major Hadoop distributions used these days: Apache Hadoop 0.20.2, 0.20.203, 1.0 and Cloudera’s Distribution including Apache Hadoop 3 (CDH3). It also adds support for large clusters by allowing the namenode, the jobtracker and the Hive server to reside on different nodes.
As Radoop’s promise is to make big data analytics easier, the 0.3 release is also focused on improving the user interface. It has an enhanced breakpointing system which allows to investigate intermediate results, and it adds dozens of quick fixes, so common process design mistakes get much easier to solve.
There are many further improvements and fixes, so please consult the release notes for more details. Radoop is in private beta mode, but heading towards a public release in Q2 2012. If you would like to get early access, then please apply at the signup page or describe your use case in email (beta at radoop.eu).
Radoop 0.3 (15 February 2012)
- Support for Apache Hadoop 0.20.2, 0.20.203, 1.0 and Cloudera’s Distribution Including Apache Hadoop 3 (CDH3) in a single release
- Support for clusters with separate master nodes (namenode, jobtracker, Hive server)
- Enhanced breakpointing to evaluate intermediate results
- Dozens of quick fixes for the most common process design errors
- Improved process design and error reporting
- New welcome perspective to help in the first steps
- Many bugfixes and performance improvements
Radoop 0.2.2 (6 December 2011)
- More Aggregate functions and distinct option
- Generate ID operator for convenience
- Numerous bug fixes and improvements
- Improved user interface
Radoop 0.2.1 (16 September 2011)
- Set Role and Data Multiplier operators
- Management panel for testing Hadoop connections
- Stability improvements for Hive access
- Further small bugfixes and improvements
Radoop 0.2 (26 July 2011)
- Three new algoritms: Fuzzy K-Means, Canopy, and Dirichlet clustering
- Three new data preprocessing operators: Normalize, Replace, and Replace Missing Values
- Significant speed improvements in data transmission and interactive analytics
- Increased stability and speedup for K-Means
- More flexible settings for Join operations
- More meaningful error messages
- Other small bugfixes and improvements
Radoop 0.1 (14 June 2011)
Initial release with 26 operators for data transmission, data preprocessing, and one clustering algorithm.
Note that Rapid Miner also has a great R extension so you can use R, a graphical interface and big data analytics is now easier and more powerful than ever.