Machine Learning to Translate Code from different programming languages

Google Translate has been a pioneer in using machine learning for translating various languages (and so is the awesome Google Transliterate)

I wonder if they can expand it to programming languages and not just human languages.

 

Issues in converting  translating programming language code

1) Paths referred for stored objects

2) Object Names should remain the same and not translated

3) Multiple Functions have multiple uses , sometimes function translate is not straightforward

I think all these issues are doable, solveable and more importantly profitable.

 

I look forward to the day a iOS developer can convert his code to Android app code by simple upload and download.

Anonymous grows up and matures…Anonanalytics.com

I liked the design, user interfaces and the conceptual ideas behind the latest Anonymous hactivist websites (much better than the shabby graphic design of Wikileaks, or Friends of Wikileaks, though I guess they have been busy what with Julian’s escapades and Syrian emails)

 

I disagree  (and let us agree to disagree some of the time)

with the complete lack of respect for Graphical User Interfaces for tools. If dDOS really took off due to LOIC, why not build a GUI for SQL Injection (or atleats the top 25 vulnerability testing as by this list http://www.sans.org/top25-software-errors/

Shouldnt Tor be embedded within the next generation of Loic.

Automated testing tools are used by companies like Adobe (and others)… so why not create simple GUI for the existing tools.., I may be completely offtrack here.. but I think hacker education has been a critical misstep[ that has undermined Western Democracies preparedness for Cyber tactics by hostile regimes)…. how to create the next generation of hackers by easy tutorials (see codeacademy and build appropriate modules)

-A slick website to be funded by Bitcoins (Money can buy everything including Mastercard and Visa, but Bitcoins are an innovative step towards an internet economy  currency)

-A collobrative wiki

http://wiki.echelon2.org/wiki/Main_Page

Seriously dude, why not make this a part of Wikipedia- (i know Jimmy Wales got shifty eyes, but can you trust some1 )

-Analytics for Anonymous (sighs! I should have thought about this earlier)

http://anonanalytics.com/ (can be used to play and bill both sides of corporate espionage and be cyber private investigators)

What We Do

We provide the public with investigative reports exposing corrupt companies. Our team includes analysts, forensic accountants, statisticians, computer experts, and lawyers from various jurisdictions and backgrounds. All information presented in our reports is acquired through legal channels, fact-checked, and vetted thoroughly before release. This is both for the protection of our associates as well as groups/individuals who rely on our work.

_and lastly creative content for Pinterest.com and Public Relations ( what next-? Tom Cruise to play  Julian Assange in the new Movie ?)

http://www.par-anoia.net/ />Potentially Alarming Research: Anonymous Intelligence AgencyInformation is and will be free. Expect it. ~ Anonymous

Links of interest

  • Latest Scientology Mails (Austria)
  • Full FBI call transcript
  • Arrest Tracker
  • HBGary Email Viewer
  • The Pirate Bay Proxy
  • We Are Anonymous – Book
  • To be announced…

 

Rapid Miner User Conference 2012

One of those cool conferences that is on my bucket list- this time in Hungary (That’s a nice place)

But I am especially interested in seeing how far Radoop has come along !

Disclaimer- Rapid Miner has been a Decisionstats.com sponsor  for many years. It is also a very cool software but I like the R Extension facility even more!

—————————————————————

and not very expensive too compared to other User Conferences in Europe!-

http://rcomm2012.org/index.php/registration/prices

Information about Registration

  • Early Bird registration until July 20th, 2012.
  • Normal registration from July 21st, 2012 until August 13th, 2012.
  • Latest registration from August 14th, 2012 until August 24th, 2012.
  • Students have to provide a valid Student ID during registration.
  • The Dinner is included in the All Days and in the Conference packages.
  • All prices below are net prices. Value added tax (VAT) has to be added if applicable.

Prices for Regular Visitors

Days and Event
Early Bird Rate
Normal Rate
Latest Registration
Tuesday

(Training / Development 1)

190 Euro 230 Euro 280 Euro
Wednesday + Thursday

(Conference)

290 Euro 350 Euro 420 Euro
Friday

(Training / Development 2 and Exam)

190 Euro 230 Euro 280 Euro
All Days

(Full Package)

610 Euro 740 Euro 900 Euro

Prices for Authors and Students

In case of students, please note that you will have to provide a valid student ID during registration.

Days and Event
Early Bird Rate
Normal Rate
Latest Registration
Tuesday

(Training / Development 1)

90 Euro 110 Euro 140 Euro
Wednesday + Thursday

(Conference)

140 Euro 170 Euro 210 Euro
Friday

(Training / Development 2 and Exam)

90 Euro 110 Euro 140 Euro
All Days

(Full Package)

290 Euro 350 Euro 450 Euro
Time
Slot
Tuesday
Training / Workshop 1
Wednesday
Conference 1
Thursday
Conference 2
Friday
Training / Workshop 2
09:00 – 10:30
Introductory Speech
Ingo Mierswa; Rapid-I 

Data Analysis

 

NeurophRM: Integration of the Neuroph framework into RapidMiner
Miloš Jovanović, Jelena Stojanović, Milan Vukićević, Vera Stojanović, Boris Delibašić (University of Belgrade)

To be announced (Invited Talk)
To be announced

 

Recommender Systems

 

Extending RapidMiner with Recommender Systems Algorithms
Matej Mihelčić, Nino Antulov-Fantulin, Matko Bošnjak, Tomislav Šmuc (Ruđer Bošković Institute)

Implementation of User Based Collaborative Filtering in RapidMiner
Sérgio Morais, Carlos Soares (Universidade do Porto)

Parallel Training / Workshop Session

Advanced Data Mining and Data Transformations

or

Development Workshop Part 2

10:30 – 12:30
Data Analysis

Nearest-Neighbor and Clustering based Anomaly Detection Algorithms for RapidMiner
Mennatallah Amer, Markus Goldstein (DFKI)

Customers’ LifeStyle Targeting on Big Data using Rapid Miner
Maksim Drobyshev (LifeStyle Marketing Ltd)

Robust GPGPU Plugin Development for RapidMiner
Andor Kovács, Zoltán Prekopcsák (Budapest University of Technology and Economics)

Extensions

Image Mining Extension – Year After
Radim Burget, Václav Uher, Jan Mašek (Brno University of Technology)

Incorporating R Plots into RapidMiner Reports
Peter Jeszenszky (University of Debrecen)

An Octave Extension for RapidMiner
Sylvain Marié (Schneider Electric)

12:30 – 13:30
Lunch
Lunch
Lunch
13:30 – 15:00
Parallel Training / Workshop Session

Basic Data Mining and Data Transformations

or

Development Workshop Part 1

Applications

Application of RapidMiner in Steel Industry Research and Development
Bengt-Henning Maas, Hakan Koc, Martin Bretschneider (Salzgitter Mannesmann Forschung)

A Comparison of Data-driven Models for Forecast River Flow
Milan Cisty, Juraj Bezak (Slovak University of Technology)

Portfolio Optimization Using Local Linear Regression Ensembles in Rapid Miner
Gábor Nagy, Tamás Henk, Gergő Barta (Budapest University of Technology and Economics)

Unstructured Data


Processing Data Streams with the RapidMiner Streams-Plugin
Christian Bockermann, Hendrik Blom (TU Dortmund)

Automated Creation of Corpuses for the Needs of Sentiment Analysis
Peter Koncz, Jan Paralic (Technical University of Kosice)

 

Demonstration

 

News from the Rapid-I Labs
Simon Fischer; Rapid-I

This short session demonstrates the latest developments from the Rapid-I lab and will let you how you can build powerful analysis processes and routines by using those RapidMiner tools.

Certification Exam
15:00 – 17:00
Book Presentation and Game Show

Data Mining for the Masses: A New Textbook on Data Mining for Everyone
Matthew North (Washington & Jefferson College)

Matthew North presents his new book “Data Mining for the Masses” introducing data mining to a broader audience and making use of RapidMiner for practical data mining problems.

 

Game Show
Did you miss last years’ game show “Who wants to be a data miner?”? Use RapidMiner for problems it was never created for and beat the time and other contestants!

User Support

Get some Coffee for free – Writing Operators with RapidMiner Beans
Christian Bockermann, Hendrik Blom (TU Dortmund)

Meta-Modeling Execution Times of RapidMiner operators
Matija Piškorec, Matko Bošnjak, Tomislav Šmuc (Ruđer Bošković Institute) 

19:00
Social Event (Conference Dinner)
Social Event (Visit of Bar District)

 

Training: Basic Data Mining and Data Transformations

This is a short introductory training course for users who are not yet familiar with RapidMiner or only have a few experiences with RapidMiner so far. The topics of this training session include

  • Basic Usage
    • User Interface
    • Creating and handling RapidMiner repositories
    • Starting a new RapidMiner project
    • Operators and processes
    • Loading data from flat files
    • Storing data, processes, and results
  • Predictive Models
    • Linear Regression
    • Naïve Bayes
    • Decision Trees
  • Basic Data Transformations
    • Changing names and roles
    • Handling missing values
    • Changing value types by discretization and dichotimization
    • Normalization and standardization
    • Filtering examples and attributes
  • Scoring and Model Evaluation
    • Applying models
    • Splitting data
    • Evaluation methods
    • Performance criteria
    • Visualizing Model Performance

 

Training: Advanced Data Mining and Data Transformations

This is a short introductory training course for users who already know some basic concepts of RapidMiner and data mining and have already used the software before, for example in the first training on Tuesday. The topics of this training session include

  • Advanced Data Handling
    • Sampling
    • Balancing data
    • Joins and Aggregations
    • Detection and removal of outliers
    • Dimensionality reduction
  • Control process execution
    • Remember process results
    • Recall process results
    • Loops
    • Using branches and conditions
    • Exception handling
    • Definition of macros
    • Usage of macros
    • Definition of log values
    • Clearing log tables
    • Transforming log tables to data

 

Development Workshop Part 1 and Part 2

Want to exchange ideas with the developers of RapidMiner? Or learn more tricks for developing own operators and extensions? During our development workshops on Tuesday and Friday, we will build small groups of developers each working on a small development project around RapidMiner. Beginners will get a comprehensive overview of the architecture of RapidMiner before making the first steps and learn how to write own operators. Advanced developers will form groups with our experienced developers, identify shortcomings of RapidMiner and develop a new extension which might be presented during the conference already. Unfinished work can be continued in the second workshop on Friday before results might be published on the Marketplace or can be taken home as a starting point for new custom operators.

Talking on Big Data Analytics

I am going  being sponsored to a Government of India sponsored talk on Big Data Analytics at Bangalore on Friday the 13 th of July. If you are in Bangalore, India you may drop in for a dekko. Schedule and Abstracts (i am on page 7 out 9) .

Your tax payer money is hard at work- (hassi majak only if you are a desi. hassi to fassi.)

13 July 2012 (9.30 – 11.00 & 11.30 – 1.00)
Big Data Big Analytics
The talk will showcase using open source technologies in statistical computing for big data, namely the R programming language and its use cases in big data analysis. It will review case studies using the Amazon Cloud, custom packages in R for Big Data, tools like Revolution Analytics RevoScaleR package, as well as the newly launched SAP Hana used with R. We will also review Oracle R Enterprise. In addition we will show some case studies using BigML.com (using Clojure) , and approaches using PiCloud. In addition it will showcase some of Google APIs for Big Data Analysis.

Lastly we will talk on social media analysis ,national security use cases (i.e. cyber war) and privacy hazards of big data analytics.

Schedule

View more presentations from Ajay Ohri.
Abstracts

View more documents from Ajay Ohri.

 

Working with a large number of files for reading into R #rstats

Using the dir() and list.files() commands lists all the files in a particular directory. These can be interactively read by R, by referencing to specific parts of the list created by the above two commands. This is useful when you are working with a large number of files, that get generated or re-generated after specific time periods (like web server log files)

> getwd()
[1] “C:/Users/KUs/Documents”
> path=”C:/Users/KUs/Desktop/tester”
> dir(path)
[1] “tester.csv” “tester2.csv” “tester3.csv””tester4.csv”
> setwd(path)
> read.table(file=dir(path)[1],sep=”t”,header=T)
X1 X2 X3 X4
1 to be 2 B

> read.table(file=dir(path)[4],sep=”,”,header=T)
zoo bee doo bee.1 daa
1 12 32 43 34 qwerty

Saving Output in R for Presentations

While SAS language has a beautifully designed ODS (Output Delivery System) for saving output from certain analysis in excel files (and html and others), in R one can simply use the object, put it in a write.table and save it a csv file using the file parameter within write.table.

As a business analytics consultant, the output from a Proc Means, Proc Freq (SAS) or a summary/describe/table command (in R) is to be presented as a final report. Copying and pasting is not feasible especially for large amounts of text, or remote computers.

Using the following we can simple save the output  in R

 

> getwd()
[1] “C:/Users/KUs/Desktop/Ajay”
> setwd(“C:\Users\KUs\Desktop”)

#We shifted the directory, so we can save output without putting the entire path again and again for each step.

#I have found the summary command most useful for initial analysis and final display (particularly during the data munging step)

nams=summary(ajay)

# I assigned a new object to the analysis step (summary), it could also be summary,names, describe (HMisc) or table (for frequency analysis),
> write.table(nams,sep=”,”,file=”output.csv”)

Note: This is for basic beginners in R using it for business analytics dealing with large number of variables.

 

pps: Note

If you have a large number of files in a local directory to be read in R, you can avoid typing the entire path again and again by modifying the file parameter in the read.table and changing the working directory to that folder

 

setwd(“C:/Users/KUs/Desktop/”)
ajayt1=read.table(file=”test1.csv”,sep=”,”,header=T)

ajayt2=read.table(file=”test2.csv”,sep=”,”,header=T)

 

and so on…

maybe there is a better approach somewhere on Stack Overflow or R help, but this will work just as well.

you can then merge the objects created ajayt1 and ajayt2… (to be continued)

%d bloggers like this: