Home » Posts tagged 'Languages' (Page 2)
Tag Archives: Languages
Here is an interview with Jason Kuo who works with SAP Analytics as Group Solutions Marketing Manager. Jason answers questions on SAP Analytics and it’s increasing involvement with R statistical language.
Ajay- What made you choose R as the language to tie important parts of your technology platform like HANA and SAP Predictive Analysis. Did you consider other languages like Julia or Python.
Jason- It’s the most popular. Over 50% of the statisticians and data analysts use R. With 3,500+ algorithms its arguably the most comprehensive statistical analysis language. That said,we are not closing the door on others.
Ajay- When did you first start getting interested in R as an analytics platform?
Jason- SAP has been tracking R for 5+ years. With R’s explosive growth over the last year or two, it made sense for us to dramatically increase our investment in R.
Ajay- Can we expect SAP to give back to the R community like Google and Revolution Analytics does- by sponsoring Package development or sponsoring user meets and conferences?
Will we see SAP’s R HANA package in this year’s R conference User 2012 in Nashville
Jason- Yes. We plan to provide a specific driver for HANA tables for input of the data to native R. This planned for end of 2012. We’ll then review our event strategy. SAP has been a sponsor of Predictive Analytics World for several years and was indeed a founding sponsor. We may be attending the year’s R conference in Nashville.
Ajay- What has been some of the initial customer feedback to your analytics expansion and offerings.
Jason- We have completed two very successful Pilots of the R Integration for HANA with two of SAP’s largest customers.
Jason has over 15 years of BI and Data Warehousing industry experience. Having worked at Oracle, Business Objects, and now SAP, Jason has been involved in numerous technical marketing roles involving performance management dashboards, information management, text analysis, predictive analytics, and now big data. He has a bachelor’s of science in operations research from the University of Michigan.
Here is an interview with Charlie Parker, head of large scale online algorithms at http://bigml.com
Ajay- Describe your own personal background in scientific computing, and how you came to be involved with machine learning, cloud computing and BigML.com
Charlie- I am a machine learning Ph.D. from Oregon State University. Francisco Martin (our founder and CEO), Adam Ashenfelter (the lead developer on the tree algorithm), and myself were all studying machine learning at OSU around the same time. We all went our separate ways after that.
Francisco started Strands and turned it into a 100+ million dollar company building recommender systems. Adam worked for CleverSet, a probabilistic modeling company that was eventually sold to Cisco, I believe. I worked for several years in the research labs at Eastman Kodak on data mining, text analysis, and computer vision.
When Francisco left Strands to start BigML, he brought in Justin Donaldson who is a brilliant visualization guy from Indiana, and an ex-Googler named Jose Ortega who is responsible for most of our data infrastructure. They pulled in Adam and I a few months later. We also have Poul Petersen, a former Strands employee, who manages our herd of servers. He is a wizard and makes everyone else’s life much easier.
Ajay- You use clojure for the back end of BigML.com .Are there any other languages and packages you are considering? What makes clojure such a good fit for cloud computing ?
Charlie- Clojure is a great language because it offers you all of the benefits of Java (extensive libraries, cross-platform compatibility, easy integration with things like Hadoop, etc.) but has the syntactical elegance of a functional language. This makes our code base small and easy to read as well as powerful.
We’ve had occasional issues with speed, but that just means writing the occasional function or library in Java. As we build towards processing data at the Terabyte level, we’re hoping to create a framework that is language-agnostic to some extent. So if we have some great machine learning code in C, for example, we’ll use Clojure to tie everything together, but the code that does the heavy lifting will still be in C. For the API and Web layers, we use Python and Django, and Justin is a huge fan of HaXe for our visualizations.
Ajay- Current support is for Decision Trees. When can we see SVM, K Means Clustering and Logit Regression?
Charlie- Right now we’re focused on perfecting our infrastructure and giving you new ways to put data in the system, but expect to see more algorithms appearing in the next few months. We want to make sure they are as beautiful and easy to use as the trees are. Without giving too much away, the first new thing we will probably introduce is an ensemble method of some sort (such as Boosting or Bagging). Clustering is a little further away but we’ll get there soon!
Ajay- How can we use the BigML.com API using R and Python.
Charlie- We have a public github repo for the language bindings. https://github.com/bigmlcom/io Right now, there there are only bash scripts but that should change very soon. The python bindings should be there in a matter of days, and the R bindings in probably a week or two. Clojure and Java bindings should follow shortly after that. We’ll have a blog post about it each time we release a new language binding. http://blog.bigml.com/
Ajay- How can we predict large numbers of observations using a Model that has been built and pruned (model scoring)?
Charlie- We are in the process of refactoring our backend right now for better support for batch prediction and model evaluation. This is something that is probably only a few weeks away. Keep your eye on our blog for updates!
Ajay- How can we export models built in BigML.com for scoring data locally.
Charlie- This is as simple as a call to our API. https://bigml.com/developers/models The call gives you a JSON object representing the tree that is roughly equivalent to a PMML-style representation.
You can read about Charlie Parker at http://www.linkedin.com/pub/charles-parker/11/85b/4b5 and the rest of the BigML team at
I was just reading up on my weekly to-read list and came across this interesting method. It is called Play Color Cipher-
Each Character ( Capital, Small letters, Numbers (0-9), Symbols on the keyboard ) in the plain text is substituted with a color block from the available 18 Decillions of colors in the world  and at the receiving end the cipher text block (in color) is decrypted in to plain text block. It overcomes the problems like “Meet in the middle attack, Birthday attack and Brute force attacks ”.
It also reduces the size of the plain text when it is encrypted in to cipher text by 4 times, with out any loss of content. Cipher text occupies very less buffer space; hence transmitting through channel is very fast. With this the transportation cost through channel comes down.
Visual Cryptography is indeed an interesting topic-
Visual cryptography, an emerging cryptography technology, uses the characteristics of human vision to decrypt encrypted
images. It needs neither cryptography knowledge nor complex computation. For security concerns, it also ensures that hackers
cannot perceive any clues about a secret image from individual cover images. Since Naor and Shamir proposed the basic
model of visual cryptography, researchers have published many related studies.
Visual cryptography (VC) schemes hide the secret image into two or more images which are called
shares. The secret image can be recovered simply by stacking the shares together without any complex
computation involved. The shares are very safe because separately they reveal nothing about the secret image.
Visual Cryptography provides one of the secure ways to transfer images on the Internet. The advantage
of visual cryptography is that it exploits human eyes to decrypt secret images .
Color Visual Cryptography Scheme Using Meaningful Shares
Visual cryptography for color images
- Visual Crypto – One-time Image Create two secure images from one by Robert Hansen
- Visual Crypto Java Applet at the University of Regensburg
- Visual Cryptography Kit Software to create image layers
- On-line Visual Crypto Applet by Leemon Baird
- Extended Visual Cryptography (pdf) by Mizuho Nakajima and Yasushi Yamaguchi
- Visual Cryptography Paper by Moni Noar and Adi Shamir
- Visual Crypto Talk (pdf) by Frederik Vercauteren ESAT Leuven
- t the University of Salerno web page on visual cryptogrpahy.
- Visual Crypto Page by Doug Stinson
Constructions and Bounds for Visual Cryptography
Lecture Notes in Computer Science 1099 (1996), 416-428 (23rd International Colloquium on Automata, Languages and Programming).
- Visual Cryptography for General Access Structures
Information and Computation 129 (1996), 86-106 (this paper is an expanded and revised version of the conference paper).
- On the Contrast in Visual Cryptography Schemes
Journal of Cryptology 12 (1999), 261-289.
- Extended Schemes for Visual Cryptography
Theoretical Computer Science 250 (2001), 143-161.
- Threshold Visual Cryptography Schemes With Specified Whiteness Levels of Reconstructed Pixels
Designs, Codes and Cryptography 25 (2002), 15-61.
- Contrast Optimal Threshold Visual Cryptography Schemes
SIAM J. on Discrete Math. 16 (2003), 224-261.
- “Visual Cryptography: Seeing is Believing” availablehere,
- example- face http://cacr.uwaterloo.ca/~dstinson/VCS-happyface.html
- flag http://cacr.uwaterloo.ca/~dstinson/VCS-flag.html
- pi http://cacr.uwaterloo.ca/~dstinson/VCS-pi.html
- Moni Naor and Adi Shamir, Visual Cryptography , Eurocrypt 94. Postscript , gzipped Postscript
- Moni Naor and Adi Shamir, Visual Cryptography II , Cambridge Workshop on Protocols, 1996. Postscript, gzipped Postscript
- Moni Naor and Benny Pinkas, Visual Authentication , Crypto 97. Postscript, gzipped Postscript
Ajay- I think a combination of sharing and color ciphers would prove more helpful to secure Internet Communication than existing algorithms. It also levels the playing field from computationally rich players to creative coders.
Integrates R Statistical Programming Language into Oracle Database 11g
Comprehensive In-Database Platform for Advanced Analytics
|Oracle Advanced Analytics — an option to Oracle Database 11g Enterprise Edition – extends the database into a comprehensive advanced analytics platform through two major components: Oracle R Enterprise and Oracle Data Mining. With Oracle Advanced Analytics, customers have a comprehensive platform for real-time analytic applications that deliver insight into key business subjects such as churn prediction, product recommendations, and fraud alerting.
Oracle R Enterprise tightly integrates the open source R programming language with the database to further extend the database with Rs library of statistical functionality, and pushes down computations to the database. Oracle R Enterprise dramatically advances the capability for R users, and allows them to use their existing R development skills and tools, and scripts can now also run transparently and scale against data stored in Oracle Database 11g.
Oracle Data Mining provides powerful data mining algorithms that run as native SQL functions for in-database model building and model deployment. It can be accessed through the SQL Developer extension Oracle Data Miner to build, evaluate, share and deploy predictive analytics methodologies. At the same time the high-performance Oracle-specific data mining algorithms are accessible from R.
|Oracle R Hadoop Connector||Gives R users high performance native access to Hadoop Distributed File System (HDFS) and MapReduce programming framework.|
A new contest from a relatively new website. This one is fast and furious and has a decent chunk of money!
Sun, 26 February 2012 05:00 AM UTC
|Results Announced by:
Mon, 05 March 2012 05:00 AM UTC
|Category: Text Analytics||Function: Aerospace & Aviation|
|Analysis of sentiment and its intensity – feedback from airport guests|
|ABC (name intentionally obfuscated) is one of the best managed and highly profitable airports in India. As with all well managed airports, ABC would like to understand what guests feel about their experience when traveling, using or transiting through their airport. ABC has a website in which guests can visit and leave behind a comment, agree or disagree with others’ comments, or respond to a comment confirming or negating the expressed opinion.
The goal of this contest is to create a summarization of the opinions, feelings and sentiments expressed in the comments left behind by guests on the website. This information is being provided as data for solvers. Some understanding of the intensity of the opinion, feeling or sentiment will also be useful. For example, if there is a consistent demand for more spas across guest conversations, it needs to be highlighted. Consistent positive or negative sentiments and opinions need to be discovered and highlighted.
Guest comments have been crawled and provided to you. The data consists approximately 1000 comments from guests including the timestamp of those comments. Personal information (name, email etc) have been hidden. This data is publicly available
Participants may submit entries before the deadline. If a participant submits multiple entries, the entry submitted last before the deadline will be considered as the participant’s submission.
The following deliverables are expected to be submitted:
Timeline and Prizes:
This contest begin on 16 Feb 2012 and will last for a duration of 9 days.
1) All software has bugs. Sometimes this is because people have been told to code in a hurry to meet shipping deadlines. Sometimes it is due to the way metal and other software interact with it. Mostly it is karma.
2) In the 21 st Century,It is okay to insult someone over his software , but not over most other things. Sometimes I think people are passionate not just for their own software but to just diss the other guys. It is a politically convenient release.
3) Bloggers writing about software are full of bull-by products. If they were any good in writing code, they would not have time to write a blog. Mostly bloggers on code are people whose coding enthusiasm is more than their coding competence.
4) Software is easier than it looks to people who know it. To those who dont know how to code, it will always be a bit of magic.
5) Despite immense progress, initiatives and encouragement- the number of females writing code is too low . Comparatively, figuratively and literally. If you are a male and want a social life- get into marketing while the hair is still black.
Man walks into Bar. Says to Women at Bar. ” Hey,What do you do, Me- I write code”
6) People who write software end up making more money not just because they create useful stuff that helps get work done faster or helps reduce boredom for people. They make more money because they are mostly passionate, logical problem thinkers, focused, hard working and better read on a variety of subjects than others. That’s your cue to how to make money even if you cannot code.
7) I would rather write much more code rather than write poetry. But I sometimes think they are related. Just manipulating words in different languages to manipulate output in different machines or people.
8) Kids should be taught software at early age , as that is a skill that helps in their education and thinking. More education for the kids!
9) Laying off talented software people because you found a cheaper , younger alternative half across the globe is sometimes evil. It is also inevitable. Learn more software as you grow older.
10) The best software is the one in your head. It was written by a better programmer too.