- Much more progress has been made in data storage , data querying and data analysis of huge amounts of personally identifiable information , than in encrypting such information
- Big Data has as much dual use usage for governments and corporations as uranium has for building bombs or power plants.
- There is as much lucre and potential revenue for encrypted data streams in the cloud era – as there for anti virus software in the PC era
- Tracking citizens totally is evil- the total costs of such programs is unjustified given the thwarted terrorism plots by Big Data ‘s Cyber Spying. At best I can understand governments spying on citizen’s of other countries to gain advantages in trade
- The American dominance of cyber spying and big data threaten to unravel and undermine it’s credibility as de facto leader of the Internet. It proves China’s vision of a walled off internet makes sense and that is a dangerous precedent which could lead to the break up of the internet along national boundaries of electronic fire walls.
Author: Ajay Ohri
Writing on APIs for Programmable Web
I have been writing free lance on APIs for Programmable Web. Here is an updated list of the articles, many of these would be of interest to analytics users. Note- some of these are interviews and they are in bold. Note to regular readers: I keep updating this list , and at each updation bring it to the front page, then allowing the blog postings to slide it down!
Scoreoid Aims to Gamify the World Using APIs January 27th, 2014
Plot.ly’s Plot to Visualize More Data January 22nd, 2014
LumenData’s Acquisition of Algorithms.io is a Win-Win January 8th, 2014
Yactraq API Sees Huge Growth in 2013 January 6th, 2014
Scrape.it Describes a Better Way to Extract Data December 20th, 2013
Exclusive Interview: App Store Analytics API December 4th, 2013
APIs Enter 3d Printing Industry November 29th, 2013
PW Interview: José Luis Martinez of Textalytics November 6th, 2013
PW Interview Simon Chan PredictionIO November 5th, 2013
PW Interview: Scott Gimpel Founder and CEO FantasyData.com October 23rd, 2013
PW Interview Brandon Levy, cofounder and CEO of Stitch Labs October 8th, 2013
PW Interview: Jolo Balbin Co-Founder Text Teaser September 18th, 2013
PW Interview:Bob Bickel CoFounder Redline13 July 29th, 2013
PW Interview : Brandon Wirtz CTO Stremor.com July 4th, 2013
PW Interview: Andy Bartley, CEO Algorithms.io June 4th, 2013
PW Interview: Francisco J Martin, CEO BigML.com 2013/05/30
PW Interview: Tal Rotbart Founder- CTO, SpringSense 2013/05/28
PW Interview: Jeh Daruwala CEO Yactraq API, Behavorial Targeting for videos 2013/05/13
PW Interview: Michael Schonfeld of Dwolla API on Innovation Meeting the Payment Web 2013/05/02
PW Interview: Stephen Balaban of Lamda Labs on the Face Recognition API 2013/04/29
PW Interview: Amber Feng, Stripe API, The Payment Web 2013/04/24
PW Interview: Greg Lamp and Austin Ogilvie of Yhat on Shipping Predictive Models via API 2013/04/22
Google Mirror API documentation is open for developers 2013/04/18
PW Interview: Ricky Robinett, Ordr.in API, Ordering Food meets API 2013/04/16
PW Interview: Jacob Perkins, Text Processing API, NLP meets API 2013/04/10
Amazon EC2 On Demand Windows Instances -Prices reduced by 20% 2013/04/08
Amazon S3 API Requests prices slashed by half 2013/04/02
PW Interview: Stuart Battersby, Chatterbox API, Machine Learning meets Social 2013/04/02
PW Interview: Karthik Ram, rOpenSci, Wrapping all science APIs 2013/03/20
Viralheat Human Intent API- To buy or not to buy 2013/03/13
Interview Tammer Kamel CEO and Founder Quandl 2013/03/07
YHatHQ API: Calling Hosted Statistical Models 2013/03/04
Quandl API: A Wikipedia for Numerical Data 2013/02/25
Amazon Redshift API is out of limited preview and available! 2013/02/18
Windows Azure Media Services REST API 2013/02/14
Data Science Toolkit Wraps Many Data Services in One API 2013/02/11
Diving into Codeacademy’s API Lessons 2013/01/31
Google APIs finetuning Cloud Storage JSON API 2013/01/29
Springer APIs- Fostering Innovation via API Contests 2012/11/20
Statistically programming the web – Shiny,HttR and RevoDeploy API 2012/11/19
Google Cloud SQL API- Bigger ,Faster and now Free 2012/11/12
A Look at the Web’s Most Popular API -Google Maps API 2012/10/09
Cloud Storage APIs for the next generation Enterprise 2012/09/26
Last.fm API: Sultan of Musical APIs 2012/09/12
Socrata Data API: Keeping Government Open 2012/08/29
BigML API Gets Bigger 2012/08/22
Bing APIs: the Empire Strikes Back 2012/08/15
Google Cloud SQL: Relational Database on the Cloud 2012/08/13
Google BigQuery API Makes Big Data Analytics Easy 2012/08/05
Your Store in The Cloud -Google Cloud Storage API 2012/08/01
Predict the future with Google Prediction API 2012/07/30
The Romney vs Obama API 2012/07/27
Related articles
- API Evangelist (programmableweb.com) July 4th, 2013
Interview: Linkurious aims to simplify graph databases
Here is an interview with a really interesting startup Linkurious and it’s co-founders Sebastien Heymann( also co-founder of Gephi) and Jean Villedieu. They are hoping to making graph databases easier to use and thus spur on their usage.
Linkurious (L) -A lot of businesses are struggling to understand the connections within their data. Who are the persons connected to this financial transaction? What happens to the telecommunication network if this antenna fails? Who is the most influential person in this community? There are a lot of questions that involve a deep understanding of graphs. Most business intelligence and data visualization tools are not adapted for these questions because they have a hard time handling queries about connections and because their interface is not suited for network visualization.
I noticed this because I co-founded a graph visualization software called Gephi a few years ago. It quickly became a reference and the software was downloaded 250k times last year. It really helped people understand the connections in their data in a new way.
In 2013, this success inspired me to found Linkurious. The idea is to provide a solution that’s easy to use to democratize graph visualization.
What does it mean?
We want to help people understand the connection in their data. Linkurious is really easy to use and optimized for the exploration of graphs.
You can install it in minutes. Then, it gives you a search interface through which you can query the data. What’s special about our software is that the result of your search is represented as a graph that you can explore dynamically. Contrary to Gephi or other graph visualization tools, Linkurious only shows you a limited subset of your data and not the whole graph. The goal here is to focus on what the user is looking for and help him find an answer faster.
In order to do that, Linkurious also comes with the ability to filter nodes or color them according to their properties. This way, it’s much faster to understand the data.
L- Linkurious is largely based on a stack of open-source technologies. We rely on Neo4j, the leading graph database to store and access the data. Neo4j can handle really large datasets, this means that our users can access the information much faster than with a traditional SQL database. Neo4j also comes with a query language that allows “smart search”, locating nodes and relationships based on rules like “what’s the shortest path between these 2 nodes?” or “who among the close network of this person has been to London and loves sushi”. That’s the kind of things that Facebook delivers via Graph Search and it’s exciting to see these technologies applied in the business world.
We also use Nodejs, Sigmajs and ElasticSearch.
L- There really are a lot of use cases for graph visualization and we are learning about it almost every day. There are well know applications that are connected to security. For example, graph databases are great to identify suspicious patterns across a variety of data sources. People using false identities to defraud bank tend to share addresses, phone numbers or names. Without graphs, it’s hard to see how they are connected and they tend to remain undetected until it’s too late. Graph visualization can be triggered by alert systems. Then, analysts can investigate the data and decide whether the alert should be escalated or not.
In the telecom industry, you can use graph to map your network and identify weak links, assess the potential of a failure (i.e. impact analysis). Graph visualization helps understand these information and better manage the network.
We also have clients in the logistics, health or consulting industry. Every data oriented industry needs data visualization tools, and graphs offer powerful ways to ask new questions and reveal unforeseen information.
L- There are a lot of challenges with creating and sustaining a challenges. I think the bigger ones are not necessarily location-related. The main issue is to build something people want. It’s certainly been our biggest challenge. We’ve used a lean startup approach to ship a prototype of our product as fast as we could. The first version of Linkurious was buggy and didn’t much interest from customers. But we did get feedback from a few people who really liked it. Since then, we’ve been focusing on them to develop our vision of Linkurious. We are pleased with the results, I think we are on the right path but it’s really a journey.
As for the more location-related challenges, I think France usually gets a bad rep for not being start-up friendly. Our experience has been quite the contrary. There are administrative annoyances but we also benefit from generous benefits, access to great engineers and a burgeoning startup eco-system!
The mission of Linurio.us is to help users access and navigate graph databases in a simple manner so they can make sense of their data.
Some of their interesting solutions are here.
Interview Anne Milley JMP
An interview with noted analytics thought leader Anne Milley from JMP. Anne talks of statistics, cloud computing, culture of JMP, globalization and analytics in general.
DecisionStats(DS) How was 2013 as a year for statistics in general and JMP in particular?
Anne Milley- (AM) I’d say the first-ever International Year of Statistics (Statistics2013) was a great success! We hope to carry some of that momentum into 2014. We are fans of the UK’s 10-year GetStats campaign—they are in the third year, and it seems to be going really well. JMP had a very good year as well, with worldwide double-digit growth again. We are pleased to have launched version 11 of JMP and JMP Pro last year at our annual Discovery Summit user conference.
DS- Any cloud computing plans for JMP?
AM- We are exploring options, but with memory and storage still so incredibly cheap on the desktop, the responsiveness of local, in-memory computing on Macs or Windows operating systems remains compelling. John Sall said it best in a blog post he wrote in December. It is our intention to have a public cloud offering in 2014.
DS- Describe the company culture and environment in the JMP division. Any global plans?
AM- John Sall’s passion to bring interactive, intuitive data visualization and analysis on the desktop continues. There is a strong commitment in the JMP division to speeding the statistical discovery process and making it fun. It’s a powerfully motivating factor to work in an environment where that passion and purpose are shared, and where we get to interact with customers who are also passionate users of JMP, many of whom use JMP and SAS together.
While a majority of JMP personnel are in Cary, North Carolina, almost half the staff are contributing from other states and countries. JMP is sold everywhere we have SAS offices (in 59 countries). JMP has localized versions in seven languages, and we keep getting requests for more.
DS- You have been a SAS Institute veteran for 15 years now. What are some of the ups and downs you remember as milestones in the field of analytics?
AM- The most exciting milestone is that analytics has been getting more attention in the last few years, thanks to a combination of factors. Analytics is a very inclusive term (statistics, optimization, data mining, machine learning, data science, etc.), but statistics is the main discipline we draw on when we are trying to make informed decisions in the face of uncertainty. In the early days of data mining, there was a tension between statisticians and data miners/machine learners, but we now have a richer set of methods (with more solid theoretic underpinnings) with which to analyze data and make better decisions. We have better ways to automate parts of the model-building process as well, which is important with ever-wider data. In the early days of data mining, I remember many reacting with “Why spend so much time dredging through opportunistically collected data, when statistics has so much more to offer, like design of experiments?” There is still some merit to that, and maybe we will see the pendulum swing back to doing more with information-rich data.
DS- What are your top three forecasts for analytics technology in 2014?
AM- My perspective may be different than others on what’s trending in analytics technology, but as we try to do more with more data, here are my top three picks:
-
We will continue to innovate new ways to visualize data and statistical output to capitalize on our high visual bandwidth. (Examples of some of our recent innovations can be found on the JMP Blog.)
-
We will continue to see innovative ways to create more analytic bandwidth and democratize analytics—for example, more quickly build and deploy analytic applications and interactive visualizations for others to use.
-
We will see more integration with commonly used analytical tools and infrastructure to help analysts be more productive.
DS- How do you maintain work-life balance?
AM- I enjoy what I do and the great people I work with; that is part of what motivates me each day and is added to the long list of things for which I’m grateful. Outside of work, I enjoy spending time with family, regular exercise, organic gardening and other creative pursuits.
DS-As a senior technology management person working for the past 15 years, do you think technology is a better employer for women employees than it was in the 1990s? What steps can be done to increase this?
AM- I certainly see more support for women in technology with various women-in-technology organizations and programs around the world. And I also see more encouragement for girls and young women to get more exposure to science, technology, engineering, math, and statistics and consider the career options knowledge of these areas could bring. But there is more to do. I would like to add statistics to the STEM list explicitly since many still consider statistics a branch of math and don’t appreciate that statistics is the science/language of science. (Florence Nightingale said that statistics is “the most important science in the whole world.”) This year, we will see the first Women in Statistics Conference “enticing, elevating, and empowering careers of women in statistics.” There are several organizations and programs out there advocating for women in science, engineering, statistics and math, which is great. The resources such organizations provide for networking, mentoring, career development and making role models more visible are important in raising awareness on what the impediments are and how to overcome them. We should all read Sheryl Sandberg’s re-release of Lean In for Graduates (due out in April). Thank you for asking this question!
About–Anne Milley SR DIRECTOR, ANALYTIC STRATEGY, JMP
The Seven C’s of Viral Content -What makes content viral online?
Viral-
Definition-(of an image, video, piece of information, etc.) circulated rapidly and widely from one Internet user to another.
- Channels– Some content goes viral on some particular channels (like 4chan, or Tumblr) while gets ignored on other social media channels
- Content – the type of content should match the audience type (technical or non technical) and channel used for dissemination (like Pinterest or Tumble for images)
- Celebrity– Getting a celebrity (say with high enough influence score) endorsement greatly helps viral content to reach beyond initial network
- Credibility or Network Effects- People find it easier to like or share content which is already proved to be a viral content or beyond a certain threshold. Some people would like the content if it already is very successful.
- Customers -Content consumers can be influencers, sharers, innovators, or passive. It is critical to meet a certain threshold of certain customer types to hit viral counts.
- Context– One man’s viral content is another man’s spam.
- Circulation – How easy is it to circulate the content? to share it or show appreciation? to add customized comments? This affects viral nature- though it is mostly a function of hosting website than the content itself
\bonus the 8th C – Cuteness and Catiness – On the internet cute babies and cats rule in a duo-poly
2013 in review
The WordPress.com stats helper monkeys prepared a 2013 annual report for this blog.
Here’s an excerpt:
The Louvre Museum has 8.5 million visitors per year. This blog was viewed about 150,000 times in 2013. If it were an exhibit at the Louvre Museum, it would take about 6 days for that many people to see it.
2013 Thank You Note
I would like to write a thank you note to some of the people who helped make Decisionstats.com possible . We had a total of 150,644 views this year.For that, I have to thank you dear readers for putting up with me- it is now our seventh year.
| Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec | Total |
|---|
| 13,940 | 12,153 | 12,948 | 13,371 | 12,778 | 12,085 | 12,894 | 11,934 | 9,914 | 14,764 | 12,907 | 10,956 | 150,644 |
I would like to thank Chris (of Mashape) for helping me with some of the interviews I wrote here .I did 26 interviews this year for Programmable Web and a total of 30+ articles including the interviews in 2013.
Of course- we have now reached 116 excellent interviews on Decisionstats.com alone ( see http://goo.gl/V6UsCG )I would like to thank each one of the interviewees who took precious time to fill out the questions.
Sponsors- I would like to thank Dr Eric Siegel ( individually as an author and as founder chair of www.pawcon.com ) , Nadja and Ingo (for Rapid-Miner) , Dr Jonathan ( for Datamind) , Chris M (for Statace.com ) , Gergely ( Author) and many more during all these six years who have kept us afloat and the servers warm in these days of cold reflection, including Gregory (of KDNuggets.com) and erstwhile AsterData founders.
Training Partners- I would like to thank Lovleen Bhatia ( of Edureka for giving me the opportunity to make http://www.edureka.in/r-for-analytics which now has 1721 learners as per http://www.edureka.in/)
I would also specially say Thank you to Jigsaw Academy for giving me the opportunity to create
the first affordable and quality R course in Asia http://analyticstraining.com/2013/jigsaw-completes-training-of-300-students-on-r/
These training courses including those by Datamind and Coursera remain a formidable and affordable alternative to many others catching up in the analytics education game in India ( an issue I wrote here)
Each and Everyone of my students (past and present) and Everyone in the #rstats and SAS-L community, including people who may have been left out.
Thank you sir, for helping me and Decisionstats.com !
Wish each one of you a very happy and Joyous Happy New Year and a great and prosperous 2014!

