Interview: Linkurious aims to simplify graph databases

linkurious-239x60-trHere is an interview with a really interesting startup Linkurious and it’s co-founders Sebastien Heymann( also co-founder of Gephi) and Jean Villedieu. They are hoping to making graph databases easier to use and thus spur on their usage.

Decisionstats (DS)-  How did you come up about setting across your startup

Linkurious (L) -A lot of businesses are struggling to understand the connections within their data. Who are the persons connected to this financial transaction? What happens to the telecommunication network if this antenna fails? Who is the most influential person in this community? There are a lot of questions that involve a deep understanding of graphs. Most business intelligence and data visualization tools are not adapted for these questions because they have a hard time handling queries about connections and because their interface is not suited for network visualization.
I noticed this because I co-founded a graph visualization software called Gephi a few years ago. It quickly became a reference and the software was downloaded 250k times last year. It really helped people understand the connections in their data in a new way.
In 2013, this success inspired me to found Linkurious. The idea is to provide a solution that’s easy to use to democratize graph visualization.

What does it mean?
We want to help people understand the connection in their data. Linkurious is really easy to use and optimized for the exploration of graphs.
You can install it in minutes. Then, it gives you a search interface through which you can query the data. What’s special about our software is that the result of your search is represented as a graph that you can explore dynamically. Contrary to Gephi or other graph visualization tools, Linkurious only shows you a limited subset of your data and not the whole graph. The goal here is to focus on what the user is looking for and help him find an answer faster.
In order to do that, Linkurious also comes with the ability to filter nodes or color them according to their properties. This way, it’s much faster to understand the data.

DS- How do you support packages from Python , and R and other languages like Julia? What is Linkurious based on?

L- Linkurious is largely based on a stack of open-source technologies. We rely on Neo4j, the leading graph database to store and access the data. Neo4j can handle really large datasets, this means that our users can access the information much faster than with a traditional SQL database. Neo4j also comes with a query language that allows “smart search”, locating nodes and relationships based on rules like “what’s the shortest path between these 2 nodes?” or “who among the close network of this person has been to London and loves sushi”. That’s the kind of things that Facebook delivers via Graph Search and it’s exciting to see these technologies applied in the business world.
We also use Nodejs, Sigmajs and ElasticSearch.

DS-  Name  a few case studies where enterprises have used graphical analysis for great benefit?

L- There really are a lot of use cases for graph visualization and we are learning about it almost every day. There are well know applications that are connected to security. For example, graph databases are great to identify suspicious patterns across a variety of data sources. People using false identities to defraud bank tend to share addresses, phone numbers or names. Without graphs, it’s hard to see how they are connected and they tend to remain undetected until it’s too late. Graph visualization can be triggered by alert systems. Then, analysts can investigate the data and decide whether the alert should be escalated or not.
In the telecom industry, you can use graph to map your network and identify weak links, assess the potential of a failure (i.e. impact analysis). Graph visualization helps understand these information and better manage the network.

We also have clients in the logistics, health or consulting industry. Every data oriented industry needs data visualization tools, and graphs offer powerful ways to ask new questions and reveal unforeseen information.

DS-What are some of the challenges with creating, sustaining and maintaining a cutting edge technology startup in Europe and France

L- There are a lot of challenges with creating and sustaining a challenges. I think the bigger ones are not necessarily location-related. The main issue is to build something people want. It’s certainly been our biggest challenge. We’ve used a lean startup approach to ship a prototype of our product as fast as we could. The first version of Linkurious was buggy and didn’t much interest from customers. But we did get feedback from a few people who really liked it. Since then, we’ve been focusing on them to develop our vision of Linkurious. We are pleased with the results, I think we are on the right path but it’s really a journey.
As for the more location-related challenges, I think France usually gets a bad rep for not being start-up friendly. Our experience has been quite the contrary. There are administrative annoyances but we also benefit from generous benefits, access to great engineers and a burgeoning startup eco-system!


The mission of is  to help users access and navigate graph databases in a simple manner so they can make sense of their data.

Some of their interesting solutions are here.

Interview Anne Milley JMP

 An interview with noted analytics thought leader Anne Milley from JMP. Anne talks of statistics, cloud computing, culture of JMP, globalization and analytics in general.

DecisionStats(DS) How was 2013 as a year for statistics in general and JMP in particular?  

Anne Milley-  (AM) I’d say the first-ever  International Year of Statistics (Statistics2013) was a great success! We hope to carry some of that momentum into 2014. We are fans of the UK’s 10-year GetStats campaign—they are in the third year, and it seems to be going really well. JMP had a very good year as well, with worldwide double-digit growth again. We are pleased to have launched version 11 of JMP and JMP Pro last year at our annual Discovery Summit user conference.

DS-  Any cloud computing plans for JMP?

AM- We are exploring options, but with memory and storage still so incredibly cheap on the desktop, the responsiveness of local, in-memory computing on Macs or Windows operating systems remains compelling. John Sall said it best in a blog post he wrote in December.  It is our intention to have a public cloud offering in 2014.

DS- Describe the company culture and environment in the JMP division. Any global plans?

AM- John Sall’s passion to bring interactive, intuitive data visualization and analysis on the desktop continues. There is a strong commitment in the JMP division to speeding the statistical discovery process and making it fun. It’s a powerfully motivating factor to work in an environment where that passion and purpose are shared, and where we get to interact with customers who are also passionate users of JMP, many of whom use JMP and SAS together.

While a majority of JMP personnel are in Cary, North Carolina, almost half the staff are contributing from other states and countries. JMP is sold everywhere we have SAS offices (in 59 countries). JMP has localized versions in seven languages, and we keep getting requests for more.

DS- You have been a SAS Institute veteran for 15 years now. What are some of the ups and downs you remember as milestones in the field of analytics?

AM- The most exciting milestone is that analytics has been getting more attention in the last few years, thanks to a combination of factors. Analytics is a very inclusive term (statistics, optimization, data mining, machine learning, data science, etc.), but statistics is the main discipline we draw on when we are trying to make informed decisions in the face of uncertainty. In the early days of data mining, there was a tension between statisticians and data miners/machine learners, but we now have a richer set of methods (with more solid theoretic underpinnings) with which to analyze data and make better decisions. We have better ways to automate parts of the model-building process as well, which is important with ever-wider data. In the early days of data mining, I remember many reacting with “Why spend so much time dredging through opportunistically collected data, when statistics has so much more to offer, like design of experiments?” There is still some merit to that, and maybe we will see the pendulum swing back to doing more with information-rich data.

DS- What are your top three forecasts for analytics technology in 2014?

AM- My perspective may be different than others on what’s trending in analytics technology, but as we try to do more with more data, here are my top three picks:

  • We will continue to innovate new ways to visualize data and statistical output to capitalize on our high visual bandwidth. (Examples of some of our recent innovations can be found on the JMP Blog.)

  • We will continue to see innovative ways to create more analytic bandwidth and democratize analytics—for example, more quickly build and deploy analytic applications and interactive visualizations for others to use.

  • We will see more integration with commonly used analytical tools and infrastructure to help analysts be more productive.

DS-  How do you maintain work-life balance?

AM- I enjoy what I do and the great people I work with; that is part of what motivates me each day and is added to the long list of things for which I’m grateful. Outside of work, I enjoy spending time with family, regular exercise, organic gardening and other creative pursuits.

DS-As a senior technology management person working for the past 15 years, do you think technology is a better employer for women employees than it was in the 1990s? What steps can be done to increase this?

AM- I certainly see more support for women in technology with various women-in-technology organizations and programs around the world. And I also see more encouragement for girls and young women to get more exposure to science, technology, engineering, math, and statistics and consider the career options knowledge of these areas could bring. But there is more to do. I would like to add statistics to the STEM list explicitly since many still consider statistics a branch of math and don’t appreciate that statistics is the science/language of science. (Florence Nightingale said that statistics is “the most important science in the whole world.”) This year, we will see the first Women in Statistics Conference “enticing, elevating, and empowering careers of women in statistics.” There are several organizations and programs out there advocating for women in science, engineering, statistics and math, which is great. The resources such organizations provide for networking, mentoring, career development and making role models more visible are important in raising awareness on what the impediments are and how to overcome them. We should all read Sheryl Sandberg’s re-release of Lean In for Graduates (due out in April). Thank you for asking this question!


Anne oversees product management and analytic strategy in JMP Product Marketing. She is a contributing faculty member for the International Institute of Analytics.

The Seven C’s of Viral Content -What makes content viral online?


Definition-(of an image, video, piece of information, etc.) circulated rapidly and widely from one Internet user to another.

  1. Channels– Some content goes viral on some particular channels (like 4chan, or Tumblr) while gets ignored on other social media channels
  2. Content  – the type of content should match the audience type (technical or non technical) and channel used for dissemination (like Pinterest or Tumble for images)
  3. Celebrity– Getting a celebrity (say with high enough influence score) endorsement greatly helps viral content to reach beyond initial network
  4. Credibility   or Network Effects- People find it easier to like or share content which is already proved to be a viral content or beyond a certain threshold.  Some people would like the content if it already is very successful.
  5.   Customers  -Content consumers can be influencers, sharers, innovators, or passive. It is critical to meet a certain threshold of certain customer types to hit viral counts.
  6. Context– One man’s viral content is another man’s spam.
  7.  Circulation – How easy is it to circulate the content? to share it or show appreciation? to add customized comments? This affects viral nature- though it is mostly a function of hosting website than the content itself

\bonus the 8th C – Cuteness and Catiness – On the internet cute babies and cats rule in a duo-poly


2013 in review

The stats helper monkeys prepared a 2013 annual report for this blog.

Here’s an excerpt:

The Louvre Museum has 8.5 million visitors per year. This blog was viewed about 150,000 times in 2013. If it were an exhibit at the Louvre Museum, it would take about 6 days for that many people to see it.

Click here to see the complete report.