Indian Culture and Indian Startups

Some things unique to Indian Startups-

1) Meet my Co Founder is my relative– Indian Startups tend to have more relatives as co founders than American startups I have seen. American founders do hire relatives as early employees but that is much more less than in India.

Basically this is because Indian culture values family over business. Also this is because family members can be trusted more in our society than friends (see below)

2) Payment is delayed , Boss– One common excuse startups face in India and sometimes contribute to is delaying payments , invoices and promised monetary benefits.

Basically this is because Indian culture values honesty over money

3) Follow up – Send an email. Then send whatsapp. Then call. Then meet. Mostly over things that were committed verbally or orally by or to a tech startup in India.

Basically this is because Indian  culture values communication and interaction over honesty

4) Stock options are in the mail– Indian startup employees are promised stock options on day 1 but they will always be ready next year. American Startups either dont promise anything or they they give you a contract on it

Basically this is because Indian  culture values entrepreneurship over employment

5) SarkaarThere is no Govt in technology startups- This is laissez faire or pure capitalism. No govt to protect you and no govt to help you. Republican Americans should take note.

Basically this is because Indian  Government  values technology corporations over technology entrepreneurship because technology corporations lobby them better than technology entrpreneurs. This is quite exactly the same as America

Basically, in both American and Indian startup culture- money talks, and cash is king.

For the love of Data : Interview with DataJoy Founder James Allen

 Describe how you came u with the idea of Setting up DataJoy? What are some of the things that you have learnt while creating ShareLaTeX.

The idea for DataJoy came organically as we talked to users of ShareLaTeX about the difficulties in their research workflow, beyond just paper writing. Like with LaTeX, Python and R have a high learning curve for new users. Having to first worry about installing them and getting a working environment set up is a difficult hurdle for people when they just want to start getting a feel for the language itself. Basically we want to let you write and run your first few lines of Python or R as quickly as possible.

There are also the difficulties that people face with collaboration. Getting someone else in a position to be able to run your code can be hard, especially if you use a lot of specialist packages or specific versions. If you’re actively working together on some code, making sure you don’t get in each other’s way is difficult. Version control systems have a very steep learning curve and need your entire team to use it. We think the real-time nature of DataJoy is a nice middle ground that lets everyone work together without fear of overwriting or disrupting your collaborator’s work, but has no learning curve.

With ShareLaTeX, we realised that there is a huge silent majority of students and researchers who may not be very tech-literate but are actively engaged in the academic process. These people just want to achieve their end goal, whether it’s submitting an assignment, writing a paper, or analysing some data. They aren’t posting on Stack Overflow or reading blogs about best practices because they don’t care about the technology, they only care about getting their work done. These are the people who we’ve found that we can help the most.

I can set up a ipython notebook server on Amazon and also using RStudio Server ( or just use an AMI which has both). What advantages does DataJoy give me as a data scientist? How is it different from R-fiddle?

Absolutely, and I don’t think DataJoy will ever replace this use-case. If you’re advanced enough in your understanding of your tools, and the infrastructure behind them then setting up a server on Amazon for yourself has a lot of benefits. However, there are a lot more people out there who want the benefits of a cloud environment, but wouldn’t know where to start with setting up their own server and are more focused on the results of their research than in learning how to do so.

Even as someone who does know how to set up such a server though, it’s still an extra piece of infrastructure that you need to manage and support. If you use DataJoy then you can let us do that for you and just focus on your actual data science workflow.

What are some of the ways that you have thought of monetizing this model of creating infrastructure for data scientists?

It’s still very early days and we’re still learning about the needs of different users, but I think there are likely to be 2 or 3 main sources of revenue for DataJoy:

  • Individual accounts for users looking for more compute resources or more advanced features,
  • Group and site license for teams in enterprise, or universities, or teaching looking to move their whole teams’ workflow to DataJoy,
  • Onsite installations

Are you thinking of expanding to include things like Spark et al for users?

We’re focusing on Python and R at the moment to make sure that we can provide the best user experience for these languages. However, our long term goal is to make DataJoy language agnostic so that you can bring your favourite language and toolchain and we’ll be able to support it. We have a very flexible infrastructure on the backend and the limitation to Python and R at the moment is to keep things simpler for us and users.

What are some case studies that you want to share?

We’re really excited about how DataJoy is being used in classrooms all over the world. I haven’t asked permission to share these stories publicly, so without naming names, I’m aware of a lecturer who is using DataJoy to run classes in an interactive way that just wasn’t possible before. He can present the lesson as code in DataJoy on the projector, and have all his students be logged into the same project on their laptops. Students can fill in chunks of code as the lesson progresses and it appears immediately on the projector and on other student’s screens.

Likewise, another lecturer is using DataJoy as a way to distribute assignments to students and if they get stuck, she can quickly log in directly to the student’s project and help them debug it. This has saved her lots of unnecessary hassle of getting the student to email her the code, and then fighting with possible version mismatches or missing dependencies. Being able to see the problem in exactly the same context as the student has been invaluable.

These cases are really exciting to us because they open up completely new ways of teaching that just weren’t possible before.

Do you intend to make the code for DataJoy open source or for users who want to run their own DataJoy server on premise?

Yes, absolutely! ShareLaTeX is already open source and available for users to run individual instances. The DataJoy code base is branched from ShareLaTeX and it still in our open GitHib repositories. The only problem with DataJoy at the moment is that the infrastructure for running Python and R code on our backend is quite tied in to our specific architecture. As soon as we work out how to abstract that so that it can run easily anywhere, we will release DataJoy as an open source project.

What else is on your product roadmap for DataJoy?

At the moment we have two main focuses: Improving DataJoy for teaching, and improving the ease of use for new Python/R users. We want to make it easier for teachers to manage large classes of students and work interactively with them. We also want to make sure that we remove the roadblocks that new Python or R users face, including making error messages more clear, making it ridiculously easy to install any package (even ones that need compiling from source) and providing help, tutorials and examples at the right times.

Describe your own journey as a developer hacker and entrepreneur. What advice would you give to young people entering data science and devops today?

I came to ShareLaTeX and DataJoy after doing a PhD in theoretical physics at Durham University which I finished in early 2013. I’d always had an interest in programming, and worked as a part-time web developer for a web hosting company while I was an undergraduate at Edinburgh studying maths. As a PhD student, I’d written a prototype LaTeX editor that had a bit of traction, and teamed up with my co-founder Henry to work on ShareLaTeX in 2012. Henry comes from a strong software development background and has helped me mature a lot as a software developer to be able to write and maintain large scale services.

I don’t have much experience doing data science directly, but my advice for all aspect of life would be reach out and talk to as many people as possible, especially if they are doing interesting or different work from you. Only by getting lots of opinions (sometimes conflicting!) can you start to build up a realistic view of the world. Surrounding yourself with people you can learn from is very important too, and part of this. If you can’t find people in real life, then find good people to listen to online. Of course, always evaluate what they say with a critical eye :).

Do you intend to make the code for DataJoy open source or for users who want to run their own DataJoy server on premise?

Yes, absolutely! ShareLaTeX is already open source and available for users to run individual instances. The DataJoy code base is branched from ShareLaTeX and it still in our open GitHib repositories. The only problem with DataJoy at the moment is that the infrastructure for running Python and R code on our backend is quite tied in to our specific architecture. As soon as we work out how to abstract that so that it can run easily anywhere, we will release DataJoy as an open source project.

What else is on your product roadmap for DataJoy?

At the moment we have two main focuses: Improving DataJoy for teaching, and improving the ease of use for new Python/R users. We want to make it easier for teachers to manage large classes of students and work interactively with them. We also want to make sure that we remove the roadblocks that new Python or R users face, including making error messages more clear, making it ridiculously easy to install any package (even ones that need compiling from source) and providing help, tutorials and examples at the right times.

Describe your own journey as a developer hacker and entrepreneur. What advice would you give to young people entering data science and devops today?

I came to ShareLaTeX and DataJoy after doing a PhD in theoretical physics at Durham University which I finished in early 2013. I’d always had an interest in programming, and worked as a part-time web developer for a web hosting company while I was an undergraduate at Edinburgh studying maths. As a PhD student, I’d written a prototype LaTeX editor that had a bit of traction, and teamed up with my co-founder Henry to work on ShareLaTeX in 2012. Henry comes from a strong software development background and has helped me mature a lot as a software developer to be able to write and maintain large scale services.

I don’t have much experience doing data science directly, but my advice for all aspect of life would be reach out and talk to as many people as possible, especially if they are doing interesting or different work from you. Only by getting lots of opinions (sometimes conflicting!) can you start to build up a realistic view of the world. Surrounding yourself with people you can learn from is very important too, and part of this. If you can’t find people in real life, then find good people to listen to online. Of course, always evaluate what they say with a critical eye :).

How would Datajoy enable coding on mobile phones or even learning coding on mobile phones.

We’d love to support DataJoy on mobile devices, but they present a number of unique technical challenges. We’ve found that what makes a nice user interface on a PC does not transfer to a tablet/phone very well, and so we’d need to redesign the whole experience. We also have to work with poorer network connections, and offline usage. These are problems that we’re excited to tackle because I think it would let people work in ways with Python and R that haven’t been possible before, but for now we’re focused on improving the desktop/laptop experience

Screenshot from 2015-09-28 22:29:07

Screenshot from 2015-09-28 22:28:45

(ps – I love DataJoy, and I have no commercial interests at all in them. I just get a kick from kicking tires in R and Python in a browser WITHOUT any installations hassles)

https://www.getdatajoy.com/

Google is watching you and how

Here is some R code we have written.

library(jsonlite)
a=fromJSON(“/home/rstudio/R/Takeout/Location History/LocationHistory.json”)
b=as.data.frame(a)

mygoog=NULL
mygoog$latitude=b$locations.latitudeE7/10000000
mygoog$longitude=b$locations.longitudeE7/10000000
mygoog$time=as.POSIXct(as.numeric(b$locations.timestampMs)/1000 , origin=”1970-01-01″)

mygoog=as.data.frame(mygoog)

library(ggmap)
Map zoom = 12,
size = c(640, 640),
scale = 2, maptype = c(“terrain”),
color = “color”)

plot1 geom_path(data = mygoog, aes(x = longitude, y = latitude
),
alpha = I(0.5),
size = 0.8)
suppressWarnings(print(plot1))

mygoog2=mygoog[time>”2015-09-21 12:09:31″,,]
plot1 <- ggmap(Map) +

geom_path(data = mygoog2, aes(x = longitude, y = latitude
),
alpha = I(0.5),
size = 0.8)
suppressWarnings(print(plot1))

Using R now is closer and more similar to just using Python #rstats #python

Some developments- (this should be interesting to Microsoft that is basically leading player in Enterprise Solutions in R after completely acquiring Revolution R and RStudio being headed by another Microsoft alum)

  1. you can install R using miniconda http://continuum.io/conda-for-R Screenshot from 2015-09-16 13:07:26
  2. you can run R using Jupyter notebooks

http://irkernel.github.io/

Screenshot from 2015-09-16 13:09:19

and see

http://continuum.io/blog/conda-jupyter-irkernel

R Essentials” setup

The Anaconda team has created an “R Essentials” bundle with the IRKernel and over 80 of the most used R packages for data science, including dplyr, shiny, ggplot2, tidyr,caret and nnet.

Downloading “R Essentials” requires conda. Miniconda includes conda, Python, and a few other necessary packages, while Anaconda includes all this and over 200 of the most popularPython packages for science, math, engineering, and data analysis. Users may install all of Anaconda at once, or they may install Miniconda at first and then use conda to install any other packages they need, including any of the packages in Anaconda.

Once you have conda, you may install “R Essentials” into the current environment:

conda install -c r r-essentials
Bash

or create a new environment just for “R essentials”:

conda create -n my-r-env -c r r-essentials
Bash

Jupyter

Jupyter provides a great notebook interface to write your analysis and share it with your peers. Open a shell and run this command to start the Jupyter notebook interface in your browser:

jupyter notebook
Bash

Start a new R notebook:

create an R notebook with jupyter

You can immediately write and run R code in the notebook cells.

  1. Running R from within Python – yeah!

http://blog.revolutionanalytics.com/2015/09/using-r-with-jupyter-notebooks.html

Step 1: install miniConda

Step 2: open an OS terminal window:

conda install -c r ipython-notebook r-irkernel
ipython notebook

4.5

Using R Within the IPython Notebok

Using the rmagic extension, users can run R code from within the IPython Notebook. This example Notebook demonstrates this capability.

http://nbviewer.ipython.org/github/olgabot/ipython/blob/2.x/examples/Builtin%20Extensions/R%20Magics.ipynb

Screenshot from 2015-09-16 14:15:05

  1. Use docker !

https://hub.docker.com/r/jupyter/datascience-notebook/.

Jupyter Notebook Data Science Stack

What it Gives You

  • Jupyter Notebook server v4.0.x
  • Conda Python 3.4.x and Python 2.7.x environments
  • pandas, matplotlib, scipy, seaborn, scikit-learn, scikit-image, sympy, cython, patsy, statsmodel, cloudpickle, dill, numba, bokeh pre-installed
  • Conda R v3.2.x and channel
  • plyr, devtools, dplyr, ggplot2, tidyr, shiny, rmarkdown, forecast, stringr, rsqlite, reshape2, nycflights13, caret, rcurl, and randomforest pre-installed
  • Julia v0.3.x with Gadfly and RDatasets pre-installed
  • Unprivileged user jovyan (uid=1000, configurable, see options) in group users (gid=100) with ownership over /home/jovyan and /opt/conda
  • Options for HTTPS, password auth, and passwordless sudo

Basic Use

The following command starts a container with the Notebook server listening for HTTP connections on port 8888 without authentication configured.

docker run -d -p 8888:8888 jupyter/datascience-notebook

DataJoy brings you an online way of doing #rstats and #python

I love the sleek and simple interface at DataJoy, and it quickly enabled me to start coding on the cloud in terms of both R and Python without installing RStudio Server or Ipython Notebook Server.

Amazing stuff and just two clicks away to test it out for free.

 

This slideshow requires JavaScript.

Check it here

https://www.getdatajoy.com/

Genius Triumphs and other Narcos Lessons

from lefsetz.com/wordpress/ Narcos Lessons

GENIUS TRIUMPHS

It’s not something you learn in books, but an ability you’re born with, that you believe in, that you exercise. Success in life is about analysis. That’s what they teach in the elite institutions that those going to lesser colleges miss out on. Facts are irrelevant, you can look them up online. But how to put them together to create something new… That’s what the stars know, how to hold two contradictory thoughts in their brain simultaneously and then 3-D model the future. Geniuses are one step ahead and are ultimately decried and hated for it. They have insight the rest of us lack. Or as Gretzky put it, skate to where the puck is going, not where it is right now.

INFORMATION IS EVERYTHING

That’s why superstars are on the phone all day, why they cultivate relationships. Life is war and if you want to triumph you need to know where all the bodies are buried, who is relevant and who is not. Once someone focuses on personal hurts by irrelevant people you know it’s time to move on. Winners focus on the prize, and never remove their eyes. So, once again, collect information, and then synthesize it into a plan.

BEWARE OF ADVERTISING AND PR

The narcos hired a PR firm, which charged them deep five digits for a logo that accomplished little. Use your PR team to navigate media outlets, advertising firms can come up with ideas, but you’re in the driver’s seat, you have to guide them and make the final decisions.

LOYALTY IS EVERYTHING

If you’ve got no one you can count on, who’ll take a bullet for you, you’re lost. Life is a team sport, which is why loners end up on the sidelines.

SOMEONE’S GOT TO BE THE LEADER

Once you’re ceding territory to another, not wanting to be perceived as aggressive, you’re lost. Natural leaders want the power the same way LeBron wants the ball. If you’re questioning yourself, if you’re letting someone else go first, you’re doomed.

LEADERS AREN’T ALWAYS RIGHT

Gustavo gets Pablo to change course, but only after he hears Pablo out. He who speaks first often fails. Let others talk, and then gently nudge them in the direction you believe should be pursued.

THERE’S NO HONOR AMONG THIEVES

Not only in drugs, but tech and music too. Bill Gates and Steve Jobs are not lovable teddy bears. Same deal with label heads. They need to make choices to stay in power and win, choices that you might think are illegal or abhor. You can quit, but if you won’t do these same things you’re never going to win. That’s what they don’t tell you about life… School teaches you to conform but rules are for suckers. If you don’t believe everything is up for grabs, you’re a loser.

PIVOT

Pablo Escobar didn’t start out in drugs. Don’t be married to who you are. A musician, a coder, a… Winners are always open to opportunities. If you’re not willing to change your mind, change course, you’re not going to win.

IF IT’S TOO GOOD TO BE TRUE, IT IS

We want to believe, and therefore we give charismatic leaders the benefit of the doubt when we shouldn’t. People are mindless sheep who will do anything if given attention, especially by someone rich, powerful and famous.

FAMILY IS EVERYTHING

There’s a reason your spouse can’t be forced to testify against you in court. Choose your spouse wisely and teach your children well.

EVERYBODY LOSES IN THE END

When you hang it out that far, you’re gonna get caught. What did the DEA man say… “The bad guys need to get lucky every time. The good guys just need to get lucky once.”

EVERYTHING’S NEGOTIABLE

If you don’t have the chutzpah to ask for the unthinkable, you’re not dreaming big enough.

MONEY TALKS AND JUST ABOUT EVERYBODY HAS A PRICE

And if they’re reluctant to conform, just threaten their family, see above.

LADIES LOVE OUTLAWS, AND MEN DO TOO

We want to believe we’ve got a way out, that’s why we lionize the renegades.

THE LITTLE PEOPLE DON’T COUNT

They’re uninformed and stupid and can be manipulated and are expendable. Don’t believe me? You keep the Fortune 500 alive, but do these corporations care about you as they pollute, overcharge and pay no taxes? The sooner you wake up and realize the game is rigged, the earlier you get on the path of success.

LOVE WILL GET YOU IN TROUBLE

Following your johnson is the best way to be blown off course. If you can’t say no to the little man, the big man may not survive.

WINNERS SLEEP WITH ONE EYE OPEN

They know someone else wants their perch.

WORK IS BORING, THE SPOILS ARE INTERESTING

Just ask a banker. If money is the byproduct you’re looking for, don’t be surprised if the work is drudgery. Which is why fat cats are always looking to invest in the entertainment business. And are always ripped off by the lifers in it. Don’t go where you don’t know. At best you have one area of expertise. Nobody knows everything and no one wins it all. Life is a game where the board is wiped clean every generation and no one has a sense of history. Do you think they’ll be talking about Pablo Escobar fifty years from now? Probably not, but we’re intrigued by those who make it from the bottom on smarts alone, who are willing to challenge institutions and take big risks, because…we usually are not.

YOU WANT TO BE IN THE ROOM

That’s where all the action takes place, where the story is hashed out. By time you read it in the newspaper it’s usually wrong. If you want to win, if you want to be a player, you’ve got to find a way to get inside the room, where decisions are made and most of what is said never seeps out of the walls. It’s all about uncovering the REAL STORY. And you can only do this by knowing the right people and being in the right place at the right time. People love to talk, they’ll tell you anything. As long as you can keep a secret…

DON’T WRITE ANYTHING INCRIMINATING DOWN

That’s the mark of an amateur. Success is all about plausible deniability.