okay Julian got hacked by the oldest hack trap in the world but it is still pertinent.
So how the fuck do we control these evil hackers.
Show them money
Show them Jesus
Throw them in Jail
The correct solution to bring talented members of the technical community back into the nice air contioned corporate tent of technology is to-
Sponsor Hack My Website Contests- Winners have to share techniques
Sponsor Hack this Search Engine Rank Contests- Winners have to share technique
General Amnesty for people who have hacked before provided they share techniques and agree to join security teams.
Sponsor hack this login id contests- Winners have to share techniques and work to develop fool proof system.
Unfortunately this will never happen. even the big grand Daddy , Google is willing to define hacking contests only in the narrow frame o technical hacks, rather than a system breach hack. Because system breach hacks generally happen at the people level.
Internal cover your assets mentality prevents technology and media employees from reaching out and helping create a secure online platform, thus harming shareholders.
Destruction testing (even in a controlled sandbox) of online systems would reveal the underbelly of corporate information technology.
I mean who wants to sponsor a hack contest that makes you look bad, when it is much more expensive to have a hacking attacks that decreases share price, but doesnot affect your salary.
We have gone in for crowd sourced coding.
How about incentivizing crowd sourced systems design for secure and free internet.
Note -corelation between making pipe bombs and tattoo art is not the same as causation. corelation is not causation unless google comes out with http://causation.google.com
and we see West Virginia likes to search for “how to make a bomb ” 🙂 yeah, right 😉
If you do a Google search for Data Mining Blog- for the past several years one Blog will come on top. data mining blog – Google Search http://bit.ly/kEdPlE
To honor 5 years of Sandro Saitta’s blog (yes thats 5 years!) , we cover an exclusive interview with him where he reveals his unique sauce for cool techie blogging.
Ajay- Describe your journey as a scientist and data miner, from early experiences, to schooling to your work/research/blogging.
Sandro- My first experience with data mining was my master project. I used decision tree to predict pollen concentration for the following week using input data such as wind, temperature and rain. The fact that an algorithm can make a computer learn from experience was really amazing to me. I found it so interesting that I started a PhD in data mining. This time, the field of application was civil engineering. Civil engineers put a lot of sensors on their structure in order to understand how they behave. With all these sensors they generate a lot of data. To interpret these data, I used data mining techniques such as feature selection and clustering. I started my blog, Data Mining Research, during my PhD, to share with other researchers.
I then started applying data mining in the stock market as my first job in industry. I realized the difference between image recognition, where 99% correct classification rate is state of the art, and stock market, where you’re happy with 55%. However, the company ambiance was not as good as I thought, so I moved to consulting. There, I applied data mining in behavioral targeting to increase click-through rates. When you compare the number of customers who click with the ones who don’t, then you really understand what class imbalance mean. A few months ago, I accepted a very good opportunity at SICPA. I’m looking forward to resolving new challenges there.
Ajay- Your blog is the top ranked blog for “data mining blog”. Could you share some tips on better blogging for analytics and technical people
Sandro- It’s always difficult to start a blog, since at the beginning you have no reader. Writing for nobody may seem stupid, but it is not. By writing my first posts during my PhD I was reorganizing my ideas. I was expressing concepts which were not always clear to me. I thus learned a lot and also improved my English level. Of course, it’s still not perfect, but I hope most people can understand me.
Next come the readers. A few dozen each week first. To increase this number, I then started to learn SEO (Search Engine Optimization) by reading books and blogs. I tested many techniques that increased Data Mining Research visibility in the blogosphere. I think SEO is interesting when you already have some content published (which means not at the very beginning of your blog). After a while, once your blog is nicely ranked, the main task is to work on the content of the blog. To be of interest, your content must be particular: original, informative or provocative for example. I also had the chance to have a good visibility thanks to well-known people in the field like Kevin Hillstrom, Gregory Piatetsky-Shapiro, Will Dwinnell / Dean Abbott, Vincent Granville, Matthew Hurst and many others.
Ajay- Whats your favorite statistical software and what are the various softwares that you have worked with. Could you compare and contrast these software as well.
Sandro- My favorite software at this point is SAS. I worked with it for two years. Once you know the language, you can perform ETL and data mining so easily. It’s also very fast compared to others. There are a lot of tools for data mining, but I cannot think of a tool that is as powerful as SAS and, in the same time, has a high-level programming language behind it.
I also worked with R and Matlab. R is very nice since you have all the up-to-date data mining algorithms implemented. However, working in the memory is not always a good choice, especially for ETL. Matlab is an excellent tool for prototyping. It’s not so fast and certainly not done for ETL, but the price is low regarding all the possibilities for data mining. According to me, SAS is the best choice for ETL and a good choice for data mining. Of course, there is the price.
Ajay- What are your favorite techniques and training resources for learning basics of data mining to say statisticians or business management graduates.
Sandro- I’m the kind of guy who likes to read books. I read data mining books one after the other. The fact that the same concepts are explained differently (and by different people) helps a lot in learning a topic like data mining. Of course, nothing replaces experience in the field. You can read hundreds of books, you will still not be a good practitioner until you really apply data mining in specific fields. My second choice after books is blogs. By reading data mining blogs, you will really see the issues and challenges in the field. It’s still not experience, but we are closer. Finally, web resources and networks such as KDnuggets of course, but also AnalyticBridge and LinkedIn.
Ajay- Describe your hobbies and how they help you ,if at all in your professional life.
Sandro- One of my hobbies is reading. I read a lot of books about data mining, SEO, Google as well as Sci-Fi and Fantasy. I’m a big fan of Asimov by the way. My other hobby is playing tennis. I think I simply use my hobbies as a way to find equilibrium in my life. I always try to find the best balance between work, family, friends and sport.
Ajay- What are your plans for your website for 2011-2012.
Sandro- I will continue to publish guest posts and interviews. I think it is important to let other people express themselves about data mining topics. I will not write about my current applications due to the policies of my current employer. But don’t worry, I still have a lot to write, whether it is technical or not. I will also emphasis more on my experience with data mining, advices for data miners, tips and tricks, and of course book reviews!
Standard Disclosure of Blogging- Sandro awarded me the Peoples Choice award for his blog for 2010 and carried out my interview. There is a lot of love between our respective wordpress blogs, but to reassure our puritan American readers- it is platonic and intellectual.
About Sandro S-
Sandro Saitta is a Data Mining Research Engineer at SICPA Security Solutions. He is also a blogger at Data Mining Research (www.dataminingblog.com). His interests include data mining, machine learning, search engine optimization and website marketing.
Jake Gyllenhaal has always been a dear chap from his Donny Darko days. So when you mash some quantum physics (parabolic calculus as per movie), with science fiction to capture terrorists (a very topical topic)- you get Source Code– an investigative and recursive logical thriller. Lead actress Vera Farmiga from the Departed (remember the scene of making love to Comfortably Numb) seems a bit bored today and the tension never crackles. This is a movie for science fiction or action thrillers not geeks- and the name source code is a bit of a misleading title- as it should probably be called Complex Event Processing. It is also a terrible name to search for in Google Image Search- you dont get movie images at all.
The movie is very watchable, but it wont be winning any Hugo awards yet.
From the marvelous lovely Journal of Statistical Software, ignored by mainstream corporatia, but beloved to academia. here is one more interesting and very timely paper.
Can be used to grade stdudents homework, catch terrorists as in plagiarists , search engine spam linkers. Enjoy!
Here is a wonderful example of a geeky nerdy corporate player encouraging education in the liberal arts ( the designers of the GUIs and the phones) of the future.
Google sponsored Doodle 4 Google. (also quite a challenge to traditional brand managers who want to so control the image of the brand- I once waited 12 days for an official Logo to appear on this blog)
Mash Hash Smash
To all the Larries of the world
the Joes, the Jims, the Steves
A shout out, much respect, you have unfurled.
Mash
First create a mash, an aggregated stash
Cross Domain platforms , the blurring of silo edges,
Intra department rivalries have no place in Klaatus world
Create a mash-up , shut up and draw
Do not swallow your own FUD in AAker’s brand indentity trap
Cognitive Bias leads to agency conflict
Ignore the little guys who bond in dorms at your peril,
Statistically the young still have more time than the boomers
to adapt , to change, to nurture the trees that stood for 300 million years
Hash
Congratulations, you got some money
Enjoy the lunch, the San Fransisco strip pubs, attention
Working furiously as the launch date approaches
Dont leave your prototype when you schmooze
remember your hommies and bro
Nice people last nice products have faltered
Everyone hates pompous software marketing
Create a hash tag in your blog
make title tag both tweet optimized and web search crawl able
If you are good to the search engine
It being fair will reward you well
On the internet phonies don’t last except in bubbles
Smash
When the economy is down is a good time to hire top talent at negotiable rates
When the economy is up is a good time to move from naive sales forecast models
The ego of the CEO is less valuable than the death of an elephant
Elephants dont dance but peacocks do
Use Fear Doubt Uncertainty selectively
Karma revolves FUD PIE on your face
be generous to schools you didnt study in
The poor have a right to college in India and China
Your medicine costs 1/10 th in Asia
Outsource but dont betray your nation
Outsource everything except your pride
Enjoy the show, welcome to the ride