Interview Dan Steinberg Founder Salford Systems

Here is an interview with Dan Steinberg, Founder and President of Salford Systems (http://www.salford-systems.com/ )

Ajay- Describe your journey from academia to technology entrepreneurship. What are the key milestones or turning points that you remember.

 Dan- When I was in graduate school studying econometrics at Harvard,  a number of distinguished professors at Harvard (and MIT) were actively involved in substantial real world activities.  Professors that I interacted with, or studied with, or whose software I used became involved in the creation of such companies as Sun Microsystems, Data Resources, Inc. or were heavily involved in business consulting through their own companies or other influential consultants.  Some not involved in private sector consulting took on substantial roles in government such as membership on the President’s Council of Economic Advisors. The atmosphere was one that encouraged free movement between academia and the private sector so the idea of forming a consulting and software company was quite natural and did not seem in any way inconsistent with being devoted to the advancement of science.

 Ajay- What are the latest products by Salford Systems? Any future product plans or modification to work on Big Data analytics, mobile computing and cloud computing.

 Dan- Our central set of data mining technologies are CART, MARS, TreeNet, RandomForests, and PRIM, and we have always maintained feature rich logistic regression and linear regression modules. In our latest release scheduled for January 2012 we will be including a new data mining approach to linear and logistic regression allowing for the rapid processing of massive numbers of predictors (e.g., one million columns), with powerful predictor selection and coefficient shrinkage. The new methods allow not only classic techniques such as ridge and lasso regression, but also sub-lasso model sizes. Clear tradeoff diagrams between model complexity (number of predictors) and predictive accuracy allow the modeler to select an ideal balance suitable for their requirements.

The new version of our data mining suite, Salford Predictive Modeler (SPM), also includes two important extensions to the boosted tree technology at the heart of TreeNet.  The first, Importance Sampled learning Ensembles (ISLE), is used for the compression of TreeNet tree ensembles. Starting with, say, a 1,000 tree ensemble, the ISLE compression might well reduce this down to 200 reweighted trees. Such compression will be valuable when models need to be executed in real time. The compression rate is always under the modeler’s control, meaning that if a deployed model may only contain, say, 30 trees, then the compression will deliver an optimal 30-tree weighted ensemble. Needless to say, compression of tree ensembles should be expected to be lossy and how much accuracy is lost when extreme compression is desired will vary from case to case. Prior to ISLE, practitioners have simply truncated the ensemble to the maximum allowable size.  The new methodology will substantially outperform truncation.

The second major advance is RULEFIT, a rule extraction engine that starts with a TreeNet model and decomposes it into the most interesting and predictive rules. RULEFIT is also a tree ensemble post-processor and offers the possibility of improving on the original TreeNet predictive performance. One can think of the rule extraction as an alternative way to explain and interpret an otherwise complex multi-tree model. The rules extracted are similar conceptually to the terminal nodes of a CART tree but the various rules will not refer to mutually exclusive regions of the data.

 Ajay- You have led teams that have won multiple data mining competitions. What are some of your favorite techniques or approaches to a data mining problem.

 Dan- We only enter competitions involving problems for which our technology is suitable, generally, classification and regression. In these areas, we are  partial to TreeNet because it is such a capable and robust learning machine. However, we always find great value in analyzing many aspects of a data set with CART, especially when we require a compact and easy to understand story about the data. CART is exceptionally well suited to the discovery of errors in data, often revealing errors created by the competition organizers themselves. More than once, our reports of data problems have been responsible for the competition organizer’s decision to issue a corrected version of the data and we have been the only group to discover the problem.

In general, tackling a data mining competition is no different than tackling any analytical challenge. You must start with a solid conceptual grasp of the problem and the actual objectives, and the nature and limitations of the data. Following that comes feature extraction, the selection of a modeling strategy (or strategies), and then extensive experimentation to learn what works best.

 Ajay- I know you have created your own software. But are there other software that you use or liked to use?

 Dan- For analytics we frequently test open source software to make sure that our tools will in fact deliver the superior performance we advertise. In general, if a problem clearly requires technology other than that offered by Salford, we advise clients to seek other consultants expert in that other technology.

 Ajay- Your software is installed at 3500 sites including 400 universities as per http://www.salford-systems.com/company/aboutus/index.html What is the key to managing and keeping so many customers happy?

 Dan- First, we have taken great pains to make our software reliable and we make every effort  to avoid problems related to bugs.  Our testing procedures are extensive and we have experts dedicated to stress-testing software . Second, our interface is designed to be natural, intuitive, and easy to use, so the challenges to the new user are minimized. Also, clear documentation, help files, and training videos round out how we allow the user to look after themselves. Should a client need to contact us we try to achieve 24-hour turn around on tech support issues and monitor all tech support activity to ensure timeliness, accuracy, and helpfulness of our responses. WebEx/GotoMeeting and other internet based contact permit real time interaction.

 Ajay- What do you do to relax and unwind?

 Dan- I am in the gym almost every day combining weight and cardio training. No matter how tired I am before the workout I always come out energized so locating a good gym during my extensive travels is a must. I am also actively learning Portuguese so I look to watch a Brazilian TV show or Portuguese dubbed movie when I have time; I almost never watch any form of video unless it is available in Portuguese.

 Biography-

http://www.salford-systems.com/blog/dan-steinberg.html

Dan Steinberg, President and Founder of Salford Systems, is a well-respected member of the statistics and econometrics communities. In 1992, he developed the first PC-based implementation of the original CART procedure, working in concert with Leo Breiman, Richard Olshen, Charles Stone and Jerome Friedman. In addition, he has provided consulting services on a number of biomedical and market research projects, which have sparked further innovations in the CART program and methodology.

Dr. Steinberg received his Ph.D. in Economics from Harvard University, and has given full day presentations on data mining for the American Marketing Association, the Direct Marketing Association and the American Statistical Association. After earning a PhD in Econometrics at Harvard Steinberg began his professional career as a Member of the Technical Staff at Bell Labs, Murray Hill, and then as Assistant Professor of Economics at the University of California, San Diego. A book he co-authored on Classification and Regression Trees was awarded the 1999 Nikkei Quality Control Literature Prize in Japan for excellence in statistical literature promoting the improvement of industrial quality control and management.

His consulting experience at Salford Systems has included complex modeling projects for major banks worldwide, including Citibank, Chase, American Express, Credit Suisse, and has included projects in Europe, Australia, New Zealand, Malaysia, Korea, Japan and Brazil. Steinberg led the teams that won first place awards in the KDDCup 2000, and the 2002 Duke/TeraData Churn modeling competition, and the teams that won awards in the PAKDD competitions of 2006 and 2007. He has published papers in economics, econometrics, computer science journals, and contributes actively to the ongoing research and development at Salford.

Interview Jaime Fitzgerald President Fitzgerald Analytics

Here is an interview with noted analytics expert Jaime Fitzgerald, of Fitzgerald Analytics.

Ajay-Describe your career journey from being a Harvard economist to being a text analytics thought leader.

 Jaime- I was attracted to economics because of the logic, the structured and systematic approach to understanding the world and to solving problems. In retrospect, this is the same passion for logic in problem solving that drives my business today.

About 15 years ago, I began working in consulting and initially took a traditional career path. I worked for well-known strategy consulting firms including First Manhattan Consulting Group, Novantas LLC, Braun Consulting, and for the former Japan-focused division of Deloitte Consulting, which had spun off as an independent entity. I was the only person in their New York City office for whom Japanese was not the first language.

While I enjoyed traditional consulting, I was especially passionate about the role of data, analytics, and process improvement. In traditional strategy consulting, these are important factors, but I had a vision for a “next generation” approach to strategy consulting that would be more transparent, more robust, and more focused on the role that information, analysis, and process plays in improving business results. I often explain that while my firm is “not your father’s consulting model,” we have incorporated key best practices from traditional consulting, and combined them with an approach that is more data-centric, technology-centric, and process-centric.

At the most fundamental level, I was compelled to found Fitzgerald Analytics more than six years ago by my passion for the role information plays in improving results, and ultimately improving lives. In my vision, data is an asset waiting to be transformed into results, including profit as well as other results that matter deeply to people. For example,one of the most fulfilling aspects of our work at Fitzgerald Analytics is our support of non-profits and social entrepreneurs, who we help increase their scale and their success in achieving their goals.

Ajay- How would you describe analytics as a career option to future students. What do you think are the most essential qualities an analytics career requires.

Jaime- My belief is that analytics will be a major driver of job-growth and career growth for decades. We are just beginning to unlock the full potential of analytics, and already the demand for analytic talent far exceeds the supply.

To succeed in analytics, the most important quality is logic. Many people believe that math or statistical skills are the most important quality, but in my experience, the most essential trait is what I call “ThoughtStyle” — critical thinking, logic, an ability to break down a problem into components, into sub-parts.

Ajay -What are your favorite techniques and methodologies in text analytics. How do you see social media and Big Data analytics as components of text analytics

 Jaime-We do a lot of work for our clients measuring Customer Experience, by which I mean the experience customers have when interacting with our clients. For example, we helped a major brokerage firm to measure 12 key “Moments that Matter,” including the operational aspects of customer service, customer satisfaction and sentiment, and ultimately customer behavior. Clients care about this a lot, because customer experience drives customer loyalty, which in turn drives customer behavior, customer loyalty, and customer profitability.

Text analytics plays a key role in these projects because much of our data on customer sentiment comes via unstructured text data. For example, we have access to call center transcripts and notes, to survey responses, and to social media comments.

We use a variety of methods, some of which I’m not in a position to describe in great detail. But at a high level, I would say that our favorite text analytics methodologies are “hybrid solutions” which use a two-step process to answer key questions for clients:

Step 1: convert unstructured data into key categorical variables (for example, using contextual analysis to flag users who are critical vs. neutral vs. advocates)

Step 2: linking sentiment categories to customer behavior and profitability (for example, linking customer advocacy and loyalty with customer profits as well as referral volume, to define the ROI that clients accrue for customer satisfaction improvements)

Ajay- Describe your consulting company- Fitzgerald Analytics and some of the work that you have been engaged in.

 Jaime- Our mission is to “illuminate reality” using data and to convert Data to Dollars for our clients. We have a track record of doing this well, with concrete and measurable results in the millions of dollars. As a result, 100% of our clients have engaged us for more than one project: a 100% client loyalty rate.

Our specialties–and most frequent projects–include customer profitability management projects, customer segmentation, customer experience management, balanced scorecards, and predictive analytics. We are often engaged to address high-stakes analytic questions, including issues that help to set long-term strategy. In other cases, clients hire us to help them build their internal capabilities. We have helped build several brand new analytic teams for clients, which continue to generate millions of dollars of profits with their fact-based recommendations.

Our methodology is based on Steven Covey’s principle: “begin with the end in mind,” the concept of starting with the client’s goal and working backwards from there. I often explain that our methods are what you would have gotten if Steven Covey had been a data analyst…we are applying his principles to the world of data analytics.

Ajay- Analytics requires more and more data while privacy requires the least possible data. What do you think are the guidelines that need to be built in sharing internet browsing and user activity data and do we need regulations just like we do for sharing financial data.

 Jaime- Great question. This is an essential challenge of the big data era. My perspective is that firms who depend on user data for their analysis need to take responsibility for protecting privacy by using data management best practices. Best practices to adequately “mask” or remove private data exist…the problem is that these best practices are often not applied. For example, Facebook’s practice of sharing unique user IDs with third-party application companies has generated a lot of criticism, and could have been avoided by applying data management best practices which are well known among the data management community.

If I were able to influence public policy, my recommendation would be to adopt a core set of simple but powerful data management standards that would protect consumers from perhaps 95% of the privacy risks they face today. The number one standard would be to prohibit sharing of static, personally identifiable user IDs between companies in a manner that creates “privacy risk.” Companies can track unique customers without using a static ID…they need to step up and do that.

Ajay- What are your favorite text analytics software that you like to work with.

 Jaime- Because much of our work in deeply embedded into client operations and systems, we often use the software our clients already prefer. We avoid recommending specific vendors unless our client requests it. In tandem with our clients and alliance partners, we have particular respect for Autonomy, Open Text, Clarabridge, and Attensity.

Biography-

http://www.fitzgerald-analytics.com/jaime_fitzgerald.html

The Founder and President of Fitzgerald Analytics, Jaime has developed a distinctively quantitative, fact-based, and transparent approach to solving high stakes problems and improving results.  His approach enables translation of Data to Dollars™ using methodologies clients can repeat again and again.  He is equally passionate about the “human side of the equation,” and is known for his ability to link the human and the quantitative, both of which are needed to achieve optimal results.

Experience: During more than 15 years serving clients as a management strategy consultant, Jaime has focused on customer experience and loyalty, customer profitability, technology strategy, information management, and business process improvement.  Jaime has advised market-leading banks, retailers, manufacturers, media companies, and non-profit organizations in the United States, Canada, and Singapore, combining strategic analysis with hands-on implementation of technology and operations enhancements.

Career History: Jaime began his career at First Manhattan Consulting Group, specialists in financial services, and was later a Co-Founder at Novantas, the strategy consultancy based in New York City.  Jaime was also a Manager for Braun Consulting, now part of Fair Isaac Corporation, and for Japan-based Abeam Consulting, now part of NEC.

Background: Jaime is a graduate of Harvard University with a B.A. in Economics.  He is passionate and supportive of innovative non-profit organizations, their effectiveness, and the benefits they bring to our society.

Upcoming Speaking Engagements:   Jaime is a frequent speaker on analytics, information management strategy, and data-driven profit improvement.  He recently gave keynote presentations on Analytics in Financial Services for The Data Warehousing Institute, the New York Technology Council, and the Oracle Financial Services Industry User Group. A list of Jaime’s most interesting presentations on analyticscan be found here.

He will be presenting a client case study this fall at Text Analytics World re:   “New Insights from ‘Big Legacy Data’: The Role of Text Analytics” 

Connecting with Jaime:  Jaime can be found at Linkedin,  and Twitter.  He edits the Fitzgerald Analytics Blog.

Credit Downgrade of USA and Triple A Whining

As a person trained , deployed and often asked to comment on macroeconomic shenanigans- I have the following observations to make on the downgrade of US Debt by S&P

1) Credit rating is both a mathematical exercise of debt versus net worth as well as intention to repay. Given the recent deadlock in United States legislature on debt ceiling, it is natural and correct to assume that holding US debt is slightly more risky in 2011 as compared to 2001. That means if the US debt was AAA in 2001 it sure is slightly more risky in 2011.

2) Politicians are criticized the world over in democracies including India, UK and US. This is natural , healthy and enforced by checks and balances by constitution of each country. At the time of writing this, there are protests in India on corruption, in UK on economic disparities, in US on debt vs tax vs spending, Israel on inflation. It is the maturity of the media as well as average educational level of citizenry that amplifies and inflames or dampens sentiment regarding policy and business.

3) Conspicuous consumption has failed both at an environmental and economic level. Cheap debt to buy things you do not need may have made good macro economic sense as long as the things were made by people locally but that is no longer the case. Outsourcing is not all evil, but it sure is not a perfect solution to economics and competitiveness. Outsourcing is good or outsourcing is bad- well it depends.

4) In 1944 , the US took debt to fight Nazism, build atomic power and generally wage a lot of war and lots of dual use inventions. In 2004-2010 the US took debt to fight wars in Iraq, Afghanistan and bail out banks and automobile companies. Some erosion in the values represented by a free democracy has taken place, much to the delight of authoritarian regimes (who have managed to survive Google and Facebook).

5) A Double A rating is still quite a good rating. Noone is moving out of the US Treasuries- I mean seriously what are your alternative financial resources to park your government or central bank assets, euro, gold, oil, rare earth futures, metals or yen??

6) Income disparity as a trigger for social unrest in UK, France and other parts is an ominous looming threat that may lead to more action than the poor maths of S &P. It has been some time since riots occured in the United States and I believe in time series and cycles especially given the rising Gini coefficients .

Gini indices for the United States at various times, according to the US Census Bureau:[8][9][10]

  • 1929: 45.0 (estimated)
  • 1947: 37.6 (estimated)
  • 1967: 39.7 (first year reported)
  • 1968: 38.6 (lowest index reported)
  • 1970: 39.4
  • 1980: 40.3
  • 1990: 42.8
    • (Recalculations made in 1992 added a significant upward shift for later values)
  • 2000: 46.2
  • 2005: 46.9
  • 2006: 47.0 (highest index reported)
  • 2007: 46.3
  • 2008: 46.69
  • 2009: 46.8

7) Again I am slightly suspicious of an American Corporation downgrading the American Governmental debt when it failed to reconcile numbers by 2 trillion and famously managed to avoid downgrading Lehman Brothers.  What are the political affiliations of the S &P board. What are their backgrounds. Check the facts, Watson.

The Chinese government should be concerned if it is holding >1000 tonnes of Gold and >1 trillion plus of US treasuries lest we have a third opium war (as either Gold or US Treasuries will burst)

. Opium in 1850 like the US Treasuries in 2010 have no inherent value except for those addicted to them.

8   ) Ron Paul and Paul Krugman are the two extremes of economic ideology in the US.

Reminds me of the old saying- Robbing Peter to pay Paul. Both the Pauls seem equally unhappy and biased.

I have to read both WSJ and NYT to make sense of what actually is happening in the US as opinionated journalism has managed to elbow out fact based journalism. Do we need analytics in journalism education/ reporting?

9) Panic buying and selling would lead to short term arbitrage positions. People like W Buffet made more money in the crash of 2008 than people did in the boom years of 2006-7

If stocks are cheap- buy. on the dips. Acquire companies before they go for IPOs. Go buy your own stock if you are sitting on  a pile of cash. Buy some technology patents in cloud , mobile, tablet and statistical computing if you have a lot of cash and need to buy some long term assets.

10) Follow all advice above at own risk and no liability to this author 😉

 

Newer Doctrines for Newer Wars

On the Memorial Day, some thoughts on the convergence of revolutions in technology and war fare-

 

War – 

War is an openly declared state of organized conflict, typified by extreme aggression, societal disruption, and high mortality

1) Disrupting command and control objects is the primary stage of attack. Evading detection of your own command and control objects while retaining secure channels of communication with redundant lines of control is the primary stage of defense.

2) Pre emptive strikes are in. Reactive all out wars are out. Countries will no longer “declare war” before going to war. They already dont.

3) Commando /Special Forces/Terror strikes /Guerrilla warfare weapons, tactics and technology will have a big demand. So will be specialist trainers.

4) Improving the predictability of your own detect and destroy mechanisms, and disrupting the predictability of enemy detect and react mechanisms will be hugely in- even more than commissioning one more submarine and one more aircraft type.

5) Countries will revert to ancient tribal paradigms in fast shifting alliances for economics as well as geo politics. Very stupidly religion can be  factor in warfare even in the 21 st century.

 

6) Number of Kills per Weapons fired will converge to a constant .  Risks of secondary collateral damage will need to have a higher weight-age because they spur more retal attacks. Fewer prisoner of wars, higher KIA/ MIA ratio.

7) Fewer civilian casualties than all previous wars. This includes fewer civilian casualties even in nuclear war than previous nuclear scenarios.

8) War is a business. It will not be allowed to disrupt global supply chains for more than 2-3 weeks (or inventory replenishment of critical goods and /or services). commodities will lead to wars explicitly, especially since nuclear energy is discredited and carbon energy is diminishing. Expect synchronization with financial derivatives activity. War futures anyone.

9) The Geneva Convention is overdue for an update. Call it Geneva Convention 3.0 United Nations will remain critical to preventing or hastening global conflicts (remember the league of extra ordinary nations .)

10) Economic weapons, climate changing weapons, and sky weapons will emerge. Expect newer kinds of gun powder to be invented. Cyber weapons and hackers will be in demand . Thats the only bright spot.

Happy Memorial Day.

 

Enjoy that freedom to eat an barbecue- it was paid for in more blood than you will ever care to know.

 

Protected: Happy Labour Day to American Stats-ical Association

This content is password protected. To view it please enter your password below:

Free and Open Source cannot get basic economics correct

Nutch robots
Image via Wikipedia

Before you rev up those keyboards, and shoot off a snarky comment- consider this statement- there are many ways to run (and ruin economies). But they still have not found a replacement for money. Yes Happiness is important. Search Engine is good.

So unless they start a new branch of economics with lots more motivational theory and psychology and lot less quant especially for open source projects, money ,revenue, sales is the only true measure of success in enterprise software. Particularly if you have competitors who are making more money selling the same class of software.

Popularity contests are for high school quarterbacks —so even if your open source software is popular in downloads, email discussions, stack overflow or Continue reading “Free and Open Source cannot get basic economics correct”

Interview Luis Torgo Author Data Mining with R

Example of k-nearest neighbour classification
Image via Wikipedia

Here is an interview with Prof Luis Torgo, author of the recent best seller “Data Mining with R-learning with case studies”.

Ajay- Describe your career in science. How do you think can more young people be made interested in science.

Luis- My interest in science only started after I’ve finished my degree. I’ve entered a research lab at the University of Porto and started working on Machine Learning, around 1990. Since then I’ve been involved generally in data analysis topics both from a research perspective as well as from a more applied point of view through interactions with industry partners on several projects. I’ve spent most of my career at the Faculty of Economics of the University of Porto, but since 2008 I’m at the department of Computer Science of the Faculty of Sciences of the same university. At the same time I’ve been a researcher at LIAAD / Inesc Porto LA (www.liaad.up.pt).

I like a lot what I do and like science and the “scientific way of thinking”, but I cannot say that I’ve always thought of this area as my “place”. Most of all I like solving challenging problems through data analysis. If that translates into some scientific outcome than I’m more satisfied but that is not my main goal, though I’m kind of “forced” to think about that because of the constraints of an academic career.

That does not mean I’m not passionate about science, I just think there are many more ways of “doing science” than what is reflected in the usual “scientific indicators” that most institutions seem to be more and more obsessed about.

Regards interesting young people in science that is a hard question that I’m not sure I’m qualified to answer. I do tend to think that young people are more sensible to concrete examples of problems they think are interesting and that science helps in solving, as a way of finding a motivation for facing the hard work they will encounter in a scientific career. I do believe in case studies as a nice way to learn and motivate, and thus my book 😉

Ajay- Describe your new book “Data Mining with R, learning with case studies” Why did you choose a case study based approach? who is the target audience? What is your favorite case study from the book

Luis- This book is about learning how to use R for data mining. The book follows a “learn by doing it” approach to data mining instead of the more common theoretical description of the available techniques in this discipline. This is accomplished by presenting a series of illustrative case studies for which all necessary steps, code and data are provided to the reader. Moreover, the book has an associated web page (www.liaad.up.pt/~ltorgo/DataMiningWithR) where all code inside the book is given so that easy copy-paste is possible for the more lazy readers.

The language used in the book is very informal without many theoretical details on the used data mining techniques. For obtaining these theoretical insights there are already many good data mining books some of which are referred in “further readings” sections given throughout the book. The decision of following this writing style had to do with the intended target audience of the book.

In effect, the objective was to write a monograph that could be used as a supplemental book for practical classes on data mining that exist in several courses, but at the same time that could be attractive to professionals working on data mining in non-academic environments, and thus the choice of this more practically oriented approach.

Regards my favorite case study that is a hard question for an author… still I would probably choose the “Predicting Stock Market Returns” case study (Chapter 3). Not only because I like this challenging problem, but mainly because the case study addresses all aspects of knowledge discovery in a real world scenario and not only the construction of predictive models. It tackles data collection, data pre-processing, model construction, transforming predictions into actions using different trading policies, using business-related performance metrics, implementing a trading simulator for “real-world” evaluation, and laying out grounds for constructing an online trading system.

Obviously, for all these steps there are far too many options to be possible to describe/evaluate all of them in a chapter, still I do believe that for the reader it is important to see the overall picture, and read about the relevant questions on this problem and some possible paths that can be followed at these different steps.

In other words: do not expect to become rich with the solution I describe in the chapter !

Ajay- Apart from R, what other data mining software do you use or have used in the past. How would you compare their advantages and disadvantages with R

Luis- I’ve played around with Clementine, Weka, RapidMiner and Knime, but really only playing with teaching goals, and no serious use/evaluation in the context of data mining projects. For the latter I mainly use R or software developed by myself (either in R or other languages). In this context, I do not think it is fair to compare R with these or other tools as I lack serious experience with them. I can however, tell you about what I see as the main pros and cons of R. The main reason for using R is really not only the power of the tool that does not stop surprising me in terms of what already exists and keeps appearing as contributions of an ever growing community, but mainly the ability of rapidly transforming ideas into prototypes. Regards some of its drawbacks I would probably mention the lack of efficiency when compared to other alternatives and the problem of data set sizes being limited by main memory.

I know that there are several efforts around for solving this latter issue not only from the community (e.g. http://cran.at.r-project.org/web/views/HighPerformanceComputing.html), but also from the industry (e.g. Revolution Analytics), but I would prefer that at this stage this would be a standard feature of the language so the the “normal” user need not worry about it. But then this is a community effort and if I’m not happy with the current status instead of complaining I should do something about it!

Ajay- Describe your writing habit- How do you set about writing the book- did you write a fixed amount daily or do you write in bursts etc

Luis- Unfortunately, I write in bursts whenever I find some time for it. This is much more tiring and time consuming as I need to read back material far too often, but I cannot afford dedicating too much consecutive time to a single task. Actually, I frequently tease my PhD students when they “complain” about the lack of time for doing what they have to, that they should learn to appreciate the luxury of having a single task to complete because it will probably be the last time in their professional life!

Ajay- What do you do to relax or unwind when not working?

Luis- For me, the best way to relax from work is by playing sports. When I’m involved in some game I reset my mind and forget about all other things and this is very relaxing for me. A part from sports I enjoy a lot spending time with my family and friends. A good and long dinner with friends over a good bottle of wine can do miracles when I’m too stressed with work! Finally,I do love traveling around with my family.

Luis Torgo

Short Bio: Luis Torgo has a degree in Systems and Informatics Engineering and a PhD in Computer Science. He is an Associate Professor of the Department of Computer Science of the Faculty of Sciences of the University of Porto. He is also a researcher of the Laboratory of Artificial Intelligence and Data Analysis (LIAAD) belonging to INESC Porto LA. Luis Torgo has been an active researcher in Machine Learning and Data Mining for more than 20 years. He has lead several academic and industrial Data Mining research projects. Luis Torgo accompanies the R project almost since its beginning, using it on his research activities. He teaches R at different levels and has given several courses in different countries.

For reading “Data Mining with R” – you can visit this site, also to avail of a 20% discount the publishers have generously given (message below)-

For more information and to place an order, visit us at http://www.crcpress.com.  Order online and apply 20% Off discount code 907HM at checkout.  CRC is pleased to offer free standard shipping on all online orders!

link to the book page  http://www.crcpress.com/product/isbn/9781439810187

Price: $79.95
Cat. #: K10510
ISBN: 9781439810187
ISBN 10: 1439810184
Publication Date: November 09, 2010
Number of Pages: 305
Availability: In Stock
Binding(s): Hardback