Interview Rob J Hyndman Forecasting Expert #rstats

Here is an interview with Prof Rob J Hyndman who has created many time series forecasting methods and authored books as well as R packages on the same.

Ajay -Describe your journey from being a student of science to a Professor. What were some key turning points along that journey?
 
Rob- I started a science honours degree at the University of Melbourne in 1985. By the end of 1985 I found myself simultaneously working as a statistical consultant (having completed all of one year of statistics courses!). For the next three years I studied mathematics, statistics and computer science at university, and tried to learn whatever I needed to in order to help my growing group of clients. Often we would cover things in classes that I’d already taught myself through my consulting work. That really set the trend for the rest of my career. I’ve always been an academic on the one hand, and a statistical consultant on the other. The consulting work has led me to learn a lot of things that I would not otherwise have come across, and has also encouraged me to focus on research problems that are of direct relevance to the clients I work with.
I never set out to be an academic. In fact, I thought that I would get a job in the business world as soon as I finished my degree. But once I completed the degree, I was offered a position as a statistical consultant within the University of Melbourne, helping researchers in various disciplines and doing some commercial work. After a year, I was getting bored doing only consulting, and I thought it would be interesting to do a PhD. I was lucky enough to be offered a generous scholarship which meant I was paid more to study than to continue working.
Again, I thought that I would probably go and get a job in the business world after I finished my PhD. But I finished it early and my scholarship was going to be cut off once I submitted my thesis. So instead, I offered to teach classes for free at the university and delayed submitting my thesis until the scholarship period ran out. That turned out to be a smart move because the university saw that I was a good teacher, and offered me a lecturing position starting immediately I submitted my thesis. So I sort of fell into an academic career.
I’ve kept up the consulting work part-time because it is interesting, and it gives me a little extra money. But I’ve also stayed an academic because I love the freedom to be able to work on anything that takes my fancy.
Ajay- Describe your upcoming book on Forecasting.
 
Rob- My first textbook on forecasting (with Makridakis and Wheelwright) was written a few years after I finished my PhD. It has been very popular, but it costs a lot of money (about $140 on Amazon). I estimate that I get about $1 for every book sold. The rest goes to the publisher (Wiley) and all they do is print, market and distribute it. I even typeset the whole thing myself and they print directly from the files I provided. It is now about 15 years since the book was written and it badly needs updating. I had a choice of writing a new edition with Wiley or doing something completely new. I decided to do a new one, largely because I didn’t want a publisher to make a lot of money out of students using my hard work.
It seems to me that students try to avoid buying textbooks and will search around looking for suitable online material instead. Often the online material is of very low quality and contains many errors.
As I wasn’t making much money on my textbook, and the facilities now exist to make online publishing very easy, I decided to try a publishing experiment. So my new textbook will be online and completely free. So far it is about 2/3 completed and is available at http://otexts.com/fpp/. I am hoping that my co-author (George Athanasopoulos) and I will finish it off before the end of 2012.
The book is intended to provide a comprehensive introduction to forecasting methods. We don’t attempt to discuss the theory much, but provide enough information for people to use the methods in practice. It is tied to the forecast package in R, and we provide code to show how to use the various forecasting methods.
The idea of online textbooks makes a lot of sense. They are continuously updated so if we find a mistake we fix it immediately. Also, we can add new sections, or update parts of the book, as required rather than waiting for a new edition to come out. We can also add richer content including video, dynamic graphics, etc.
For readers that want a print edition, we will be aiming to produce a print version of the book every year (available via Amazon).
I like the idea so much I’m trying to set up a new publishing platform (otexts.com) to enable other authors to do the same sort of thing. It is taking longer than I would like to make that happen, but probably next year we should have something ready for other authors to use.
Ajay- How can we make textbooks cheaper for students as well as compensate authors fairly
 
Rob- Well free is definitely cheaper, and there are a few businesses trying to make free online textbooks a reality. Apart from my own efforts, http://www.flatworldknowledge.com/ is producing a lot of free textbooks. And textbookrevolution.org is another great resource.
With otexts.com, we will compensate authors in two ways. First, the print versions of a book will be sold (although at a vastly cheaper rate than other commercial publishers). The royalties on print sales will be split 50/50 with the authors. Second, we plan to have some features of each book available for subscription only (e.g., solutions to exercises, some multimedia content, etc.). Again, the subscription fees will be split 50/50 with the authors.
Ajay- Suppose a person who used to use forecasting software from another company decides to switch to R. How easy and lucid do you think the current documentation on R website for business analytics practitioners such as these – in the corporate world.
 
Rob- The documentation on the R website is not very good for newcomers, but there are a lot of other R resources now available. One of the best introductions is Matloff’s “The Art of R Programming”. Provided someone has done some programming before (e.g., VBA, python or java), learning R is a breeze. The people who have trouble are those who have only ever used menu interfaces such as Excel. Then they are not only learning R, but learning to think about computing in a different way from what they are used to, and that can be tricky. However, it is well worth it. Once you know how to code, you can do so much more.  I wish some basic programming was part of every business and statistics degree.
If you are working in a particular area, then it is often best to find a book that uses R in that discipline. For example, if you want to do forecasting, you can use my book (otexts.com/fpp/). Or if you are using R for data visualization, get hold of Hadley Wickham’s ggplot2 book.
Ajay- In a long and storied career- What is the best forecast you ever made ? and the worst?
 
 Rob- Actually, my best work is not so much in making forecasts as in developing new forecasting methodology. I’m very proud of my forecasting models for electricity demand which are now used for all long-term planning of electricity capacity in Australia (see  http://robjhyndman.com/papers/peak-electricity-demand/  for the details). Also, my methods for population forecasting (http://robjhyndman.com/papers/stochastic-population-forecasts/ ) are pretty good (in my opinion!). These methods are now used by some national governments (but not Australia!) for their official population forecasts.
Of course, I’ve made some bad forecasts, but usually when I’ve tried to do more than is reasonable given the available data. One of my earliest consulting jobs involved forecasting the sales for a large car manufacturer. They wanted forecasts for the next fifteen years using less than ten years of historical data. I should have refused as it is unreasonable to forecast that far ahead using so little data. But I was young and naive and wanted the work. So I did the forecasts, and they were clearly outside the company’s (reasonable) expectations, and they then refused to pay me. Lesson learned. It’s better to refuse work than do it poorly.

Probably the biggest impact I’ve had is in helping the Australian government forecast the national health budget. In 2001 and 2002, they had underestimated health expenditure by nearly $1 billion in each year which is a lot of money to have to find, even for a national government. I was invited to assist them in developing a new forecasting method, which I did. The new method has forecast errors of the order of plus or minus $50 million which is much more manageable. The method I developed for them was the basis of the ETS models discussed in my 2008 book on exponential smoothing (www.exponentialsmoothing.net)

. And now anyone can use the method with the ets() function in the forecast package for R.
About-
Rob J Hyndman is Pro­fessor of Stat­ist­ics in the Depart­ment of Eco­no­met­rics and Busi­ness Stat­ist­ics at Mon­ash Uni­ver­sity and Dir­ector of the Mon­ash Uni­ver­sity Busi­ness & Eco­nomic Fore­cast­ing Unit. He is also Editor-in-Chief of the Inter­na­tional Journal of Fore­cast­ing and a Dir­ector of the Inter­na­tional Insti­tute of Fore­casters. Rob is the author of over 100 research papers in stat­ist­ical sci­ence. In 2007, he received the Moran medal from the Aus­tralian Academy of Sci­ence for his con­tri­bu­tions to stat­ist­ical research, espe­cially in the area of stat­ist­ical fore­cast­ing. For 25 years, Rob has main­tained an act­ive con­sult­ing prac­tice, assist­ing hun­dreds of com­pan­ies and organ­iz­a­tions. His recent con­sult­ing work has involved fore­cast­ing elec­tri­city demand, tour­ism demand, the Aus­tralian gov­ern­ment health budget and case volume at a US call centre.

Interview Beth Schultz Editor AllAnalytics.com

Here is an interview with Beth Scultz Editor in Chief, AllAnalytics.com .

Allanalytics.com http://www.allanalytics.com/ is the new online community on Predictive Analytics, and its a bit different in emphasizing quality more than just quantity. Beth is veteran in tech journalism and communities.

Ajay-Describe your journey in technology journalism and communication. What are the other online communities that you have been involved with?

Beth- I’m a longtime IT journalist, having begun my career covering the telecommunications industry at the brink of AT&T’s divestiture — many eons ago. Over the years, I’ve covered the rise of internal corporate networking; the advent of the Internet and creation of the Web for business purposes; the evolution of Web technology for use in building intranets, extranets, and e-commerce sites; the move toward a highly dynamic next-generation IT infrastructure that we now call cloud computing; and development of myriad enterprise applications, including business intelligence and the analytics surrounding them. I have been involved in developing online B2B communities primarily around next-generation enterprise IT infrastructure and applications. In addition, Shawn Hessinger, our community editor, has been involved in myriad Web sites aimed at creating community for small business owners.

 Ajay- Technology geeks get all the money while journalists get a story. Comments please

Beth- Great technology geeks — those being the ones with technology smarts as well as business savvy — do stand to make a lot of money. And some pursue that to all ends (with many entrepreneurs gunning for the acquisition) while others more or less fall into it. Few journalists, at least few tech journalists, have big dollars in mind. The gratification for journalists comes in being able to meet these folks, hear and deliver their stories — as appropriate — and help explain what makes this particular technology geek developing this certain type of product or service worth paying attention to.

 Ajay- Describe what you are trying to achieve with the All Analytics community and how it seeks to differentiate itself with other players in this space.

 Beth- With AllAnaltyics.com, we’re concentrating on creating the go-to site for CXOs, IT professionals, line-of-business managers, and other professionals to share best practices, concrete experiences, and research about data analytics, business intelligence, information optimization, and risk management, among many other topics. We differentiate ourself by featuring excellent editorial content from a top-notch group of bloggers, access to industry experts through weekly chats, ongoing lively and engaging message board discussions, and biweekly debates.

We’re a new property, and clearly in rapid building mode. However, we’ve already secured some of the industry’s most respected BI/analytics experts to participate as bloggers. For example, a small sampling of our current lineup includes the always-intrigueing John Barnes, a science fiction novelist and statistics guru; Sandra Gittlen, a longtime IT journalist with an affinity for BI coverage; Olivia Parr-Rud, an internationally recognized expert in BI and organizational alignment; Tom Redman, a well-known data-quality expert; and Steve Williams, a leading BI strategy consultant. I blog daily as well, and in particular love to share firsthand experiences of how organizations are benefiting from the use of BI, analytics, data warehousing, etc. We’ve featured inside looks at analytics initiatives at companies such as 1-800-Flowers.com, Oberweis Dairy, the Cincinnati Zoo & Botanical Garden, and Thomson Reuters, for example.

In addition, we’ve hosted instant e-chats with Web and social media experts Joe Stanganelli and Pierre DeBois, and this Friday, Aug. 26, at 3 p.m. ET we’ll be hosting an e-chat with Marshall Sponder, Web metrics guru and author of the newly published book, Social Media Analytics: Effective Tools for Building, Interpreting, and Using Metrics. (Readers interested in participating in the chat do need to fill out a quick registration form, available here http://www.allanalytics.com/register.asp . The chat is available here http://www.allanalytics.com/messages.asp?piddl_msgthreadid=241039&piddl_msgid=439898#msg_439898 .

Experts participating in our biweekly debate series, called Point/Counterpoint, have broached topics such as BI in the cloud, mobile BI and whether an analytics culture is truly possible to build.

Ajay-  What are some tips you would like to share about writing tech stories to aspiring bloggers.

Beth- I suppose my best advice is this: Don’t write about technology for technology’s sake. Always strive to tell the audience why they should care about a particular technology, product, or service. How might a reader use it to his or her company’s advantage, and what are the potential benefits? Improved productivity, increased revenue, better customer service? Providing anecdotal evidence goes a long way toward delivering that message, as well.

Ajay- What are the other IT world websites that have made a mark on the internet.

Beth- I’d be remiss if I didn’t give a shout out to UBM TechWeb sites, including InformationWeek, which has long charted the use of IT within the enterprise; Dark Reading, a great source for folks interested in securing an enterprise’s information assets; and Light Reading, which takes the pulse of the telecom industry.

 Biography- 

Beth Schultz has more than two decades of experience as an IT writer and editor. Most recently, she brought her expertise to bear writing thought-provoking editorial and marketing materials on a variety of technology topics for leading IT publications and industry players. Previously, she oversaw multimedia content development, writing and editing for special feature packages at Network World. Beth has a keen ability to identify business and technology trends, developing expertise through in-depth analysis and early-adopter case studies. Over the years, she has earned more than a dozen national and regional editorial excellence awards for special issues from American Business Media, American Society of Business Press Editors, Folio.net, and others.

 

Machine Learning Contest

New Contest at http://www.ecmlpkdd2011.org/dcOverview.php

 

 

Discovery Challenge Overview

Organization | Overview | Task and DatasetsTimeline

 

General description: tasks and dataset

VideoLectures.net is a free and open access multimedia repository of video lectures, mainly of research and educational character. The lectures are given by distinguished scholars and scientists at the most important and prominent events like conferences, summer schools, workshops and science promotional events from many fields of Science. The portal is aimed at promoting science, exchanging ideas and fostering knowledge sharing by providing high quality didactic contents not only to the scientific community but also to the general public. All lectures, accompanying documents, information and links are systematically selected and classified through the editorial process taking into account also users’ comments.

The ECML-PKDD 2011 Discovery Challenge is organized in order to improve the website’s current recommender system. The challenge consists of two main tasks and a “side-by” contest. The provided data is for both of the tasks, and it is up to the contestants how it will be used for learning (building up) a recommender.

Due to the nature of the problem, each of the tasks has its own merit: task 1 simulates new-user and new- item recommendation (cold-start mode), task 2 simulates clickstream based recommendation (normal mode). Continue reading “Machine Learning Contest”

Viva Libre Office

WordPerfect 5.1 for DOS.
Image via Wikipedia

The Document Foundation is happy to announce the release candidate of
LibreOffice 3.3.1. This release candidate is the first in a series of
frequent bugfix releases on top of our LibreOffice 3.3 product. Please
be aware that LibreOffice 3.3.1 RC1 is not yet ready for production
use, you should continue to use LibreOffice for that.

http://listarchives.documentfoundation.org/www/announce/msg00028.html

Following is the list of changes against LibreOffice 3.3:

Key changes at a glance:

* Numerous translation updates
* new mimetype icons for LibreOffice – explained here:
http://luxate.blogspot.com/2011/01/not-even-included-but-already-improved.html
* quite a few crasher fixes

Detailed change log:

* translation updates
* Removed old/unmaintained icon themes
* Fix for https://bugzilla.novell.com/show_bug.cgi?id=664516: Don’t
use a reference or the default formula string will be changed
* Install bash completion for oo* wrappers when enabled
(https://bugzilla.novell.com/show_bug.cgi?id=665402)
* Build fix: get the stlport compat workaround working for gcc 4.6.0
* Build fix: no ddraw.h or ddraw.lib in the June 2010 DirectX SDK,
removed usage
* Windows installer: padded nologobanner.bmp, new size is 102×58
* removed gd – Gaelic, ky – Kirghiz, pap – Papiamento, ti – Tigrinya,
ms – Malay, ps – Pashto, ur – Urdu. UI localization does not exist
in these languages. So it makes no sense to ship packages.
* Build fix: pass thru PYTHON, found by configure. Will be used by
filter/source/config/fragments/makefile.mk.
* Upgraded libwpd (WordPerfect filter) to 0.9.1
* Fixed BrOffice Windows start menu branding
* Removed language code ‘kid’. kid is not Koshin, but key id pseudo
language which is good for debugging UI but should no be included
in the product
* Added ca_XV and ast language/local name and description
* Fixed incorrect page number in page preview mode
(https://bugs.freedesktop.org/show_bug.cgi?id=33155). When the
window is large enough to show several ‘Page X’ strings,
the page number was not properly incremented.
* Fixed incorrect import of cell attributes from Excel
documents. When a cell with non-default formatting attribute starts
with non-first row in a column, the filter would incorrectly apply
the same format to all the cells above it if they didn’t have any
formats.
* Ubuntu: fix for lp#696527 – enable human icon theme in LibreOffice
* Fix for https://bugzilla.redhat.com/show_bug.cgi?id=673819 crash on
changing position of drawing object in header.
* Changed OpenOffice.org to LibreOffice in nsplugin
* Added Occitan dictionary
* Added Ukrainian dictionaries
* Fix window focus for langpack installation on Mac –
https://bugs.freedesktop.org/show_bug.cgi?id=33056
* Added/modified NLPsolver translations from Pootle
* Fix for https://bugzilla.novell.com/show_bug.cgi?id=655763
* Fix for RTF export crasher
(https://bugzilla.novell.com/show_bug.cgi?id=656503)
* Use LibreOffice as product name for EPS Creator header
* Parse svg ‘color’ property (fixes
https://bugs.freedesktop.org/show_bug.cgi?id=33551)
* Use double instead of float in writerfilter import
* Build fix: use PYTHON as passed through by set_soenv.in.
* Fix for https://bugs.freedesktop.org/show_bug.cgi?id=33237 remove
debug line
* Fix for https://bugs.freedesktop.org/show_bug.cgi?id=33237 – fixes
ole object import for writer (docx)
* Fix for https://bugs.freedesktop.org/show_bug.cgi?id=33249
rename OOo -> LibO on Getting Support Page
* Fix ooxml import: handle css::table::BorderLine in addition to
css::table::BorderLine2 That means that table cell properties are
correctly set on import again.
* Fix for https://bugs.freedesktop.org/show_bug.cgi?id=33258
wikihelp: Improve the check for existence of the localized help.
* Fix for https://bugs.freedesktop.org/show_bug.cgi?id=33994 – fixes
several crashes around config UNO API
* Fix for https://bugs.freedesktop.org/show_bug.cgi?id=30879
* Fix for https://bugs.freedesktop.org/show_bug.cgi?id=32872
Implementation names weren’t matching with xcu.
* Fix: don’t pushback and process a corrupt extension
* Fix: wikihelp – do not check for existence of the localized
help. In case we do not have the help installed, it is up to the
online service to decide the fallback in case a language version is
not available.
* Fix README: change su urpmi to sudo urpmi for Mandriva section
* Fix README formatting –
https://bugs.freedesktop.org/show_bug.cgi?id=32741 – using CRLF
instead of LF on WIN platform
* Fix README: word wrap at column 75 for better readability
* Build fix: KDE3 library search order
(https://bugs.freedesktop.org/show_bug.cgi?id=32797). Use LINKFLAGS
instead of STDLIBS.
* Start using technical.dic instead of oracle.dic
(https://bugs.freedesktop.org/show_bug.cgi?id=31798)
* Build fix: add explicit QRegion* for clipRegion to fix compile of
kde backend
* Cleanup: removed obsolete m_bSingleAltPress
* Remove the menu when Left Alt Key was pressed for GTK
* Fix for https://bugs.freedesktop.org/show_bug.cgi?id=33459: use
year of era in long format for zh_TW by default
* Fix wrong collation for Catalan language
* Fix for https://bugs.freedesktop.org/show_bug.cgi?id=31271 wrong
line break with “(”
* Fix for https://bugs.freedesktop.org/show_bug.cgi?id=32561 – crash
when iterating over the database types.
* Default currency for Estonia should be Euro – fixes
https://bugs.freedesktop.org/show_bug.cgi?id=33160
* Avoid a pointless GetHelpText() call in the toolbox. Fixes
https://bugs.freedesktop.org/show_bug.cgi?id=33315. GetHelpText()
can be quite heavy, see
https://bugs.freedesktop.org/show_bug.cgi?id=33088.
* Paint toolbar handle positioned properly
(https://bugs.freedesktop.org/show_bug.cgi?id=32558)
* Build fix: move cxxabi.h after stl headers to workaround gcc 4.6.0
and stlport
* Fix for https://bugs.freedesktop.org/show_bug.cgi?id=33355
manipulate also the C runtime’s environment
* Fix for CTL/Other Default Font #i25247#, #i25561#, #i48064#,
#i92341#
* RTF export crasher
(https://bugzilla.novell.com/show_bug.cgi?id=656503)
* Fixed an infinite loop in RTF exporter
* UI: translations need more space on word count dialog, made space
for it.
* Fix for https://bugzilla.novell.com/show_bug.cgi?id=660816 improve
formfield checkbox binary export (and import)

Again a BIG Thank You!

Again whats Libre Office

What does LibreOffice give you?

Writer is the word processor inside LibreOffice. Use it for everything, from dashing off a quick letter to producing an entire book with tables of contents, embedded illustrations, bibliographies and diagrams. The while-you-type auto-completion, auto-formatting and automatic spelling checking make difficult tasks easy (but are easy to disable if you prefer). Writer is powerful enough to tackle desktop publishing tasks such as creating multi-column newsletters and brochures. The only limit is your imagination.

Calc tames your numbers and helps with difficult decisions when you’re weighing the alternatives. Analyze your data with Calc and then use it to present your final output. Charts and analysis tools help bring transparency to your conclusions. A fully-integrated help system makes easier work of entering complex formulas. Add data from external databases such as SQL or Oracle, then sort and filter them to produce statistical analyses. Use the graphing functions to display large number of 2D and 3D graphics from 13 categories, including line, area, bar, pie, X-Y, and net – with the dozens of variations available, you’re sure to find one that suits your project.

Impress is the fastest and easiest way to create effective multimedia presentations. Stunning animation and sensational special effects help you convince your audience. Create presentations that look even more professional than the standard presentations you commonly see at work. Get your collegues’ and bosses’ attention by creating something a little bit different.

Draw lets you build diagrams and sketches from scratch. A picture is worth a thousand words, so why not try something simple with box and line diagrams? Or else go further and easily build dynamic 3D illustrations and special effects. It’s as simple or as powerful as you want it to be.

Base is the database front-end of the LibreOffice suite. With Base, you can seamlessly integrate your existing database structures into the other components of LibreOffice, or create an interface to use and administer your data as a stand-alone application. You can use imported and linked tables and queries from MySQL, PostgreSQL or Microsoft Access and many other data sources, or design your own with Base, to build powerful front-ends with sophisticated forms, reports and views. Support is built-in or easily addable for a very wide range of database products, notably the standardly-provided HSQL, MySQL, Adabas D, Microsoft Access and PostgreSQL.

Math is a simple equation editor that lets you lay-out and display your mathematical, chemical, electrical or scientific equations quickly in standard written notation. Even the most-complex calculations can be understandable when displayed correctly. E=mc2.

LibreOffice also comes configured with a PDF file creator, meaning you can distribute documents that you’re sure can be opened and read by users of almost any computing device or operating system.

Download LibreOffice now and try it out today.

http://www.libreoffice.org/features/

 

A Software called Prezi

For an added dimension in your multimedia presentations , rather than plain old powerpoint take a look at Prezi (from www.Prezi.com).

I used Google Docs to create a free standard presentation , and downloaded in Powerpoint.

I then used Prezi to create the zooming effects.

I then used a software for capturing screens XVidCap ( a linux version of Camtasia)

Using Youtube’s audio swap I mixed the soundtrack on it.