Here are some broad guidelines for Graphs from EIA.gov , so you can say these are the official graphical guidelines of USA Gov
They can be really useful for sites planning to get into the Tableau Software/NYT /Guardian Infographic mode- or even for communities of blogs that have recurrent needs to display graphical plots- particularly since communication, statistical and design specialists are different areas/expertise/people.
Energy Information Administration Standard 2009-25
Title: Statistical Graphs
Superseded Version: Standard 2002-25
Purpose: To ensure the utility (usefulness to intended users) and objectivity (accuracy, clarity, completeness, and lack of bias) of energy information presented in statistical graphs.
Applicability: All EIA information products.
Required Actions:
Graphs should be used to show and compare changes, trends and/or relationships, and to assist users in visualizing the conclusions drawn from the data represented.
A Summary report from Rexer Analytics Annual Survey
HIGHLIGHTS from the 4th Annual Data Miner Survey (2010):
• FIELDS & GOALS: Data miners work in a diverse set of fields. CRM / Marketing has been the #1 field in each of the past four years. Fittingly, “improving the understanding of customers”, “retaining customers” and other CRM goals are also the goals identified by the most data miners surveyed.
• ALGORITHMS: Decision trees, regression, and cluster analysis continue to form a triad of core algorithms for most data miners. However, a wide variety of algorithms are being used. This year, for the first time, the survey asked about Ensemble Models, and 22% of data miners report using them.
A third of data miners currently use text mining and another third plan to in the future.
• MODELS: About one-third of data miners typically build final models with 10 or fewer variables, while about 28% generally construct models with more than 45 variables.
• TOOLS: After a steady rise across the past few years, the open source data mining software R overtook other tools to become the tool used by more data miners (43%) than any other. STATISTICA, which has also been climbing in the rankings, is selected as the primary data mining tool by the most data miners (18%). Data miners report using an average of 4.6 software tools overall. STATISTICA, IBM SPSS Modeler, and R received the strongest satisfaction ratings in both 2010 and 2009.
• TECHNOLOGY: Data Mining most often occurs on a desktop or laptop computer, and frequently the data is stored locally. Model scoring typically happens using the same software used to develop models. STATISTICA users are more likely than other tool users to deploy models using PMML.
• CHALLENGES: As in previous years, dirty data, explaining data mining to others, and difficult access to data are the top challenges data miners face. This year data miners also shared best practices for overcoming these challenges. The best practices are available online.
• FUTURE: Data miners are optimistic about continued growth in the number of projects they will be conducting, and growth in data mining adoption is the number one “future trend” identified. There is room to improve: only 13% of data miners rate their company’s analytic capabilities as “excellent” and only 8% rate their data quality as “very strong”.
Please contact us if you have any questions about the attached report or this annual research program. The 5th Annual Data Miner Survey will be launching next month. We will email you an invitation to participate.
|My only thought- since most data miners are using multiple tools including free tools as well as paid software, Perhaps a pie chart of market share by revenue and volume would be handy.
Also some ideas on comparing diverse data mining projects by data size, or complexity.
Here is a brief dataset I out after one hour of cutting and pasting from WordPress.com’s creative data style formats. It shows spam,comments,traffic, and number of posts written monthly.
Clearly monthly traffic is directly related to number I write (suppose A + B* Posts)
But Spam is showing a discontinuous growth especially after a big month (in which Reddit helped)
Akismet had some missing historical values (which is curious)
Carole-Ann’s 2011 Predictions for Decision Management
For Ajay Ohri on DecisionStats.com
What were the top 5 events in 2010 in your field?
Maturity: the Decision Management space was made up of technology vendors, big and small, that typically focused on one or two aspects of this discipline. Over the past few years, we have seen a lot of consolidation in the industry – first with Business Intelligence (BI) then Business Process Management (BPM) and lately in Business Rules Management (BRM) and Advanced Analytics. As a result the giant Platform vendors have helped create visibility for this discipline. Lots of tiny clues finally bubbled up in 2010 to attest of the increasing activity around Decision Management. For example, more products than ever were named Decision Manager; companies advertised for Decision Managers as a job title in their job section; most people understand what I do when I am introduced in a social setting!
Boredom: unfortunately, as the industry matures, inevitably innovation slows down… At the main BRMS shows we heard here and there complaints that the technology was stalling. We heard it from vendors like Red Hat (Drools) and we heard it from bored end-users hoping for some excitement at Business Rules Forum’s vendor panel. They sadly did not get it
Scrum: I am not thinking about the methodology there! If you have ever seen a rugby game, you can probably understand why this is the term that comes to mind when I look at the messy & confusing technology landscape. Feet blindly try to kick the ball out while superhuman forces are moving randomly the whole pack – or so it felt when I played! Business Users in search of Business Solutions are facing more and more technology choices that feel like comparing apples to oranges. There is value in all of them and each one addresses a specific aspect of Decision Management but I regret that the industry did not simplify the picture in 2010. On the contrary! Many buzzwords were created or at least made popular last year, creating even more confusion on a muddy field. A few examples: Social CRM, Collaborative Decision Making, Adaptive Case Management, etc. Don’t take me wrong, I *do* like the technologies. I sympathize with the decision maker that is trying to pick the right solution though.
Information: Analytics have been used for years of course but the volume of data surrounding us has been growing to unparalleled levels. We can blame or thank (depending on our perspective) Social Media for that. Sites like Facebook and LinkedIn have made it possible and easy to publish relevant (as well as fluffy) information in real-time. As we all started to get the hang of it and potentially over-publish, technology evolved to enable the storage, correlation and analysis of humongous volumes of data that we could not dream of before. 25 billion tweets were posted in 2010. Every month, over 30 billion pieces of data are shared on Facebook alone. This is not just about vanity and marketing though. This data can be leveraged for the greater good. Carlos pointed to some fascinating facts about catastrophic event response team getting organized thanks to crowd-sourced information. We are also seeing, in the Decision management world, more and more applicability for those very technology that have been developed for the needs of Big Data – I’ll name for example Hadoop that Carlos (yet again) discussed in his talks at Rules Fest end of 2009 and 2010.
Self-Organization: it may be a side effect of the Social Media movement but I must admit that I was impressed by the success of self-organizing initiatives. Granted, this last trend has nothing to do with Decision Management per se but I think it is a great evolution worth noting. Let me point to a couple of examples. I usually attend traditional conferences and tradeshows in which the content can be good but is sometimes terrible. I was pleasantly surprised by the professionalism and attendance at *un-conferences* such as P-Camp (P stands for Product – an event for Product Managers). When you think about it, it is already difficult to get a show together when people are dedicated to the tasks. How crazy is it to have volunteers set one up with no budget and no agenda? Well, people simply show up to do their part and everyone has fun voting on-site for what seems the most appealing content at the time. Crowdsourcing applied to shows: it works! Similar experience with meetups or tweetups. I also enjoyed attending some impromptu Twitter jam sessions on a given topic. Social Media is certainly helping people reach out and get together in person or virtually and that is wonderful!
Image via Wikipedia
What are the top three trends you see in 2011?
Performance: I might be cheating here. I was very bullish about predicting much progress for 2010 in the area of Performance Management in your Decision Management initiatives. I believe that progress was made but Carlos did not give me full credit for the right prediction… Okay, I am a little optimistic on timeline… I admit it… If it did not fully happen in 2010, can I predict it again in 2011? I think that companies want to better track their business performance in order to correct the trajectory of course but also to improve their projections. I see that it is turning into reality already here and there. I expect it to become a trend in 2011!
Insight: Big Data being available all around us with new technologies and algorithms will continue to propagate in 2011 leading to more widely spread Analytics capabilities. The buzz at Analytics shows on Social Network Analysis (SNA) is a sign that there is interest in those kinds of things. There is tremendous information that can be leveraged for smart decision-making. I think there will be more of that in 2011 as initiatives launches in 2010 will mature into material results.
Image by Intersection Consulting via Flickr
Collaboration: Social Media for the Enterprise is a discipline in the making. Social Media was initially seen for the most part as a Marketing channel. Over the years, companies have started experimenting with external communities and ideation capabilities with moderate success. The few strategic initiatives started in 2010 by “old fashion” companies seem to be an indication that we are past the early adopters. This discipline may very well materialize in 2011 as a core capability, well, or at least a new trend. I believe that capabilities such Chatter, offered by Salesforce, will transform (slowly) how people interact in the workplace and leverage the volumes of social data captured in LinkedIn and other Social Media sites. Collaboration is of course a topic of interest for me personally. I even signed up for Kare Anderson’s collaboration collaboration site – yes, twice the word “collaboration”: it is really about collaborating on collaboration techniques. Even though collaboration does not require Social Media, this medium offers perspectives not available until now.
Brief Bio-
Carole-Ann is a renowned guru in the Decision Management space. She created the vision for Decision Management that is widely adopted now in the industry. Her claim to fame is the strategy and direction of Blaze Advisor, the then-leading BRMS product, while she also managed all the Decision Management tools at FICO (business rules, predictive analytics and optimization). She has a vision for Decision Management both as a technology and a discipline that can revolutionize the way corporations do business, and will never get tired of painting that vision for her audience. She speaks often at Industry conferences and has conducted university classes in France and Washington DC.
Leveraging her Masters degree in Applied Mathematics / Computer Science from a “Grande Ecole” in France, she started her career building advanced systems using all kinds of technologies — expert systems, rules, optimization, dashboarding and cubes, web search, and beta version of database replication – as well as conducting strategic consulting gigs around change management.
She started her career building advanced systems using all kinds of technologies — expert systems, rules, optimization, dashboarding and cubes, web search, and beta version of database replication. At Cleversys (acquired by Kurt Salmon & Associates), she also conducted strategic consulting gigs mostly around change management.
While playing with advanced software components, she found a passion for technology and joined ILOG (acquired by IBM). She developed a growing interest in Optimization as well as Business Rules. At ILOG, she coined the term BRMS while brainstorming with her Sales counterpart. She led the Presales organization for Telecom in the Americas up until 2000 when she joined Blaze Software (acquired by Brokat Technologies, HNC Software and finally FICO).
Her 360-degree experience allowed her to gain appreciation for all aspects of a software company, giving her a unique perspective on the business. Her technical background kept her very much in touch with technology as she advanced.
She also became addicted to Twitter in the process. She is active on all kinds of social media, always looking for new digital experience!
Outside of work, Carole-Ann loves spending time with her two boys. They grow fruits in their Northern California home and cook all together in the French tradition.
My annual traffic to this blog was almost 99,000 . Add in additional views on networking sites plus the 400 plus RSS readers- so I can say traffic was 1,20,000 for 2010. Nice. Thanks for reading and hope it was worth your time. (this is a long post and will take almost 440 secs to read but the summary is just given)
My intent is either to inform you, give something useful or atleast something interesting.
see below-
Jan
Feb
Mar
Apr
May
Jun
2010
6,311
4,701
4,922
5,463
6,493
4,271
Jul
Aug
Sep
Oct
Nov
Dec
Total
5,041
5,403
17,913
16,430
11,723
10,096
98,767
Sandro Saita from http://www.dataminingblog.com/ just named me for an award on his blog (but my surname is ohRi , Sandro left me without an R- What would I be without R :)) ).
Aw! I am touched. Google for “Data Mining Blog” and Sandro is the best that it is in data mining writing.
”
DMR People Award 2010
There are a lot of active people in the field of data mining. You can discuss with them on forums. You can read their blogs. You can also meet them in events such as PAW or KDD. Among the people I follow on a regular basis, I have elected:
Ajay Ori
He has been very active in 2010, especially on his blog . Good work Ajay and continue sharing your experience with us!”
What did I write in 2010- stuff.
What did you read on this blog- well thats the top posts list.
well I guess I owe Tal G for almost 9000 views ( incidentally I withdrew posting my blog from R- Bloggers and Analyticbridge blogs – due to SEO keyword reasons and some spam I was getting see (below))
Still reading this post- gosh let me sell you some advertising. It is only $100 a month (yes its a recession)
Advertisers are treated on First in -Last out (FILO)
I have been told I am obsessed with SEO , but I dont care much for search engines apart from Google, and yes SEO is an interesting science (they should really re name it GEO or Google Engine Optimization)
Apparently Hadley Wickham and Donald Farmer are big keywords for me so I should be more respectful I guess.
Search Terms for 365 days ending 2010-12-31 (Summarized)
2009-12-31 to Today
Search
Views
libre office
925
facebook analytics
798
test drive a chrome notebook
467
test drive a chrome notebook.
215
r gui
203
data mining
163
wps sas lawsuit
158
wordle.net
133
wps sas
123
google maps jet ski
123
test drive chrome notebook
96
sas wps
89
sas wps lawsuit
85
chrome notebook test drive
83
decision stats
83
best statistics software
74
hadley wickham
72
google maps jetski
72
libreoffice
70
doug savage
65
hive tutorial
58
funny india
56
spss certification
52
donald farmer microsoft
51
best statistical software
49
What about outgoing links? Apparently I need to find a way to ask Google to pay me for the free advertising I gave their chrome notebook launch. But since their search engine and browser is free to me, guess we are even steven.
Clicks for 365 days ending 2010-12-31 (Summarized)