Careers in #Rstats

I saw a posting for career with Revolution Analytics. Now I am probably on the wrong side of a H1 visa and the C,R skill-o-meter, but these look great for any aspiring R coder. Includes one free lance opp as well.

http://www.revolutionanalytics.com/aboutus/careers.php

We have many opportunities opening up—among them:

Job Title Location
Pre-sales Consultants / Technical Sales Palo Alto, CA
Parallel Computing Developer Palo Alto, CA or Seattle, WA
R Programmer (Freelance) Palo Alto, CA
Software Training Course Developer (Freelance) Palo Alto, CA
Build / Release Engineer Seattle, WA
QA Engineer Seattle, WA
Technical Writer Seattle, WA

 

Please send your resume to careers@revolutionanalytics.com

2) Indeed.com

Searching for “R” jobs and not just , R jobs, gives better results in search engines and job sites. It is still a tough keyword to search but it is getting better.

You can use this RSS feed http://www.indeed.co.in/rss?q=%22R%22++analytics+jobs or send by email option to get alerts

3) http://icrunchdata.com/

 

I Crunch Data has a good number of Analytics Jobs, and again using the keyword as R within quotes of “R” you can see lots of jobs here

http://www.icrunchdata.com/ViewJob.aspx?id=334914&keys=%22R%22

There used to be a Google Group on R jobs, but is too low volume compared to the actual number of R jobs out there.

Note the big demand is for analytics, and knowing more than one platform helps you in the job search than knowing just a single language.

 

 

 

So how useful is Data.gov anyway

As per official statistics, not many people download data from it .

Why dont they just donate the data and save taxpayers some money

http://www.data.gov/metric

Summary

Agency/Sub-Agency/Organization Raw Datasets
(high-value)
Tools
(high-value)
Geodata Total Latest Entry # of times downloaded
within the last week*
TOTAL
3,486 (2,163) 1,071 (393) 386,429 390,986 08/24/2011 0

* These numbers represent the number of times a user has clicked on the “XML” or “CSV” (for example) links in the Raw Data Catalogs to download datasets and user downloads of tools in the Tool Catalog available in these categories.

But apparently lots of people like it still

http://www.data.gov/metric/visitorstats/monthlyredirecttrend

More list of public data repositories-

Google http://www.google.com/publicdata/directory

DataMob http://datamob.org/datasets

Amazon http://aws.amazon.com/publicdatasets/

DataMarket http://datamarket.com/

Infochimps http://www.infochimps.com/

From SEC, the Edgar  http://www.sec.gov/edgar/searchedgar/companysearch.html

More lists of lists

http://www.kdnuggets.com/2011/02/free-public-datasets.html

But who  gets more downloads last week than Data.gov !

 

 

 

 

 

 

 

 

 

 

Interview Eberhard Miethke and Dr. Mamdouh Refaat, Angoss Software

Here is an interview with Eberhard Miethke and Dr. Mamdouh Refaat, of Angoss Software. Angoss is a global leader in delivering business intelligence software and predictive analytics solutions that help businesses capitalize on their data by uncovering new opportunities to increase sales and profitability and to reduce risk.

Ajay-  Describe your personal journey in software. How can we guide young students to pursue more useful software development than just gaming applications.

 Mamdouh- I started using computers long time ago when they were programmed using punched cards! First in Fortran, then C, later C++, and then the rest. Computers and software were viewed as technical/engineering tools, and that’s why we can still see the heavy technical orientation of command languages such as Unix shells and even in the windows Command shell. However, with the introduction of database systems and Microsoft office apps, it was clear that business will be the primary user and field of application for software. My personal trip in software started with scientific applications, then business and database systems, and finally statistical software – which you can think of it as returning to the more scientific orientation. However, with the wide acceptance of businesses of the application of statistical methods in different fields such as marketing and risk management, it is a fast growing field that in need of a lot of innovation.

Ajay – Angoss makes multiple data mining and analytics products. could you please introduce us to your product portfolio and what specific data analytics need they serve.

a- Attached please find our main product flyers for KnowledgeSTUDIO and KnowledgeSEEKER. We have a 3rd product called “strategy builder” which is an add-on to the decision tree modules. This is also described in the flyer.

(see- Angoss Knowledge Studio Product Guide April2011  and http://www.scribd.com/doc/63176430/Angoss-Knowledge-Seeker-Product-Guide-April2011  )

Ajay-  The trend in analytics is for big data and cloud computing- with hadoop enabling processing of massive data sets on scalable infrastructure. What are your plans for cloud computing, tablet based as well as mobile based computing.

a- This is an area where the plan is still being figured out in all organizations. The current explosion of data collected from mobile phones, text messages, and social websites will need radically new applications that can utilize the data from these sources. Current applications are based on the relational database paradigm designed in the 70’s through the 90’s of the 20th century.

But data sources are generating data in volumes and formats that are challenging this paradigm and will need a set of new tools and possibly programming languages to fit these needs. The cloud computing, tablet based and mobile computing (which are the same thing in my opinion, just different sizes of the device) are also two technologies that have not been explored in analytics yet.

The approach taken so far by most companies, including Angoss, is to rely on new xml-based standards to represent data structures for the particular models. In this case, it is the PMML (predictive modelling mark-up language) standard, in order to allow the interoperability between analytics applications. Standardizing on the representation of models is viewed as the first step in order to allow the implementation of these models to emerging platforms, being that the cloud or mobile, or social networking websites.

The second challenge cited above is the rapidly increasing size of the data to be analyzed. Angoss has already identified this challenge early on and is currently offering in-database analytics drivers for several database engines: Netezza, Teradata and SQL Server.

These drivers allow our analytics products to translate their routines into efficient SQL-based scripts that run in the database engine to exploit its performance as well as the powerful hardware on which it runs. Thus, instead of copying the data to a staging format for analytics, these drivers allow the data to be analyzed “in-place” within the database without moving it.

Thus offering performance, security and integrity. The performance is improved because of the use of the well tuned database engines running on powerful hardware.

Extra security is achieved by not copying the data to other platforms, which could be less secure. And finally, the integrity of the results are vastly improved by making sure that the results are always obtained by analyzing the up-to-date data residing in the database rather than an older copy of the data which could be obsolete by the time the analysis is concluded.

Ajay- What are the principal competing products to your offerings, and what makes your products special or differentiated in value to them (for each customer segment).

a- There are two major players in today’s market that we usually encounter as competitors, they are: SAS and IBM.

SAS offers a data mining workbench in the form of SAS Enterprise Miner, which is closely tied to SAS data mining methodology known as SEMMA.

On the other hand, IBM has recently acquired SPSS, which offered its Clementine data mining software. IBM has now rebranded Clementine as IBM SPSS Modeller.

In comparison to these products, our KnowledgeSTUDIO and KnowledgeSEEKER offer three main advantages: ease of use; affordability; and ease of integration into existing BI environments.

Angoss products were designed to look-and-feel-like popular Microsoft office applications. This makes the learning curve indeed very steep. Typically, an intermediate level analyst needs only 2-3 days of training to become proficient in the use of the software with all its advanced features.

Another important feature of Angoss software products is their integration with SAS/base product, and SQL-based database engines. All predictive models generated by Angoss can be automatically translated to SAS and SQL scripts. This allows the generation of scoring code for these common platforms. While the software interface simplifies all the tasks to allow business users to take advantage of the value added by predictive models, the software includes advanced options to allow experienced statisticians to fine-tune their models by adjusting all model parameters as needed.

In addition, Angoss offers a unique product called StrategyBuilder, which allows the analyst to add key performance indicators (KPI’s) to predictive models. KPI’s such as profitability, market share, and loyalty are usually required to be calculated in conjunction with any sales and marketing campaign. Therefore, StrategyBuilder was designed to integrate such KPI’s with the results of a predictive model in order to render the appropriate treatment for each customer segment. These results are all integrated into a deployment strategy that can also be translated into an execution code in SQL or SAS.

The above competitive features offered by the software products of Angoss is behind its success in serving over 4000 users from over 500 clients worldwide.

Ajay -Describe a major case study where using Angoss software helped save a big amount of revenue/costs by innovative data mining.

a-Rogers Telecommunications Inc. is one of the largest Canadian telecommunications providers, serving over 8.5 million customers and a revenue of 11.1 Billion Canadian Dollars (2009). In 2008, Rogers engaged Angoss in order to help with the problem of ballooning accounts receivable for a period of 18 months.

The problem was approached by improving the efficiency of the call centre serving the collections process by a set of predictive models. The first set of models were designed to find accounts likely to default ahead of time in order to take preventative measures. A second set of models were designed to optimize the call centre resources to focus on delinquent accounts likely to pay back most of the outstanding balance. Accounts that were identified as not likely to pack quickly were good candidates for “Early-out” treatment, by forwarding them directly to collection agencies. Angoss hosted Rogers’ data and provided on a regular interval the lists of accounts for each treatment to be deployed by the call centre dialler. As a result of this Rogers estimated an improvement of 10% of the collected sums.

Biography-

Mamdouh has been active in consulting, research, and training in various areas of information technology and software development for the last 20 years. He has worked on numerous projects with major organizations in North America and Europe in the areas of data mining, business analytics, business analysis, and engineering analysis. He has held several consulting positions for solution providers including Predict AG in Basel, Switzerland, and as ANGOSS Corp. Mamdouh is the Director of Professional services for EMEA region of ANGOSS Software. Mamdouh received his PhD in engineering from the University of Toronto and his MBA from the University of Leeds, UK.

Mamdouh is the author of:

"Credit Risk Scorecards: Development and Implmentation using SAS"
 "Data Preparation for Data Mining Using SAS",
 (The Morgan Kaufmann Series in Data Management Systems) (Paperback)
 and co-author of
 "Data Mining: Know it all",Morgan Kaufmann



Eberhard Miethke  works as a senior sales executive for Angoss

 

About Angoss-

Angoss is a global leader in delivering business intelligence software and predictive analytics to businesses looking to improve performance across sales, marketing and risk. With a suite of desktop, client-server and in-database software products and Software-as-a-Service solutions, Angoss delivers powerful approaches to turn information into actionable business decisions and competitive advantage.

Angoss software products and solutions are user-friendly and agile, making predictive analytics accessible and easy to use.

Interview Scott Gidley CTO and Founder, DataFlux

Here is an interview with Scott Gidley, CTO and co-founder of leading data quality ccompany DataFlux . DataFlux is a part of SAS Institute and in 2011 acquired Baseline Consulting besides launching the latest version of their Master Data Management  product. Continue reading “Interview Scott Gidley CTO and Founder, DataFlux”

Search Engine Advertising sweet spot for arbitrage.

Assume I am a blogger using both Adsense and Adwords.

Suppose Adwords costs me X dollars per click, and Adsense pays me Y dollars per click.

Then a unique arbitrage opportunity would arise if

Y times CTR on my blog > X times CTR on my Ad Campaign

Is it possible. Theoretically yes? Long Tail of Internet yes.

However since there is a lag of time in which the Rates would converge , the Adsense rate would go lower or Adwords rate would go higher

Is there a tool that you can use to pump keywords with short times arbitrage opportunities , much like trading algols and quants do in finance.

Just asking !

 Hint- Its a trick math puzzle 🙂

 

SAS Blogs gets a makeover

SAS blogs gets a much needed makeover. Now if only they share some of the social media analytics

some more rather than just social media on analytics 🙂 One of the more professionally designed , managed and

passionate corporate blog series in my opinion

Seriously 27 blogs and not one blog on social media analytics despite

being the leading maker of such software!

http://www.sas.com/software/customer-intelligence/social-media-analytics/