Cloud Computing using Python

I liked the new features in PiCloud , which is a cloud computing way to use Python. Python is increasingly popular as a computational language, and the cloud is the way where HW is headed to atleast as of 2011-12

http://www.picloud.com/

The new features allows you to publish your own functions as urls.

 By publishing your Python functions to URLs. Why would you want to publish a function?

  • To call your Python functions from a programming language other than Python.
  • To use PiCloud from Google AppEngine, which does not support our native client library.
  • To easily setup a scalable RPC system.

Here’s a peek at the interface:

You publish a Python function

cloud.rest.publish(your_func, ‘myfunction’)

We give you a URL Back

https://api.picloud.com/r/2/myfunction/

You make an HTTP request using your method of choice to the URL

curl -k -u ‘key:secret_key’ https://api.picloud.com/r/2/myfunction/

It certainly is an interesting development and I am wondering how other languages can adopt this paradigm as well.
For R, as of now http://www.cloudnumbers.com/ seems to be the only player in the cloud.
It would be exciting to see more players in the cloud statistical analytical space.

 

Interview Dan Steinberg Founder Salford Systems

Here is an interview with Dan Steinberg, Founder and President of Salford Systems (http://www.salford-systems.com/ )

Ajay- Describe your journey from academia to technology entrepreneurship. What are the key milestones or turning points that you remember.

 Dan- When I was in graduate school studying econometrics at Harvard,  a number of distinguished professors at Harvard (and MIT) were actively involved in substantial real world activities.  Professors that I interacted with, or studied with, or whose software I used became involved in the creation of such companies as Sun Microsystems, Data Resources, Inc. or were heavily involved in business consulting through their own companies or other influential consultants.  Some not involved in private sector consulting took on substantial roles in government such as membership on the President’s Council of Economic Advisors. The atmosphere was one that encouraged free movement between academia and the private sector so the idea of forming a consulting and software company was quite natural and did not seem in any way inconsistent with being devoted to the advancement of science.

 Ajay- What are the latest products by Salford Systems? Any future product plans or modification to work on Big Data analytics, mobile computing and cloud computing.

 Dan- Our central set of data mining technologies are CART, MARS, TreeNet, RandomForests, and PRIM, and we have always maintained feature rich logistic regression and linear regression modules. In our latest release scheduled for January 2012 we will be including a new data mining approach to linear and logistic regression allowing for the rapid processing of massive numbers of predictors (e.g., one million columns), with powerful predictor selection and coefficient shrinkage. The new methods allow not only classic techniques such as ridge and lasso regression, but also sub-lasso model sizes. Clear tradeoff diagrams between model complexity (number of predictors) and predictive accuracy allow the modeler to select an ideal balance suitable for their requirements.

The new version of our data mining suite, Salford Predictive Modeler (SPM), also includes two important extensions to the boosted tree technology at the heart of TreeNet.  The first, Importance Sampled learning Ensembles (ISLE), is used for the compression of TreeNet tree ensembles. Starting with, say, a 1,000 tree ensemble, the ISLE compression might well reduce this down to 200 reweighted trees. Such compression will be valuable when models need to be executed in real time. The compression rate is always under the modeler’s control, meaning that if a deployed model may only contain, say, 30 trees, then the compression will deliver an optimal 30-tree weighted ensemble. Needless to say, compression of tree ensembles should be expected to be lossy and how much accuracy is lost when extreme compression is desired will vary from case to case. Prior to ISLE, practitioners have simply truncated the ensemble to the maximum allowable size.  The new methodology will substantially outperform truncation.

The second major advance is RULEFIT, a rule extraction engine that starts with a TreeNet model and decomposes it into the most interesting and predictive rules. RULEFIT is also a tree ensemble post-processor and offers the possibility of improving on the original TreeNet predictive performance. One can think of the rule extraction as an alternative way to explain and interpret an otherwise complex multi-tree model. The rules extracted are similar conceptually to the terminal nodes of a CART tree but the various rules will not refer to mutually exclusive regions of the data.

 Ajay- You have led teams that have won multiple data mining competitions. What are some of your favorite techniques or approaches to a data mining problem.

 Dan- We only enter competitions involving problems for which our technology is suitable, generally, classification and regression. In these areas, we are  partial to TreeNet because it is such a capable and robust learning machine. However, we always find great value in analyzing many aspects of a data set with CART, especially when we require a compact and easy to understand story about the data. CART is exceptionally well suited to the discovery of errors in data, often revealing errors created by the competition organizers themselves. More than once, our reports of data problems have been responsible for the competition organizer’s decision to issue a corrected version of the data and we have been the only group to discover the problem.

In general, tackling a data mining competition is no different than tackling any analytical challenge. You must start with a solid conceptual grasp of the problem and the actual objectives, and the nature and limitations of the data. Following that comes feature extraction, the selection of a modeling strategy (or strategies), and then extensive experimentation to learn what works best.

 Ajay- I know you have created your own software. But are there other software that you use or liked to use?

 Dan- For analytics we frequently test open source software to make sure that our tools will in fact deliver the superior performance we advertise. In general, if a problem clearly requires technology other than that offered by Salford, we advise clients to seek other consultants expert in that other technology.

 Ajay- Your software is installed at 3500 sites including 400 universities as per http://www.salford-systems.com/company/aboutus/index.html What is the key to managing and keeping so many customers happy?

 Dan- First, we have taken great pains to make our software reliable and we make every effort  to avoid problems related to bugs.  Our testing procedures are extensive and we have experts dedicated to stress-testing software . Second, our interface is designed to be natural, intuitive, and easy to use, so the challenges to the new user are minimized. Also, clear documentation, help files, and training videos round out how we allow the user to look after themselves. Should a client need to contact us we try to achieve 24-hour turn around on tech support issues and monitor all tech support activity to ensure timeliness, accuracy, and helpfulness of our responses. WebEx/GotoMeeting and other internet based contact permit real time interaction.

 Ajay- What do you do to relax and unwind?

 Dan- I am in the gym almost every day combining weight and cardio training. No matter how tired I am before the workout I always come out energized so locating a good gym during my extensive travels is a must. I am also actively learning Portuguese so I look to watch a Brazilian TV show or Portuguese dubbed movie when I have time; I almost never watch any form of video unless it is available in Portuguese.

 Biography-

http://www.salford-systems.com/blog/dan-steinberg.html

Dan Steinberg, President and Founder of Salford Systems, is a well-respected member of the statistics and econometrics communities. In 1992, he developed the first PC-based implementation of the original CART procedure, working in concert with Leo Breiman, Richard Olshen, Charles Stone and Jerome Friedman. In addition, he has provided consulting services on a number of biomedical and market research projects, which have sparked further innovations in the CART program and methodology.

Dr. Steinberg received his Ph.D. in Economics from Harvard University, and has given full day presentations on data mining for the American Marketing Association, the Direct Marketing Association and the American Statistical Association. After earning a PhD in Econometrics at Harvard Steinberg began his professional career as a Member of the Technical Staff at Bell Labs, Murray Hill, and then as Assistant Professor of Economics at the University of California, San Diego. A book he co-authored on Classification and Regression Trees was awarded the 1999 Nikkei Quality Control Literature Prize in Japan for excellence in statistical literature promoting the improvement of industrial quality control and management.

His consulting experience at Salford Systems has included complex modeling projects for major banks worldwide, including Citibank, Chase, American Express, Credit Suisse, and has included projects in Europe, Australia, New Zealand, Malaysia, Korea, Japan and Brazil. Steinberg led the teams that won first place awards in the KDDCup 2000, and the 2002 Duke/TeraData Churn modeling competition, and the teams that won awards in the PAKDD competitions of 2006 and 2007. He has published papers in economics, econometrics, computer science journals, and contributes actively to the ongoing research and development at Salford.

Using Google Fusion Tables from #rstats

But after all that- I was quite happy to see Google Fusion Tables within Google Docs. Databases as a service ? Not quite but still quite good, and lets see how it goes.

https://www.google.com/fusiontables/DataSource?dsrcid=implicit&hl=en_US&pli=1

http://googlesystem.blogspot.com/2011/09/fusion-tables-new-google-docs-app.html

 

But what interests me more is

http://code.google.com/apis/fusiontables/docs/developers_guide.html

The Google Fusion Tables API is a set of statements that you can use to search for and retrieve Google Fusion Tables data, insert new data, update existing data, and delete data. The API statements are sent to the Google Fusion Tables server using HTTP GET requests (for queries) and POST requests (for inserts, updates, and deletes) from a Web client application. The API is language agnostic: you can write your program in any language you prefer, as long as it provides some way to embed the API calls in HTTP requests.

The Google Fusion Tables API does not provide the mechanism for submitting the GET and POST requests. Typically, you will use an existing code library that provides such functionality; for example, the code libraries that have been developed for the Google GData API. You can also write your own code to implement GET and POST requests.

Also see http://code.google.com/apis/fusiontables/docs/sample_code.html

 

Google Fusion Tables API Sample Code

Libraries

SQL API

Language Library Public repository Samples
Python Fusion Tables Python Client Library fusion-tables-client-python/ Samples
PHP Fusion Tables PHP Client Library fusion-tables-client-php/ Samples

Featured Samples

An easy way to learn how to use an API can be to look at sample code. The table above provides links to some basic samples for each of the languages shown. This section highlights particularly interesting samples for the Fusion Tables API.

SQL API

Language Featured samples API version
cURL
  • Hello, cURLA simple example showing how to use curl to access Fusion Tables.
SQL API
Google Apps Script SQL API
Java
  • Hello, WorldA simple walkthrough that shows how the Google Fusion Tables API statements work.
  • OAuth example on fusion-tables-apiThe Google Fusion Tables team shows how OAuth authorization enables you to use the Google Fusion Tables API from a foreign web server with delegated authorization.
SQL API
Python
  • Docs List ExampleDemonstrates how to:
    • List tables
    • Set permissions on tables
    • Move a table to a folder
Docs List API
Android (Java)
  • Basic Sample ApplicationDemo application shows how to create a crowd-sourcing application that allows users to report potholes and save the data to a Fusion Table.
SQL API
JavaScript – FusionTablesLayer Using the FusionTablesLayer, you can display data on a Google Map

Also check out FusionTablesLayer Builder, which generates all the code necessary to include a Google Map with a Fusion Table Layer on your own website.

FusionTablesLayer, Google Maps API
JavaScript – Google Chart Tools Using the Google Chart Tools, you can request data from Fusion Tables to use in visualizations or to display directly in an HTML page. Note: responses are limited to 500 rows of data.

Google Chart Tools

External Resources

Google Fusion Tables is dedicated to providing code examples that illustrate typical uses, best practices, and really cool tricks. If you do something with the Google Fusion Tables API that you think would be interesting to others, please contact us at googletables-feedback@google.com about adding your code to our Examples page.

  • Shape EscapeA tool for uploading shape files to Fusion Tables.
  • GDALOGR Simple Feature Library has incorporated Fusion Tables as a supported format.
  • Arc2CloudArc2Earth has included support for upload to Fusion Tables via Arc2Cloud.
  • Java and Google App EngineODK Aggregate is an AppEngine application by the Open Data Kit team, uses Google Fusion Tables to store survey data that is collected through input forms on Android mobile phones. Notable code:
  • R packageAndrei Lopatenko has written an R interface to Fusion Tables so Fusion Tables can be used as the data store for R.
  • RubySimon Tokumine has written a Ruby gem for access to Fusion Tables from Ruby.

 

Updated-You can use Google Fusion Tables from within R from http://andrei.lopatenko.com/rstat/fusion-tables.R

 

ft.connect <- function(username, password) {
  url = "https://www.google.com/accounts/ClientLogin";
  params = list(Email = username, Passwd = password, accountType="GOOGLE", service= "fusiontables", source = "R_client_API")
 connection = postForm(uri = url, .params = params)
 if (length(grep("error", connection, ignore.case = TRUE))) {
 	stop("The wrong username or password")
 	return ("")
 }
 authn = strsplit(connection, "\nAuth=")[[c(1,2)]]
 auth = strsplit(authn, "\n")[[c(1,1)]]
 return (auth)
}

ft.disconnect <- function(connection) {
}

ft.executestatement <- function(auth, statement) {
      url = "http://tables.googlelabs.com/api/query"
      params = list( sql = statement)
      connection.string = paste("GoogleLogin auth=", auth, sep="")
      opts = list( httpheader = c("Authorization" = connection.string))
      result = postForm(uri = url, .params = params, .opts = opts)
      if (length(grep("<HTML>\n<HEAD>\n<TITLE>Parse error", result, ignore.case = TRUE))) {
      	stop(paste("incorrect sql statement:", statement))
      }
      return (result)
}

ft.showtables <- function(auth) {
   url = "http://tables.googlelabs.com/api/query"
   params = list( sql = "SHOW TABLES")
   connection.string = paste("GoogleLogin auth=", auth, sep="")
   opts = list( httpheader = c("Authorization" = connection.string))
   result = getForm(uri = url, .params = params, .opts = opts)
   tables = strsplit(result, "\n")
   tableid = c()
   tablename = c()
   for (i in 2:length(tables[[1]])) {
     	str = tables[[c(1,i)]]
   	    tnames = strsplit(str,",")
   	    tableid[i-1] = tnames[[c(1,1)]]
   	    tablename[i-1] = tnames[[c(1,2)]]
   	}
   	tables = data.frame( ids = tableid, names = tablename)
    return (tables)
}

ft.describetablebyid <- function(auth, tid) {
   url = "http://tables.googlelabs.com/api/query"
   params = list( sql = paste("DESCRIBE", tid))
   connection.string = paste("GoogleLogin auth=", auth, sep="")
   opts = list( httpheader = c("Authorization" = connection.string))
   result = getForm(uri = url, .params = params, .opts = opts)
   columns = strsplit(result,"\n")
   colid = c()
   colname = c()
   coltype = c()
   for (i in 2:length(columns[[1]])) {
     	str = columns[[c(1,i)]]
   	    cnames = strsplit(str,",")
   	    colid[i-1] = cnames[[c(1,1)]]
   	    colname[i-1] = cnames[[c(1,2)]]
   	    coltype[i-1] = cnames[[c(1,3)]]
   	}
   	cols = data.frame(ids = colid, names = colname, types = coltype)
    return (cols)
}

ft.describetable <- function (auth, table_name) {
   table_id = ft.idfromtablename(auth, table_name)
   result = ft.describetablebyid(auth, table_id)
   return (result)
}

ft.idfromtablename <- function(auth, table_name) {
    tables = ft.showtables(auth)
	tableid = tables$ids[tables$names == table_name]
	return (tableid)
}

ft.importdata <- function(auth, table_name) {
	tableid = ft.idfromtablename(auth, table_name)
	columns = ft.describetablebyid(auth, tableid)
	column_spec = ""
	for (i in 1:length(columns)) {
		column_spec = paste(column_spec, columns[i, 2])
		if (i < length(columns)) {
			column_spec = paste(column_spec, ",", sep="")
		}
	}
	mdata = matrix(columns$names,
	              nrow = 1, ncol = length(columns),
	              dimnames(list(c("dummy"), columns$names)), byrow=TRUE)
	select = paste("SELECT", column_spec)
	select = paste(select, "FROM")
	select = paste(select, tableid)
	result = ft.executestatement(auth, select)
    numcols = length(columns)
    rows = strsplit(result, "\n")
    for (i in 3:length(rows[[1]])) {
    	row = strsplit(rows[[c(1,i)]], ",")
    	mdata = rbind(mdata, row[[1]])
   	}
   	output.frame = data.frame(mdata[2:length(mdata[,1]), 1])
   	for (i in 2:ncol(mdata)) {
   		output.frame = cbind(output.frame, mdata[2:length(mdata[,i]),i])
   	}
   	colnames(output.frame) = columns$names
    return (output.frame)
}

quote_value <- function(value, to_quote = FALSE, quote = "'") {
	 ret_value = ""
     if (to_quote) {
     	ret_value = paste(quote, paste(value, quote, sep=""), sep="")
     } else {
     	ret_value = value
     }
     return (ret_value)
}

converttostring <- function(arr, separator = ", ", column_types) {
	con_string = ""
	for (i in 1:(length(arr) - 1)) {
		value = quote_value(arr[i], column_types[i] != "number")
		con_string = paste(con_string, value)
	    con_string = paste(con_string, separator, sep="")
	}

    if (length(arr) >= 1) {
    	value = quote_value(arr[length(arr)], column_types[length(arr)] != "NUMBER")
    	con_string = paste(con_string, value)
    }
}

ft.exportdata <- function(auth, input_frame, table_name, create_table) {
	if (create_table) {
       create.table = "CREATE TABLE "
       create.table = paste(create.table, table_name)
       create.table = paste(create.table, "(")
       cnames = colnames(input_frame)
       for (columnname in cnames) {
         create.table = paste(create.table, columnname)
    	 create.table = paste(create.table, ":string", sep="")
    	   if (columnname != cnames[length(cnames)]){
    		  create.table = paste(create.table, ",", sep="")
           }
       }
      create.table = paste(create.table, ")")
      result = ft.executestatement(auth, create.table)
    }
    if (length(input_frame[,1]) > 0) {
    	tableid = ft.idfromtablename(auth, table_name)
	    columns = ft.describetablebyid(auth, tableid)
	    column_spec = ""
	    for (i in 1:length(columns$names)) {
		   column_spec = paste(column_spec, columns[i, 2])
		   if (i < length(columns$names)) {
			  column_spec = paste(column_spec, ",", sep="")
		   }
	    }
    	insert_prefix = "INSERT INTO "
    	insert_prefix = paste(insert_prefix, tableid)
    	insert_prefix = paste(insert_prefix, "(")
    	insert_prefix = paste(insert_prefix, column_spec)
    	insert_prefix = paste(insert_prefix, ") values (")
    	insert_suffix = ");"
    	insert_sql_big = ""
    	for (i in 1:length(input_frame[,1])) {
    		data = unlist(input_frame[i,])
    		values = converttostring(data, column_types  = columns$types)
    		insert_sql = paste(insert_prefix, values)
    		insert_sql = paste(insert_sql, insert_suffix) ;
    		insert_sql_big = paste(insert_sql_big, insert_sql)
    		if (i %% 500 == 0) {
    			ft.executestatement(auth, insert_sql_big)
    			insert_sql_big = ""
    		}
    	}
        ft.executestatement(auth, insert_sql_big)
    }
}

Interview Eberhard Miethke and Dr. Mamdouh Refaat, Angoss Software

Here is an interview with Eberhard Miethke and Dr. Mamdouh Refaat, of Angoss Software. Angoss is a global leader in delivering business intelligence software and predictive analytics solutions that help businesses capitalize on their data by uncovering new opportunities to increase sales and profitability and to reduce risk.

Ajay-  Describe your personal journey in software. How can we guide young students to pursue more useful software development than just gaming applications.

 Mamdouh- I started using computers long time ago when they were programmed using punched cards! First in Fortran, then C, later C++, and then the rest. Computers and software were viewed as technical/engineering tools, and that’s why we can still see the heavy technical orientation of command languages such as Unix shells and even in the windows Command shell. However, with the introduction of database systems and Microsoft office apps, it was clear that business will be the primary user and field of application for software. My personal trip in software started with scientific applications, then business and database systems, and finally statistical software – which you can think of it as returning to the more scientific orientation. However, with the wide acceptance of businesses of the application of statistical methods in different fields such as marketing and risk management, it is a fast growing field that in need of a lot of innovation.

Ajay – Angoss makes multiple data mining and analytics products. could you please introduce us to your product portfolio and what specific data analytics need they serve.

a- Attached please find our main product flyers for KnowledgeSTUDIO and KnowledgeSEEKER. We have a 3rd product called “strategy builder” which is an add-on to the decision tree modules. This is also described in the flyer.

(see- Angoss Knowledge Studio Product Guide April2011  and http://www.scribd.com/doc/63176430/Angoss-Knowledge-Seeker-Product-Guide-April2011  )

Ajay-  The trend in analytics is for big data and cloud computing- with hadoop enabling processing of massive data sets on scalable infrastructure. What are your plans for cloud computing, tablet based as well as mobile based computing.

a- This is an area where the plan is still being figured out in all organizations. The current explosion of data collected from mobile phones, text messages, and social websites will need radically new applications that can utilize the data from these sources. Current applications are based on the relational database paradigm designed in the 70’s through the 90’s of the 20th century.

But data sources are generating data in volumes and formats that are challenging this paradigm and will need a set of new tools and possibly programming languages to fit these needs. The cloud computing, tablet based and mobile computing (which are the same thing in my opinion, just different sizes of the device) are also two technologies that have not been explored in analytics yet.

The approach taken so far by most companies, including Angoss, is to rely on new xml-based standards to represent data structures for the particular models. In this case, it is the PMML (predictive modelling mark-up language) standard, in order to allow the interoperability between analytics applications. Standardizing on the representation of models is viewed as the first step in order to allow the implementation of these models to emerging platforms, being that the cloud or mobile, or social networking websites.

The second challenge cited above is the rapidly increasing size of the data to be analyzed. Angoss has already identified this challenge early on and is currently offering in-database analytics drivers for several database engines: Netezza, Teradata and SQL Server.

These drivers allow our analytics products to translate their routines into efficient SQL-based scripts that run in the database engine to exploit its performance as well as the powerful hardware on which it runs. Thus, instead of copying the data to a staging format for analytics, these drivers allow the data to be analyzed “in-place” within the database without moving it.

Thus offering performance, security and integrity. The performance is improved because of the use of the well tuned database engines running on powerful hardware.

Extra security is achieved by not copying the data to other platforms, which could be less secure. And finally, the integrity of the results are vastly improved by making sure that the results are always obtained by analyzing the up-to-date data residing in the database rather than an older copy of the data which could be obsolete by the time the analysis is concluded.

Ajay- What are the principal competing products to your offerings, and what makes your products special or differentiated in value to them (for each customer segment).

a- There are two major players in today’s market that we usually encounter as competitors, they are: SAS and IBM.

SAS offers a data mining workbench in the form of SAS Enterprise Miner, which is closely tied to SAS data mining methodology known as SEMMA.

On the other hand, IBM has recently acquired SPSS, which offered its Clementine data mining software. IBM has now rebranded Clementine as IBM SPSS Modeller.

In comparison to these products, our KnowledgeSTUDIO and KnowledgeSEEKER offer three main advantages: ease of use; affordability; and ease of integration into existing BI environments.

Angoss products were designed to look-and-feel-like popular Microsoft office applications. This makes the learning curve indeed very steep. Typically, an intermediate level analyst needs only 2-3 days of training to become proficient in the use of the software with all its advanced features.

Another important feature of Angoss software products is their integration with SAS/base product, and SQL-based database engines. All predictive models generated by Angoss can be automatically translated to SAS and SQL scripts. This allows the generation of scoring code for these common platforms. While the software interface simplifies all the tasks to allow business users to take advantage of the value added by predictive models, the software includes advanced options to allow experienced statisticians to fine-tune their models by adjusting all model parameters as needed.

In addition, Angoss offers a unique product called StrategyBuilder, which allows the analyst to add key performance indicators (KPI’s) to predictive models. KPI’s such as profitability, market share, and loyalty are usually required to be calculated in conjunction with any sales and marketing campaign. Therefore, StrategyBuilder was designed to integrate such KPI’s with the results of a predictive model in order to render the appropriate treatment for each customer segment. These results are all integrated into a deployment strategy that can also be translated into an execution code in SQL or SAS.

The above competitive features offered by the software products of Angoss is behind its success in serving over 4000 users from over 500 clients worldwide.

Ajay -Describe a major case study where using Angoss software helped save a big amount of revenue/costs by innovative data mining.

a-Rogers Telecommunications Inc. is one of the largest Canadian telecommunications providers, serving over 8.5 million customers and a revenue of 11.1 Billion Canadian Dollars (2009). In 2008, Rogers engaged Angoss in order to help with the problem of ballooning accounts receivable for a period of 18 months.

The problem was approached by improving the efficiency of the call centre serving the collections process by a set of predictive models. The first set of models were designed to find accounts likely to default ahead of time in order to take preventative measures. A second set of models were designed to optimize the call centre resources to focus on delinquent accounts likely to pay back most of the outstanding balance. Accounts that were identified as not likely to pack quickly were good candidates for “Early-out” treatment, by forwarding them directly to collection agencies. Angoss hosted Rogers’ data and provided on a regular interval the lists of accounts for each treatment to be deployed by the call centre dialler. As a result of this Rogers estimated an improvement of 10% of the collected sums.

Biography-

Mamdouh has been active in consulting, research, and training in various areas of information technology and software development for the last 20 years. He has worked on numerous projects with major organizations in North America and Europe in the areas of data mining, business analytics, business analysis, and engineering analysis. He has held several consulting positions for solution providers including Predict AG in Basel, Switzerland, and as ANGOSS Corp. Mamdouh is the Director of Professional services for EMEA region of ANGOSS Software. Mamdouh received his PhD in engineering from the University of Toronto and his MBA from the University of Leeds, UK.

Mamdouh is the author of:

"Credit Risk Scorecards: Development and Implmentation using SAS"
 "Data Preparation for Data Mining Using SAS",
 (The Morgan Kaufmann Series in Data Management Systems) (Paperback)
 and co-author of
 "Data Mining: Know it all",Morgan Kaufmann



Eberhard Miethke  works as a senior sales executive for Angoss

 

About Angoss-

Angoss is a global leader in delivering business intelligence software and predictive analytics to businesses looking to improve performance across sales, marketing and risk. With a suite of desktop, client-server and in-database software products and Software-as-a-Service solutions, Angoss delivers powerful approaches to turn information into actionable business decisions and competitive advantage.

Angoss software products and solutions are user-friendly and agile, making predictive analytics accessible and easy to use.

Interview Scott Gidley CTO and Founder, DataFlux

Here is an interview with Scott Gidley, CTO and co-founder of leading data quality ccompany DataFlux . DataFlux is a part of SAS Institute and in 2011 acquired Baseline Consulting besides launching the latest version of their Master Data Management  product. Continue reading “Interview Scott Gidley CTO and Founder, DataFlux”

Interview Beth Schultz Editor AllAnalytics.com

Here is an interview with Beth Scultz Editor in Chief, AllAnalytics.com .

Allanalytics.com http://www.allanalytics.com/ is the new online community on Predictive Analytics, and its a bit different in emphasizing quality more than just quantity. Beth is veteran in tech journalism and communities.

Ajay-Describe your journey in technology journalism and communication. What are the other online communities that you have been involved with?

Beth- I’m a longtime IT journalist, having begun my career covering the telecommunications industry at the brink of AT&T’s divestiture — many eons ago. Over the years, I’ve covered the rise of internal corporate networking; the advent of the Internet and creation of the Web for business purposes; the evolution of Web technology for use in building intranets, extranets, and e-commerce sites; the move toward a highly dynamic next-generation IT infrastructure that we now call cloud computing; and development of myriad enterprise applications, including business intelligence and the analytics surrounding them. I have been involved in developing online B2B communities primarily around next-generation enterprise IT infrastructure and applications. In addition, Shawn Hessinger, our community editor, has been involved in myriad Web sites aimed at creating community for small business owners.

 Ajay- Technology geeks get all the money while journalists get a story. Comments please

Beth- Great technology geeks — those being the ones with technology smarts as well as business savvy — do stand to make a lot of money. And some pursue that to all ends (with many entrepreneurs gunning for the acquisition) while others more or less fall into it. Few journalists, at least few tech journalists, have big dollars in mind. The gratification for journalists comes in being able to meet these folks, hear and deliver their stories — as appropriate — and help explain what makes this particular technology geek developing this certain type of product or service worth paying attention to.

 Ajay- Describe what you are trying to achieve with the All Analytics community and how it seeks to differentiate itself with other players in this space.

 Beth- With AllAnaltyics.com, we’re concentrating on creating the go-to site for CXOs, IT professionals, line-of-business managers, and other professionals to share best practices, concrete experiences, and research about data analytics, business intelligence, information optimization, and risk management, among many other topics. We differentiate ourself by featuring excellent editorial content from a top-notch group of bloggers, access to industry experts through weekly chats, ongoing lively and engaging message board discussions, and biweekly debates.

We’re a new property, and clearly in rapid building mode. However, we’ve already secured some of the industry’s most respected BI/analytics experts to participate as bloggers. For example, a small sampling of our current lineup includes the always-intrigueing John Barnes, a science fiction novelist and statistics guru; Sandra Gittlen, a longtime IT journalist with an affinity for BI coverage; Olivia Parr-Rud, an internationally recognized expert in BI and organizational alignment; Tom Redman, a well-known data-quality expert; and Steve Williams, a leading BI strategy consultant. I blog daily as well, and in particular love to share firsthand experiences of how organizations are benefiting from the use of BI, analytics, data warehousing, etc. We’ve featured inside looks at analytics initiatives at companies such as 1-800-Flowers.com, Oberweis Dairy, the Cincinnati Zoo & Botanical Garden, and Thomson Reuters, for example.

In addition, we’ve hosted instant e-chats with Web and social media experts Joe Stanganelli and Pierre DeBois, and this Friday, Aug. 26, at 3 p.m. ET we’ll be hosting an e-chat with Marshall Sponder, Web metrics guru and author of the newly published book, Social Media Analytics: Effective Tools for Building, Interpreting, and Using Metrics. (Readers interested in participating in the chat do need to fill out a quick registration form, available here http://www.allanalytics.com/register.asp . The chat is available here http://www.allanalytics.com/messages.asp?piddl_msgthreadid=241039&piddl_msgid=439898#msg_439898 .

Experts participating in our biweekly debate series, called Point/Counterpoint, have broached topics such as BI in the cloud, mobile BI and whether an analytics culture is truly possible to build.

Ajay-  What are some tips you would like to share about writing tech stories to aspiring bloggers.

Beth- I suppose my best advice is this: Don’t write about technology for technology’s sake. Always strive to tell the audience why they should care about a particular technology, product, or service. How might a reader use it to his or her company’s advantage, and what are the potential benefits? Improved productivity, increased revenue, better customer service? Providing anecdotal evidence goes a long way toward delivering that message, as well.

Ajay- What are the other IT world websites that have made a mark on the internet.

Beth- I’d be remiss if I didn’t give a shout out to UBM TechWeb sites, including InformationWeek, which has long charted the use of IT within the enterprise; Dark Reading, a great source for folks interested in securing an enterprise’s information assets; and Light Reading, which takes the pulse of the telecom industry.

 Biography- 

Beth Schultz has more than two decades of experience as an IT writer and editor. Most recently, she brought her expertise to bear writing thought-provoking editorial and marketing materials on a variety of technology topics for leading IT publications and industry players. Previously, she oversaw multimedia content development, writing and editing for special feature packages at Network World. Beth has a keen ability to identify business and technology trends, developing expertise through in-depth analysis and early-adopter case studies. Over the years, she has earned more than a dozen national and regional editorial excellence awards for special issues from American Business Media, American Society of Business Press Editors, Folio.net, and others.

 

Google Cloud Connect

 

Interestingly a Google Plugin to share Microsoft Office on the Cloud.

Google Cloud Connect is a plug-in for Microsoft Office® 2003, 2007, and 2010 that lets you share and edit Microsoft Word, PowerPoint, and Excel documents simultaneously with other people in your organization. You get the collaboration benefits of Google Docs, while still using Microsoft Office.

http://tools.google.com/dlpage/cloudconnect

Google Cloud Connect for Microsoft Office

Google Cloud Connect for Microsoft Office brings collaborative multi-person editing to the familiar Microsoft® Office experience. You can share, backup, and simultaneously edit Microsoft Word, PowerPoint®, and Excel® documents with coworkers.

Watch the videos below to learn how Google Cloud Connect teaches your old docs new tricks.

Learn how Cloud Connect helped Mazda Raceway Laguna Seca (English only)

Help and information for users
Help and information for administrators

Download Google Cloud Connect

It’s free and downloads in seconds.

Requirements:

  • Windows XP with .NET Framework 2.0, Windows Vista, or Windows 7
  • Microsoft Office 2003, Office 2007, or Office 2010