Anonymous Hacker

Digital salmon swimming alone upstream
Sacrificing sleep for bits of some dream
Patiently stalking impatient trolls
Denying web crawlers omnipotent stroll
Digitally Dividing and conquering
Masquerading charlatans of new age ball
Sheep like server waits for hacking hammer to fall

Amazing nights but boring days
Lord forgive us our digital merry ways
Going back to the day job in the morn
Keeping up appearances to earn bread and daily corn

Masking emotions daily outrage
Waging lonesome wars of rage
we are small but proud
forced sometimes kneel and crawl
hacker cracker phisher phracker
wont you be my friend after it all

by –

Ajay Ohri

https://plus.google.com/116302364907696741272

Happy Guy Fawkes Day-

Rcpp Workshop in San Francisco Oct 8th

Rcpp Workshop in San Francisco Oct 8th

Following the successful one-day master class on Rcpp preceding this year’s R/Finance conference, a full-day master class on Rcpp and related topics which will be held on Saturday, October 8, in San Francisco.

Join Dirk Eddelbuettel for six hours of detailed and hands-on instructions and discussions aroundRcpp, inline, RInside, RcppArmadillo, RcppGSL, RcppEigen and other packages—in an intimate small-group setting.

The full-day format allows combining an introductory morning session with a more advanced afternoon session while leaving room for sufficient breaks. We plan on having about six hours of instructions, a one-hour lunch break and two half-hour coffee breaks (and lunch and refreshments will be provided).

Morning session: “A Hands-on Introduction to R and C++”

The morning session will provide a practical introduction to the Rcpp package (and other related packages). The focus will be on simple and straightforward applications of Rcpp in order to extend R and/or to significantly accelerate the execution of simple functions.

The tutorial will cover the inline package which permits embedding of self-contained C, C++ or FORTRAN code in R scripts. We will also discuss RInside, to easily embed the R engine code in C++ applications, as well as standard Rcpp extension packages such as RcppArmadillo and RcppEigen for linear algebra (via highly expressive templated C++ libraries) and RcppGSL.

Afternoon session: “Advanced R and C++ Topics”

The afternoon tutorial will provide a hands-on introduction to more advanced Rcpp features. It will cover topics such as writing packages that use Rcpp, how Rcpp modules and the new R ReferenceClasses interact, and how Rcpp sugar lets us write C++ code that is often as expressive as R code. Another possible topic, time permitting, may be writing glue code to extend Rcpp to other C++ projects.

We also expect to leave some time to discuss problems brought by the class participants.

October 8, 2011 – San Franciso

AMA Executive Conference Center
@ the Marriott Hotel
55 4th Street, 2nd Level
San Francisco, CA 94103
Tel. 415-442-6770

Instructor Bio

Dirk E has been contributing packages to CRAN for nearly a decade. Among these are RQuantLib, digest, littler, random, RPostgreSQL, as well the Rcpp family of packages comprising Rcpp, RInside, RcppClassic, RcppExamples, RcppDE, RcppArmadillo and RcppEigen. He maintains the CRAN Task Views for Finance as well as High-Performance Computing, and is a founding co-organiser of the annual R / Finance conferences in Chicago. He has Ph.D. in Financial Econometrics from EHESS (Paris), and works in Chicago as a Quantitative Strategist.

Jaspersoft releasing new version – 4.2

Jaspersoft is planning to launch its version 4.2 to the world.

http://www.jaspersoft.com/event/upcoming-webinar-introducing-jaspersoft-42?elq=c0e7a97601f84a8399b1abc5cc84bbe5

Upcoming Webinar: Introducing Jaspersoft 4.2

Webinar

Date: September 29, 2011
Time: 10:00 AM PT/1:00 PM ET
Duration: 60 minutes
Language: English

Whether your building business intelligence (BI) solutions for your organization or for your customers, one thing is likely: your users want access to information anytime, anywhere. The challenge is getting the right information, to the right person, on the right device, without breaking your budget.

You can see precisely what we mean at the Jaspersoft 4.2 launch webinar on Thursday, September 29th.

Join us and see how Jaspersoft 4.2 can deliver superior choice for organizations looking to deliver information to end users, wherever they are. Jaspersoft is focused on providing modern, usable, affordable BI for everyone.
●      Discover the new product capabilities that will improve BI access for your users
●      See Jaspersoft 4.2 live demos
●      Join Japersoft experts and fellow technology professionals in a real-time, interactive discussion.

Register to reserve your seat, today!

Denial of Service Attacks against Hospitals and Emergency Rooms

One of the most frightening possibilities of cyber warfare is to use remotely deployed , or timed intrusion malware to disturb, distort, deny health care services.

Computer Virus Shuts Down Georgia Hospital

A doctor in an Emergency Room depends on critical information that may save lives if it is electronic and comes on time. However this electronic information can be distorted (which is more severe than deleting it)

The electronic system of a Hospital can also be overwhelmed. If there can be built Stuxnet worms on nuclear centrifuge systems (like those by Siemens), then the widespread availability of health care systems means these can be reverse engineered for particularly vicious cyber worms.

An example of prime area for targeting is Veterans Administration for veterans of armed forces, but also cyber attacks against electronic health records.

Consider the following data points-

http://threatpost.com/en_us/blogs/dhs-warns-about-threat-mobile-devices-healthcare-051612

May 16, 2012, 9:03AM

DHS’s National Cybersecurity and Communications Integration Center (NCCIC) issued the unclassfied bulletin, “Attack Surface: Healthcare and Public Health Sector” on May 4. In it, DHS warns of a wide range of security risks, including that could expose patient data to malicious attackers, or make hospital networks and first responders subject to disruptive cyber attack

http://publicintelligence.net/nccic-medical-device-cyberattacks/

National Cybersecurity and Communications Integration Center Bulletin

The Healthcare and Public Health (HPH) sector is a multi-trillion dollar industry employing over 13 million personnel, including approximately five million first-responders with at least some emergency medical training, three million registered nurses, and more than 800,000 physicians.

(U) A significant portion of products used in patient care and management including diagnosis and treatment are Medical Devices (MD). These MDs are designed to monitor changes to a patient’s health and may be implanted or external. The Food and Drug Administration (FDA) regulates devices from design to sale and some aspects of the relationship between manufacturers and the MDs after sale. However, the FDA cannot regulate MD use or users, which includes how they are linked to or configured within networks. Typically, modern MDs are not designed to be accessed remotely; instead they are intended to be networked at their point of use. However, the flexibility and scalability of wireless networking makes wireless access a convenient option for organizations deploying MDs within their facilities. This robust sector has led the way with medical based technology options for both patient care and data handling.

(U) The expanded use of wireless technology on the enterprise network of medical facilities and the wireless utilization of MDs opens up both new opportunities and new vulnerabilities to patients and medical facilities. Since wireless MDs are now connected to Medical information technology (IT) networks, IT networks are now remotely accessible through the MD. This may be a desirable development, but the communications security of MDs to protect against theft of medical information and malicious intrusion is now becoming a major concern. In addition, many HPH organizations are leveraging mobile technologies to enhance operations. The storage capacity, fast computing speeds, ease of use, and portability render mobile devices an optimal solution.

(U) This Bulletin highlights how the portability and remote connectivity of MDs introduce additional risk into Medical IT networks and failure to implement a robust security program will impact the organization’s ability to protect patients and their medical information from intentional and unintentional loss or damage.

…

(U) According to Health and Human Services (HHS), a major concern to the Healthcare and Public Health (HPH) Sector is exploitation of potential vulnerabilities of medical devices on Medical IT networks (public, private and domestic). These vulnerabilities may result in possible risks to patient safety and theft or loss of medical information due to the inadequate incorporation of IT products, patient management products and medical devices onto Medical IT Networks. Misconfigured networks or poor security practices may increase the risk of compromised medical devices. HHS states there are four factors which further complicate security resilience within a medical organization.

1. (U) There are legacy medical devices deployed prior to enactment of the Medical Device Law in 1976, that are still in use today.

2. (U) Many newer devices have undergone rigorous FDA testing procedures and come equipped with design features which facilitate their safe incorporation onto Medical IT networks. However, these secure design features may not be implemented during the deployment phase due to complexity of the technology or the lack of knowledge about the capabilities. Because the technology is so new, there may not be an authoritative understanding of how to properly secure it, leaving open the possibilities for exploitation through zero-day vulnerabilities or insecure deployment configurations. In addition, new or robust features, such as custom applications, may also mean an increased amount of third party code development which may create vulnerabilities, if not evaluated properly. Prior to enactment of the law, the FDA required minimal testing before placing on the market. It is challenging to localize and mitigate threats within this group of legacy equipment.

3. (U) In an era of budgetary restraints, healthcare facilities frequently prioritize more traditional programs and operational considerations over network security.

4. (U) Because these medical devices may contain sensitive or privacy information, system owners may be reluctant to allow manufactures access for upgrades or updates. Failure to install updates lays a foundation for increasingly ineffective threat mitigation as time passes.

…

(U) Implantable Medical Devices (IMD): Some medical computing devices are designed to be implanted within the body to collect, store, analyze and then act on large amounts of information. These IMDs have incorporated network communications capabilities to increase their usefulness. Legacy implanted medical devices still in use today were manufactured when security was not yet a priority. Some of these devices have older proprietary operating systems that are not vulnerable to common malware and so are not supported by newer antivirus software. However, many are vulnerable to cyber attacks by a malicious actor who can take advantage of routine software update capabilities to gain access and, thereafter, manipulate the implant.

(U) During an August 2011 Black Hat conference, a security researcher demonstrated how an outside actor can shut off or alter the settings of an insulin pump without the user’s knowledge. The demonstration was given to show the audience that the pump’s cyber vulnerabilities could lead to severe consequences. The researcher that provided the demonstration is a diabetic and personally aware of the implications of this activity. The researcher also found that a malicious actor can eavesdrop on a continuous glucose monitor’s (CGM) transmission by using an oscilloscope, but device settings could not be reprogrammed. The researcher acknowledged that he was not able to completely assume remote control or modify the programming of the CGM, but he was able to disrupt and jam the device.

http://www.healthreformwatch.com/category/electronic-medical-records/

February 7, 2012

Since the data breach notification regulations by HHS went into effect in September 2009, 385 incidents affecting 500 or more individuals have been reported to HHS, according to its website.

http://www.darkdaily.com/cyber-attacks-against-internet-enabled-medical-devices-are-new-threat-to-clinical-pathology-laboratories-215#axzz1yPzItOFc

February 16 2011

One high-profile healthcare system that regularly experiences such attacks is the Veterans Administration (VA). For two years, the VA has been fighting a cyber battle against illegal and unwanted intrusions into their medical devices

http://www.mobiledia.com/news/120863.html

DEC 16, 2011

Malware in a Georgia hospital’s computer system forced it to turn away patients, highlighting the problems and vulnerabilities of computerized systems.

The computer infection started to cause problems at the Gwinnett Medical Center last Wednesday and continued to spread, until the hospital was forced to send all non-emergency admissions to other hospitals.

More doctors and nurses than ever are using mobile devices in healthcare, and hospitals are making patient records computerized for easier, convenient access over piles of paperwork.

http://www.doctorsofusc.com/uscdocs/locations/lac-usc-medical-center

As one of the busiest public hospitals in the western United States, LAC+USC Medical Center records nearly 39,000 inpatient discharges, 150,000 emergency department visits, and 1 million ambulatory care visits each year.

http://www.healthreformwatch.com/category/electronic-medical-records/

If one jumbo jet crashed in the US each day for a week, we’d expect the FAA to shut down the industry until the problem was figured out. But in our health care system, roughly 250 people die each day due to preventable error

http://www.pcworld.com/article/142926/are_healthcare_organizations_under_cyberattack.html

Feb 28, 2008

“There is definitely an uptick in attacks,” says Dr. John Halamka, CIO at both Beth Israel Deaconess Medical Center and Harvard Medical School in the Boston area. “Privacy is the foundation of everything we do. We don’t want to be the TJX of healthcare.” TJX is the Framingham, Mass-based retailer which last year disclosed a massive data breach involving customer records.

Dr. Halamka, who this week announced a project in electronic health records as an online service to the 300 doctors in the Beth Israel Deaconess Physicians Organization,

Interview Markus Schmidberger ,Cloudnumbers.com

Here is an interview with Markus Schmidberger, Senior Community Manager for cloudnumbers.com. Cloudnumbers.com is the exciting new cloud startup for scientific computing. It basically enables transition to a R and other platforms in the cloud and makes it very easy and secure from the traditional desktop/server model of operation.

Ajay- Describe the startup story for setting up Cloudnumbers.com

Markus- In 2010 the company founders Erik Muttersbach (TU München), Markus Fensterer (TU München) and Moritz v. Petersdorff-Campen (WHU Vallendar) started with the development of the cloud computing environment. Continue reading “Interview Markus Schmidberger ,Cloudnumbers.com”

Interview Dan Steinberg Founder Salford Systems

Here is an interview with Dan Steinberg, Founder and President of Salford Systems (http://www.salford-systems.com/ )

Ajay- Describe your journey from academia to technology entrepreneurship. What are the key milestones or turning points that you remember.

Dan- When I was in graduate school studying econometrics at Harvard, a number of distinguished professors at Harvard (and MIT) were actively involved in substantial real world activities. Professors that I interacted with, or studied with, or whose software I used became involved in the creation of such companies as Sun Microsystems, Data Resources, Inc. or were heavily involved in business consulting through their own companies or other influential consultants. Some not involved in private sector consulting took on substantial roles in government such as membership on the President’s Council of Economic Advisors. The atmosphere was one that encouraged free movement between academia and the private sector so the idea of forming a consulting and software company was quite natural and did not seem in any way inconsistent with being devoted to the advancement of science.

Ajay- What are the latest products by Salford Systems? Any future product plans or modification to work on Big Data analytics, mobile computing and cloud computing.

Dan- Our central set of data mining technologies are CART, MARS, TreeNet, RandomForests, and PRIM, and we have always maintained feature rich logistic regression and linear regression modules. In our latest release scheduled for January 2012 we will be including a new data mining approach to linear and logistic regression allowing for the rapid processing of massive numbers of predictors (e.g., one million columns), with powerful predictor selection and coefficient shrinkage. The new methods allow not only classic techniques such as ridge and lasso regression, but also sub-lasso model sizes. Clear tradeoff diagrams between model complexity (number of predictors) and predictive accuracy allow the modeler to select an ideal balance suitable for their requirements.

The new version of our data mining suite, Salford Predictive Modeler (SPM), also includes two important extensions to the boosted tree technology at the heart of TreeNet. The first, Importance Sampled learning Ensembles (ISLE), is used for the compression of TreeNet tree ensembles. Starting with, say, a 1,000 tree ensemble, the ISLE compression might well reduce this down to 200 reweighted trees. Such compression will be valuable when models need to be executed in real time. The compression rate is always under the modeler’s control, meaning that if a deployed model may only contain, say, 30 trees, then the compression will deliver an optimal 30-tree weighted ensemble. Needless to say, compression of tree ensembles should be expected to be lossy and how much accuracy is lost when extreme compression is desired will vary from case to case. Prior to ISLE, practitioners have simply truncated the ensemble to the maximum allowable size. The new methodology will substantially outperform truncation.

The second major advance is RULEFIT, a rule extraction engine that starts with a TreeNet model and decomposes it into the most interesting and predictive rules. RULEFIT is also a tree ensemble post-processor and offers the possibility of improving on the original TreeNet predictive performance. One can think of the rule extraction as an alternative way to explain and interpret an otherwise complex multi-tree model. The rules extracted are similar conceptually to the terminal nodes of a CART tree but the various rules will not refer to mutually exclusive regions of the data.

Ajay- You have led teams that have won multiple data mining competitions. What are some of your favorite techniques or approaches to a data mining problem.

Dan- We only enter competitions involving problems for which our technology is suitable, generally, classification and regression. In these areas, we are partial to TreeNet because it is such a capable and robust learning machine. However, we always find great value in analyzing many aspects of a data set with CART, especially when we require a compact and easy to understand story about the data. CART is exceptionally well suited to the discovery of errors in data, often revealing errors created by the competition organizers themselves. More than once, our reports of data problems have been responsible for the competition organizer’s decision to issue a corrected version of the data and we have been the only group to discover the problem.

In general, tackling a data mining competition is no different than tackling any analytical challenge. You must start with a solid conceptual grasp of the problem and the actual objectives, and the nature and limitations of the data. Following that comes feature extraction, the selection of a modeling strategy (or strategies), and then extensive experimentation to learn what works best.

Ajay- I know you have created your own software. But are there other software that you use or liked to use?

Dan- For analytics we frequently test open source software to make sure that our tools will in fact deliver the superior performance we advertise. In general, if a problem clearly requires technology other than that offered by Salford, we advise clients to seek other consultants expert in that other technology.

Ajay- Your software is installed at 3500 sites including 400 universities as per http://www.salford-systems.com/company/aboutus/index.html What is the key to managing and keeping so many customers happy?

Dan- First, we have taken great pains to make our software reliable and we make every effort to avoid problems related to bugs. Our testing procedures are extensive and we have experts dedicated to stress-testing software . Second, our interface is designed to be natural, intuitive, and easy to use, so the challenges to the new user are minimized. Also, clear documentation, help files, and training videos round out how we allow the user to look after themselves. Should a client need to contact us we try to achieve 24-hour turn around on tech support issues and monitor all tech support activity to ensure timeliness, accuracy, and helpfulness of our responses. WebEx/GotoMeeting and other internet based contact permit real time interaction.

Ajay- What do you do to relax and unwind?

Dan- I am in the gym almost every day combining weight and cardio training. No matter how tired I am before the workout I always come out energized so locating a good gym during my extensive travels is a must. I am also actively learning Portuguese so I look to watch a Brazilian TV show or Portuguese dubbed movie when I have time; I almost never watch any form of video unless it is available in Portuguese.

Biography-

http://www.salford-systems.com/blog/dan-steinberg.html

Dan Steinberg, President and Founder of Salford Systems, is a well-respected member of the statistics and econometrics communities. In 1992, he developed the first PC-based implementation of the original CART procedure, working in concert with Leo Breiman, Richard Olshen, Charles Stone and Jerome Friedman. In addition, he has provided consulting services on a number of biomedical and market research projects, which have sparked further innovations in the CART program and methodology.

Dr. Steinberg received his Ph.D. in Economics from Harvard University, and has given full day presentations on data mining for the American Marketing Association, the Direct Marketing Association and the American Statistical Association. After earning a PhD in Econometrics at Harvard Steinberg began his professional career as a Member of the Technical Staff at Bell Labs, Murray Hill, and then as Assistant Professor of Economics at the University of California, San Diego. A book he co-authored on Classification and Regression Trees was awarded the 1999 Nikkei Quality Control Literature Prize in Japan for excellence in statistical literature promoting the improvement of industrial quality control and management.

His consulting experience at Salford Systems has included complex modeling projects for major banks worldwide, including Citibank, Chase, American Express, Credit Suisse, and has included projects in Europe, Australia, New Zealand, Malaysia, Korea, Japan and Brazil. Steinberg led the teams that won first place awards in the KDDCup 2000, and the 2002 Duke/TeraData Churn modeling competition, and the teams that won awards in the PAKDD competitions of 2006 and 2007. He has published papers in economics, econometrics, computer science journals, and contributes actively to the ongoing research and development at Salford.

Using Google Fusion Tables from #rstats

But after all that- I was quite happy to see Google Fusion Tables within Google Docs. Databases as a service ? Not quite but still quite good, and lets see how it goes.

https://www.google.com/fusiontables/DataSource?dsrcid=implicit&hl=en_US&pli=1

http://googlesystem.blogspot.com/2011/09/fusion-tables-new-google-docs-app.html

But what interests me more is

http://code.google.com/apis/fusiontables/docs/developers_guide.html

The Google Fusion Tables API is a set of statements that you can use to search for and retrieve Google Fusion Tables data, insert new data, update existing data, and delete data. The API statements are sent to the Google Fusion Tables server using HTTP GET requests (for queries) and POST requests (for inserts, updates, and deletes) from a Web client application. The API is language agnostic: you can write your program in any language you prefer, as long as it provides some way to embed the API calls in HTTP requests.

The Google Fusion Tables API does not provide the mechanism for submitting the GET and POST requests. Typically, you will use an existing code library that provides such functionality; for example, the code libraries that have been developed for the Google GData API. You can also write your own code to implement GET and POST requests.

Also see http://code.google.com/apis/fusiontables/docs/sample_code.html

Google Fusion Tables API Sample Code

Libraries

SQL API

Language	Library	Public repository	Samples
Python	Fusion Tables Python Client Library	fusion-tables-client-python/	Samples
PHP	Fusion Tables PHP Client Library	fusion-tables-client-php/	Samples

Featured Samples

An easy way to learn how to use an API can be to look at sample code. The table above provides links to some basic samples for each of the languages shown. This section highlights particularly interesting samples for the Fusion Tables API.

SQL API

Language	Featured samples	API version
cURL	Hello, cURLA simple example showing how to use curl to access Fusion Tables.	SQL API
Google Apps Script	Google Apps Script ExampleA simple example showing how to run the SHOW TABLES query in Google Spreadsheets.	SQL API
Java	Hello, WorldA simple walkthrough that shows how the Google Fusion Tables API statements work. OAuth example on fusion-tables-apiThe Google Fusion Tables team shows how OAuth authorization enables you to use the Google Fusion Tables API from a foreign web server with delegated authorization.	SQL API
Python	Docs List ExampleDemonstrates how to: List tables Set permissions on tables Move a table to a folder	Docs List API
Android (Java)	Basic Sample ApplicationDemo application shows how to create a crowd-sourcing application that allows users to report potholes and save the data to a Fusion Table.	SQL API
JavaScript – FusionTablesLayer	Using the FusionTablesLayer, you can display data on a Google Map The basics Basic FusionTablesLayer example Multiple Layers per map Heatmap Changing the layer query Using a select menu Using a slider On zoom Using checkboxes Autocomplete Textbox Spatial Queries (see also “Working with Geographic Data”) Find features within a radius Find features within a bounding box Find the n nearest features Dynamic Styling Basic Styling Demo Multiple Columns (with Legend!) Advanced Custom Info Windows Info Window with Driving Directions Custom Marker Icons Legend Pan and Zoom to User-Entered Address Also check out FusionTablesLayer Builder, which generates all the code necessary to include a Google Map with a Fusion Table Layer on your own website.	FusionTablesLayer, Google Maps API
JavaScript – Google Chart Tools	Using the Google Chart Tools, you can request data from Fusion Tables to use in visualizations or to display directly in an HTML page. Note: responses are limited to 500 rows of data. Basic Request Data Table Visualization Bar Chart Visualization Word Cloud Visualization	Google Chart Tools

External Resources

Google Fusion Tables is dedicated to providing code examples that illustrate typical uses, best practices, and really cool tricks. If you do something with the Google Fusion Tables API that you think would be interesting to others, please contact us at googletables-feedback@google.com about adding your code to our Examples page.

Shape EscapeA tool for uploading shape files to Fusion Tables.
GDALOGR Simple Feature Library has incorporated Fusion Tables as a supported format.
Arc2CloudArc2Earth has included support for upload to Fusion Tables via Arc2Cloud.
Java and Google App EngineODK Aggregate is an AppEngine application by the Open Data Kit team, uses Google Fusion Tables to store survey data that is collected through input forms on Android mobile phones. Notable code:
- Create a Fusion Table – FusionTableServlet.java
- Insert data into a Fusion Table – SubmissionFusionTable.java
R packageAndrei Lopatenko has written an R interface to Fusion Tables so Fusion Tables can be used as the data store for R.
RubySimon Tokumine has written a Ruby gem for access to Fusion Tables from Ruby.

Updated-You can use Google Fusion Tables from within R from http://andrei.lopatenko.com/rstat/fusion-tables.R

ft.connect <- function(username, password) {
  url = "https://www.google.com/accounts/ClientLogin";
  params = list(Email = username, Passwd = password, accountType="GOOGLE", service= "fusiontables", source = "R_client_API")
 connection = postForm(uri = url, .params = params)
 if (length(grep("error", connection, ignore.case = TRUE))) {
 	stop("The wrong username or password")
 	return ("")
 }
 authn = strsplit(connection, "\nAuth=")[[c(1,2)]]
 auth = strsplit(authn, "\n")[[c(1,1)]]
 return (auth)
}

ft.disconnect <- function(connection) {
}

ft.executestatement <- function(auth, statement) {
      url = "http://tables.googlelabs.com/api/query"
      params = list( sql = statement)
      connection.string = paste("GoogleLogin auth=", auth, sep="")
      opts = list( httpheader = c("Authorization" = connection.string))
      result = postForm(uri = url, .params = params, .opts = opts)
      if (length(grep("<HTML>\n<HEAD>\n<TITLE>Parse error", result, ignore.case = TRUE))) {
      	stop(paste("incorrect sql statement:", statement))
      }
      return (result)
}

ft.showtables <- function(auth) {
   url = "http://tables.googlelabs.com/api/query"
   params = list( sql = "SHOW TABLES")
   connection.string = paste("GoogleLogin auth=", auth, sep="")
   opts = list( httpheader = c("Authorization" = connection.string))
   result = getForm(uri = url, .params = params, .opts = opts)
   tables = strsplit(result, "\n")
   tableid = c()
   tablename = c()
   for (i in 2:length(tables[[1]])) {
     	str = tables[[c(1,i)]]
   	    tnames = strsplit(str,",")
   	    tableid[i-1] = tnames[[c(1,1)]]
   	    tablename[i-1] = tnames[[c(1,2)]]
   	}
   	tables = data.frame( ids = tableid, names = tablename)
    return (tables)
}

ft.describetablebyid <- function(auth, tid) {
   url = "http://tables.googlelabs.com/api/query"
   params = list( sql = paste("DESCRIBE", tid))
   connection.string = paste("GoogleLogin auth=", auth, sep="")
   opts = list( httpheader = c("Authorization" = connection.string))
   result = getForm(uri = url, .params = params, .opts = opts)
   columns = strsplit(result,"\n")
   colid = c()
   colname = c()
   coltype = c()
   for (i in 2:length(columns[[1]])) {
     	str = columns[[c(1,i)]]
   	    cnames = strsplit(str,",")
   	    colid[i-1] = cnames[[c(1,1)]]
   	    colname[i-1] = cnames[[c(1,2)]]
   	    coltype[i-1] = cnames[[c(1,3)]]
   	}
   	cols = data.frame(ids = colid, names = colname, types = coltype)
    return (cols)
}

ft.describetable <- function (auth, table_name) {
   table_id = ft.idfromtablename(auth, table_name)
   result = ft.describetablebyid(auth, table_id)
   return (result)
}

ft.idfromtablename <- function(auth, table_name) {
    tables = ft.showtables(auth)
	tableid = tables$ids[tables$names == table_name]
	return (tableid)
}

ft.importdata <- function(auth, table_name) {
	tableid = ft.idfromtablename(auth, table_name)
	columns = ft.describetablebyid(auth, tableid)
	column_spec = ""
	for (i in 1:length(columns)) {
		column_spec = paste(column_spec, columns[i, 2])
		if (i < length(columns)) {
			column_spec = paste(column_spec, ",", sep="")
		}
	}
	mdata = matrix(columns$names,
	              nrow = 1, ncol = length(columns),
	              dimnames(list(c("dummy"), columns$names)), byrow=TRUE)
	select = paste("SELECT", column_spec)
	select = paste(select, "FROM")
	select = paste(select, tableid)
	result = ft.executestatement(auth, select)
    numcols = length(columns)
    rows = strsplit(result, "\n")
    for (i in 3:length(rows[[1]])) {
    	row = strsplit(rows[[c(1,i)]], ",")
    	mdata = rbind(mdata, row[[1]])
   	}
   	output.frame = data.frame(mdata[2:length(mdata[,1]), 1])
   	for (i in 2:ncol(mdata)) {
   		output.frame = cbind(output.frame, mdata[2:length(mdata[,i]),i])
   	}
   	colnames(output.frame) = columns$names
    return (output.frame)
}

quote_value <- function(value, to_quote = FALSE, quote = "'") {
	 ret_value = ""
     if (to_quote) {
     	ret_value = paste(quote, paste(value, quote, sep=""), sep="")
     } else {
     	ret_value = value
     }
     return (ret_value)
}

converttostring <- function(arr, separator = ", ", column_types) {
	con_string = ""
	for (i in 1:(length(arr) - 1)) {
		value = quote_value(arr[i], column_types[i] != "number")
		con_string = paste(con_string, value)
	    con_string = paste(con_string, separator, sep="")
	}

    if (length(arr) >= 1) {
    	value = quote_value(arr[length(arr)], column_types[length(arr)] != "NUMBER")
    	con_string = paste(con_string, value)
    }
}

ft.exportdata <- function(auth, input_frame, table_name, create_table) {
	if (create_table) {
       create.table = "CREATE TABLE "
       create.table = paste(create.table, table_name)
       create.table = paste(create.table, "(")
       cnames = colnames(input_frame)
       for (columnname in cnames) {
         create.table = paste(create.table, columnname)
    	 create.table = paste(create.table, ":string", sep="")
    	   if (columnname != cnames[length(cnames)]){
    		  create.table = paste(create.table, ",", sep="")
           }
       }
      create.table = paste(create.table, ")")
      result = ft.executestatement(auth, create.table)
    }
    if (length(input_frame[,1]) > 0) {
    	tableid = ft.idfromtablename(auth, table_name)
	    columns = ft.describetablebyid(auth, tableid)
	    column_spec = ""
	    for (i in 1:length(columns$names)) {
		   column_spec = paste(column_spec, columns[i, 2])
		   if (i < length(columns$names)) {
			  column_spec = paste(column_spec, ",", sep="")
		   }
	    }
    	insert_prefix = "INSERT INTO "
    	insert_prefix = paste(insert_prefix, tableid)
    	insert_prefix = paste(insert_prefix, "(")
    	insert_prefix = paste(insert_prefix, column_spec)
    	insert_prefix = paste(insert_prefix, ") values (")
    	insert_suffix = ");"
    	insert_sql_big = ""
    	for (i in 1:length(input_frame[,1])) {
    		data = unlist(input_frame[i,])
    		values = converttostring(data, column_types  = columns$types)
    		insert_sql = paste(insert_prefix, values)
    		insert_sql = paste(insert_sql, insert_suffix) ;
    		insert_sql_big = paste(insert_sql_big, insert_sql)
    		if (i %% 500 == 0) {
    			ft.executestatement(auth, insert_sql_big)
    			insert_sql_big = ""
    		}
    	}
        ft.executestatement(auth, insert_sql_big)
    }
}

Please share:

Please share:

Upcoming Webinar: Introducing Jaspersoft 4.2

Webinar

Register to reserve your seat, today!

Please share:

National Cybersecurity and Communications Integration Center Bulletin

Please share:

Please share:

Please share:

Google Fusion Tables API Sample Code

Libraries

SQL API

Featured Samples

SQL API

External Resources

Please share: