Predictive Analytics World Conference –New York City and London, UK

Please use the following code  to get a 15% discount on the 2 Day Conference Pass:  AJAYNY11.

Predictive Analytics World Conference –New York City and London, UK

October 17-21, 2011 – New York City, NY (pawcon.com/nyc)
Nov 30 – Dec 1, 2011 – London, UK (pawcon.com/london)

Predictive Analytics World (pawcon.com) is the business-focused event for predictive analytics
professionals, managers and commercial practitioners, covering today’s commercial deployment of
predictive analytics, across industries and across software vendors. The conference delivers case
studies, expertise, and resources to achieve two objectives:

1) Bigger wins: Strengthen the business impact delivered by predictive analytics

2) Broader capabilities: Establish new opportunities with predictive analytics

Case Studies: How the Leading Enterprises Do It

Predictive Analytics World focuses on concrete examples of deployed predictive analytics. The leading
enterprises have signed up to tell their stories, so you can hear from the horse’s mouth precisely how
Fortune 500 analytics competitors and other top practitioners deploy predictive modeling, and what
kind of business impact it delivers.

PAW NEW YORK CITY 2011

PAW’s NYC program is the richest and most diverse yet, featuring over 40 sessions across three tracks
– including both X and Y tracks, and an “Expert/Practitioner” track — so you can witness how predictive
analytics is applied at major companies.

PAW NYC’s agenda covers hot topics and advanced methods such as ensemble models, social data,
search marketing, crowdsourcing, blackbox trading, fraud detection, risk management, survey analysis,
and other innovative applications that benefit organizations in new and creative ways.

WORKSHOPS: PAW NYC also features five full-day pre- and post-conference workshops that
complement the core conference program. Workshop agendas include advanced predictive modeling
methods, hands-on training, an intro to R (the open source analytics system), and enterprise decision
management.

For more see http://www.predictiveanalyticsworld.com/newyork/2011/

PAW LONDON 2011

PAW London’s agenda covers hot topics and advanced methods such as risk management, uplift
(incremental lift) modeling, open source analytics, and crowdsourcing data mining. Case study
presentations cover campaign targeting, churn modeling, next-best-offer, selecting marketing channels,
global analytics deployment, email marketing, HR candidate search, and other innovative applications
that benefit organizations in new and creative ways.

Join PAW and access the best keynotes, sessions, workshops, exposition, expert panel, live demos,
networking coffee breaks, reception, birds-of-a-feather lunches, brand-name enterprise leaders, and

industry heavyweights in the business.

For more see http://www.predictiveanalyticsworld.com/london

CROSS-INDUSTRY APPLICATIONS

Predictive Analytics World is the only conference of its kind, delivering vendor-neutral sessions across
verticals such as banking, financial services, e-commerce, education, government, healthcare, high
technology, insurance, non-profits, publishing, social gaming, retail and telecommunications

And PAW covers the gamut of commercial applications of predictive analytics, including response
modeling, customer retention with churn modeling, product recommendations, fraud detection, online
marketing optimization, human resource decision-making, law enforcement, sales forecasting, and
credit scoring.

Why bring together such a wide range of endeavors? No matter how you use predictive analytics, the
story is the same: Predicatively scoring customers optimizes business performance. Predictive analytics
initiatives across industries leverage the same core predictive modeling technology, share similar project
overhead and data requirements, and face common process challenges and analytical hurdles.

RAVE REVIEWS:

“Hands down, best applied, analytics conference I have ever attended. Great exposure to cutting-edge
predictive techniques and I was able to turn around and apply some of those learnings to my work
immediately. I’ve never been able to say that after any conference I’ve attended before!”

Jon Francis
Senior Statistician
T-Mobile

Read more: Articles and blog entries about PAW can be found at http://www.predictiveanalyticsworld.com/
pressroom.php

VENDORS. Meet the vendors and learn about their solutions, software and service. Discover the best
predictive analytics vendors available to serve your needs – learn what they do and see how they
compare

COLLEAGUES. Mingle, network and hang out with your best and brightest colleagues. Exchange
experiences over lunch, coffee breaks and the conference reception connecting with those professionals
who face the same challenges as you.

GET STARTED. If you’re new to predictive analytics, kicking off a new initiative, or exploring new ways
to position it at your organization, there’s no better place to get your bearings than Predictive Analytics
World. See what other companies are doing, witness vendor demos, participate in discussions with the
experts, network with your colleagues and weigh your options!

For more information:
http://www.predictiveanalyticsworld.com

View videos of PAW Washington DC, Oct 2010 — now available on-demand:
http://www.predictiveanalyticsworld.com/online-video.php

What is predictive analytics? See the Predictive Analytics Guide:
http://www.predictiveanalyticsworld.com/predictive_analytics.php

If you’d like our informative event updates, sign up at:
http://www.predictiveanalyticsworld.com/signup-us.php

To sign up for the PAW group on LinkedIn, see:
http://www.linkedin.com/e/gis/1005097

For inquiries e-mail regsupport@risingmedia.com or call (717) 798-3495.

Contest for SAS Users and Students

Heres a new contest for SAS users. The prizes are books, so students should be interested as well.

From http://www.sascommunity.org/mwiki/images/b/bc/PointsforprizesRules.pdf

HOW TO ENTER: To qualify for entry, go to the sasCommunity.org web site located at http://www.sascommunity.org/wiki/Main_Page
between April 11, 2011 and May 9, 2011 and either add or edit valid content as described herein to earn award points.
Creation of a first time profile on www.sascommunity.org will earn 1,000 points. For each valid article creation or edit, 100
points will be earned. Articles and subsequent edits should adhere to the sasCommunity.org terms of use as outlined on
http://www.sascommunity.org/wiki/sasCommunity:Terms_of_Use. All points’ accumulation will end at 5:00 PM GMT on
May 9, 2011 and only those points earned between 8:00 AM GMT on April 11, 2011 and 5:00 PM GMT on May 9, 2011
will be counted in this contest. Contest entries made through the Internet will be declared made by the registered user of
the sasCommunity.org profile account. Sponsor is not responsible for phone, technical, network, electronic, computer
hardware or software failures of any kind, misdirected, incomplete, garbled or delayed transmissions. Sponsor will not be
responsible for incorrect or inaccurate entry information, whether caused by entrants or by any of the equipment or
programming associated with or utilized in the contest.
ELIGIBILITY: The contest is open to all sasCommunity.org members 18 year of age or older on the start date of the
contest. Void where prohibited by law. Employees (including immediate family members and/or those living in the same 
household of each), the Sponsor, members of the sasCommunity.org Advisory Board, SAS Global Users Group Executive 
Board, their advertising, promotion and production agencies, the affiliated companies of each, and the immediate family 
members of each are not eligible. 

PRIZE: Three (3) prizes will be awarded based on total points accumulated during the contest as follows:
 1stPlace: 3 SAS®Press books - not to exceed $250 in combined retail value;
 2ndPlace: 2 SAS®Press books - not to exceed $150 in combined retail value; and
 3rdPlace: 1 SAS®Press book - not to exceed $100 in retail value.

What’s New

http://www.sascommunity.org/wiki/Main_Page

New Points for Prizes Contest
Points for Prizes Contest
Win SAS books!
Contribute content or SAS code to sasCommunity.org for your chance to WIN! To qualify, simply add or edit articles between April 11, 2011 and May 9, 2011 (GMT). Creation of a first-time profile on sasCommunity.org gives you 1,000 points. For each valid article creation or edit, 100 points will be earned. The user with the most points collected during this time wins SAS Press Books!

Become a sasCommunity Guru
Thanks for Contributing to sasCommunity.org!
New sasCommunity.org Point System
The sasCommunity support team has been hard at work adding new features and is pleased to announce a points system that recognizes each user’s contributions to the site. Every time you contribute by creating a page, updating it, or just doing a little wiki gardening, you earn points.Earning points is automatic and simple – all you have to do is contribute! Creating your account starts you with 1000 points and all the current users have been credited with points dating back to the site coming online in April 2007.

Comparing Bit Torrent Downloaders

Tux, as originally drawn by Larry Ewing
Image via Wikipedia

I personally like UTorrent on Windows and KTorrent on Linux.

While no experts on this, anything that gets the data down faster while maximizing my pipes efficiency.

I also like Torrenting than  any of the sudo-apt get method of downloading software or the zip unzip,tar untar, install/make file

Torrenting is a simpler way of sharing applications but sadly not used much by the stats computing community to share downloads.

Also I think any dashboard or visualization should be sorted (but not alphabetically but numerically/categorically)

SORT THE DASHBOARD —-KEEP IT SORTED

So I am partially recreating after sorting the data viz from http://en.wikipedia.org/wiki/Comparison_of_BitTorrent_clients

BitTorrent client Magnet URI Super-seeding Embedded tracker UPnP[81] NAT Port Mapping Protocol NAT traversal[82] DHT[83] Peer exchange Encryption UDP tracker LPD
µTorrent Yes Yes[95] Yes[96] Yes[97] Yes Yes[98] Yes[99] Yes[85] Yes[100] Yes Yes[101]
BitSpirit [11] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No
BitTorrent 6 Yes Yes Yes Yes Yes Yes Yes Yes[85] Yes Yes Yes
OneSwarm Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No
qBittorrent Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
SoMud Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
Vuze (formerly Azureus) Yes Yes Yes Yes Yes Yes[102] Yes[87] Yes Yes Yes No
BitComet Yes Yes Separate download Yes Yes Yes Yes Yes Yes Yes No
Tixati [43] Yes Yes No Yes No No Yes Yes Yes Yes Partial
Aria2 Yes No Yes No No No Yes Yes Yes Yes Yes
Tribler Yes No Yes Yes Yes No Yes Yes Yes No No
Bitflu Yes No No No No No Yes Yes No Yes No
Deluge Yes No No Yes Yes Yes Yes Yes Yes Yes Yes
Flush Yes No No Yes Yes No Yes Yes No No Yes
KTorrent Yes No No Yes Yes Yes Yes Yes Yes Yes Partial
Shareaza Yes No No Yes Yes No Yes[93] Yes No No No
Transmission Yes No No Yes Yes Yes Yes Yes[94] Yes No Yes
LimeWire Partial Yes Yes Yes Yes No Yes Yes Yes Yes No
BitTyrant No Yes[citation needed] Yes Yes Yes Yes[86] Yes[87] Yes Yes No No
BitTornado No Yes Yes[84] Yes No No No No Yes No No
Torrent Swapper No Yes Yes[84] Yes No No No Yes No No No
Localhost No Yes Yes Yes No Yes Yes [89] No No No No
Meerkat Bittorrent Client No Yes No Yes Yes Yes Yes No Yes No No
rTorrent No Yes No No No No Yes Yes Yes Yes No[92]
TorrentFlux No Yes No Yes No No No No Yes No No
TorrentVolve No Partial [76] No Partial[76] Partial [76] Partial [76] Partial[76] Partial [76] Partial [76] Partial [76] No
Opera No No Yes[90] No No No No Yes[91] No No No
BitTorrent 5 / Mainline No No Yes[84] Yes Yes No Yes Yes Yes No No
ABC No No Yes Yes No No No No No No No
Blog Torrent No No Yes No No No No No No No No
MLDonkey No No Yes Yes Yes No No No No Yes No
Tomato Torrent No No Yes No No No Yes No No No No
Acquisition No No No No Yes No No No No No No
Arctic Torrent No No No No No No No Yes No No No
BitLet No No No Yes No No No No No No No
BitLord No No No Yes No Yes No Yes No Yes No
BitThief No No No No No No No No No No No
Bits on Wheels No No No No No No No No No No No
BTG No No No Yes Yes No Yes Yes Yes Yes No
BTPD No No No No No No No No No No No
FlashGet No No No No No No Yes No Yes No No
Folx No No No Yes Yes No Yes Yes No Yes No
Free Download Manager No No No No No No Yes Yes No No No
G3 Torrent No No No No No No No No No No No
Gnome BitTorrent No No No No No No No No No No No
Halite No No No Yes Yes No Yes No Yes No[88] No
QTorrent No No No No No No No No No No No
Rufus No No No No No No No No No No No
SymTorrent No No No N/A N/A N/A No No No No No
Tonido Torrent No No No Yes Yes Yes Yes No No No No
Torium No No No Yes No No Yes No No No No
ZipTorrent No No No Yes Yes No No Yes No No No

 

 

 

 

PySpread Magic

Python logo
Image via Wikipedia

Just working with PySpread- and worked on a 1 million by 1 million spreadsheet- Python sure looks promising for the way ahead for stat computing ( you need to

sudo apt-get install python-numpy python-rpy python-scipy python-gmpy wxpython*,

cd to the untarred bz2 file from http://pyspread.sourceforge.net/download.html,  (like

:~/Downloads$ cd pyspread-0.1.2

:~/Downloads/pyspread-0.1.2

sudo python setup.py install

)

http://pyspread.sourceforge.net/

by Martin Manns

 

about Pyspread is a cross-platform Python spreadsheet application. It is based on and written in the programming language Python.

Instead of spreadsheet formulas, Python expressions are entered into the spreadsheet cells. Each expression returns a Python object that can be accessed from other cells. These objects can represent anything including lists or matrices.

Pyspread screenshot
features
  • Three dimensional grid with up to 85,899,345 rows and 14,316,555 columns (64 bit systems, depends on row height and column width). Note that a million cells require about 500 MB of memory.
  • Complex data types such as lists, trees or matrices within a single cell.
  • Macros for functionalities that are too complex for a single Python expression.
  • Python module access from each cell, which allows:
    • Arbitrary size rational numbers (via gmpy),
    • Fixed point decimal numbers for business calculations, (via the decimal module from the standard library)
    • Advanced statistics including plotting functions (via RPy)
    • Much more via <your favourite module>.
  • CSV import and export
  • Clipboard access
Pyspread screenshot

warning The concept of pyspread allows doing everything from each cell that a Python script can do. This powerful feature has its drawbacks. A spreadsheet may very well delete your hard drive or send your data via the Internet. Of course this is a non-issue if you sandbox properly or if you only use self developed spreadsheets.

Since this is not the case for everyone (see discussion at lwn.net), a GPG signature based trust model for spreadsheet files has been introduced. It ensures that only your own trusted files are executed on loading. Untrusted files are displayed in safe mode. You can approve a file manually. Inspect carefully.

 

How to Analyze Wikileaks Data – R SPARQL

Logo for R
Image via Wikipedia

Drew Conway- one of the very very few Project R voices I used to respect until recently. declared on his blog http://www.drewconway.com/zia/

Why I Will Not Analyze The New WikiLeaks Data

and followed it up with how HE analyzed the post announcing the non-analysis.

“If you have not visited the site in a week or so you will have missed my previous post on analyzing WikiLeaks data, which from the traffic and 35 Comments and 255 Reactions was at least somewhat controversial. Given this rare spotlight I thought it would be fun to use the infochimps API to map out the geo-location of everyone that visited the blog post over the last few days. Unfortunately, after nearly two years with the same web hosting service, only today did I realize that I was not capturing daily log files for my domain”

Anyways – non American users of R Project can analyze the Wikileaks data using the R SPARQL package I would advise American friends not to use this approach or attempt to analyze any data because technically the data is still classified and it’s possession is illegal (which is the reason Federal employees and organizations receiving federal funds have advised not to use this or any WikiLeaks dataset)

https://code.google.com/p/r-sparql/

Overview

R is a programming language designed for statistics.

R Sparql allows you to run SPARQL Queries inside R and store it as a R data frame.

The main objective is to allow the integration of Ontologies with Statistics.

It requires Java and rJava installed.

Example (in R console):

> library(sparql)> data <- query("SPARQL query>","RDF file or remote SPARQL Endpoint")

and the data in a remote SPARQL  http://www.ckan.net/package/cablegate

SPARQL is an easy language to pick  up, but dammit I am not supposed to blog on my vacations.

http://code.google.com/p/r-sparql/wiki/GettingStarted

Getting Started

1. Installation

1.1 Make sure Java is installed and is the default JVM:

$ sudo apt-get install sun-java6-bin sun-java6-jre sun-java6-jdk$ sudo update-java-alternatives -s java-6-sun

1.2 Configure R to use the correct version of Java

$ sudo R CMD javareconf

1.3 Install the rJava library

$ R> install.packages("rJava")> q()

1.4 Download and install the sparql library

Download: http://code.google.com/p/r-sparql/downloads/list

$ R CMD INSTALL sparql-0.1-X.tar.gz

2. Executing a SPARQL query

2.1 Start R

#Load the librarylibrary(sparql)#Run the queryresult <- query("SELECT ... ", "http://...")#Print the resultprint(result)

3. Examples

3.1 The Query can be a string or a local file:

query("SELECT ?date ?number ?season WHERE {  ... }", "local-file.rdf")
query("my-query.rq", "local-file.rdf")

The package will detect if my-query.rq exists and will load it from the file.

3.3 The uri can be a file or an url (for remote queries):

query("SELECT ... ","local-file.db")
query("SELECT ... ","http://dbpedia.org/sparql")

3.4 Get some examples here: http://code.google.com/p/r-sparql/downloads/list

SPARQL Tutorial-

http://openjena.org/ARQ/Tutorial/index.html

Also read-

http://webr3.org/blog/linked-data/virtuoso-6-sparqlgeo-and-linked-data/

and from the favorite blog of Project R- Also known as NY Times

http://bits.blogs.nytimes.com/2010/11/15/sorting-through-the-government-data-explosion/?twt=nytimesbits

In May 2009, the Obama administration started putting raw 
government data on the Web. 
It started with 47 data sets. Today, there are more than
 270,000 government data sets, spanning every imaginable 
category from public health to foreign aid.

Machine Addictions

in the middle of essential and inevitable tasks
restless inner conscience wakens and asks
stuck again today to the computer are we now
please remind me this state we reached how

oh we had bills to pay student loans to repay
once we got hooked t’was easy to be carried away
just a matter of time before inevitable voices query
this is my machine that I want to marry

I spend more time with him/her as it is
the Machinery is devoted with focused loyalties
meanwhile the non machine world goes round
strives forth on things less profound

as we stroke the keys and click the mouse
machine addictions will only add to human grouse

RWui :Creating R Web Interfaces on the go

Here is a great R application created by http://sysbio.mrc-bsu.cam.ac.uk

R Wui for creating R Web Interfaces

its been there for some time now- but presumably R Apache is more well known.

From-

http://sysbio.mrc-bsu.cam.ac.uk/Rwui/tutorial/Rwui_Rnews_final.pdf

The web application Rwui is used to create web interfaces  for running R scripts. All the code is generated automatically so that a fully functional web interface for an R script can be downloaded and up and running in a matter of minutes.

Rwui is aimed at R script writers who have scripts that they want people unversed in R to use. The script writer uses Rwui to create a web application that will run their R script. Rwui allows the script writer to do this without them having to do any web application programming, because Rwui generates all the code for them.

The script writer designs the web application to run their R script by entering information on a sequence of web pages. The script writer then downloads the application they have created and installs it on their own server.

http://sysbio.mrc-bsu.cam.ac.uk/Rwui/tutorial/Technical_Report.pdf

Features of web applications created by Rwui

  1. Whole range of input items available if required – text boxes, checkboxes, file upload etc.
  2. Facility for uploading of an arbitrary number of files (for example, microarray replicates).
  3. Facility for grouping uploaded files (for example, into ‘Diseased’ and ‘Control’ microarray data files).
  4. Results files displayed on results page and available for download.
  5. Results files can be e-mailed to the user.
  6. Interactive results files using image maps.
  7. Repeat analyses with different parameters and data files – new results added to results list, as a link to the corresponding results page.
  8. Real time progress information (text or graphical) displayed when running the application.

Requirements

In order to use the completed web applications created by Rwui you will need:

  1. A Java webserver such as Tomcat version 5.5 or later.
  2. Java version 1.5
  3. R – a version compatible with your R script(s).

Using Rwui

Using Rwui to create a web application for an R script simply involves:

  1. Entering details about your Rscript on a sequence of web pages.
  2. Rwui is quite flexible so you can backtrack, edit and insert, as you design your application.
  3. Rwui then generates the web application, which is Java based and platform independent.
  4. The application can be downloaded either as a .zip or .tgz file.
  5. Unpacked, the download contains all the source code and a .war file.
  6. Once the .war file is copied to the Tomcat webapps directory, the application is ready to use.
  7. Application details are saved in an ‘application definition file’ for reuse and modification.
Interested-
go click and check out a new web app from http://sysbio.mrc-bsu.cam.ac.uk/Rwui/ in a matter of minutes
Also see
%d bloggers like this: