Top 10 Games on Linux -sudo update

Here are some cool games I like to play on my Ubuntu 10.10 – I think they run on most other versions of Linux as well. 1) Open Arena– First person Shooter– This is like Quake Arena- very very nice graphics and good for playing for a couple of hours while taking a break. It is available here- http://openarena.ws/smfnews.php ideally if you have a bunch of gaming friends, playing on a local network or internet is quite mind blowing entertaining. And it’s free! 2) Armagetron– This is based on the TRON game of light cycles-It is available here at http://www.armagetronad.net/ or you can use Synaptic packages manager for all the games mentioned here

If violence or cars is not your stuff and you like puzzles like Sudoko, well just install the application Sudoko from http://gnome-sudoku.sourceforge.net/ Also recommended for people of various ages as it has multiple levels.

If you ever liked Pinball play the open source version from download at http://pinball.sourceforge.net/ Alternatively you can go to Ubuntu Software Centre>Games>Arcade>Emilio>Pinball and you can also build your own pinball if you like the game well enough. 5) Pacman/Njam- Clone of the original classic game. Downloadable from http://www.linuxcompatible.org/news/story/pacman_for_linux.html 6) Gweled– This is free clone version of Bejeweled. It now has a new website at http://gweled.org/ http://linux.softpedia.com/progDownload/Gweled-Download-3449.html

Gweled is a GNOME version of a popular PalmOS/Windows/Java game called “Bejeweled” or “Diamond Mine”. The aim of the game is to make alignment of 3 or more gems, both vertically or horizontally by swapping adjacent gems. The game ends when there are no possible moves left. Here are some key features of “Gweled”: · exact same gameplay as the commercial versions · SVG original graphics

7) Hearts – For this card game classis you can use Ubuntu software to install the package or go to http://linuxappfinder.com/package/gnome-hearts 8) Card Games- KPatience has almost 14 card games including solitaire, and free cell. 9) Sauerbraten -First person shooter with good network play, edit maps capabilities. You can read more here- http://sauerbraten.org/ 10) Tetris-KBlocks Tetris is the classic game. If you like classic slow games- Tetris is the best. and I like the toughest Tetris game -Bastet http://fph.altervista.org/prog/bastet.html Even an xkcd toon for it

Ubuntu 10.10 Alternatives (lockergnome.com)
Maciej Danielski: Tried Ubuntu 10.10 for a week and now back on #! CrunchBang Linux (meanmachine.wordpress.com)
7 Predictions For Open Source in 2011 (pcworld.com)
Bodhi Linux Get Software Page Goes Live (jeffhoogland.blogspot.com)
Interview with Matt Asay of Canonical (interviews.slashdot.org)
Macbuntu Makes your Linux Desktop Look Like Mac OS X [Downloads] (lifehacker.com)
Fix VirtualBox’s Guest Additions in Ubuntu 10.10 [Linux Tip] (lifehacker.com)
Linux Mint 10 Boasts New Menu And Theme (lockergnome.com)
Ubuntu tablet rumored for early 2011 launch (go.theregister.com)
Alien Arena 2011 Released (techie-buzz.com)
This Is What $10,000 Worth of Top Tier Pinball Play Looks Like [Clips] (kotaku.com)
6 Fun Ways To Explore Ubuntu 10.10 [Linux] (makeuseof.com)
4 Fun Party Games Using Networked Computers (makeuseof.com)

Protected: Using SAS and C/C++ together

Google Cloud Print -print documents from the internet

Print Jobs just got easier- especially if you prefer one printer, use Google Chrome, and can take 2 minutes to set up your printer to print from anywhere in the world through the internet.

It’s called Google Cloud Print– and it makes my life a lot easier when I travel and need to give to printer at home some documents to print rather than rely on external printers. See screenshots below and check out http://www.google.com/cloudprint/ for more

Secure Printing with Google Cloud Print (kinlane.com)
Print from Your Phone with Google Cloud Print [Cloud Print] (lifehacker.com)
Introduction to Google Cloud Print (kinlane.com)
Google Cloud Print – Print from Anywhere Anytime (searchenginepeople.com)
Are You Printing from the Google Cloud? (chris.pirillo.com)
Print From Your Phone To Your Printer With Google Cloud Print (businessinsider.com)
Print from your phone with Gmail for mobile and Google Cloud Print (gmailblog.blogspot.com)

Heritage Health Prize- Data Mining Contest for 3mill USD

An animation of the quicksort algorithm sortin... — Image via Wikipedia

If Netflix was about 1 mill USD to better online video choices, here is a chance to earn serious money, write great code, and save lives!

From http://www.heritagehealthprize.com/

Heritage Health Prize
Launching April 4

Laptop

More than 71 Million individuals in the United States are admitted to
hospitals each year, according to the latest survey from the American
Hospital Association. Studies have concluded that in 2006 well over
$30 billion was spent on unnecessary hospital admissions. Each of
these unnecessary admissions took away one hospital bed from someone
else who needed it more.

http://www.heritagehealthprize.com/competition.php

Prize Goal & Participation

The goal of the prize is to develop a predictive algorithm that can identify patients who will be admitted to the hospital within the next year, using historical claims data.

Official registration will open in 2011, after the launch of the prize. At that time, pre-registered teams will be notified to officially register for the competition. Teams must consent to be bound by final competition rules.

Registered teams will develop and test their algorithms. The winning algorithm will be able to predict patients at risk for an unplanned hospital admission with a high rate of accuracy. The first team to reach the accuracy threshold will have their algorithms confirmed by a judging panel. If confirmed, a winner will be declared.

The competition is expected to run for approximately two years. Registration will be open throughout the competition.

Data Sets

Registered teams will be granted access to two separate datasets of de-identified patient claims data for developing and testing algorithms: a training dataset and a quiz/test dataset. The datasets will be comprised of de-identified patient data. The datasets will include:

Outpatient encounter data
Hospitalization encounter data
Medication dispensing claims data, including medications
Outpatient laboratory data, including test outcome values

The data for each de-identified patient will be organized into two sections: “Historical Data” and “Admission Data.” Historical Data will represent three years of past claims data. This section of the dataset will be used to predict if that patient is going to be admitted during the Admission Data period. Admission Data represents previous claims data and will contain whether or not a hospital admission occurred for that patient; it will be a binary flag.

Data The training dataset includes several thousand anonymized patients and will be made available, securely and in full, to any registered team for the purpose of developing effective screening algorithms.

The quiz/test dataset is a smaller set of anonymized patients. Teams will only receive the Historical Data section of these datasets and the two datasets will be mixed together so that teams will not be aware of which de-identified patients are in which set. Teams will make predictions based on these data sets and submit their predictions to HPN through the official Heritage Health Prize web site. HPN will use the Quiz Dataset for the initial assessment of the Team’s algorithms. HPN will evaluate and report back scores to the teams through the prize website’s leader board.

Scores from the final Test Dataset will not be made available to teams until the accuracy thresholds are passed. The test dataset will be used in the final judging and results will be kept hidden. These scores are used to preserve the integrity of scoring and to help validate the predictive algorithms.

Teams can begin developing and testing their algorithms as soon as they are registered and ready. Teams will log onto the official Heritage Health Prize website and submit their predictions online. Comparisons will be run automatically and team accuracy scores will be posted on the leader board. This score will be only on a portion of the predictions submitted (the Quiz Dataset), the additional results will be kept back (the Test Dataset).

Form

Once a team successfully scores above the accuracy thresholds on the online testing (quiz dataset), final judging will occur. There will be three parts to this judging. First, the judges will confirm that the potential winning team’s algorithm accurately predicts patient admissions in the Test Dataset (again, above the thresholds for accuracy).

Next, the judging panel will confirm that the algorithm does not identify patients and use external data sources to derive its predictions. Lastly, the panel will confirm that the team’s algorithm is authentic and derives its predictive power from the datasets, not from hand-coding results to improve scores. If the algorithm meets these three criteria, it will be declared the winner.

Failure to meet any one of these three parts will disqualify the team and the contest will continue. The judges reserve the right to award second and third place prizes if deemed applicable.

HPN Health Prize: The X-Prize of Health Care (medicineandtechnology.com)
$3 million machine learning prize (heritagehealthprize.com)
Heritage Providers Continues to Promote $3 Million Dollar Prize to Create An Algorithm To Predict and Prevents Hospitalizations (ducknetweb.blogspot.com)
Netflix Prize-Style Competition Predicts Hospitalizations (fastcompany.com)
For Data Crunchers, A Glittering Prize (online.wsj.com)
The American Hospital Association Awards Its Exclusive Endorsement to HR Solutions’ Physician Engagement Survey (prweb.com)

Google Refine

An interesting data cleaning software from Google at

https://code.google.com/p/google-refine/

From the page at

https://code.google.com/p/google-refine/wiki/UserGuide

The Basics

First, although Google Refine might start out looking like a spreadsheet program (Microsoft Excel, Google Spreadsheets, etc.), don’t expect it to work like a spreadsheet program. That’s almost like expecting a database to work like a text editor.

Google Refine is NOT for entering new data one cell at a time. It is NOT for doing accounting.

Google Refine is for applying transformations over many existing cells in bulk, for the purpose of cleaning up the data, extending it with more data from other sources, and getting it to some form that other tools can consume.

To use Google Refine, think in big patterns. For example, to spot errors, think

Show me every row where the string length of the customer’s name is longer than 50 characters (because I suspect that the customer’s address is mistakenly included in the name field)
Show me every row where the contract fee is less than 1 (because I suspect the fee was entered in unit of thousand dollars rather than dollars)
Show me every row where the description field (scraped from some web site) contains “&” (because I suspect it wasn’t decoded properly)

To edit data, think

For every row where the contract fee is less than 1, multiply the fee by 1000.
For every row where the customer name contains a comma (it has been entered as “last_name, first_name”), split the name by the comma, reverse the array, and join it back with a space (producing “first_name last_name”)

To specify patterns, use filters and facets. Typically, you create a filter or facet on a particular column. For example, you can create a numeric facet on the “contract fee” column and adjust its range selector to select values less than 1. If the default facet doesn’t do what you want, you can configure it (by clicking “change” on the facet’s header). For example, you can create a text facet with on the same “contract fee” column with this expression:

  value < 1

It will show 2 choices: true and false. Just select true. Then, invoke the Transform command on that same column and enter the expression

  value * 1000

That Transform command affects only rows where the “contract fee” cell contains a value less than 1.

You can use several filters and facets together. Only rows that are selected by all facets and filters will be shown in the data table. For example, say you have two text facets, one on the “contract fee” column with the expression

  value < 1

and another on the “state” column (with the default expression). If you select “true” in the first facet and “Nevada” in the second, then you will only see rows for contracts in Nevada with fees less than 1.

Analogies

Databases

If you have programmed databases before (performing SQL queries), then what Google Refine works should be quite familiar to you. Creating filters and facets and selecting something in them is like performing this SELECT statement:

  SELECT *
  WHERE ... constraints determined by selection in facets and filters ...

And invoking the Transform command on a column while having some filters and facets selected is like performing this UPDATE statement

  UPDATE whole_table SET column_X = ... expression ...
  WHERE ... constraints determined by selection in facets and filters ...

The difference between Google Refine and databases is that the facets show you choices that you can select, whereas databases assume that you already know what’s in the data.

Transforming spreadsheets into SKOS with Google Refine (semantic-web.at)
Adding geographical information to a spreadsheet based on postcodes – Google Refine and APIs (onlinejournalismblog.com)
Chapter 1. Using Google Refine to Clean Messy Data – ProPublica (propublica.org)

HIGHLIGHTS from REXER Survey :R gives best satisfaction

A Summary report from Rexer Analytics Annual Survey

HIGHLIGHTS from the 4^th Annual Data Miner Survey (2010):

• FIELDS & GOALS: Data miners work in a diverse set of fields. CRM / Marketing has been the #1 field in each of the past four years. Fittingly, “improving the understanding of customers”, “retaining customers” and other CRM goals are also the goals identified by the most data miners surveyed.

• ALGORITHMS: Decision trees, regression, and cluster analysis continue to form a triad of core algorithms for most data miners. However, a wide variety of algorithms are being used. This year, for the first time, the survey asked about Ensemble Models, and 22% of data miners report using them.
A third of data miners currently use text mining and another third plan to in the future.

• MODELS: About one-third of data miners typically build final models with 10 or fewer variables, while about 28% generally construct models with more than 45 variables.

• TOOLS: After a steady rise across the past few years, the open source data mining software R overtook other tools to become the tool used by more data miners (43%) than any other. STATISTICA, which has also been climbing in the rankings, is selected as the primary data mining tool by the most data miners (18%). Data miners report using an average of 4.6 software tools overall. STATISTICA, IBM SPSS Modeler, and R received the strongest satisfaction ratings in both 2010 and 2009.

• TECHNOLOGY: Data Mining most often occurs on a desktop or laptop computer, and frequently the data is stored locally. Model scoring typically happens using the same software used to develop models. STATISTICA users are more likely than other tool users to deploy models using PMML.

• CHALLENGES: As in previous years, dirty data, explaining data mining to others, and difficult access to data are the top challenges data miners face. This year data miners also shared best practices for overcoming these challenges. The best practices are available online.

• FUTURE: Data miners are optimistic about continued growth in the number of projects they will be conducting, and growth in data mining adoption is the number one “future trend” identified. There is room to improve: only 13% of data miners rate their company’s analytic capabilities as “excellent” and only 8% rate their data quality as “very strong”.

Please contact us if you have any questions about the attached report or this annual research program. The 5^th Annual Data Miner Survey will be launching next month. We will email you an invitation to participate.

Information about Rexer Analytics is available at www.RexerAnalytics.com. Rexer Analytics continues their impressive journey see http://www.rexeranalytics.com/Clients.html

|My only thought- since most data miners are using multiple tools including free tools as well as paid software, Perhaps a pie chart of market share by revenue and volume would be handy.

Also some ideas on comparing diverse data mining projects by data size, or complexity.

Skills of a good data miner (zyxo.wordpress.com)
7 Data Blogs To Explore (readwriteweb.com)
FBI Data-Mining Program:Total Information Awareness (alitarhini.wordpress.com)

Whats new in the latest version of R

Figure 2: The argument φ and modulus r locate ... — Image via Wikipedia

CHANGES IN R VERSION 2.12.2: http://cran.r-project.org/src/base/NEWS

  SIGNIFICANT USER-VISIBLE CHANGES:

    â€¢ Complex arithmetic (notably z^n for complex z and integer n) gave
      incorrect results since R 2.10.0 on platforms without C99 complex
      support.  This and some lesser issues in trignometric functions
      have been corrected.

      Such platforms were rare (we know of Cygwin and FreeBSD).
      However, because of new compiler optimizations in the way complex
      arguments are handled, the same code was selected on x86_64 Linux
      with gcc 4.5.x at the default -O2 optimization (but not at -O).

    â€¢ There is a workaround for crashes seen with several packages on
      systems using zlib 1.2.5: see the INSTALLATION section.

  NEW FEATURES:

    â€¢ PCRE has been updated to 8.12 (two bug-fix releases since 8.10).

    â€¢ rep(), seq(), seq.int() and seq_len() report more often when the
      first element is taken of an argument of incorrect length.

    â€¢ The Cocoa back-end for the quartz() graphics device on Mac OS X
      provides a way to disable event loop processing temporarily
      (useful, e.g., for forked instances of R).

    â€¢ kernel()'s default for m was not appropriate if coef was a set of
      coefficients.  (Reported by Pierre Chausse.)

    â€¢ bug.report() has been updated for the current R bug tracker,
      which does not accept emailed submissions.

    â€¢ R CMD check now checks for the correct use of $(LAPACK_LIBS) (as
      well as $(BLAS_LIBS)), since several CRAN recent submissions have
      ignored â€˜Writing R Extensionsâ€™.

  INSTALLATION:

    â€¢ The zlib sources in the distribution are now built with all
      symbols remapped: this is intended to avoid problems seen with
      packages such as XML and rggobi which link to zlib.so.1 on
      systems using zlib 1.2.5.

    â€¢ The default for FFLAGS and FCFLAGS with gfortran on x86_64 Linux
      has been changed back to -g -O2: however, setting -g -O may still
      be needed for gfortran 4.3.x.

  PACKAGE INSTALLATION:

    â€¢ A LazyDataCompression field in the DESCRIPTION file will be used
      to set the value for the --data-compress option of R CMD INSTALL.

    â€¢ Files R/sysdata.rda of more than 1Mb are now stored in the
      lazyload daabase using xz compression: this for example halves
      the installed size of package Imap.

    â€¢ R CMD INSTALL now ensures that directories installed from inst
      have search permission for everyone.

      It no longer installs files inst/doc/Rplots.ps and
      inst/doc/Rplots.pdf.  These are almost certainly left-overs from
      Sweave runs, and are often large.

  DEPRECATED & DEFUNCT:

    â€¢ The â€˜experimentalâ€™ alternative specification of a name space via
      .Export() etc is now deprecated.

    â€¢ zip.file.extract() is now deprecated.

    â€¢ Zip-ing data sets in packages (and hence R CMD INSTALL
      --use-zip-data and the ZipData: yes field in a DESCRIPTION file)
      is deprecated: using efficiently compressed .rda images and
      lazy-loading of data has superseded it.

  BUG FIXES:

    â€¢ identical() could in rare cases generate a warning about
      non-pairlist attributes on CHARSXPs.  As these are used for
      internal purposes, the attribute check should be skipped.
      (Reported by Niels Richard Hansen).

    â€¢ If the filename extension (usually .Rnw) was not included in a
      call to Sweave(), source references would not work properly and
      the keep.source option failed.  (PR#14459)

    â€¢ format.data.frame() now keeps zero character column names.

    â€¢ pretty(x) no longer raises an error when x contains solely
      non-finite values. (PR#14468)

    â€¢ The plot.TukeyHSD() function now uses a line width of 0.5 for its
      reference lines rather than lwd = 0 (which caused problems for
      some PDF and PostScript viewers).

    â€¢ The big.mark argument to prettyNum(), format(), etc. was inserted
      reversed if it was more than one character long.

    â€¢ R CMD check failed to check the filenames under man for Windows'
      reserved names.

    â€¢ The "Date" and "POSIXt" methods for seq() could overshoot when to
      was supplied and by was specified in months or years.

    â€¢ The internal method of untar() now restores hard links as file
      copies rather than symbolic links (which did not work for
      cross-directory links).

    â€¢ unzip() did not handle zip files which contained filepaths with
      two or more leading directories which were not in the zipfile and
      did not already exist.  (It is unclear if such zipfiles are valid
      and the third-party C code used did not support them, but
      PR#14462 created one.)

    â€¢ combn(n, m) now behaves more regularly for the border case m = 0.
      (PR#14473)

    â€¢ The rendering of numbers in plotmath expressions (e.g.
      expression(10^2)) used the current settings for conversion to
      strings rather than setting the defaults, and so could be
      affected by what has been done before.  (PR#14477)

    â€¢ The methods of napredict() and naresid() for na.action =
      na.exclude fits did not work correctly in the very rare event
      that every case had been omitted in the fit.  (Reported by Simon
      Wood.)

    â€¢ weighted.residuals(drop0=TRUE) returned a vector when the
      residuals were a matrix (e.g. those of class "mlm").  (Reported
      by Bill Dunlap.)

    â€¢ Package HTML index files /html/00Index.html were generated
      with a stylesheet reference that was not correct for static
      browsing in libraries.

    â€¢ ccf(na.action = na.pass) was not implemented.

    â€¢ The parser accepted some incorrect numeric constants, e.g. 20x2.
      (Reported by Olaf Mersmann.)

    â€¢ format(*, zero.print) did not always replace the full zero parts.

    â€¢ Fixes for subsetting or subassignment of "raster" objects when
      not both i and j are specified.

    â€¢ R CMD INSTALL was not always respecting the ZipData: yes field of
      a DESCRIPTION file (although this is frequently incorrectly
      specified for packages with no data or which specify lazy-loading
      of data).

      R CMD INSTALL --use-zip-data was incorrectly implemented as
      --use-zipdata since R 2.9.0.

    â€¢ source(file, echo=TRUE) could fail if the file contained #line
      directives.  It now recovers more gracefully, but may still
      display the wrong line if the directive gives incorrect
      information.

    â€¢ atan(1i) returned NaN+Infi (rather than 0+Infi) on platforms
      without C99 complex support.

    â€¢ library() failed to cache S4 metadata (unlike loadNamespace())
      causing failures in S4-using packages without a namespace (e.g.
      those using reference classes).

    â€¢ The function qlogis(lp, log.p=TRUE) no longer prematurely
      overflows to Inf when exp(lp) is close to 1.

    â€¢ Updating S4 methods for a group generic function requires
      resetting the methods tables for the members of the group (patch
      contributed by Martin Morgan).

    â€¢ In some circumstances (including for package XML), R CMD INSTALL
      installed version-control directories from source packages.

    â€¢ Added PROTECT calls to some constructed expressions used in C
      level eval calls.

    â€¢ utils:::create.post() (used by bug.report() and help.request())
      failed to quote arguments to the mailer, and so often failed.

    â€¢ bug.report() was naive about how to extract maintainer email
      addresses from package descriptions, so would often try mailing
      to incorrect addresses.

    â€¢ debugger() could fail to read the environment of a call to a
      function with a ... argument.  (Reported by Charlie Roosen.)    â€¢ prettyNum(c(1i, NA), drop0=TRUE) or str(NA_complex_) now work
      correctly.

R 2.12.2 scheduled for February 25 (revolutionanalytics.com)
Sweave Tutorial 3: Console Input and Output – Multiple Choice Test Analysis (r-bloggers.com)

3) Sudoko–

4) Pinball

That’s all for holiday season folks, the top 10 lists is based on almost 3 decades of gaming experience, but beauty is the eye of the beholder- so happy gaming for free.

Related Articles

Please share:

Related Articles

Please share:

Heritage Health Prize Launching April 4

Prize Goal & Participation

Data Sets

Related Articles

Please share:

The Basics

Analogies

Databases

Related Articles

Please share:

Related Articles

Please share:

Related Articles

Please share:

Heritage Health Prize
Launching April 4