Using R from within Python

Python logo
Image via Wikipedia

I came across this excellent JSS paper at www.jstatsoft.org/v35/c02/paper

on a Python package called PypeR which allows you to use R from within Python using the pipe functionality.

It is an interesting package and given Python’s increasing buzz , one worthy to be checked out by people using or thinking Python in their packages.

























Citation:
	@article{Xia:McClelland:Wang:2010:JSSOBK:v35c02,
	  author =	"Xiao-Qin Xia and Michael McClelland and Yipeng Wang",
	  title =	"PypeR, A Python Package for Using R in Python",
	  journal =	"Journal of Statistical Software, Code Snippets",
	  volume =	"35",
	  number =	"2",
	  pages =	"1--8",
	  day =  	"30",
	  month =	"7",
	  year = 	"2010",
	  CODEN =	"JSSOBK",
	  ISSN = 	"1548-7660",
	  bibdate =	"2010-03-23",
	  URL =  	"http://www.jstatsoft.org/v35/c02",
	  accepted =	"2010-03-23",
	  acknowledgement = "",
	  keywords =	"",
	  submitted =	"2009-10-23",
	}

 

PySpread Magic

Python logo
Image via Wikipedia

Just working with PySpread- and worked on a 1 million by 1 million spreadsheet- Python sure looks promising for the way ahead for stat computing ( you need to

sudo apt-get install python-numpy python-rpy python-scipy python-gmpy wxpython*,

cd to the untarred bz2 file from http://pyspread.sourceforge.net/download.html,  (like

:~/Downloads$ cd pyspread-0.1.2

:~/Downloads/pyspread-0.1.2

sudo python setup.py install

)

http://pyspread.sourceforge.net/

by Martin Manns

 

about Pyspread is a cross-platform Python spreadsheet application. It is based on and written in the programming language Python.

Instead of spreadsheet formulas, Python expressions are entered into the spreadsheet cells. Each expression returns a Python object that can be accessed from other cells. These objects can represent anything including lists or matrices.

Pyspread screenshot
features
  • Three dimensional grid with up to 85,899,345 rows and 14,316,555 columns (64 bit systems, depends on row height and column width). Note that a million cells require about 500 MB of memory.
  • Complex data types such as lists, trees or matrices within a single cell.
  • Macros for functionalities that are too complex for a single Python expression.
  • Python module access from each cell, which allows:
    • Arbitrary size rational numbers (via gmpy),
    • Fixed point decimal numbers for business calculations, (via the decimal module from the standard library)
    • Advanced statistics including plotting functions (via RPy)
    • Much more via <your favourite module>.
  • CSV import and export
  • Clipboard access
Pyspread screenshot

warning The concept of pyspread allows doing everything from each cell that a Python script can do. This powerful feature has its drawbacks. A spreadsheet may very well delete your hard drive or send your data via the Internet. Of course this is a non-issue if you sandbox properly or if you only use self developed spreadsheets.

Since this is not the case for everyone (see discussion at lwn.net), a GPG signature based trust model for spreadsheet files has been introduced. It ensures that only your own trusted files are executed on loading. Untrusted files are displayed in safe mode. You can approve a file manually. Inspect carefully.

 

%d bloggers like this: