Home » Analytics » Here comes PySpread- 85,899,345 rows and 14,316,555 columns

Here comes PySpread- 85,899,345 rows and 14,316,555 columns

R in the Cloud

R for Quantitative Finance

Software

Train in R

A Bold GNU Head

Image via Wikipedia

Whats new/ One more open source analytics package. Built like a spreadsheet with an ability to import a million cells-

From http://pyspread.sourceforge.net/index.html

about Pyspread is a cross-platform Python spreadsheet application. It is based on and written in the programming language Python.

Instead of spreadsheet formulas, Python expressions are entered into the spreadsheet cells. Each expression returns a Python object that can be accessed from other cells. These objects can represent anything including lists or matrices.

Pyspread screenshot
features In pyspread, cells expect Python expressions and return Python objects. Therefore, complex data types such as lists, trees or matrices can be handled within a single cell. Macros can be used for functions that are too complex for a single expression.

Since Python modules can be easily used without external scripts, arbitrary size rational numbers (via gmpy), fixed point decimal numbers for business calculations, (via the decimal module from the standard library) and advanced statistics including plotting functions (via RPy) can be used in the spreadsheet. Everything is directly available from each cell. Just use the grid

Data can be imported and exported using csv files or the clipboard. Other forms of data exchange is possible using external Python modules.

In  order to simplify sparse matrix editing, pyspread features a three dimensional grid that can be sized up to 85,899,345 rows and 14,316,555 columns (64 bit-systems, depends on row height and column width). Note that importing a million cells requires about 500 MB of memory.

The concept of pyspread allows doing everything from each cell that a Python script can do. This may very well include deleting your hard drive or sending your data via the Internet. Of course this is a non-issue if you sandbox properly or if you only use self developed spreadsheets. Since this is not the case for everyone (see the discussion at lwn.net), a GPG signature based trust model for spreadsheet files has been introduced. It ensures that only your own trusted files are executed on loading. Untrusted files are displayed in safe mode. You can trust a file manually. Inspect carefully.

Pyspread screenshot

requirements Pyspread runs on Linux, Windows and *nix platforms with GTK+ support. There are reports that it works with MacOS X as well. If you would like to contribute by testing on OS X please contact me.

Dependencies

Highly recommended for full functionality

  • PyMe >=0.8.1, Note for Windows™ users: If you want to use signatures without compiling PyMe try out Gpg4win.
  • gmpy >=1.1.0 and
  • rpy >=1.0.3.
maturity Pyspread is in early Beta release. This means that the core functionality is fully implemented but the program needs testing and polish.

and from the wiki

http://sourceforge.net/apps/mediawiki/pyspread/index.php?title=Main_Page

a spreadsheet with more powerful functions and data structures that are accessible inside each cell. Something like Python that empowers you to do things quickly. And yes, it should be free and it should run on Linux as well as on Windows. I looked around and found nothing that suited me. Therefore, I started pyspread.

Concept

  • Each cell accepts any input that works in a Python command line.
  • The inputs are parsed and evaluated by Python’s eval command.
  • The result objects are accessible via a 3D numpy object array.
  • String representations of the result objects are displayed in the cells.

Benefits

  • Each cell returns a Python object. This object can be anything including arrays and third party library objects.
  • Generator expressions can be used efficiently for data manipulation.
  • Efficient numpy slicing is used.
  • numpy methods are accessible for the data.

Installation

  1. Download the pyspread tarball or zip and unzip at a convenient place
  2. In case you do not have it already get and install Python, wxpython and numpy
If you want the examples to work, install gmpy, R and rpy
Really do check the version requirements that are mentioned on http://pyspread.sf.net
  1. Get install privileges (e.g. become root)
  2. Change into the directory and type
python setup.py install
Windows: Replace “python” with your Python interpreter (absolute path)
  1. Become normal user again
  2. Start pyspread by typing
pyspread
  1. Enjoy

Links

Next on Spreadsheet wishlist-

a MSI bundle /Windows Self Installer which has all dependencies bundled in it-linking to PostGresSQL ;) etc

way to go Mr Martin Manns

mmanns < at > gmx < dot > net


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Predictive Analytics- The Book

Conferences

Books

Follow

Get every new post delivered to your Inbox.

Join 744 other followers