Home » Posts tagged 'MetaData'
Tag Archives: MetaData
Teradata updates Teradata-R
March 17, 2012 3:46 pm / Leave a Comment
The Teradata add-on package for R
teradataR is a package or library that allows R users to easily connect to Teradata, establish data frames (R data formats) to Teradata and to call in-database analytic functions within Teradata. This allows R users to work within their R console environment while leveraging the in-database functions developed with Teradata Warehouse Miner. This package provides 44 different analytical functions and an additional 20 data connection and R infrastructure functions. In addition, we’ve added a function that will list the stored procedures within Teradata provide the capability to call functions from R.
- 20 Functions to enable R infrastructure to operate with Teradata
- tdConnect – Connect to Teradata via ODBC
- Td.data.frame – Establish data frame connections to a Teradata table
- 44 in-database analytical functions callable from R. Sample of the functions include:
- Descriptive statistics: Overlap, histogram, frequency, statistics, matrix functions, and values analysis
- Reorganization functions: join, merge and samples
- Transformations: bincode, recode, rescale, sigmoid, zscore and null replacement
- K-Means clustering and Score K-Means
- Statistical tests: ks, dagostino.pearson, shapiro.wilk, bionomial, and wilcoxon
- R language features nrow, ncol, min, max, summary, as.dataframe, and dim
- Tool and R functions that allow users to create their own custom analytic functions that’s callable by R.
- Teradata Warehouse Miner can capture any analytic stream including UDFs and create a stored procedure
- Analytic process to create new derived predictive variables can be captured as a stored procedure.
- Entire process to create or update an analytical data set can be captured as a stored procedure.
- R function can list all the stored procedures within Teradata.
- R function can call a stored procedure that runs in-database
TeradataR allows R users to leverage all the benefits of in-database processing with Teradata:
- Eliminate data movement from Teradata to the R framework for key data intensive tasks.
- Leverage the speed of Teradata database’s parallel processing to run analytics against big data.
- Ability to operate within the R console environment.
- Embed your frequently performed tasks to run in-database.
- R and TeradataR are free downloads.
Source-
http://developer.teradata.com/applications/articles/in-database-analytics-with-teradata-r
This package allows users of R to interact with a Teradata database. R is an open source language for statistical computing and graphics. R provides a wide variety of statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering) and graphical techniques, and is highly extensible. Users can use many statistical functions directly against the Teradata system without having to extract the data into memory.
Enhancements included with this new 1.0.1 release include:
- teradataR User Guide
- addition of Mac OS X Package
- addition of Red Hat Linux Package (added 2/23/12)
- summary has been enhanced to run faster
- JDBC support added to allow Windows or Mac users to run the package with JDBC
- td.data.frame enhanced to allow support for manipulation to add columns and expressions
- td.data.frame enhanced to use Teradata 14.0 Fastpath Transform Functions (see Appendix B)
- td.tapply function added to apply a select group of functions to columns of an array
From-
http://downloads.teradata.com/download/applications/teradata-r
and
A new R package for Red Hat Linux has been added to the teradataR 1.0.1 release. This new package provides the same functionality as in the previously released Windows and Mac OS X packages, but is built for Red Hat Linux. This version was built and tested on Red Hat Linux 6.2 32-bit. (The R version for Red Hat Linux is 2.14.1)
Installing this package is the same as any normal R package; just extract it into your R library area, or use the install.packagescommand with the file path.
from-
http://developer.teradata.com/tag/r
and
With plenty of prolific and enthusiastic developers, the number of packages for R is expected to grow tremendously. Statisticians and analysts using these packages will find innovative ways to use data to answer their research and business questions. And as organizations become more willing to rely on open-source software for mission-critical tasks, R is poised to become an essential tool for analyzing our complex world.
Source-
http://www.teradatamagazine.com/v09n03/Connections/R-you-ready/
From the user guide-
http://downloads.teradata.com/download/applications/teradata-r
teradataR allows R users to easily connect to Teradata, establish td data frames (virtual R data frames) to
Teradata and to call in-database analytic functions within Teradata. This allows R users to work within their R
console environment while leveraging the in-database functions
A Function List
teradataR-package Allow access to Teradata via R
as.data.frame.td.data.frame Convert td data frame to a data frame
as.td.data.frame Coerce to a td data frame
dim.td.data.frame Dimensions of a td data frame
hist.td.data.frame Histograms
Is.td.data.frame Is an Object a Teradata Data Frame
Is.td.expression Is an Object a Teradata Expression
mean.td.data.frame Arithmetic Mean
median.td.data.frame Median Value
min.td.data.frame Minima
predict.kmeans Kmeans Model Prediction
print.td.data.frame Show contents of a td data frame
sum.td.data.frame Sum of column
summary.td.data.frame Summary of Teradata Data Frame
Td.bincode Create Table of Bincode Values
Td.binomial Binomial Test
Td.binomialsign Binomial Sign Test
Td.call.sp Locate and call stored procedure
Td.cor Correlation Matrix
Td.cov Covariance Matrix
Td.dagostino.pearson D’Agostino Pearson Test
Td.data.frame Teradata Data Frames
Td.f.oneway One way F Test
Td.factanal Factor Analysis
Td.freq Frequency Analysis
Td.hist Histograms
Td.join Join Tables in Teradata
Td.kmeans K-Means Clustering
Td.ks Kolmogorov Smirnov Test
Td.lilliefors Lilliefors Test
Td.merge Merge Rows of Teradata Tables
Td.mode Mode Value of Column
Td.mwnkw Mann-Whitney/Kruskal Wallis Test
Td.nullreplace Replace Null Values
Td.overlap Overlap
Td.quantiles Quantile Values
Td.rank Rank
Td.recode Recode
Td.rescale Rescale Values of Column
Td.sample Sample Rows
Td.shapiro.wilk Shapiro Wilk
Td.sigmoid Sigmoid Transformation
Td.smirnov Smirnov Test
Td.solve Solve a system of equations
Td.stats General Statistics
Td.t.paired T Test Paired
Td.t.unpaired T Test Unpaired
Td.t.unpairedi T Test – Unpaired Indicator
Td.values Values
Td.wilcoxon Wilcoxon Test
Td.zscore Zscore Transformation
tdClose Close connection
tdConnect Connect to Teradata database
tdMetadataDB Set metadata database
tdQuery Query Teradata Database
teradataR Allow access to Teradata via R
[.td.data.frame Extract Teradata Data Frame
[<-.td.data.frame Replace value of Teradata Data Frame
Whats new in the latest version of R
February 27, 2011 12:21 am / Leave a Comment
CHANGES IN R VERSION 2.12.2: http://cran.r-project.org/src/base/NEWS SIGNIFICANT USER-VISIBLE CHANGES: • Complex arithmetic (notably z^n for complex z and integer n) gave incorrect results since R 2.10.0 on platforms without C99 complex support. This and some lesser issues in trignometric functions have been corrected. Such platforms were rare (we know of Cygwin and FreeBSD). However, because of new compiler optimizations in the way complex arguments are handled, the same code was selected on x86_64 Linux with gcc 4.5.x at the default -O2 optimization (but not at -O). • There is a workaround for crashes seen with several packages on systems using zlib 1.2.5: see the INSTALLATION section. NEW FEATURES: • PCRE has been updated to 8.12 (two bug-fix releases since 8.10). • rep(), seq(), seq.int() and seq_len() report more often when the first element is taken of an argument of incorrect length. • The Cocoa back-end for the quartz() graphics device on Mac OS X provides a way to disable event loop processing temporarily (useful, e.g., for forked instances of R). • kernel()'s default for m was not appropriate if coef was a set of coefficients. (Reported by Pierre Chausse.) • bug.report() has been updated for the current R bug tracker, which does not accept emailed submissions. • R CMD check now checks for the correct use of $(LAPACK_LIBS) (as well as $(BLAS_LIBS)), since several CRAN recent submissions have ignored ‘Writing R Extensions’. INSTALLATION: • The zlib sources in the distribution are now built with all symbols remapped: this is intended to avoid problems seen with packages such as XML and rggobi which link to zlib.so.1 on systems using zlib 1.2.5. • The default for FFLAGS and FCFLAGS with gfortran on x86_64 Linux has been changed back to -g -O2: however, setting -g -O may still be needed for gfortran 4.3.x. PACKAGE INSTALLATION: • A LazyDataCompression field in the DESCRIPTION file will be used to set the value for the --data-compress option of R CMD INSTALL. • Files R/sysdata.rda of more than 1Mb are now stored in the lazyload daabase using xz compression: this for example halves the installed size of package Imap. • R CMD INSTALL now ensures that directories installed from inst have search permission for everyone. It no longer installs files inst/doc/Rplots.ps and inst/doc/Rplots.pdf. These are almost certainly left-overs from Sweave runs, and are often large. DEPRECATED & DEFUNCT: • The ‘experimental’ alternative specification of a name space via .Export() etc is now deprecated. • zip.file.extract() is now deprecated. • Zip-ing data sets in packages (and hence R CMD INSTALL --use-zip-data and the ZipData: yes field in a DESCRIPTION file) is deprecated: using efficiently compressed .rda images and lazy-loading of data has superseded it. BUG FIXES: • identical() could in rare cases generate a warning about non-pairlist attributes on CHARSXPs. As these are used for internal purposes, the attribute check should be skipped. (Reported by Niels Richard Hansen). • If the filename extension (usually .Rnw) was not included in a call to Sweave(), source references would not work properly and the keep.source option failed. (PR#14459) • format.data.frame() now keeps zero character column names. • pretty(x) no longer raises an error when x contains solely non-finite values. (PR#14468) • The plot.TukeyHSD() function now uses a line width of 0.5 for its reference lines rather than lwd = 0 (which caused problems for some PDF and PostScript viewers). • The big.mark argument to prettyNum(), format(), etc. was inserted reversed if it was more than one character long. • R CMD check failed to check the filenames under man for Windows' reserved names. • The "Date" and "POSIXt" methods for seq() could overshoot when to was supplied and by was specified in months or years. • The internal method of untar() now restores hard links as file copies rather than symbolic links (which did not work for cross-directory links). • unzip() did not handle zip files which contained filepaths with two or more leading directories which were not in the zipfile and did not already exist. (It is unclear if such zipfiles are valid and the third-party C code used did not support them, but PR#14462 created one.) • combn(n, m) now behaves more regularly for the border case m = 0. (PR#14473) • The rendering of numbers in plotmath expressions (e.g. expression(10^2)) used the current settings for conversion to strings rather than setting the defaults, and so could be affected by what has been done before. (PR#14477) • The methods of napredict() and naresid() for na.action = na.exclude fits did not work correctly in the very rare event that every case had been omitted in the fit. (Reported by Simon Wood.) • weighted.residuals(drop0=TRUE) returned a vector when the residuals were a matrix (e.g. those of class "mlm"). (Reported by Bill Dunlap.) • Package HTML index files /html/00Index.html were generated with a stylesheet reference that was not correct for static browsing in libraries. • ccf(na.action = na.pass) was not implemented. • The parser accepted some incorrect numeric constants, e.g. 20x2. (Reported by Olaf Mersmann.) • format(*, zero.print) did not always replace the full zero parts. • Fixes for subsetting or subassignment of "raster" objects when not both i and j are specified. • R CMD INSTALL was not always respecting the ZipData: yes field of a DESCRIPTION file (although this is frequently incorrectly specified for packages with no data or which specify lazy-loading of data). R CMD INSTALL --use-zip-data was incorrectly implemented as --use-zipdata since R 2.9.0. • source(file, echo=TRUE) could fail if the file contained #line directives. It now recovers more gracefully, but may still display the wrong line if the directive gives incorrect information. • atan(1i) returned NaN+Infi (rather than 0+Infi) on platforms without C99 complex support. • library() failed to cache S4 metadata (unlike loadNamespace()) causing failures in S4-using packages without a namespace (e.g. those using reference classes). • The function qlogis(lp, log.p=TRUE) no longer prematurely overflows to Inf when exp(lp) is close to 1. • Updating S4 methods for a group generic function requires resetting the methods tables for the members of the group (patch contributed by Martin Morgan). • In some circumstances (including for package XML), R CMD INSTALL installed version-control directories from source packages. • Added PROTECT calls to some constructed expressions used in C level eval calls. • utils:::create.post() (used by bug.report() and help.request()) failed to quote arguments to the mailer, and so often failed. • bug.report() was naive about how to extract maintainer email addresses from package descriptions, so would often try mailing to incorrect addresses. • debugger() could fail to read the environment of a call to a function with a ... argument. (Reported by Charlie Roosen.) • prettyNum(c(1i, NA), drop0=TRUE) or str(NA_complex_) now work correctly.
Related Articles
- R 2.12.2 scheduled for February 25 (revolutionanalytics.com)
- Sweave Tutorial 3: Console Input and Output – Multiple Choice Test Analysis (r-bloggers.com)
Libreoffice 3.3 released
January 13, 2011 11:57 am / Leave a Comment
What does LibreOffice give you?
http://www.libreoffice.org/features/
WRITER is the word processor inside LibreOffice. Use it for everything, from dashing off a quick letter to producing an entire book with tables of contents, embedded illustrations, bibliographies and diagrams. The while-you-type auto-completion, auto-formatting and automatic spelling checking make difficult tasks easy (but are easy to disable if you prefer). Writer is powerful enough to tackle desktop publishing tasks such as creating multi-column newsletters and brochures. The only limit is your imagination.
CALC tames your numbers and helps with difficult decisions when you’re weighing the alternatives. Analyze your data with Calc and then use it to present your final output. Charts and analysis tools help bring transparency to your conclusions. A fully-integrated help system makes easier work of entering complex formulas. Add data from external databases such as SQL or Oracle, then sort and filter them to produce statistical analyses. Use the graphing functions to display large number of 2D and 3D graphics from 13 categories, including line, area, bar, pie, X-Y, and net – with the dozens of variations available, you’re sure to find one that suits your project.
IMPRESS is the fastest and easiest way to create effective multimedia presentations. Stunning animation and sensational special effects help you convince your audience. Create presentations that look even more professional than the standard presentations you commonly see at work. Get your collegues’ and bosses’ attention by creating something a little bit different.
DRAW lets you build diagrams and sketches from scratch. A picture is worth a thousand words, so why not try something simple with box and line diagrams? Or else go further and easily build dynamic 3D illustrations and special effects. It’s as simple or as powerful as you want it to be.
BASE is the database front-end of the LibreOffice suite. With Base, you can seamlessly integrate into your existing database structures. Based on imported and linked tables and queries from MySQL, PostgreSQL or Microsoft Access and many other data sources, you can build powerful databases containing forms, reports, views and queries. Full integration is possible with the in-built HSQL database.
MATH is a simple equation editor that lets you lay-out and display your mathematical, chemical, electrical or scientific equations quickly in standard written notation. Even the most-complex calculations can be understandable when displayed correctly. E=mc2
Open Documentation just announced release candidate 3 of Libre office.
New Features-
http://www.libreoffice.org/download/new-features/

General
- Added the LibreColors to the palette;
- Added Quickstarter for Unix builds;
- Introduced Linux “Libertine G” and Linux “Biolinum G” fonts;
- Implement import of alpha channel for RGBA .tiffs [
http://bugs.freedesktop.org/show_bug.cgi?id=30472
]; - Show all appropiate formats by default on “Save As” [
http://qa.openoffice.org/issues/show_bug.cgi?id=113141
]; - Use radio buttons for mutually exclusive menu options;
- Replace the “Help Support” menu item by the “License Information” one;
- Load and save documents in flat XML;
- Made Help system available via the WikiHelp;
- Option to enable saving of documents at all times (see Tools -> Options -> LibreOffice -> General -> “Allow to save document…”).
Calc
- [
http://bugs.freedesktop.org/show_bug.cgi?id=30559
]: Added new tab page ‘Compatibility’ in the Options dialog; - Better default key bindings;
- Use Ctrl-Shift-D to launch selection list in LibreOffice;
- Added new image file used in the “insert new sheet” button. This image is not visible in read-only mode;
- Fix fake small caps resizing factor [
http://qa.openoffice.org/issues/show_bug.cgi?id=1526
]; - Added dotted/dashed borders in Calc;
- Added icons for toggling sheet grids in Calc;
- Better performance and interoperability on Excel doc import;
- Better performance on DBF import;
- Slightly better performance on ODS import;
- Possibility to use English formula names;
- Distributed alignment – allows one to specify ‘distributed’ horizontal alignment and ‘justified’ and ‘distributed’ vertical alignments within cells. This is notably useful for CJK locales;
- Support for 3 different formula syntaxes: Calc A1, Excel A1 and Excel R1C1;
- Configurable argument and array separators in formula expressions;
- External reference works within OFFSET function;
- Hitting TAB during auto-complete commits current selection and moves to the next cell;
- Shift-TAB cycles through auto-complete selections;
- Find and replace skips those cells that are filtered out (thus hidden);
- Protecting sheet provides two additional sheet protection options, to optionally limit cursor placement in protected and unprotected areas;
- Copying a range highlights the range being copied. It also allows you to paste it by hitting ENTER key. Hitting ESC removes the range highlight;
- Jumping to and from references in formula cells via “Ctrl-[" and "Ctrl-]“;
- Cell cursor stays at the original cell during range selection.
Writer
- AutoCorrections match case of the words that AutoCorrect replaces. (Issuezilla 2838);
- Ability to turn off number recognition in Writer;
- RTF export (from GSoc);
- Port of Lotus Word Pro filter;
- New dialog box for title page.
Impress/Draw
- PPTX chart import feature;
- [
http://qa.openoffice.org/issues/show_bug.cgi?id=112421
] make “Presenter Screen” default to laptop, not projector; - Improve randomization in “Dissolve” transition.
Math
- Default to just printing the formula itself in Math;
- [
http://qa.openoffice.org/issues/show_bug.cgi?id=113400
] Maths brackets malformed in presentation mode.
Base
- [
http://qa.openoffice.org/issues/show_bug.cgi?id=112597
] Added display properties to control shapes.
Development
- UNO APIs for size and moveProtect of notes;
- Via Issuezilla bug #i80184: allow addition of drawing documents to gallery via API.
Productivity Enhancements
- New custom properties handling;
- Embedding of standard PDF fonts;
- New “Narrow” font family;
- Increased document protection in Writer and Calc;
- Automatic decimals digits for “General” format in Calc;
- 1 million rows in a spreadsheet;
- New options for CSV (Comma-Separated Value) importation in Calc;
- Insert drawing objects in charts;
- Hierarchical axis labels for charts;
- Improved slide layout handling in Impress;
- Manual setting for primary key support for databases;
- Support of Read-Only database registration;
- New Math command: ‘nospace’.
Internationalization
- Additional locale data.
Usability and Interface
- Common search toolbar;
- New easier-to-use print interface;
- More options for changing case;
- Redesign of thesaurus;
- Resetting of text to the default language in Writer;
- Text rendering of form controls in Writer;
- Changed defaults for charts;
- Colored sheet tabs in Calc;
- Adaptation to marked selection for filter area in Calc;
- Sort dialog box for DataPilot in Calc;
- Display custom names for DataPilot fields, items and totals in Calc.
Developer Features and Extensibility
- Grid control enhancements;
- New MetaData node for database;
- Extending database drivers using extensions.
Related Articles
- Make Numbers Easier to Read in OpenOffice Calc (helpdeskgeek.com)
- Libre Office, Using Java To A Lesser Extent (lockergnome.com)
- OpenOffice vs. Office 2011: Rooting for the Underdog (appreaders.com)
- LibreOffice RC 3 now available (omgubuntu.co.uk)
- Libre Office Beta 3 released (omgubuntu.co.uk)
- Rumblings From the LibreOffice Camp Signal Good Things Ahead (ostatic.com)
- LibreOffice 3.3 RC2 released, available for download (omgubuntu.co.uk)
- LibreOffice: Ready for Liftoff (zdnet.com)
- LibreOffice – The Likely Future of OpenOffice (maketecheasier.com)
- Replace OpenOffice.org with LibreOffice in Ubuntu [Linux Tip] (lifehacker.com)
- LibreOffice Ubuntu PPA makes installation easy (omgubuntu.co.uk)

