Open Source Compiler for SAS language/ GNU -DAP

A Bold GNU Head
Image via Wikipedia

I am still testing this out.

But if you know bit more about make and .compile in Ubuntu check out

http://www.gnu.org/software/dap/

I loved the humorous introduction

Dap is a small statistics and graphics package based on C. Version 3.0 and later of Dap can read SBS programs (based on the utterly famous, industry standard statistics system with similar initials – you know the one I mean)! The user wishing to perform basic statistical analyses is now freed from learning and using C syntax for straightforward tasks, while retaining access to the C-style graphics and statistics features provided by the original implementation. Dap provides core methods of data management, analysis, and graphics that are commonly used in statistical consulting practice (univariate statistics, correlations and regression, ANOVA, categorical data analysis, logistic regression, and nonparametric analyses).

Anyone familiar with the basic syntax of C programs can learn to use the C-style features of Dap quickly and easily from the manual and the examples contained in it; advanced features of C are not necessary, although they are available. (The manual contains a brief introduction to the C syntax needed for Dap.) Because Dap processes files one line at a time, rather than reading entire files into memory, it can be, and has been, used on data sets that have very many lines and/or very many variables.

I wrote Dap to use in my statistical consulting practice because the aforementioned utterly famous, industry standard statistics system is (or at least was) not available on GNU/Linux and costs a bundle every year under a lease arrangement. And now you can run programs written for that system directly on Dap! I was generally happy with that system, except for the graphics, which are all but impossible to use,  but there were a number of clumsy constructs left over from its ancient origins.

http://www.gnu.org/software/dap/#Sample output

  • Unbalanced ANOVA
  • Crossed, nested ANOVA
  • Random model, unbalanced
  • Mixed model, balanced
  • Mixed model, unbalanced
  • Split plot
  • Latin square
  • Missing treatment combinations
  • Linear regression
  • Linear regression, model building
  • Ordinal cross-classification
  • Stratified 2×2 tables
  • Loglinear models
  • Logit  model for linear-by-linear association
  • Logistic regression
  • Copyright © 2001, 2002, 2003, 2004 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA

    sounds too good to be true- GNU /DAP joins WPS workbench and Dulles Open’s Carolina as the third SAS language compiler (besides the now defunct BASS software) see http://en.wikipedia.org/wiki/SAS_language#Controversy

     

    Also see http://en.wikipedia.org/wiki/DAP_(software)

    Dap was written to be a free replacement for SAS, but users are assumed to have a basic familiarity with the C programming language in order to permit greater flexibility. Unlike R it has been designed to be used on large data sets.

    It has been designed so as to cope with very large data sets; even when the size of the data exceeds the size of the computer’s memory

    Protovis a graphical toolkit for visualization

    I just found about a new data visualization tool called Protovis http://vis.stanford.edu/protovis/ex/

    Protovis composes custom views of data with simple marks such as bars and dots. Unlike low-level graphics libraries that quickly become tedious for visualization, Protovis defines marks through dynamic properties that encode data, allowing inheritancescales and layouts to simplify construction.

    Protovis is free and open-source and is a Stanford project. It has been used in web interface R Node (which I will talk later )

    http://squirelove.net/r-node/doku.php

    Conventional

    While Protovis is designed for custom visualization, it is still easy to create many standard chart types. These simpler examples serve as an introduction to the language, demonstrating key abstractions such as quantitative and ordinal scales, while hinting at more advanced features, including stack layout.

    Custom

    Many charting libraries provide stock chart designs, but offer only limited customization; Protovis excels at custom visualization design through a concise representation and precise control over graphical marks. These examples, including a few recreations of unusual historical designs, demonstrate the language’s expressiveness.

     

     

    Try Protovis today 🙂 http://vis.stanford.edu/protovis/

    It uses JavaScript and SVG for web-native visualizations; no plugin required (though you will need a modern web browser)! Although programming experience is helpful, Protovis is mostly declarative and designed to be learned by example.

    Viva Libre Office

    WordPerfect 5.1 for DOS.
    Image via Wikipedia

    The Document Foundation is happy to announce the release candidate of
    LibreOffice 3.3.1. This release candidate is the first in a series of
    frequent bugfix releases on top of our LibreOffice 3.3 product. Please
    be aware that LibreOffice 3.3.1 RC1 is not yet ready for production
    use, you should continue to use LibreOffice for that.

    http://listarchives.documentfoundation.org/www/announce/msg00028.html

    Following is the list of changes against LibreOffice 3.3:

    Key changes at a glance:

    * Numerous translation updates
    * new mimetype icons for LibreOffice – explained here:
    http://luxate.blogspot.com/2011/01/not-even-included-but-already-improved.html
    * quite a few crasher fixes

    Detailed change log:

    * translation updates
    * Removed old/unmaintained icon themes
    * Fix for https://bugzilla.novell.com/show_bug.cgi?id=664516: Don’t
    use a reference or the default formula string will be changed
    * Install bash completion for oo* wrappers when enabled
    (https://bugzilla.novell.com/show_bug.cgi?id=665402)
    * Build fix: get the stlport compat workaround working for gcc 4.6.0
    * Build fix: no ddraw.h or ddraw.lib in the June 2010 DirectX SDK,
    removed usage
    * Windows installer: padded nologobanner.bmp, new size is 102×58
    * removed gd – Gaelic, ky – Kirghiz, pap – Papiamento, ti – Tigrinya,
    ms – Malay, ps – Pashto, ur – Urdu. UI localization does not exist
    in these languages. So it makes no sense to ship packages.
    * Build fix: pass thru PYTHON, found by configure. Will be used by
    filter/source/config/fragments/makefile.mk.
    * Upgraded libwpd (WordPerfect filter) to 0.9.1
    * Fixed BrOffice Windows start menu branding
    * Removed language code ‘kid’. kid is not Koshin, but key id pseudo
    language which is good for debugging UI but should no be included
    in the product
    * Added ca_XV and ast language/local name and description
    * Fixed incorrect page number in page preview mode
    (https://bugs.freedesktop.org/show_bug.cgi?id=33155). When the
    window is large enough to show several ‘Page X’ strings,
    the page number was not properly incremented.
    * Fixed incorrect import of cell attributes from Excel
    documents. When a cell with non-default formatting attribute starts
    with non-first row in a column, the filter would incorrectly apply
    the same format to all the cells above it if they didn’t have any
    formats.
    * Ubuntu: fix for lp#696527 – enable human icon theme in LibreOffice
    * Fix for https://bugzilla.redhat.com/show_bug.cgi?id=673819 crash on
    changing position of drawing object in header.
    * Changed OpenOffice.org to LibreOffice in nsplugin
    * Added Occitan dictionary
    * Added Ukrainian dictionaries
    * Fix window focus for langpack installation on Mac –
    https://bugs.freedesktop.org/show_bug.cgi?id=33056
    * Added/modified NLPsolver translations from Pootle
    * Fix for https://bugzilla.novell.com/show_bug.cgi?id=655763
    * Fix for RTF export crasher
    (https://bugzilla.novell.com/show_bug.cgi?id=656503)
    * Use LibreOffice as product name for EPS Creator header
    * Parse svg ‘color’ property (fixes
    https://bugs.freedesktop.org/show_bug.cgi?id=33551)
    * Use double instead of float in writerfilter import
    * Build fix: use PYTHON as passed through by set_soenv.in.
    * Fix for https://bugs.freedesktop.org/show_bug.cgi?id=33237 remove
    debug line
    * Fix for https://bugs.freedesktop.org/show_bug.cgi?id=33237 – fixes
    ole object import for writer (docx)
    * Fix for https://bugs.freedesktop.org/show_bug.cgi?id=33249
    rename OOo -> LibO on Getting Support Page
    * Fix ooxml import: handle css::table::BorderLine in addition to
    css::table::BorderLine2 That means that table cell properties are
    correctly set on import again.
    * Fix for https://bugs.freedesktop.org/show_bug.cgi?id=33258
    wikihelp: Improve the check for existence of the localized help.
    * Fix for https://bugs.freedesktop.org/show_bug.cgi?id=33994 – fixes
    several crashes around config UNO API
    * Fix for https://bugs.freedesktop.org/show_bug.cgi?id=30879
    * Fix for https://bugs.freedesktop.org/show_bug.cgi?id=32872
    Implementation names weren’t matching with xcu.
    * Fix: don’t pushback and process a corrupt extension
    * Fix: wikihelp – do not check for existence of the localized
    help. In case we do not have the help installed, it is up to the
    online service to decide the fallback in case a language version is
    not available.
    * Fix README: change su urpmi to sudo urpmi for Mandriva section
    * Fix README formatting –
    https://bugs.freedesktop.org/show_bug.cgi?id=32741 – using CRLF
    instead of LF on WIN platform
    * Fix README: word wrap at column 75 for better readability
    * Build fix: KDE3 library search order
    (https://bugs.freedesktop.org/show_bug.cgi?id=32797). Use LINKFLAGS
    instead of STDLIBS.
    * Start using technical.dic instead of oracle.dic
    (https://bugs.freedesktop.org/show_bug.cgi?id=31798)
    * Build fix: add explicit QRegion* for clipRegion to fix compile of
    kde backend
    * Cleanup: removed obsolete m_bSingleAltPress
    * Remove the menu when Left Alt Key was pressed for GTK
    * Fix for https://bugs.freedesktop.org/show_bug.cgi?id=33459: use
    year of era in long format for zh_TW by default
    * Fix wrong collation for Catalan language
    * Fix for https://bugs.freedesktop.org/show_bug.cgi?id=31271 wrong
    line break with “(”
    * Fix for https://bugs.freedesktop.org/show_bug.cgi?id=32561 – crash
    when iterating over the database types.
    * Default currency for Estonia should be Euro – fixes
    https://bugs.freedesktop.org/show_bug.cgi?id=33160
    * Avoid a pointless GetHelpText() call in the toolbox. Fixes
    https://bugs.freedesktop.org/show_bug.cgi?id=33315. GetHelpText()
    can be quite heavy, see
    https://bugs.freedesktop.org/show_bug.cgi?id=33088.
    * Paint toolbar handle positioned properly
    (https://bugs.freedesktop.org/show_bug.cgi?id=32558)
    * Build fix: move cxxabi.h after stl headers to workaround gcc 4.6.0
    and stlport
    * Fix for https://bugs.freedesktop.org/show_bug.cgi?id=33355
    manipulate also the C runtime’s environment
    * Fix for CTL/Other Default Font #i25247#, #i25561#, #i48064#,
    #i92341#
    * RTF export crasher
    (https://bugzilla.novell.com/show_bug.cgi?id=656503)
    * Fixed an infinite loop in RTF exporter
    * UI: translations need more space on word count dialog, made space
    for it.
    * Fix for https://bugzilla.novell.com/show_bug.cgi?id=660816 improve
    formfield checkbox binary export (and import)

    Again a BIG Thank You!

    Again whats Libre Office

    What does LibreOffice give you?

    Writer is the word processor inside LibreOffice. Use it for everything, from dashing off a quick letter to producing an entire book with tables of contents, embedded illustrations, bibliographies and diagrams. The while-you-type auto-completion, auto-formatting and automatic spelling checking make difficult tasks easy (but are easy to disable if you prefer). Writer is powerful enough to tackle desktop publishing tasks such as creating multi-column newsletters and brochures. The only limit is your imagination.

    Calc tames your numbers and helps with difficult decisions when you’re weighing the alternatives. Analyze your data with Calc and then use it to present your final output. Charts and analysis tools help bring transparency to your conclusions. A fully-integrated help system makes easier work of entering complex formulas. Add data from external databases such as SQL or Oracle, then sort and filter them to produce statistical analyses. Use the graphing functions to display large number of 2D and 3D graphics from 13 categories, including line, area, bar, pie, X-Y, and net – with the dozens of variations available, you’re sure to find one that suits your project.

    Impress is the fastest and easiest way to create effective multimedia presentations. Stunning animation and sensational special effects help you convince your audience. Create presentations that look even more professional than the standard presentations you commonly see at work. Get your collegues’ and bosses’ attention by creating something a little bit different.

    Draw lets you build diagrams and sketches from scratch. A picture is worth a thousand words, so why not try something simple with box and line diagrams? Or else go further and easily build dynamic 3D illustrations and special effects. It’s as simple or as powerful as you want it to be.

    Base is the database front-end of the LibreOffice suite. With Base, you can seamlessly integrate your existing database structures into the other components of LibreOffice, or create an interface to use and administer your data as a stand-alone application. You can use imported and linked tables and queries from MySQL, PostgreSQL or Microsoft Access and many other data sources, or design your own with Base, to build powerful front-ends with sophisticated forms, reports and views. Support is built-in or easily addable for a very wide range of database products, notably the standardly-provided HSQL, MySQL, Adabas D, Microsoft Access and PostgreSQL.

    Math is a simple equation editor that lets you lay-out and display your mathematical, chemical, electrical or scientific equations quickly in standard written notation. Even the most-complex calculations can be understandable when displayed correctly. E=mc2.

    LibreOffice also comes configured with a PDF file creator, meaning you can distribute documents that you’re sure can be opened and read by users of almost any computing device or operating system.

    Download LibreOffice now and try it out today.

    http://www.libreoffice.org/features/

     

    Lyx Releases 2

    Ubuntu Login
    Image via Wikipedia

    Lyx releases new version- now if only there was a SIMPLE way to put R code in a Lyx existing text class (having tried Sweave and sweaved myself into knots ! 😦

    and I hope Ubuntu Linux 10.10  netbook fixes the curious case of disappearing menu bar in Lyx

    see https://bugs.launchpad.net/ubuntu/+source/indicator-appmenu/+bug/619811

    (Hint start Lyx using from the terminal:
    QT_X11_NO_NATIVE_MENUBAR=1 lyx)

    Latest News from the

    http://www.lyx.org/News#item2

    We are pleased to announce the release of LyX 1.6.9

     

    Beta Release: LyX 2.0.0 beta 4 released.

    February 6, 2011

    We are pleased to announce the fourth public pre-release of LyX 2.0.0.
    Except usual bugfixing we fixed random crashes connected with the new background export and compilation feature.

    As far as new features is considered it is now possible

    • to set the table width,
    • customize the language package per document,
    • export LyX files as a single archive containing linked material (e.g. images) directly via export menu.

     

    Since this is most probably the last beta release we also added convertor for old (1.6) preference files which are automatically checked on the startup now.

     

    Windows Azure and Amazon Free offer

    Simple Cpu Cache Memory Organization
    Image via Wikipedia

    For Hi-Computing folks try out Azure for free-

    http://www.microsoft.com/windowsazure/offers/popup/popup.aspx?lang=en&locale=en-US&offer=MS-AZR-0001P#compute

    Windows Azure Platform
    Introductory Special

    This promotional offer enables you to try a limited amount of the Windows Azure platform at no charge. The subscription includes a base level of monthly compute hours, storage, data transfers, a SQL Azure database, Access Control transactions and Service Bus connections at no charge. Please note that any usage over this introductory base level will be charged at standard rates.

    Included each month at no charge:

    • Windows Azure
      • 25 hours of a small compute instance
      • 500 MB of storage
      • 10,000 storage transactions
    • SQL Azure
      • 1GB Web Edition database (available for first 3 months only)
    • Windows Azure platform AppFabric
      • 100,000 Access Control transactions
      • 2 Service Bus connections
    • Data Transfers (per region)
      • 500 MB in
      • 500 MB out

    Any monthly usage in excess of the above amounts will be charged at the standard rates. This introductory special will end on March 31, 2011 and all usage will then be charged at the standard rates.

    Standard Rates:

    Windows Azure

    • Compute*
      • Extra small instance**: $0.05 per hour
      • Small instance (default): $0.12 per hour
      • Medium instance: $0.24 per hour
      • Large instance: $0.48 per hour
      • Extra large instance: $0.96 per hour

     

    http://aws.amazon.com/ec2/pricing/

    Free Tier*

    As part of AWS’s Free Usage Tier, new AWS customers can get started with Amazon EC2 for free. Upon sign-up, new AWScustomers receive the following EC2 services each month for one year:

    • 750 hours of EC2 running Linux/Unix Micro instance usage
    • 750 hours of Elastic Load Balancing plus 15 GB data processing
    • 10 GB of Amazon Elastic Block Storage (EBS) plus 1 million IOs, 1 GB snapshot storage, 10,000 snapshot Get Requests and 1,000 snapshot Put Requests
    • 15 GB of bandwidth in and 15 GB of bandwidth out aggregated across all AWS services

     

    Paid Instances-

     

    Standard On-Demand Instances Linux/UNIX Usage Windows Usage
    Small (Default) $0.085 per hour $0.12 per hour
    Large $0.34 per hour $0.48 per hour
    Extra Large $0.68 per hour $0.96 per hour
    Micro On-Demand Instances
    Micro $0.02 per hour $0.03 per hour
    High-Memory On-Demand Instances
    Extra Large $0.50 per hour $0.62 per hour
    Double Extra Large $1.00 per hour $1.24 per hour
    Quadruple Extra Large $2.00 per hour $2.48 per hour
    High-CPU On-Demand Instances
    Medium $0.17 per hour $0.29 per hour
    Extra Large $0.68 per hour $1.16 per hour
    Cluster Compute Instances
    Quadruple Extra Large $1.60 per hour N/A*
    Cluster GPU Instances
    Quadruple Extra Large $2.10 per hour N/A*
    * Windows is not currently available for Cluster Compute or Cluster GPU Instances.

     

    NOTE- Amazon Instance definitions differ slightly from Azure definitions

    http://aws.amazon.com/ec2/instance-types/

    Available Instance Types

    Standard Instances

    Instances of this family are well suited for most applications.

    Small Instance – default*

    1.7 GB memory
    1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
    160 GB instance storage
    32-bit platform
    I/O Performance: Moderate
    API name: m1.small

    Large Instance

    7.5 GB memory
    4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
    850 GB instance storage
    64-bit platform
    I/O Performance: High
    API name: m1.large

    Extra Large Instance

    15 GB memory
    8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
    1,690 GB instance storage
    64-bit platform
    I/O Performance: High
    API name: m1.xlarge

    Micro Instances

    Instances of this family provide a small amount of consistent CPU resources and allow you to burst CPU capacity when additional cycles are available. They are well suited for lower throughput applications and web sites that consume significant compute cycles periodically.

    Micro Instance

    613 MB memory
    Up to 2 EC2 Compute Units (for short periodic bursts)
    EBS storage only
    32-bit or 64-bit platform
    I/O Performance: Low
    API name: t1.micro

    High-Memory Instances

    Instances of this family offer large memory sizes for high throughput applications, including database and memory caching applications.

    High-Memory Extra Large Instance

    17.1 GB of memory
    6.5 EC2 Compute Units (2 virtual cores with 3.25 EC2 Compute Units each)
    420 GB of instance storage
    64-bit platform
    I/O Performance: Moderate
    API name: m2.xlarge

    High-Memory Double Extra Large Instance

    34.2 GB of memory
    13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each)
    850 GB of instance storage
    64-bit platform
    I/O Performance: High
    API name: m2.2xlarge

    High-Memory Quadruple Extra Large Instance

    68.4 GB of memory
    26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each)
    1690 GB of instance storage
    64-bit platform
    I/O Performance: High
    API name: m2.4xlarge

    High-CPU Instances

    Instances of this family have proportionally more CPU resources than memory (RAM) and are well suited for compute-intensive applications.

    High-CPU Medium Instance

    1.7 GB of memory
    5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute Units each)
    350 GB of instance storage
    32-bit platform
    I/O Performance: Moderate
    API name: c1.medium

    High-CPU Extra Large Instance

    7 GB of memory
    20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
    1690 GB of instance storage
    64-bit platform
    I/O Performance: High
    API name: c1.xlarge

    Cluster Compute Instances

    Instances of this family provide proportionally high CPU resources with increased network performance and are well suited for High Performance Compute (HPC) applications and other demanding network-bound applications. Learn more about use of this instance type for HPC applications.

    Cluster Compute Quadruple Extra Large Instance

    23 GB of memory
    33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
    1690 GB of instance storage
    64-bit platform
    I/O Performance: Very High (10 Gigabit Ethernet)
    API name: cc1.4xlarge

    Cluster GPU Instances

    Instances of this family provide general-purpose graphics processing units (GPUs) with proportionally high CPU and increased network performance for applications benefitting from highly parallelized processing, including HPC, rendering and media processing applications. While Cluster Compute Instances provide the ability to create clusters of instances connected by a low latency, high throughput network, Cluster GPU Instances provide an additional option for applications that can benefit from the efficiency gains of the parallel computing power of GPUs over what can be achieved with traditional processors. Learn moreabout use of this instance type for HPC applications.

    Cluster GPU Quadruple Extra Large Instance

    22 GB of memory
    33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
    2 x NVIDIA Tesla “Fermi” M2050 GPUs
    1690 GB of instance storage
    64-bit platform
    I/O Performance: Very High (10 Gigabit Ethernet)
    API name: cg1.4xlarge

    versus-

    Windows Azure compute instances come in five unique sizes to enable complex applications and workloads.

    Compute Instance Size CPU Memory Instance Storage I/O Performance
    Extra Small 1 GHz 768 MB 20 GB* Low
    Small 1.6 GHz 1.75 GB 225 GB Moderate
    Medium 2 x 1.6 GHz 3.5 GB 490 GB High
    Large 4 x 1.6 GHz 7 GB 1,000 GB High
    Extra large 8 x 1.6 GHz 14 GB 2,040 GB High

    *There is a limitation on the Virtual Hard Drive (VHD) size if you are deploying a Virtual Machine role on an extra small instance. The VHD can only be up to 15 GB.

     

     

    Challenges of Analyzing a dataset (with R)

    GIF-animation showing a moving echocardiogram;...
    Image via Wikipedia

    Analyzing data can have many challenges associated with it. In the case of business analytics data, these challenges or constraints can have a marked effect on the quality and timeliness of the analysis as well as the expected versus actual payoff from the analytical results.

    Challenges of Analytical Data Processing-

    1) Data Formats- Reading in complete data, without losing any part (or meta data), or adding in superfluous details (that increase the scope). Technical constraints of data formats are relatively easy to navigate thanks to ODBC and well documented and easily search-able syntax and language.

    The costs of additional data augmentation (should we pay for additional credit bureau data to be appended) , time of storing and processing the data (every column needed for analysis can add in as many rows as whole dataset, which can be a time enhancing problem if you are considering an extra 100 variables with a few million rows), but above all that of business relevance and quality guidelines will ensure basic data input and massaging are considerable parts of whole analytical project timeline.

    2) Data Quality-Perfect data exists in a perfect world. The price of perfect information is one business will mostly never budget or wait for. To deliver inferences and results based on summaries of data which has missing, invalid, outlier data embedded within it makes the role of an analyst just as important as which ever tool is chosen to remove outliers, replace missing values, or treat invalid data.

    3) Project Scope-

    How much data? How much Analytical detail versus High Level Summary? Timelines for delivery as well as refresh of data analysis? Checks (statistical as well as business)?

    How easy is it to load and implement the new analysis in existing Information Technology Infrastructure? These are some of the outer parameters that can limit both your analytical project scope, your analytical tool choice, and your processing methodology.
    4) Output Results vis a vis stakeholder expectation management-

    Stakeholders like to see results, not constraints, hypothesis ,assumptions , p-value, or chi -square value. Output results need to be streamlined to a decision management process to justify the investment of human time and effort in an analytical project, choice,training and navigating analytical tool complexities and constraints are subset of it. Optimum use of graphical display is a part of aligning results to a more palatable form to stakeholders, provided graphics are done nicely.

    Eg Marketing wants to get more sales so they need a clear campaign, to target certain customers via specific channels with specified collateral. In order to base their business judgement, business analytics needs to validate , cross validate and sometimes invalidate this business decision making with clear transparent methods and processes.

    Given a dataset- the basic analytical steps that an analyst will do with R are as follows. This is meant as a note for analysts at a beginner level with R.

    Package -specific syntax

    update.packages() #This updates all packages
    install.packages(package1) #This installs a package locally, a one time event
    library(package1) #This loads a specified package in the current R session, which needs to be done every R session

    CRAN________LOCAL HARD DISK_________R SESSION is the top to bottom hierarchy of package storage and invocation.

    ls() #This lists all objects or datasets currently active in the R session

    > names(assetsCorr)  #This gives the names of variables within a dataframe
    [1] “AssetClass”            “LargeStocksUS”         “SmallStocksUS”
    [4] “CorporateBondsUS”      “TreasuryBondsUS”       “RealEstateUS”
    [7] “StocksCanada”          “StocksUK”              “StocksGermany”
    [10] “StocksSwitzerland”     “StocksEmergingMarkets”

    > str(assetsCorr) #gives complete structure of dataset
    ‘data.frame’:    12 obs. of  11 variables:
    $ AssetClass           : Factor w/ 12 levels “CorporateBondsUS”,..: 4 5 2 6 1 12 3 7 11 9 …
    $ LargeStocksUS        : num  15.3 16.4 1 0 0 …
    $ SmallStocksUS        : num  13.49 16.64 0.66 1 0 …
    $ CorporateBondsUS     : num  9.26 6.74 0.38 0.46 1 0 0 0 0 0 …
    $ TreasuryBondsUS      : num  8.44 6.26 0.33 0.27 0.95 1 0 0 0 0 …
    $ RealEstateUS         : num  10.6 17.32 0.08 0.59 0.35 …
    $ StocksCanada         : num  10.25 19.78 0.56 0.53 -0.12 …
    $ StocksUK             : num  10.66 13.63 0.81 0.41 0.24 …
    $ StocksGermany        : num  12.1 20.32 0.76 0.39 0.15 …
    $ StocksSwitzerland    : num  15.01 20.8 0.64 0.43 0.55 …
    $ StocksEmergingMarkets: num  16.5 36.92 0.3 0.6 0.12 …

    > dim(assetsCorr) #gives dimensions observations and variable number
    [1] 12 11

    str(Dataset) – This gives the structure of the dataset (note structure gives both the names of variables within dataset as well as dimensions of the dataset)

    head(dataset,n1) gives the first n1 rows of dataset while
    tail(dataset,n2) gives the last n2 rows of a dataset where n1,n2 are numbers and dataset is the name of the object (here a data frame that is being considered)

    summary(dataset) gives you a brief summary of all variables while

    library(Hmisc)
    describe(dataset) gives a detailed description on the variables

    simple graphics can be given by

    hist(Dataset1)
    and
    plot(Dataset1)

    As you can see in above cases, there are multiple ways to get even basic analysis about data in R- however most of the syntax commands are intutively understood (like hist for histogram, t.test for t test, plot for plot).

    For detailed analysis throughout the scope of analysis, for a business analytics user it is recommended to using multiple GUI, and multiple packages. Even for highly specific and specialized analytical tasks it is recommended to check for a GUI that incorporates the required package.