send email by R

For automated report delivery I have often used send email options in BASE SAS. For R, for scheduling tasks and sending me automated mails on completion of tasks I have two R options and 1 Windows OS scheduling option. Note red font denotes the parameters that should be changed. Anything else should NOT be changed.

Option 1-

Use the mail package at

http://cran.r-project.org/web/packages/mail/mail.pdf

> library(mail)

Attaching package: ‘mail’

The following object(s) are masked from ‘package:sendmailR’:

sendmail

>
> sendmail(“ohri2007@gmail.com“, subject=”Notification from R“,message=“Calculation finished!”, password=”rmail”)
[1] “Message was sent to ohri2007@gmail.com! You have 19 messages left.”

Disadvantage- Only 20 email messages by IP address per day. (but thats ok!)

Option 2-

use sendmailR package at http://cran.r-project.org/web/packages/sendmailR/sendmailR.pdf

install.packages()
library(sendmailR)
from <- sprintf(“<sendmailR@%s>”, Sys.info()[4])
to <- “<ohri2007@gmail.com>”
subject <- “Hello from R
body <- list(“It works!”, mime_part(iris))
sendmail(from, to, subject, body,control=list(smtpServer=”ASPMX.L.GOOGLE.COM”))

 

 

BiocInstaller version 1.2.1, ?biocLite for help
> install.packages(“sendmailR”)
Installing package(s) into ‘/home/ubuntu/R/library’
(as ‘lib’ is unspecified)
also installing the dependency ‘base64’

trying URL ‘http://cran.at.r-project.org/src/contrib/base64_1.1.tar.gz&#8217;
Content type ‘application/x-gzip’ length 61109 bytes (59 Kb)
opened URL
==================================================
downloaded 59 Kb

trying URL ‘http://cran.at.r-project.org/src/contrib/sendmailR_1.1-1.tar.gz&#8217;
Content type ‘application/x-gzip’ length 6399 bytes
opened URL
==================================================
downloaded 6399 bytes

BiocInstaller version 1.2.1, ?biocLite for help
* installing *source* package ‘base64’ …
** package ‘base64’ successfully unpacked and MD5 sums checked
** libs
gcc -std=gnu99 -I/usr/local/lib64/R/include -I/usr/local/include -fpic -g -O2 -c base64.c -o base64.o
gcc -std=gnu99 -shared -L/usr/local/lib64 -o base64.so base64.o -L/usr/local/lib64/R/lib -lR
installing to /home/ubuntu/R/library/base64/libs
** R
** preparing package for lazy loading
** help
*** installing help indices
** building package indices …
** testing if installed package can be loaded
BiocInstaller version 1.2.1, ?biocLite for help

* DONE (base64)
BiocInstaller version 1.2.1, ?biocLite for help
* installing *source* package ‘sendmailR’ …
** package ‘sendmailR’ successfully unpacked and MD5 sums checked
** R
** preparing package for lazy loading
** help
*** installing help indices
** building package indices …
** testing if installed package can be loaded
BiocInstaller version 1.2.1, ?biocLite for help

* DONE (sendmailR)

The downloaded packages are in
‘/tmp/RtmpsM222s/downloaded_packages’
> library(sendmailR)
Loading required package: base64
> from <- sprintf(“<sendmailR@%s>”, Sys.info()[4])
> to <- “<ohri2007@gmail.com>”
> subject <- “Hello from R”
> body <- list(“It works!”, mime_part(iris))
> sendmail(from, to, subject, body,
+ control=list(smtpServer=”ASPMX.L.GOOGLE.COM”))
$code
[1] “221”

$msg
[1] “2.0.0 closing connection ff2si17226764qab.40”

Disadvantage-This worked when I used the Amazon Cloud using the BioConductor AMI (for free 2 hours) at http://www.bioconductor.org/help/cloud/

It did NOT work when I tried it use it from my Windows 7 Home Premium PC from my Indian ISP (!!) .

It gave me this error

or in wait_for(250) :
SMTP Error: 5.7.1 [180.215.172.252] The IP you’re using to send mail is not authorized

 

PAUSE–

ps Why do this (send email by R)?

Note you can add either of the two programs of the end of the code that you want to be notified automatically. (like daily tasks)

This is mostly done for repeated business analytics tasks (like reports and analysis that need to be run at specific periods of time)

pps- What else can I do with this?

Can be modified to include sms or tweets  or even blog by email by modifying the   “to”  location appropriately.

3) Using Windows Task Scheduler to run R codes automatically (either the above)

or just sending an email

got to Start>  All Programs > Accessories >System Tools > Task Scheduler ( or by default C:Windowssystem32taskschd.msc)

Create a basic task

Now you can use this to run your daily/or scheduled R code  or you can send yourself email as well.

and modify the parameters- note the SMTP server (you can use the ones for google in example 2 at ASPMX.L.GOOGLE.COM)

and check if it works!

 

Related

 Geeky Things , Bro

Configuring IIS on your Windows 7 Home Edition-

note path to do this is-

Control Panel>All Control Panel Items> Program and Features>Turn Windows features on or off> Internet Information Services

and

http://stackoverflow.com/questions/709635/sending-mail-from-batch-file

 

R for Predictive Modeling- PAW Toronto

A nice workshop on using R for Predictive Modeling by Max Kuhn Director, Nonclinical Statistics, Pfizer is on at PAW Toronto.

Workshop

Monday, April 23, 2012 in Toronto
Full-day: 9:00am – 4:30pm

R for Predictive Modeling:
A Hands-On Introduction

Intended Audience: Practitioners who wish to learn how to execute on predictive analytics by way of the R language; anyone who wants “to turn ideas into software, quickly and faithfully.”

Knowledge Level: Either hands-on experience with predictive modeling (without R) or hands-on familiarity with any programming language (other than R) is sufficient background and preparation to participate in this workshop.


What prior attendees have exclaimed


Workshop Description

This one-day session provides a hands-on introduction to R, the well-known open-source platform for data analysis. Real examples are employed in order to methodically expose attendees to best practices driving R and its rich set of predictive modeling packages, providing hands-on experience and know-how. R is compared to other data analysis platforms, and common pitfalls in using R are addressed.

The instructor, a leading R developer and the creator of CARET, a core R package that streamlines the process for creating predictive models, will guide attendees on hands-on execution with R, covering:

  • A working knowledge of the R system
  • The strengths and limitations of the R language
  • Preparing data with R, including splitting, resampling and variable creation
  • Developing predictive models with R, including decision trees, support vector machines and ensemble methods
  • Visualization: Exploratory Data Analysis (EDA), and tools that persuade
  • Evaluating predictive models, including viewing lift curves, variable importance and avoiding overfitting

Hardware: Bring Your Own Laptop
Each workshop participant is required to bring their own laptop running Windows or OS X. The software used during this training program, R, is free and readily available for download.

Attendees receive an electronic copy of the course materials and related R code at the conclusion of the workshop.


Schedule

  • Workshop starts at 9:00am
  • Morning Coffee Break at 10:30am – 11:00am
  • Lunch provided at 12:30 – 1:15pm
  • Afternoon Coffee Break at 2:30pm – 3:00pm
  • End of the Workshop: 4:30pm

Instructor

Max Kuhn, Director, Nonclinical Statistics, Pfizer

Max Kuhn is a Director of Nonclinical Statistics at Pfizer Global R&D in Connecticut. He has been applying models in the pharmaceutical industries for over 15 years.

He is a leading R developer and the author of several R packages including the CARET package that provides a simple and consistent interface to over 100 predictive models available in R.

Mr. Kuhn has taught courses on modeling within Pfizer and externally, including a class for the India Ministry of Information Technology.

Source-

http://www.predictiveanalyticsworld.com/toronto/2012/r_for_predictive_modeling.php

How to use Bit Torrents

I really liked the software Qbittorent available from http://www.qbittorrent.org/ I think bit torrents should be the default way of sharing huge content especially software downloads. For protecting intellectual property there should be much better codes and software keys than presently available.

The qBittorrent project aims to provide a Free Software alternative to µtorrent. Additionally, qBittorrent runs and provides the same features on all major platforms (Linux, Mac OS X, Windows, OS/2, FreeBSD).

qBittorrent is based on Qt4 toolkit and libtorrent-rasterbar.

qBittorrent v2 Features

  • Polished µTorrent-like User Interface
  • Well-integrated and extensible Search Engine
    • Simultaneous search in most famous BitTorrent search sites
    • Per-category-specific search requests (e.g. Books, Music, Movies)
  • All Bittorrent extensions
    • DHT, Peer Exchange, Full encryption, Magnet/BitComet URIs, …
  • Remote control through a Web user interface
    • Nearly identical to the regular UI, all in Ajax
  • Advanced control over trackers, peers and torrents
    • Torrents queueing and prioritizing
    • Torrent content selection and prioritizing
  • UPnP / NAT-PMP port forwarding support
  • Available in ~25 languages (Unicode support)
  • Torrent creation tool
  • Advanced RSS support with download filters (inc. regex)
  • Bandwidth scheduler
  • IP Filtering (eMule and PeerGuardian compatible)
  • IPv6 compliant
  • Sequential downloading (aka “Download in order”)
  • Available on most platforms: Linux, Mac OS X, Windows, OS/2, FreeBSD
So if you are new to Bit Torrents- here is a brief tutorial
Some terminology from

Tracker

tracker is a server that keeps track of which seeds and peers are in the swarm.

Seed

Seed is used to refer to a peer who has 100% of the data. When a leech obtains 100% of the data, that peer automatically becomes a Seed.

Peer

peer is one instance of a BitTorrent client running on a computer on the Internet to which other clients connect and transfer data.

Leech

leech is a term with two meanings. Primarily leech (or leeches) refer to a peer (or peers) who has a negative effect on the swarm by having a very poor share ratio (downloading much more than they upload, creating a ratio less than 1.0)
1) Download and install the software from  http://www.qbittorrent.org/
2) If you want to search for new files, you can use the nice search features in here
3) If you want to CREATE new bit torrents- go to Tools -Torrent Creator
4) For sharing content- just seed the torrent you just created. What is seeding – hey did you read the terminology in the beginning?
5) Additionally –
From

Trackers: Below are some popular public trackers. They are servers which help peers to communicate.

Here are some good trackers you can use:

 

http://open.tracker.thepiratebay.org/announce
http://www.torrent-downloads.to:2710/announce
http://denis.stalker.h3q.com:6969/announce
udp://denis.stalker.h3q.com:6969/announce
http://www.sumotracker.com/announce

and

Super-seeding

When a file is new, much time can be wasted because the seeding client might send the same file piece to many different peers, while other pieces have not yet been downloaded at all. Some clients, like ABCVuzeBitTornado, TorrentStorm, and µTorrent have a “super-seed” mode, where they try to only send out pieces that have never been sent out before, theoretically making the initial propagation of the file much faster. However the super-seeding becomes less effective and may even reduce performance compared to the normal “rarest first” model in cases where some peers have poor or limited connectivity. This mode is generally used only for a new torrent, or one which must be re-seeded because no other seeds are available.
Note- you use this tutorial and any or all steps at your own risk. I am not legally responsible for any mishaps you get into. Please be responsible while being an efficient bit tor renter. That means respecting individual property rights.

Using Two Operating Systems for RATTLE, #Rstats Data Mining GUI

Using a virtual partition is slightly better than using a dual boot system. That is because you can keep the specialized operating system (usually Linux) within the main operating system (usually Windows), browse and alternate between the two operating system just using a simple command, and can utilize the advantages of both operating system.

Also you can create project specific discs for enhanced security.

In my (limited ) Mac experience, the comparisons of each operating system are-

1) Mac-  Both robust and aesthetically designed OS, the higher price and hardware-lockin for Mac remains a disadvantage. Also many stats and analytical software just wont work on the Mac

2) Windows- It is cheaper than Mac and easier to use than Linux. Also has the most compatibility with applications (usually when not crashing)

3) Linux- The lightest and most customized software in the OS class, free to use, and has many lite versions for newbies. Not compatible with mainstream corporate IT infrastructure as of 2011.

I personally use VMWare Player for creating the virtual disk (as much more convenient than the wubi.exe method)  from http://www.vmware.com/support/product-support/player/  (and downloadable from http://downloads.vmware.com/d/info/desktop_downloads/vmware_player/3_0)

That enables me to use Ubuntu on the alternative OS- keeping my Windows 7 for some Windows specific applications . For software like Rattle, the R data mining GUI , it helps to use two operating systems, in view of difficulties in GTK+.

Installing Rattle on Windows 7 is a major pain thanks to backward compatibility issues and version issues of GTK, but it installs on Ubuntu like a breeze- and it is very very convenient to switch between the two operating systems

Download Rattle from http://rattle.togaware.com/ and test it on the dual OS arrangement to see yourself.

 

 

 

 

 

US-CERT Incident Reporting System

Here are some resources if your cyber resources have been breached. Note the form doesnot use CAPTCHA at all

US-CERT Incident Reporting System (their head Randy Vickers quit last week)

https://forms.us-cert.gov/report/

Using the US-CERT Incident Reporting SystemIn order for us to respond appropriately, please answer the questions as completely and accurately as possible. Questions that must be answered are labeled “Required”. As always, we will protect your sensitive information. This web site uses Secure Sockets Layer (SSL) to provide secure communications. Your browser must allow at least 40-bit encryption. This method of communication is much more secure than unencrypted email.  Continue reading “US-CERT Incident Reporting System”

Top 25 Errors in Programming that lead to hacker attacks

I am elaborating an earlier article on https://decisionstats.com/top-25-most-dangerous-software-errors/ based on my continued research into cyber conflict and strategy. My inputs are in italics – the rest is a condensed article for further thought.

This is thus a very useful initiative for the world to follow and upgrade their cyber security.

It is in accordance with the US policy to secure its cyber infrastructure (http://www.whitehouse.gov/the-press-office/remarks-president-securing-our-nations-cyber-infrastructure)  and countries like India, and even Europe as well as other nations could do well to atleast benchmark their own security practices in software and digital infrastructure with it. There seems to much better technical coordination between rogue hackers than patriotic hackers imho 😉


The Department of Homeland Security of the United States of America has just launched a list of top 25 errors in programming or creating software that increase vulnerability to hacking attacks. The list which is available at http://cwe.mitre.org/top25/index.html lists down a methodology fo measuring vulnerability called Common Weakness Scoring System (CWSS) and uses that score to rank the various errors as well as suggestions to eliminate these weaknesses or errors.
Measuring Weaknesses

The importance of a weakness (that arises due to software bugs) may vary depending on business usage or project implementation, the technologies , operating systems and computing environments in use, and the risk or threat perception.The Common Weakness Scoring System (CWSS) provides a mechanism for scoring weaknesses. and provides a framework for prioritizing security errors (“weaknesses”) that are discovered in software applications.
Identifying Weaknesses
For example the number 1 weakness is shown with
1CWE-89: Improper Neutralization of Special Elements used in an SQL Command (‘SQL Injection’).
The rest of the weaknesses are

RANK SCORE ID NAME
[1] 93.8 CWE-89 Improper Neutralization of Special Elements used in an SQL Command (‘SQL Injection’)
[2] 83.3 CWE-78 Improper Neutralization of Special Elements used in an OS Command (‘OS Command Injection’)
[3] 79.0 CWE-120 Buffer Copy without Checking Size of Input (‘Classic Buffer Overflow’)
[4] 77.7 CWE-79 Improper Neutralization of Input During Web Page Generation (‘Cross-site Scripting’)
[5] 76.9 CWE-306 Missing Authentication for Critical Function
[6] 76.8 CWE-862 Missing Authorization
[7] 75.0 CWE-798 Use of Hard-coded Credentials
[8] 75.0 CWE-311 Missing Encryption of Sensitive Data
[9] 74.0 CWE-434 Unrestricted Upload of File with Dangerous Type
[10] 73.8 CWE-807 Reliance on Untrusted Inputs in a Security Decision
[11] 73.1 CWE-250 Execution with Unnecessary Privileges
[12] 70.1 CWE-352 Cross-Site Request Forgery (CSRF)
[13] 69.3 CWE-22 Improper Limitation of a Pathname to a Restricted Directory (‘Path Traversal’)
[14] 68.5 CWE-494 Download of Code Without Integrity Check
[15] 67.8 CWE-863 Incorrect Authorization
[16] 66.0 CWE-829 Inclusion of Functionality from Untrusted Control Sphere
[17] 65.5 CWE-732 Incorrect Permission Assignment for Critical Resource
[18] 64.6 CWE-676 Use of Potentially Dangerous Function
[19] 64.1 CWE-327 Use of a Broken or Risky Cryptographic Algorithm
[20] 62.4 CWE-131 Incorrect Calculation of Buffer Size
[21] 61.5 CWE-307 Improper Restriction of Excessive Authentication Attempts
[22] 61.1 CWE-601 URL Redirection to Untrusted Site (‘Open Redirect’)
[23] 61.0 CWE-134 Uncontrolled Format String
[24] 60.3 CWE-190 Integer Overflow or Wraparound
[25] 59.9 CWE-759 Use of a One-Way Hash without a Salt
Details of each weakness is given by http://cwe.mitre.org/top25/index.html#Details
It includes Summary , Weakness Prevalence, Consequences, Remediation Cost, Ease of Detection ,Attacker Awareness and Attack Frequency .In addition the following sections describe each software vulnerability in detail- Technical Details ,Code Examples ,Detection Methods ,References,Prevention and Mitigation, Related CWEs and Related Attack Patterns.
Other important software weaknesses are –

[26] CWE-770: Allocation of Resources Without Limits or Throttling
[27] CWE-129: Improper Validation of Array Index
[28] CWE-754: Improper Check for Unusual or Exceptional Conditions
[29] CWE-805: Buffer Access with Incorrect Length Value
[30] CWE-838: Inappropriate Encoding for Output Context
[31] CWE-330: Use of Insufficiently Random Values
[32] CWE-822: Untrusted Pointer Dereference
[33] CWE-362: Concurrent Execution using Shared Resource with Improper Synchronization (‘Race Condition’)
[34] CWE-212: Improper Cross-boundary Removal of Sensitive Data
[35] CWE-681: Incorrect Conversion between Numeric Types
[36] CWE-476: NULL Pointer Dereference
[37] CWE-841: Improper Enforcement of Behavioral Workflow
[38] CWE-772: Missing Release of Resource after Effective Lifetime
[39] CWE-209: Information Exposure Through an Error Message
[40] CWE-825: Expired Pointer Dereference
[41] CWE-456: Missing Initialization
Mitigating Weaknesses
Here is an example of the new matrix for migrations that also list the top 25 errors . This thus shows a way to fix the weaknesses and relative impact on each weakness by the following mitigations.
http://cwe.mitre.org/top25/mitigations.html#MitigationMatrix

Effectiveness ratings include:

  • High: The mitigation has well-known, well-understood strengths and limitations; there is good coverage with respect to variations of the weakness.
  • Moderate: The mitigation will prevent the weakness in multiple forms, but it does not have complete coverage of the weakness.
  • Limited: The mitigation may be useful in limited circumstances, only be applicable to a subset of this weakness type, require extensive training/customization, or give limited visibility.
  • Defense in Depth (DiD): The mitigation may not necessarily prevent the weakness, but it may help to minimize the potential impact when an attacker exploits the weakness.

Within the matrix, the following mitigations are identified:

 

  • M1: Establish and maintain control over all of your inputs.
  • M2: Establish and maintain control over all of your outputs.
  • M3: Lock down your environment.
  • M4: Assume that external components can be subverted, and your code can be read by anyone.
  • M5: Use industry-accepted security features instead of inventing your own.

The following general practices are omitted from the matrix:

  • GP1: Use libraries and frameworks that make it easier to avoid introducing weaknesses.
  • GP2: Integrate security into the entire software development lifecycle.
  • GP3: Use a broad mix of methods to comprehensively find and prevent weaknesses.
  • GP4: Allow locked-down clients to interact with your software.

 

M1 M2 M3 M4 M5 CWE
High DiD Mod CWE-22: Improper Limitation of a Pathname to a Restricted Directory (‘Path Traversal’)
Mod High DiD Ltd CWE-78: Improper Neutralization of Special Elements used in an OS Command (‘OS Command Injection’)
Mod High Ltd CWE-79: Improper Neutralization of Input During Web Page Generation (‘Cross-site Scripting’)
Mod High DiD Ltd CWE-89: Improper Neutralization of Special Elements used in an SQL Command (‘SQL Injection’)
Mod DiD Ltd CWE-120: Buffer Copy without Checking Size of Input (‘Classic Buffer Overflow’)
Mod DiD Ltd CWE-131: Incorrect Calculation of Buffer Size
High DiD Mod CWE-134: Uncontrolled Format String
Mod DiD Ltd CWE-190: Integer Overflow or Wraparound
High CWE-250: Execution with Unnecessary Privileges
Mod Mod CWE-306: Missing Authentication for Critical Function
Mod CWE-307: Improper Restriction of Excessive Authentication Attempts
DiD CWE-311: Missing Encryption of Sensitive Data
High CWE-327: Use of a Broken or Risky Cryptographic Algorithm
Ltd CWE-352: Cross-Site Request Forgery (CSRF)
Mod DiD Mod CWE-434: Unrestricted Upload of File with Dangerous Type
DiD CWE-494: Download of Code Without Integrity Check
Mod Mod Ltd CWE-601: URL Redirection to Untrusted Site (‘Open Redirect’)
Mod High DiD CWE-676: Use of Potentially Dangerous Function
Ltd DiD Mod CWE-732: Incorrect Permission Assignment for Critical Resource
High CWE-759: Use of a One-Way Hash without a Salt
DiD High Mod CWE-798: Use of Hard-coded Credentials
Mod DiD Mod Mod CWE-807: Reliance on Untrusted Inputs in a Security Decision
High High High CWE-829: Inclusion of Functionality from Untrusted Control Sphere
DiD Mod Mod CWE-862: Missing Authorization
DiD Mod CWE-863: Incorrect Authorization