So many R Packages Everywhere, which one do I use? #rstats

Some thoughts on R Packages

  • CRAN is no longer the sole repository for many useful R packages. This includes R Forge, Google Code and increasingly Github
  • CRAN lacks the flexibility and social aspect of Github.
  • CRAN Views is the only thing that lists subject wide listing of R packages. The categorization is however done more on methods than on use cases or business domains.
  • Multiple R packages for the same thing. Which one do I use? Only Stack Overflow helps with that. No rating , no recommendation system
  • The packages suggested by R package feature needs better and automatic association analysis . Right now it is manual and dependent on package author and maintainer.
  • Quis custodiet ipsos custodes? Who guards the guardians of R packages. In an era of cyber security, we need better transparency on security measures within R packages especially given the international nature of the project.  I am very sure I ( or anyone) can create R code to communicate discretely especially on Windows

  • I would rather not install anything on my local machine, and read the package directly from the CRAN . CRAN was designed in an era of low bandwidth- this needs to be upgraded.
  • Note I am refraining respectfully from the atrocious nature of aesthetics in the home website. Many statisticians feel no use of making R user friendly. My professors at U tenn (from which I dropped out in 2 sems) were horrified when I took courses in graphic design as I wanted to know more on the A and B, which make the A/B testing of statistical design. Now that I am getting older, I get horrified by the lack of HTML, CSS and JQuery by some of the brightest programmers in this project.
  • Please comment below.

 

Running R and RStudio Server on Red Hat Linux RHEL #rstats

Installing R

  • sudo rpm -ivh http://dl.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm

(OR sudo rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm )

THEN

  • sudo yum install R

THEN

  • sudo R

(and to paste in Linux Window- just use Shift + Insert)

To Install RStudio (from http://www.rstudio.com/ide/download/server)

32-bit

  •  wget http://download2.rstudio.org/rstudio-server-0.97.320-i686.rpm
  •  sudo yum install --nogpgcheck rstudio-server-0.97.320-i686.rpm

OR 64-bit

  •  wget http://download2.rstudio.org/rstudio-server-0.97.320-x86_64.rpm
  •  sudo yum install --nogpgcheck rstudio-server-0.97.320-x86_64.rpm

Then

  • sudo rstudio-server verify-installation

Changing Firewalls in your RHEL

-Change to Root

  • sudo bash 

-Change directory

  • cd etc/sysconfig

-Read Iptables ( or firewalls file)

  • vi iptables

( to quite vi , press escape, then colon :  then q )

-Change Iptables to open port 8787

  • /sbin/iptables -A INPUT -p tcp --dport 8787 -j ACCEPT

Add new user name (here newuser1)

  • sudo useradd newuser1

Change password in new user name

  • sudo passwd newuser1

Now just login to IPADDRESS:8787 with user name and password above

(credit- IBM SmartCloud Support ,http://www.youtube.com/watch?v=woVjq83gJkg&feature=player_embedded, Rstudio help, David Walker http://datamgmt.com/installing-r-and-rstudio-on-redhat-or-centos-linux/, www.google.com ,Michael Grieb)
 

 

Running R GUI on Google Compute

I wanted to run R GUIs ( rattle, Rcmdr, Deducer) on my Google Compute Instance, but didnt know how to figure out how to enable x11.

Initially I just tried to enable x11 forwarding in the local ssh (Ubuntu) and remote sshd( GCE), but it still needed some more.

Note I use gedit to edit files locally ( since it is easier) and vi to edit files remotely ( because I didnt have a graphical environment there yet) . I used vi help from the link here  (basically sudo vi filename opens the file in Linux, you scroll down and press Insert to write your changes, then hit escape, then write this to save and quit :qw ( or :q! to NOT save and quit), your mouse is quite useless and the arrow keys dont help much in vi- I assure you that)

[local]
/etc/ssh_config or ~/.ssh/config
ForwardX11 yes

restarted local ssh

[remote]
/etc/sshd_config
X11Forwarding yes

restarted remote sshd

Well this is how it is done- following is a copy and paste from actual discussion-

here are two steps you have to do in order to run X-windows applications on your instance.

1) You have to install some X-windows applications on your instance.  I used the command
sudo apt-get install xterm
which works on Ubuntu.  On Centos, you would use the command
yum install xterm
but I didn’t test that.
2) You have to create an X-windows tunnel through SSH.  You do that with the -X switch to the gcutil ssh command:
 gcutil ssh –ssh_arg -X INSTANCE
When you login to the instance, verify that the tunnel is in place.
$rman@test-pd:~$ echo $DISPLAY
localhost:10.0
rman@test-pd:~$
By way of contrast, this is what it looks like if the tunnel didn’t work:
rman@test-pd:~$ echo $DISPLAY
rman@test-pd:~$

Hat Tip- gce discussion group on google groups  https://groups.google.com/forum/#!forum/gce-discussion  and Jeff Silverman from the GCE team.

Denial of Service Attacks against Hospitals and Emergency Rooms

One of the most frightening possibilities of cyber warfare is to use remotely deployed , or timed intrusion malware to disturb, distort, deny health care services.

Computer Virus Shuts Down Georgia Hospital

A doctor in an Emergency Room depends on critical information that may save lives if it is electronic and comes on time. However this electronic information can be distorted (which is more severe than deleting it)

The electronic system of a Hospital can also be overwhelmed. If there can be built Stuxnet worms on   nuclear centrifuge systems (like those by Siemens), then the widespread availability of health care systems means these can be reverse engineered for particularly vicious cyber worms.

An example of prime area for targeting is Veterans Administration for veterans of armed forces, but also cyber attacks against electronic health records.

Consider the following data points-

http://threatpost.com/en_us/blogs/dhs-warns-about-threat-mobile-devices-healthcare-051612

May 16, 2012, 9:03AM

DHS’s National Cybersecurity and Communications Integration Center (NCCIC) issued the unclassfied bulletin, “Attack Surface: Healthcare and Public Health Sector” on May 4. In it, DHS warns of a wide range of security risks, including that could expose patient data to malicious attackers, or make hospital networks and first responders subject to disruptive cyber attack

http://publicintelligence.net/nccic-medical-device-cyberattacks/

National Cybersecurity and Communications Integration Center Bulletin

The Healthcare and Public Health (HPH) sector is a multi-trillion dollar industry employing over 13 million personnel, including approximately five million first-responders with at least some emergency medical training, three million registered nurses, and more than 800,000 physicians.

(U) A significant portion of products used in patient care and management including diagnosis and treatment are Medical Devices (MD). These MDs are designed to monitor changes to a patient’s health and may be implanted or external. The Food and Drug Administration (FDA) regulates devices from design to sale and some aspects of the relationship between manufacturers and the MDs after sale. However, the FDA cannot regulate MD use or users, which includes how they are linked to or configured within networks. Typically, modern MDs are not designed to be accessed remotely; instead they are intended to be networked at their point of use. However, the flexibility and scalability of wireless networking makes wireless access a convenient option for organizations deploying MDs within their facilities. This robust sector has led the way with medical based technology options for both patient care and data handling.

(U) The expanded use of wireless technology on the enterprise network of medical facilities and the wireless utilization of MDs opens up both new opportunities and new vulnerabilities to patients and medical facilities. Since wireless MDs are now connected to Medical information technology (IT) networks, IT networks are now remotely accessible through the MD. This may be a desirable development, but the communications security of MDs to protect against theft of medical information and malicious intrusion is now becoming a major concern. In addition, many HPH organizations are leveraging mobile technologies to enhance operations. The storage capacity, fast computing speeds, ease of use, and portability render mobile devices an optimal solution.

(U) This Bulletin highlights how the portability and remote connectivity of MDs introduce additional risk into Medical IT networks and failure to implement a robust security program will impact the organization’s ability to protect patients and their medical information from intentional and unintentional loss or damage.

(U) According to Health and Human Services (HHS), a major concern to the Healthcare and Public Health (HPH) Sector is exploitation of potential vulnerabilities of medical devices on Medical IT networks (public, private and domestic). These vulnerabilities may result in possible risks to patient safety and theft or loss of medical information due to the inadequate incorporation of IT products, patient management products and medical devices onto Medical IT Networks. Misconfigured networks or poor security practices may increase the risk of compromised medical devices. HHS states there are four factors which further complicate security resilience within a medical organization.

1. (U) There are legacy medical devices deployed prior to enactment of the Medical Device Law in 1976, that are still in use today.

2. (U) Many newer devices have undergone rigorous FDA testing procedures and come equipped with design features which facilitate their safe incorporation onto Medical IT networks. However, these secure design features may not be implemented during the deployment phase due to complexity of the technology or the lack of knowledge about the capabilities. Because the technology is so new, there may not be an authoritative understanding of how to properly secure it, leaving open the possibilities for exploitation through zero-day vulnerabilities or insecure deployment configurations. In addition, new or robust features, such as custom applications, may also mean an increased amount of third party code development which may create vulnerabilities, if not evaluated properly. Prior to enactment of the law, the FDA required minimal testing before placing on the market. It is challenging to localize and mitigate threats within this group of legacy equipment.

3. (U) In an era of budgetary restraints, healthcare facilities frequently prioritize more traditional programs and operational considerations over network security.

4. (U) Because these medical devices may contain sensitive or privacy information, system owners may be reluctant to allow manufactures access for upgrades or updates. Failure to install updates lays a foundation for increasingly ineffective threat mitigation as time passes.

(U) Implantable Medical Devices (IMD): Some medical computing devices are designed to be implanted within the body to collect, store, analyze and then act on large amounts of information. These IMDs have incorporated network communications capabilities to increase their usefulness. Legacy implanted medical devices still in use today were manufactured when security was not yet a priority. Some of these devices have older proprietary operating systems that are not vulnerable to common malware and so are not supported by newer antivirus software. However, many are vulnerable to cyber attacks by a malicious actor who can take advantage of routine software update capabilities to gain access and, thereafter, manipulate the implant.

(U) During an August 2011 Black Hat conference, a security researcher demonstrated how an outside actor can shut off or alter the settings of an insulin pump without the user’s knowledge. The demonstration was given to show the audience that the pump’s cyber vulnerabilities could lead to severe consequences. The researcher that provided the demonstration is a diabetic and personally aware of the implications of this activity. The researcher also found that a malicious actor can eavesdrop on a continuous glucose monitor’s (CGM) transmission by using an oscilloscope, but device settings could not be reprogrammed. The researcher acknowledged that he was not able to completely assume remote control or modify the programming of the CGM, but he was able to disrupt and jam the device.

http://www.healthreformwatch.com/category/electronic-medical-records/

February 7, 2012

Since the data breach notification regulations by HHS went into effect in September 2009, 385 incidents affecting 500 or more individuals have been reported to HHS, according to its website.

http://www.darkdaily.com/cyber-attacks-against-internet-enabled-medical-devices-are-new-threat-to-clinical-pathology-laboratories-215#axzz1yPzItOFc

February 16 2011

One high-profile healthcare system that regularly experiences such attacks is the Veterans Administration (VA). For two years, the VA has been fighting a cyber battle against illegal and unwanted intrusions into their medical devices

 

http://www.mobiledia.com/news/120863.html

 DEC 16, 2011
Malware in a Georgia hospital’s computer system forced it to turn away patients, highlighting the problems and vulnerabilities of computerized systems.

The computer infection started to cause problems at the Gwinnett Medical Center last Wednesday and continued to spread, until the hospital was forced to send all non-emergency admissions to other hospitals.

More doctors and nurses than ever are using mobile devices in healthcare, and hospitals are making patient records computerized for easier, convenient access over piles of paperwork.

http://www.doctorsofusc.com/uscdocs/locations/lac-usc-medical-center

As one of the busiest public hospitals in the western United States, LAC+USC Medical Center records nearly 39,000 inpatient discharges, 150,000 emergency department visits, and 1 million ambulatory care visits each year.

http://www.healthreformwatch.com/category/electronic-medical-records/

If one jumbo jet crashed in the US each day for a week, we’d expect the FAA to shut down the industry until the problem was figured out. But in our health care system, roughly 250 people die each day due to preventable error

http://www.pcworld.com/article/142926/are_healthcare_organizations_under_cyberattack.html

Feb 28, 2008

“There is definitely an uptick in attacks,” says Dr. John Halamka, CIO at both Beth Israel Deaconess Medical Center and Harvard Medical School in the Boston area. “Privacy is the foundation of everything we do. We don’t want to be the TJX of healthcare.” TJX is the Framingham, Mass-based retailer which last year disclosed a massive data breach involving customer records.

Dr. Halamka, who this week announced a project in electronic health records as an online service to the 300 doctors in the Beth Israel Deaconess Physicians Organization,

Using Two Operating Systems for RATTLE, #Rstats Data Mining GUI

Using a virtual partition is slightly better than using a dual boot system. That is because you can keep the specialized operating system (usually Linux) within the main operating system (usually Windows), browse and alternate between the two operating system just using a simple command, and can utilize the advantages of both operating system.

Also you can create project specific discs for enhanced security.

In my (limited ) Mac experience, the comparisons of each operating system are-

1) Mac-  Both robust and aesthetically designed OS, the higher price and hardware-lockin for Mac remains a disadvantage. Also many stats and analytical software just wont work on the Mac

2) Windows- It is cheaper than Mac and easier to use than Linux. Also has the most compatibility with applications (usually when not crashing)

3) Linux- The lightest and most customized software in the OS class, free to use, and has many lite versions for newbies. Not compatible with mainstream corporate IT infrastructure as of 2011.

I personally use VMWare Player for creating the virtual disk (as much more convenient than the wubi.exe method)  from http://www.vmware.com/support/product-support/player/  (and downloadable from http://downloads.vmware.com/d/info/desktop_downloads/vmware_player/3_0)

That enables me to use Ubuntu on the alternative OS- keeping my Windows 7 for some Windows specific applications . For software like Rattle, the R data mining GUI , it helps to use two operating systems, in view of difficulties in GTK+.

Installing Rattle on Windows 7 is a major pain thanks to backward compatibility issues and version issues of GTK, but it installs on Ubuntu like a breeze- and it is very very convenient to switch between the two operating systems

Download Rattle from http://rattle.togaware.com/ and test it on the dual OS arrangement to see yourself.

 

 

 

 

 

Top 25 Errors in Programming that lead to hacker attacks

I am elaborating an earlier article on https://decisionstats.com/top-25-most-dangerous-software-errors/ based on my continued research into cyber conflict and strategy. My inputs are in italics – the rest is a condensed article for further thought.

This is thus a very useful initiative for the world to follow and upgrade their cyber security.

It is in accordance with the US policy to secure its cyber infrastructure (http://www.whitehouse.gov/the-press-office/remarks-president-securing-our-nations-cyber-infrastructure)  and countries like India, and even Europe as well as other nations could do well to atleast benchmark their own security practices in software and digital infrastructure with it. There seems to much better technical coordination between rogue hackers than patriotic hackers imho 😉


The Department of Homeland Security of the United States of America has just launched a list of top 25 errors in programming or creating software that increase vulnerability to hacking attacks. The list which is available at http://cwe.mitre.org/top25/index.html lists down a methodology fo measuring vulnerability called Common Weakness Scoring System (CWSS) and uses that score to rank the various errors as well as suggestions to eliminate these weaknesses or errors.
Measuring Weaknesses

The importance of a weakness (that arises due to software bugs) may vary depending on business usage or project implementation, the technologies , operating systems and computing environments in use, and the risk or threat perception.The Common Weakness Scoring System (CWSS) provides a mechanism for scoring weaknesses. and provides a framework for prioritizing security errors (“weaknesses”) that are discovered in software applications.
Identifying Weaknesses
For example the number 1 weakness is shown with
1CWE-89: Improper Neutralization of Special Elements used in an SQL Command (‘SQL Injection’).
The rest of the weaknesses are

RANK SCORE ID NAME
[1] 93.8 CWE-89 Improper Neutralization of Special Elements used in an SQL Command (‘SQL Injection’)
[2] 83.3 CWE-78 Improper Neutralization of Special Elements used in an OS Command (‘OS Command Injection’)
[3] 79.0 CWE-120 Buffer Copy without Checking Size of Input (‘Classic Buffer Overflow’)
[4] 77.7 CWE-79 Improper Neutralization of Input During Web Page Generation (‘Cross-site Scripting’)
[5] 76.9 CWE-306 Missing Authentication for Critical Function
[6] 76.8 CWE-862 Missing Authorization
[7] 75.0 CWE-798 Use of Hard-coded Credentials
[8] 75.0 CWE-311 Missing Encryption of Sensitive Data
[9] 74.0 CWE-434 Unrestricted Upload of File with Dangerous Type
[10] 73.8 CWE-807 Reliance on Untrusted Inputs in a Security Decision
[11] 73.1 CWE-250 Execution with Unnecessary Privileges
[12] 70.1 CWE-352 Cross-Site Request Forgery (CSRF)
[13] 69.3 CWE-22 Improper Limitation of a Pathname to a Restricted Directory (‘Path Traversal’)
[14] 68.5 CWE-494 Download of Code Without Integrity Check
[15] 67.8 CWE-863 Incorrect Authorization
[16] 66.0 CWE-829 Inclusion of Functionality from Untrusted Control Sphere
[17] 65.5 CWE-732 Incorrect Permission Assignment for Critical Resource
[18] 64.6 CWE-676 Use of Potentially Dangerous Function
[19] 64.1 CWE-327 Use of a Broken or Risky Cryptographic Algorithm
[20] 62.4 CWE-131 Incorrect Calculation of Buffer Size
[21] 61.5 CWE-307 Improper Restriction of Excessive Authentication Attempts
[22] 61.1 CWE-601 URL Redirection to Untrusted Site (‘Open Redirect’)
[23] 61.0 CWE-134 Uncontrolled Format String
[24] 60.3 CWE-190 Integer Overflow or Wraparound
[25] 59.9 CWE-759 Use of a One-Way Hash without a Salt
Details of each weakness is given by http://cwe.mitre.org/top25/index.html#Details
It includes Summary , Weakness Prevalence, Consequences, Remediation Cost, Ease of Detection ,Attacker Awareness and Attack Frequency .In addition the following sections describe each software vulnerability in detail- Technical Details ,Code Examples ,Detection Methods ,References,Prevention and Mitigation, Related CWEs and Related Attack Patterns.
Other important software weaknesses are –

[26] CWE-770: Allocation of Resources Without Limits or Throttling
[27] CWE-129: Improper Validation of Array Index
[28] CWE-754: Improper Check for Unusual or Exceptional Conditions
[29] CWE-805: Buffer Access with Incorrect Length Value
[30] CWE-838: Inappropriate Encoding for Output Context
[31] CWE-330: Use of Insufficiently Random Values
[32] CWE-822: Untrusted Pointer Dereference
[33] CWE-362: Concurrent Execution using Shared Resource with Improper Synchronization (‘Race Condition’)
[34] CWE-212: Improper Cross-boundary Removal of Sensitive Data
[35] CWE-681: Incorrect Conversion between Numeric Types
[36] CWE-476: NULL Pointer Dereference
[37] CWE-841: Improper Enforcement of Behavioral Workflow
[38] CWE-772: Missing Release of Resource after Effective Lifetime
[39] CWE-209: Information Exposure Through an Error Message
[40] CWE-825: Expired Pointer Dereference
[41] CWE-456: Missing Initialization
Mitigating Weaknesses
Here is an example of the new matrix for migrations that also list the top 25 errors . This thus shows a way to fix the weaknesses and relative impact on each weakness by the following mitigations.
http://cwe.mitre.org/top25/mitigations.html#MitigationMatrix

Effectiveness ratings include:

  • High: The mitigation has well-known, well-understood strengths and limitations; there is good coverage with respect to variations of the weakness.
  • Moderate: The mitigation will prevent the weakness in multiple forms, but it does not have complete coverage of the weakness.
  • Limited: The mitigation may be useful in limited circumstances, only be applicable to a subset of this weakness type, require extensive training/customization, or give limited visibility.
  • Defense in Depth (DiD): The mitigation may not necessarily prevent the weakness, but it may help to minimize the potential impact when an attacker exploits the weakness.

Within the matrix, the following mitigations are identified:

 

  • M1: Establish and maintain control over all of your inputs.
  • M2: Establish and maintain control over all of your outputs.
  • M3: Lock down your environment.
  • M4: Assume that external components can be subverted, and your code can be read by anyone.
  • M5: Use industry-accepted security features instead of inventing your own.

The following general practices are omitted from the matrix:

  • GP1: Use libraries and frameworks that make it easier to avoid introducing weaknesses.
  • GP2: Integrate security into the entire software development lifecycle.
  • GP3: Use a broad mix of methods to comprehensively find and prevent weaknesses.
  • GP4: Allow locked-down clients to interact with your software.

 

M1 M2 M3 M4 M5 CWE
High DiD Mod CWE-22: Improper Limitation of a Pathname to a Restricted Directory (‘Path Traversal’)
Mod High DiD Ltd CWE-78: Improper Neutralization of Special Elements used in an OS Command (‘OS Command Injection’)
Mod High Ltd CWE-79: Improper Neutralization of Input During Web Page Generation (‘Cross-site Scripting’)
Mod High DiD Ltd CWE-89: Improper Neutralization of Special Elements used in an SQL Command (‘SQL Injection’)
Mod DiD Ltd CWE-120: Buffer Copy without Checking Size of Input (‘Classic Buffer Overflow’)
Mod DiD Ltd CWE-131: Incorrect Calculation of Buffer Size
High DiD Mod CWE-134: Uncontrolled Format String
Mod DiD Ltd CWE-190: Integer Overflow or Wraparound
High CWE-250: Execution with Unnecessary Privileges
Mod Mod CWE-306: Missing Authentication for Critical Function
Mod CWE-307: Improper Restriction of Excessive Authentication Attempts
DiD CWE-311: Missing Encryption of Sensitive Data
High CWE-327: Use of a Broken or Risky Cryptographic Algorithm
Ltd CWE-352: Cross-Site Request Forgery (CSRF)
Mod DiD Mod CWE-434: Unrestricted Upload of File with Dangerous Type
DiD CWE-494: Download of Code Without Integrity Check
Mod Mod Ltd CWE-601: URL Redirection to Untrusted Site (‘Open Redirect’)
Mod High DiD CWE-676: Use of Potentially Dangerous Function
Ltd DiD Mod CWE-732: Incorrect Permission Assignment for Critical Resource
High CWE-759: Use of a One-Way Hash without a Salt
DiD High Mod CWE-798: Use of Hard-coded Credentials
Mod DiD Mod Mod CWE-807: Reliance on Untrusted Inputs in a Security Decision
High High High CWE-829: Inclusion of Functionality from Untrusted Control Sphere
DiD Mod Mod CWE-862: Missing Authorization
DiD Mod CWE-863: Incorrect Authorization

Revolution releases R Windows for Academics for free

Logo for R
Image via Wikipedia

Based on the official email from them, God bless the merry coders at Revo-

Revolution Analytics has just released Revolution R Enterprise 4.3 for 32-bit and 64-bit Windows, a significant step forward in enterprise data analytics.  It features an updated RevoScaleR package for scalable, fast (multicore), and extensible data analysis with R. Revolution R Enterprise 4.3 for Windows also provides R 2.12.2, and includes an enhanced R Productivity Environment (RPE), a full-featured integrated development environment with visual debugging capabilities. Also available is an updated Windows release of our deployment server solution, RevoDeployR 1.2, designed to help you deliver R analytics via the Web.

As a registered user of the Academic version of Revolution R Enterprise for Windows, you can take advantage of these improvements by downloading and installing Revolution R Enterprise 4.3 today. You can install Revolution R Enterprise 4.3 side-by-side with your existing Revolution R Enterprise installations; there is no need to uninstall previous versions.