Working with Cloudera’s VM and Python and R

  1. Download Cloudera VM from https://www.cloudera.com/downloads/quickstart_vms/5-12.html
  2. Boot it up using VMware using instructions from  https://community.cloudera.com/t5/Hadoop-101-Training-Quickstart/How-to-setup-Cloudera-Quickstart-Virtual-Machine/ta-p/35056 and  (after download from  https://my.vmware.com/en/web/vmware/free#desktop_end_user_computing/vmware_workstation_player/12_0)
    1. 1

      Select File > Open.

      2

      In the file selection window, find and select the virtual machine package or configuration file for the virtual machine to open.

      Virtual machine package files have the extension .vmwarevm. Virtual machine configuration files have the extension .vmx. You can view a file’s extension by selecting File > Get info.

      3

      Click the Open button.

      VMware opens the virtual machine and powers it on.

  3. Download putty from (seriously dude) https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html
  4. login to Cloudera VM using Putty as followScreenshot 2018-01-02 12.06.42IP address for connecting 192.168.72.128
    1. Username and Password – cloudera
  5. Install R using – sudo yum install R https://cran.r-project.org/bin/linux/redhat/READMEscreenshot-2018-01-02-12-04-35.png
  6. For Python see latest version at http://repo.continuum.io/archive/ Screenshot 2018-01-02 12.24.13
  7. cd /opt
  8. sudo wget https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh
  9. bash Anaconda3-5.0.1-Linux-x86_64.sh
  10. Accept all conditions!
  11. type jupyter notebook to launch Python in Notebook screenshot-2018-01-02-14-11-13.png
  12. For RStudio
    1. See download link https://download1.rstudio.org/rstudio-1.1.383-x86_64.rpm from https://www.rstudio.com/products/rstudio/download/#download
    2.  sudo wget https://download1.rstudio.org/rstudio-1.1.383-x86_64.rpm
    3. sudo bash
    4. yum install rstudio-1.1.383-x86_64.rpm
  13. For RStudio Server (better alternative since RStudio didnt work above)
    1. instructions from https://www.rstudio.com/products/rstudio/download-server/ 
    2. $ wget https://download2.rstudio.org/rstudio-server-rhel-1.1.383-x86_64.rpm
      $ sudo yum install --nogpgcheck rstudio-server-rhel-1.1.383-x86_64.rpm
    3. Open this http://localhost:8787/ in browser in VM and use cloudera cloudera as username and password
    4. Install packages as needed 🙂
    5. To check rstudio sessions type this in command line

sudo rstudio-server active-sessions 

Screenshot 2018-01-02 14.07.32

Hat tip – http://linuxpitstop.com/install-anaconda-miniconda-conda-on-ubuntu-centos-linux/

https://www.vultr.com/docs/how-to-install-rstudio-server-on-centos-7

https://support.rstudio.com/hc/en-us/articles/200532327-Managing-the-Server

TO BE CONTINUED

Author: Ajay Ohri

http://about.me/ajayohri

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: