Load Multipe CSV files in PySpark

spark= SparkSession.builder \
.master(“local”) \
.appName(“Data Exploration”) \
.getOrCreate()

#load data as Spark DataFrame
data2=spark.read.format(“csv”) \
.option(“header”,”true”) \
.option(“mode”,”DROPMALFORMED”) \
.load(‘/home/Desktop//input/*.csv’)

Convert Many Columns to Float in PySpark

from pyspark.sql.functions import col
for col_name in data7.columns:
data7 = data7.withColumn(col_name, col(col_name).cast(‘float’))

 

data7.printSchema()

How to run PySpark through Jupyter notebook via Docker

docker run -it -p 8888:8888 jupyter/pyspark-notebook

Install Docker before that

 

source -https://levelup.gitconnected.com/using-docker-and-pyspark-134cd4cab867

After Life TV Series Review

A story well written tells itself more than any acting, directing or choreography will do. After life is a sweet, tangy, pretty and yet honest tale of grieving by the brilliant Ricky G.

Now I hope it goes to second season, but it looks tough

Captain Marvel the quick review

In a world where the internet is ruled by funny cats the cat is the sole redeeming feature.  Nearly all top actors are wasted in a movie which will make Wonder Woman seem like Casablanca.  Special effects are toonish and the direction is a set back for director. Almost all your money is redeemed in the two after credit scenes. One of which has a 🐱