MLFlow on Azure Databricks

On Azure Databricks you can create experiments using MLFlow

notebook_path = ‘/Users/Ajay/Folder’

notebook_path = notebook_path
mlflow.set_experiment(notebook_path + ‘_experiments’)

with mlflow.start_run(run_name=”ExperimentRun”+curr_ts):


Converting Spark DataFrame to Pandas DataFrame

%pythondf=spark.sql(“select * from name_csv”)

pandas_df = df.toPandas()

Creating SQL Table using Spark

acc_1=spark.sql(“create table test_spark as select columns, column,columnc from table where to_date(ac_opn_dt) < ‘2012-07-01’ )”)

# Given pandas dataframe,  return a spark’s dataframe.
def pandas_to_spark(pandas_df):
columns = list(pandas_df.columns)
types = list(pandas_df.dtypes)
struct_list = []
for column, typo in zip(columns, types):
struct_list.append(define_structure(column, typo))
p_schema = StructType(struct_list)
return sqlContext.createDataFrame(pandas_df, p_schema)

2020 Data Science Survey

News from Rexer Analytical ics

The 2020 Data Science Survey is live!!  
Please take the survey.  And let me know what you think of it.
— Access Code:  ZVWX3
If you have time, please also help spread the word about the survey.

  • Tell the analytic people in your network about the survey
    • Forward this email 
    • Anyone doing anything in the field of analytics is welcome to take the survey
  • Consider posting something about it in online analytic communities or social media
Please use and share these links:
— Access Code:  ZVWX3
           It is OK to share this Access Code with others:  It can be used by multiple people
— Our main Data Science Survey page also has more survey information & FREE downloads of the 2007-2017 Survey Summary Reports
%d bloggers like this: