I have a sample PySpark Dataframe (shown below). I want to convert it to another type of Dataframe in which I can apply Pandas NLTK methods to it.
ID Text Number
A hello 456
C goodbye 862
F yes 111
G no 323
I tried the code below:
def function_1(input_df):
df = input_df
df = df.toPandas()
return df
I get the following error when I run the code:
You returned a pandas.DataFrame in a pyspark workbook. You must return a pyspark.sql.DataFrame in this workbook.
Comments
Post a Comment