I want to use this syntax:
sc.textFile(','.join(files))
However I also need to match each line to the corresponding text file and save it to db later on. Is there a way to append the file name to rdd or somehow know which file I am currently reading? at the end I want a df with a string (file line contents) and another string (corresponding file name).
Comments
Post a Comment