Donate. I desperately need donations to survive due to my health

Get paid by answering surveys Click here

Click here to donate

Remote/Work from Home jobs

read multiple files, but figure out which file I am currently on

I want to use this syntax:

sc.textFile(','.join(files))

However I also need to match each line to the corresponding text file and save it to db later on. Is there a way to append the file name to rdd or somehow know which file I am currently reading? at the end I want a df with a string (file line contents) and another string (corresponding file name).

Comments