Donate. I desperately need donations to survive due to my health

Get paid by answering surveys Click here

Click here to donate

Remote/Work from Home jobs

Timestamp conversion changes data format

When I apply TimeStamp Data Type to one of the column in DataFrame

data_frame = self.spark.sparkContext.parallelize([
    (‘Joe’, '1995-08-01T00:00:01.000+0000'),
    (‘Kent’, '1995-08-01T00:00:01.000+0000'),
    (’Tim’, '1995-08-01T00:00:01.000+0000')
]).toDF(['firstName', 'dob'])

format = "yyyy-MM-dd'T'HH:mm:ss.SSSZ"
data_frame = data_frame.withColumn('dob', unix_timestamp('dob', format).cast('timestamp'))

I get the result as

(firstName= Joe', dob=datetime.datetime(1995, 8, 1, 2, 0, 1))

But, I would like to retain the data as is but just cast the DataType alone, something like this

(firstName= Joe', dob=1995-08-01T00:00:01.000+0000)

How to convert it?

Comments