Donate. I desperately need donations to survive due to my health

Get paid by answering surveys Click here

Click here to donate

Remote/Work from Home jobs

EXECUTOR and JVM creation by Apache SPARK

We have 25 Executors running, each with 5 cores(cpus).

our data is 10 Million records

We have flatMap function which runs for every record.

Our understanding from documentation of SPARK is that each executor represents one JVM, is this correct?

However, from our observation SPARK is creating 10 Million JVMs for processing of flatMap

Kindly clarify our doubt.

EDIT

We are loading 3rd party Context in static block.

our assumption was that it will be loaded once for every JVM.

we run AWR report and found that context loading queries are running for 10 Million times.

Hence, we are assuming that SPARK is creating JVM for each record in flatmap

This flatmap is executed when we call count operation on output dataframe of flatmap.

Comments