I have a large dataset similar to example below :
ID | CODE | STUDY | AMOUNT | COL_NAME |
---|---|---|---|---|
111 | 5611 | ABCD | 56.17 | ID |
211 | 5411 | GFED | 451.1 | AMOUNT |
311 | 3212 | YTRA | 687.3 | STUDY |
I want to populate the values of the columns stored in col_name in a column(COL_VAL) within the same dataframe as below :
ID | CODE | STUDY | AMOUNT | COL_NAME | COL_VALUE |
---|---|---|---|---|---|
111 | 5611 | ABCD | 56.17 | ID | 111 |
211 | 5411 | GFED | 451.1 | AMOUNT | 451.1 |
311 | 3212 | YTRA | 687.3 | STUDY | YTRA |
I am using a loop and .collect() to populate values but it is taking a lot of time. Would like to know efficient ways to do same relevant for a large dataset.