I have huge text files which are greater than 20gb. I need to loop through every record and for some criteria I will over write part of line/line. I've coded this using python in_place. It works well for files smaller than 1GB. But for 20GB files it runs for more than 4-5hrs. Can anyone suggest how can I improve my performance here. I'm breaking my head on this.
Here's my code:
With in_place.InPlace('hugefile.txt') as f:
for line in f:
If (condition):
line= overwritedata
else:
Continue
f.write(line)
Comments
Post a Comment