Shuffling an out of box large data file in python

  1. Set a buffer_size so that, we can chunk the large file into several chunks which can be fit in to the memory.
  2. We can sequentially divide the large file into small ones using buffer_size and shuffle them in memory.
  3. Last step is to write that shuffled chunked file into the output file sequentially.
  4. Repeat step-2 and step-3 until the large file is chunked completely.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store