00:00 Omit rows. When you test an algorithm for data processing or machine learning, you often don’t need the entire dataset. It’s convenient to load only a subset of the data to speed up the process.
skiprows: either the number of rows to skip at the beginning of the file if it’s an integer or the zero-based indices of the rows to skip if it’s a list-like object.
skipfooter the number of rows to skip at the end of the file.
nrows: the number of rows to read.
The instances of the Python built-in class
range behave like sequences. The first row of the file
data.csv is the header row. It has the index
0, so pandas loads it in. The second row with index
1 corresponds to the label
CHN, and pandas skips it.
The third row with the index
2 and the label
IND is loaded, and so on. If you want to choose rows randomly, then
skiprows could be a list or NumPy array with pseudorandom numbers obtained either with pure Python or with NumPy.
Become a Member to join the conversation.