What Is a File?
At its core, a file is a contiguous set of bytes used to store data. This data is organized in a specific format and can be anything as simple as a text file or as complicated as a program executable. In the end, these byte files are then translated into binary 1
and 0
for easier processing by the computer.
Files on most modern file systems are composed of three main parts:
- Header: metadata about the contents of the file (file name, size, type, and so on)
- Data: contents of the file as written by the creator or editor
- End of file (EOF): special character that indicates the end of the file
What this data represents depends on the format specification used, which is typically represented by an extension. For example, a file that has an extension of .gif
most likely conforms to the Graphics Interchange Format specification. There are hundreds, if not thousands, of file extensions out there.
00:06
A file is a contiguous set of bytes used to store data. It can be small and simple, such as the text files you’ve seen so far, or it can be large and complicated, such as a program executable—and everything in between. File format types are usually represented with a file extension, such as .txt
for text files, .csv
for comma-separated values files, .gif
, .htm
, and many hundreds—if not thousands—of other file formats which are available. File format specifications can usually be found on the internet. Here’s the one for CSV files, but they can be in-depth and necessarily technical.
00:49 For all but the simplest file formats, it’s often best to use a Python package which is dedicated to handle the contents, giving you methods to easily access the data inside.
01:01 And while in the next section you’ll be looking at the contents of some complicated files, this isn’t the way you would handle them normally in practice.
01:10 This is to give you an idea of what’s under the hood and in the files, and also to allow you to flex your muscles with the skills you’ve learned so far.
01:22 With an eye towards the next section, let’s see what’s inside a file. They’re often made up of three parts. Firstly, the header, which provides metadata about the contents, format of the file, file type, size, data, et cetera. But not all files have these.
01:39 Next up, the content. This is what we’re normally most interested in, and in the case of files such as text files, this is all there is to them. Finally, an end of file character that indicates the end of the file to Python.
01:56 With this in mind, you can now move on to the next section and use your Python skills to look inside some common file types.
billlittlewood on Nov. 6, 2019
sorry to nit-pick, but rather than being a contiguous set of bytes used to store data, I’d like to suggest a different description: a contiguous series of bytes used to store data.
Become a Member to join the conversation.
michelnakhla on Aug. 25, 2019
Excellent videos with crisp clear explanations.