File Input/Output
What is IO?
Input/Output is usually stylized as I/O or IO
The concept of:
- Sending information from inside our program to an outside system
- Receiving information from an outside system and using it inside our program
We can send and receive information from many sources:
- Operating system
- Network
- Files
- Databases
- etc.
Files
Files are units of data stored in the computer’s file system. Files exist in most operating systems.
The concept of a file is based on the metaphor of a file cabinet. A file cabinet has folders which contain pieces of paper (files). The pieces of paper have writing on them (data).
Files have data and attributes.
Data
Files contain a sequence of bytes, like 01000001 01110000 01110000 .... This is their data.
flowchart LR
A@{ shape: card, label: "**File**
01000001 01110000 01110000 01101100
01100101 01110011 00100000 01100001
01110010 01100101 00100000 01110100
01100001 01110011 01110100 01111001
00100001
" }
Attributes
| Attribute | Description | Example |
|---|---|---|
| Filename | The unique identifier of the file. Usually has an extension indicating the type of data the file contains. | pokemon.txt |
| Size | The number of bytes in the file | 1,024 bytes |
| Last Modified | The operating system keeps track of the date when the file was most recently changed | 28 Feb 2025 10:09 AM |
| Permissions | The operating system can allow only certain users or processes from accessing a file. Certain operations can be restricted too. | Allowed Users: admins; Allowed Operations: read-only |
Binary vs. Text Files
There are two main types of files: binary and text files.
Binary
The contents of binary files are just their bytes, as-is. These files are not human-readable. Opening them in a text editor will result in meaningless characters.
flowchart LR
A@{ shape: card, label: "**Binary File**
01000001 01110000 01110000 01101100
01100101 01110011 00100000 01100001
01110010 01100101 00100000 01110100
01100001 01110011 01110100 01111001
00100001
" }
Binary File Examples
| Category | File types |
|---|---|
| Images | .jpg, .png |
| Audio | .mp3, wav |
| Video | .mp4 .mov |
| Executable Machine Code | .exe, .o, or no extension |
Text File
The contents of a text file represent characters, according to an encoding, like ASCII or UTF-8. These files are human-readable. Opening a text file in a text editor will result in human-readable characters, because the text editor applies the encoding, transforming the bytes into characters.
Text File Examples
| Category | File types |
|---|---|
| XML | .xml |
| JSON | .json |
| Tab-Delimited | .tsv |
| Comma-Delimited | .csv |
| Unstructured | .txt |
File IO
File IO is implemented with streams. Why might file IO be implemented with streams?
- Reduced Memory Use — Reading an entire file into memory is often unnecessary. Streams let us examine just one “chunk” at a time, saving memory.
- Speed — Reading an entire file at once could be very slow, especially if the file is massive. Streams let us control the amount we read, saving time.
Implementation
Provided by fstream in the C++ Standard Library. The functionality is separated into two classes:
ifstream— Input File Streamofstream— Output File Stream
ifstream
ifstream is a stream for reading input from a file.
input file stream
i f stream → ifstream
flowchart LR A[File]-- ifstream-->B[C++ Program]
ofstream
ofstream is a stream for writing output to a file.
output file stream
o f stream → ofstream
flowchart LR A[C++ Program]-- ofstream-->B[File]
Example
| |
Quiz
Which stream can work with the operating system IO?
Answer:
cin&coutWhich streams can work with file IO?
Answer:
ifstream&ofstreamWhich manipulators can affect the
cin&coutstreams?Answer:
- Make tables with
setw(n),left,right,setfill(c) - Format Decimals with
setprecision(n),fixed,scientific,showpoint - Affect the Buffer with
endl,flush
- Make tables with
Which manipulators can affect the
ifstream&ofstreamstreams?Answer: The same ones!
This is a benefit of the stream concept. The same techniques can be shared among different streams, reusing your existing knowledge.
Reading & Writing Files
How to Read a Text File
- Allocate an
ifstream - Open the file with its name
- Read the file with stream functions
- Close the file
How to Write a Text File
- Allocate an
ofstream - Open the file with its name
- Write to the the file with stream functions
- Close the file
ifstream Interface
| Method | Use |
|---|---|
.open(s) | Open a file by file name (string) |
.is_open() | Whether the .open(s) action was successful |
.fail() | Whether the most recent “read” action was successful |
.close() | Close the connection to the file |
ifstream >> variable | Extract file contents into the variable, stopping at the next whitespace |
getline(ifstream, variable) | Extract file contents into the variable, stopping at the next newline |
zyBook Table 10.3.1: Stream error state flags and functions to check error state
Reading a Text File
Read pokemon.txt into a vector<string>
pokemon.txt
| |
main.cpp
| |
Choosing the “Chunk”
| |
| |
Analysis
How does this work?
ifstreamprovides an interface to access the file system, and process the text file.lineprovides a “container” to store each piece of the file as a string, as it is “streamed”.whileloops until (implicit) call to.fail()returns true, either because there is nothing left to read or an error occurred.
Why does it work?
What type of expression does while expect?
| |
while expects a boolean.
>>andgetline()both return a reference to the stream.ifstreamcan be converted into aboolimplicitly by the compiler.- The result of the
.fail()method is automatically. - Trying to read after end-of-file is reached causes
.fail()to return true. - Therefore, the
whileloop ends when a read is attempted after end-of-file is reached.
Other Resources:
Another Option
We can also call .fail() directly to stop reading at end-of-file, but we must do the checks in a certain order.
This approach is not recommended because it is easy to get the operations out of order by mistake.
| |
Writing a Text File
Let’s extend the program above to copy the text input file, adding line numbers. Use the insertion operator << to insert into an ofstream.
pokemon.txt
| |
main.cpp
| |
Write Modes
By default, writing to an ofstream will overwrite the contents of the file. We can change that behavior by providing a second argument to the open method.
| |
| |
Append to File Exercise
Extend the Program above to append the contents of pokemon-chunk.txt to the output file. Note: Each time we run the code, the contents of pokemon-chunk.txt will be appended again.
What will be the expected output, given the following files?
pokemon.txt
| |
pokemon-chunk.txt
| |
What will be the expected output if we run the code again?
ifstream & ofstream in Functions
File streams can be passed into functions as arguments. This is useful for abstracting logic into a function for reuse.
File streams must be passed by reference.
Why?
ifstream & ofstream are not copyable because streams are intended to exist once and be shared throughout your code. Passing by value would require making a copy.
Syntax
| |
Buffers
Writing and reading files in the underlying operating system is slow. File streams increase utilizing a buffer to batch the reads and writes made by our code. This has two benefits:
- Increased read & write speed
- Increased disk life. Each write to the file system is one step closer to the disk wearing out.
Reading
- Opening the file creates a buffer & fills it up with the first chunk of the file.
- Reading moves the data from the buffer to the destination in your code.
- The buffer “automatically” refills itself when it needs to supply more data but it is already empty.
flowchart LR
F@{shape: docs, label: File System}
A(Code)
B(Buffer)
O((Open))
O-- Creates -->B
F-- Refills When Empty -->B
B-- Read -->A
Writing
- Opening the file for writing creates an empty buffer.
- Writing streams data from your code into the buffer.
- The Buffer is written to disk each time it becomes full. Closing the file writes the remaining contents of the buffer to disk, if any.
flowchart LR
F@{shape: docs, label: File System}
B(Buffer)
O((Open))
A(Code)
O -- Creates -->B
B-- Writes When Full -->F
A-- Write -->B
Errors
There are several errors which can happen when working with file streams.
Summary
| Problem | Solution |
|---|---|
| Nonexistent File | .is_open() |
| Improper Variable Type | Don’t make mistakes ☹️ |
| Reading Past EOF | Stop reading when .fail() == true |
| Overwriting Existing File | Detect whether the file exists before writing |
Problem: Opening a file which does not exist
A file with the provided name might not exist, or the code may be denied access. Check that the file is open with .is_open() before reading.
| |
Fixed by terminating the program if the file could not be opened.
| |
Problem: Assigning data to a variable which cannot hold the data
If we assign content to a variable which cannot hold it, there will be an error.
| |
Fixed by storing into a variable with the right type.
| |
Problem: Trying to read when there is nothing left
If we attempt to read when there is nothing to read, logic errors will result. We must check if the read was successful each time, before using the data. Stop reading at the end-of-file. Use an idiom which handles EOF and errors. See discussion above.
Example
pokemon.txt
| |
main.cpp
| |
Fixed by looping until end-of-file
| |
Problem: Accidentally Overwriting an existing File
Writing to a file will overwrite any existing file with the same name. A programmer can detect whether a given already file exists by attempting to read the file, then checking .fail().
| |
Appendix: Example Programs
Read a TSV File
| |
| |