Read a file line by line in C - secure fgets idiom

October 03, 2009 at 02:57 PM | categories: Technical, C, UNIX | View Comments |

A pretty common thing to do in any program is read a file line-by-line. In other interpreted or managed languages this is trivial, the standard libraries will make it super easy for you. Just look at how simple it is to do this in Python or Perl or even Shell. In C its a little more complicated, because you have to think about how much memory you need up front, and also the standard library is kind of crufty. You always have to worry about overflowing buffers or dereferencing a NULL pointer. buffer-overflow However, there is a nice libc function available in BSD-derived platforms (including Mac OS X) - fgetln(3). This function makes it nice and easy to read arbitrary-length lines from a file in a safe way. Unfortunately, its not available in GNU libc - that is to say, if you use this function, your program won't compile on Linux. Its not a trivial libc function to port - unlike say strlcpy - since it relies on private details of the FILE structure. These private details don't happen to be the same in glibc, so it doesn't work out of the box. While GNU libc doesn't provide fgetln(3), it does provide its own similar function, getline(3). Of course, if you use this function - which is a bit uglier than fgetln(3) in my opinion - your program won't work on BSD libc systems. So basically, neither of these functions are usable if you want your program to be reasonably portable. Pretty much everything I write in C I want to work on at least Linux, the BSDs, and Mac OS X. You could write your own line reading function on top of ANSI C. Or you might be able to get away with using the existing ANSI function, fgets(3). You need to be careful with fgets, however. You can easily introduce bugs if you aren't careful to cover all the error cases. The other big problem with fgets is that you need to know the maximum length of lines you are going to read in advance, otherwise you'll end up with truncation. In most applications, you can get away with a kilobyte or two on the stack for each line and be ok. In some places, it could be a deal killer though. Anyway, here is a good idiom for using fgets:

char buf[MAXLINELEN];
while (fgets(buf, MAXLINELEN, ifp) != NULL) {
    buf[strcspn(buf, "\n")] = '\0';
    if (buf[0] == '\0')
The explanation of why you use strcspn() can be found in the OpenBSD manual page.

Niall O'Higgins is an author and software developer. He wrote the O'Reilly book MongoDB and Python. He also develops Strider Open Source Continuous Deployment and offers full-stack consulting services at

blog comments powered by Disqus