The Worst C Example Ever May 21, 2007
Posted by Imran Ghory in Software development.trackback
The Example:
strcpy(str, getpass("Enter password: "));
The first time I saw this I was shocked at how many problems this tiny example of code has (there’s a list of problems further on – if you want to try and figure out the problems yourself try to write a working getpass() function based upon this caller).
To put this example in context it came from Question 6.5 of the comp.lang.c FAQ in which this example is given as an example for how to achieve what the following piece of code is trying to do:
extern char *getpass();
char str[10];
str = getpass("Enter password: ");
For those of you who aren’t C programmers the above chunk of code won’t work as getpass() is returning a pointer to an array (in C strings are represented as arrays of chars) and C doens’t support implicit array copying.
However the “fixed” example is almost as bad as the original, only made marginally better by the fact it technically works. However it is horrendous as an example of good coding practice.
So what’s wrong with it ?
The newbie programmer has almost certainly implemented getpass() using the following method:
char* getpass(void)
{
char password[10];
scanf("%s\n", password);
return password;
}
This reads the password into a local variable which it then returns a pointer to it. The worst thing about this is the code will actually work sometimes and not other times leaving the programmer in a terrible confusion (local variables are allocated on the stack hence will get overwritten after the function ends).
A more spohisticated way to write this function and how it would be generally be done would be to use dynamically allocated memory as in the following:
char* getpass(void)
{
char* password = (char*) malloc(10);
scanf("%9s\n", password);
return password;
}
However in the original bad example the code is throwing away the pointer which getpass() returns, which means that it’s not possible to free that memory. Hence you end up with a memory leak, not good. So this solution is not good in the context.
The best possible getpass() you could get away with is to declare a static char array variable which you return a pointer to
char* getpass(void)
{
static char password[10];
scanf("%9s\n", password);
return password;
}
However this makes the function non-thread-safe and non-renterant which limits its use, and as well as that you end up storing the password in memory for the permanent life of the program.
So as you can see it’s impossible for even a good programmer to write a good getpass() function that can be called by strcpy(str, getpass(“Enter password: “)) , let alone the beginners that this document is targeted at.
A bonus bug
Plus for bonus points, there’s yet another sutble bug in this code. It uses strcpy() in an unsafe manner. Even professional developers often slip up when using strcpy, generally professionals always want to use strlcpy as strcpy is almost never used correctly. The reason it’s bad is the following scenario, imagine getpass() was changed to be the following:
char* getpass(void)
{
static char password[100];
scanf("%99s\n", password);
return password;
}
What makes it unsafe is that strcpy doesn’t know how many characters to copy, hence it just keeps copying from the source to the destination until it runs out of characters. Now that getpass() returns more than ten characters what happens ? – if you’re lucky the program will crash – if you’re unlucky the person typing in the password can take over the program and bypass the password protection.
So What ?
Well hopefully the above has convinced you that if you’re writing for beginners you should try to write examples that not only work, but also show good development practices. Especially online where space isn’t as major a consideration as it is in books.
Ugh, another “security through flailing paranoia” post. Ask yourself if your users are malicious. If not, move on.
It has its own problems, but:
#define getpass(s) ({ char *c = alloca(10); scanf(“%9s”, c); c; })
How would you improve that section of the comp.lang.c faq? Its kept simple for a reason, to answer a specific question. So the example serves its purpose. If you’ve ever spent some time in comp.lang.c then you’ll know that particular FAQ entry will never be made longer at the expense of clarity to the question being asked.
Ask yourself if your users are malicious. If not, move on.
Why have a password if you trust all of your users?
@engtech: why post under a weak pseudonym if someone smart and determined enough can still track you down through your IP? There are many different definitions of “enough security”.
Are you aware that getpass(3) is in libc and doesn’t suffer from any of those problems? (Albeit being deprecated for other good reasons.) I don’t think the FAQ is assuming that someone implemented it on their own… if anything, I think that question is just a little outdated.
[…] The Worst C Example Ever The Example: strcpy(str, getpass(“Enter password: “)); The first time I saw this I was shocked at how many problems […] […]
squirrel: getpass() is known to have security bugs on several platforms (see Viega, et al’s Building Secure Software). The OpenGroup’s specification also marks getpass() as being legacy and discourages it’s use.
Googling indicates most implementations use the static variable approach with a fixed length. Which as I’ve said is the only half-decent approach even for the problems it has.
That kind of stuff is so hard seriously… I think codes are hard to remember…
Sorry to sound stupid, but isn’t the whole approach wrong?
Would you not malloc the space for the password and PASS THE ARRAY POINTER in to getpass, then FREE THE MEMORY in the same code block which made the malloc to isolate your memory allocation to one place?
If you needed to manage array size then add another parameter to getpass with a length parameter preventing an overflow.
Pixolut: Yes.
“The Worst C Example Ever”
You call that the worst? please! just check http://www.ioccc.org.
I do not understand why people have to program in C where most robust programming languages are available live VB or VC#, where it would be possible more robust programs in lots less time!
Canvas of life – The lives of real people
Canvas of life Sorry for the previous bad link.
Razib,
because those languages are shit horrible.
anyone worth a damn codes and uses C correctly, anyone who doesn’t suffers.
The correct solution is one that behaves as defined by its documented implementation. see man page for details.
I agree that the code example is not good practice, primarily because of the risk strcpy() presents. There came a point in my C development experience where I didn’t use it most of the time. I used a macro, which used lstrcpy() instead, to do the same thing more safely (this was years ago).
I agree the version with the static variable is not thread safe, but I disagree that it’s not re-entrant. The way static locals work (at least they used to) is they’d initialize the first time the function is called, and every call after that the initialization step is skipped. You can still put a new value in password each time getpass() is called. If you want to see this in action, check out the strtok() function sometime.
I dislike the version that allocates memory on the heap inside the function. It’d be better to require the caller to allocate the memory, and pass in the pointer. That way the caller will be more cognizant of the fact that the memory needs to be freed afterwards.
Most help artliecs on the web are inaccurate or incoherent. Not this!
VgLw2i sveplfldbccz
r9GDND qwxirrwnvagj
OK, could use strncopy!
Moral of the story? C is a shitty programming language that leaves the door open for such rubbish.
Most modern programming languages won’t allow these kinds of mistakes and flag an error when you compile – this includes Java and C#. Even older languages that are contemporaries of C, e.g., Pascal will flag this kind of bad usage when you compile.
If you are serious about programming and want to spend most of your valuable time coding applications use a modern language.
If on the other hand, you have the time to hack around and want to spend more time debugging than programming, C will keep you happy for the rest of your life.
Not sure I agree with you Guru. It’s helpful to have intelligent compilers that look for code which is now known to be linked with vulnerabilities, but that’s no substitute for good programming.
Even accepting the need for this kind of check, the fault lies with the particular compiler not the language itself.
> Ugh, another “security through flailing paranoia” post. Ask yourself if your users are malicious. If not, move on.
Then ask yourself if memory leaks and random crashes are acceptable. If not, then pay attention.
whydo c programmers all have to reinvent the weel – non of this namby pamby hippy C stuff bring back FORTRAM IV I say 🙂
the beauty of c is also its downfall. it is one of the few languages (outside of machine code) that will allow a programmer to do nearly anything he wants without restriction. unfortunately, the onus is on the programmer to know what he is doing and that’s not always the case.
modern, heavily typed languages help take away a lot of the potential for memory leaks, security holes, and such but at the same time i think a lot of solid coding practices are forgotten (or worse, disregarded) because the compiler catches most simple mistakes and oversights.
why not use strdup() instead of strcpy()?
I’ll just go ahead and add to that fire,
No, children, C is not any part of anything bad you’ve just stated above. In fact yes, it’s beautiful, and no that’s not its downfall.
And like most things, you fear what you do not understand, so I suspect that those of you (esp. those touting any language with a “Visual” in front of it, ahem) who rag on C’s “inadequacies” just need to go pick up a copy of K&R C and read the whole thing before asking any stupid questions.
If you can’t write assembly or C then how can you possibly be a half decent programmer in a high level language like Java, Ruby, C# or VB.Net? ASM and C exposes you the programmer to the issues underlying any of those higher level languages.
C++ and objective C are like an intermediate step between higher level languages and C, and in fact I would recommend that anyone thinking that they want to write seriously for high level languages like Java or C# should learn a little ASM, C and C++ to understand the why and how of the language intermediary (virtual machine) and the underlying operating system frameworks. Last time I checked there were not many kernels written in Java…for a reason.
Those people beating up C might think differently when they have to performance tune a large multi threaded Java application and try to figure out what is actually going on under the surface in the VM or between the VM and the host operating system.
@MJH: Is it really heavily typed languages that solve this? Isn’t it the case that languages with garbage collection or retain/release memory paradigms that help more, regardless of the type model? (I’m thinking Python, Java, Obj-C, and so on). This isn’t even modern. How about Lisp?
@pixolut: I completely agree, and, piling on, I wonder if students now graduate without ever learning compiler design — another good training ground for understanding how code behaves. Or, lacking that, how about debugging with only an assembler-level debugger for a while? That’ll teach them youngsters!
I recently graduated from UIUC, and I can assure you that my program teaches the basics of compiler design. We never used any assembly debuggers, though we did learn MIPS assembly. What I’m seeing in that program, as well as other programs (CMU does this), is that Computer Science as a field is starting to segment into distinct categories. Of course, this is going to throw a wrench into the common “CS graduates should know the entire stack”, but really it’s just shortening the stack. I literally learned how to design an architecture and build a CPU from ICs and resistors. While that is great from a generalist’s point of view, it certainly doesn’t bring me any closer to advancing the state of the art in Computer Science.
CMU has a separate department within CS dedicated entirely to Human-Computer Interactions. Right now, the field is populated with a combination of designers, psychologists, and CS guys. A program to fill that niche honestly has little need for assembly, compiler theory, or numerical methods.
T.,
With all due respect, I understand the feeling that washes over a person once one comes to “understand” the deep intricacies of any topic via having read a seminal book on the topic. However, I feel that being required to read such a thing before being able to ask questions about the standard difficulties one encounters when coding C is too curt, akin to RTFM, Stockholm Syndrome-style. Yes, it’s a badge of honor, but it’s not worth anything to anyone but your peers. If it seems to most everyone coding in C that C has some really wacky problems inherent to it, then the onus is on you to refrain from RTFMing the loln00bz. The onus is on you to justify obvious shortcomings of C (which happen to be side-effects of its truly obvious advantages).
Even if C’s shortcomings are just the fact that most people who start coding in C don’t follow the properly-prescribed way of doing so. I mean, do YOU read the entire book that comes with your cellphone before you dive in and start messing around with its features? No. Like TCP/IP, the human way is a “best effort” attempt at extracting what “just works”, even if it’s “highly illogical”, and even if, in the long term, there would be significant time-savings to learn things “the proper way”. Some of us trade depth for breadth sometimes and wish for smoother translation between the breadth component of our existence and the depth of yours. Without snippety YOU CLEARLY DIDNT READ uZ TEXTBOOKZ,NOOBIAN.
“Computer, end program.”
Shouldn’t this be an example of why C is inappropriate for human use?
^ no. it should be an example of why only few people deserve to be considered real programmers.
Meh, I know there’s a flamewar going on but I just *have* to say this (like a few other folks.)
The correct code is entirely logical. People saying “C isn’t for humans” just aren’t used to how manual memory management works (the alloca() version isn’t a good idea either.)
As a decent C programmer, and that’s what our original poster is trying to say, you know what to do here. The callee will give the called function operational space. (See numerous functions in libc; fgets(), recv(), …)
Oh and for the record, just malloc()ing senselessly isn’t such a good idea either because you have to check if you actually got memory allocated.
If I had to do this I’d use a char[128] or so, and then pass it to the getpass() function which would _naturally_ only copy so many chars into a given char *.
To #23, Patrick:
Using strdup() isn’t really sensible either, although it’ll probably work. The issue is that strdup() and friends can all return NULL when OOM (Out Of Memory) occurs, which just piles on to the code’s length.
The other issue is that a general rule of thumb in C is to never expect to have to call free() yourself. If you’re making an API that handles, say, foobars, you’d probably have a foobar_new() which returns a new structure of that type, and a foobar_del() or likewise, even if it just does a simple free().
Reasoning behind it is that it allows the API to expand or shrink. It is a crucial part of API design; never let the user (the user of the API) fiddle with the internals. There’s a Google TechTalk on it, I’d link it but I’m much too tired.
This is a great example! I’d use this in an interview. The many reactions are also very indicative of people’s mindsets and maturity levels.
I love when a seemingly simple snippet of code demonstrates so many layers of understanding and misunderstanding.
To all those who do not understand why to program in C, several of the posts address this well. I’ll add this:
Cis essentially a ‘portable assembly languge’, and it is in fact more portable than any other language, when some consistent care is taken to be portable. The language is very expressive and occupies the closest thing to a sweet-spot between the machine model and the human mental model, and it is quite readable, if a modicum of discipline and consistency is applied when writing in it.
There are more platforms out there than most programmers generally imagine. Everything from embedded systems that run the telephone networks to various realtime industrial processes, to global scale transaction processing on mainframes, to network management, robust database internals, cryptography, and operating systems are often written in C. It’s not all just Windows, Linux and the Web.
edgerbAnnuare
Bded
You cant pass a string to a function that cast its parameters to void:
the header for getpass should be getpass(char*) not getpass(void)
این نبا مشاوره و پیشنهاد مناسب درخصوص کسانی است
که می خواهند حوالی محتویات شبیه بوسیله این مطلب مطالعه بیشتری داشته باشند
عالی است مطلب تصنیف شده توسط شما چقدر متحسن
نگارش شده است و من مطمئن هستم که این
پست درخصوص متعدد از وبلاگ نویسان پسندیده است
معنای مطلب ی این مطلب از نظر چقدر از افراد
نازل و کامل نیست ولی جان نثار از محتواها سایت ها و بلاگ ها پیرامون این مطلب در وب وجود دارند که می توان از آنها استفاده بهتری داشت
من به خاطر جلوه گر کردن این مطلب صفحه های زیادی را جستجو کردم و به نظرم این مطلب می
تواند درخصوص متعدد از بازدید کنندگان مناسب باشد
keep us up to date like this. Thanks for sharing.