Research Advice

Diomidis Spinellis
Department of Management Science and Technology
Athens University of Economics and Business
Athens, Greece
dds@aueb.gr

Guidelines for Citing Referenced Material

Introduction

The correct citation of referenced material is an important aspect of scientific publications. The following guidelines provide a quick starting point for creating and managing citations. The guidelines are structured as follows: first we outline the type of information that can appear in cited items (citation elements), then, for every type of item (article, book, thesis) we indicate information that is required or optional (citation contents), and finally we outline citation formats, give some examples, and describe the tools we use. This guide is intended to be an efficient reference for creating scientific citations. It is biased towards the citation formats supported by BibTeX. It is not intended to be complete or authoritative.

Citation Elements

The following list includes all elements that can appear in citations. Author and editor lists are separeted by "and" in BibTeX files; in citations they are typically separated by a comma, with an "and" appearing before the last one.
AddressPublisher's address. For major ones just the city
AnnoteAnnotation
AuthorFirst Last or Last, First. Multiple are separated by and
BooktitleTitle properly capitalised
ChapterA chapter number
EditionEdition of the book, e.g. second
EditorFirst Last or Last, First. Multiple are separated by and
HowPublishedIf it was published in a strange way
InstitutionInstitution that published it
JournalJournal name. Abreviations may exist ($TEXINPUTS/*.bst)
KeyUsed for alphabetizing and creating a label when no author
MonthMonth of publication, usual abbreviations
NoteAdditional information to help the reader
NumberNumber of a journal magazine to TR.
OrganizationOrganization sponsoring the conference.
PagesPage numbers or range.
PublisherPublisher's name.
SchoolName of the school where the thesis was written.
SeriesThe name of a series or set of books.
TitleThe work's title.
TypeType of a technical report e.g. Research Note
VolumeThe volume of a journal or multivolume book.
YearThe year of publication. Numerals only."

Citation Contents

The following sections indicate required and optional items for different types of cited material.

Article

TitleRequired
AuthorRequired
JournalRequired
VolumeOptional
NumberOptional
PagesOptional
MonthOptional
YearRequired
URLOptional
NoteOptional

Book

TitleRequired
EditionOptional
SeriesOptional
VolumeOptional
AuthorRequired or include Editor
EditorRequired or include Author
PublisherRequired
AddressOptional
MonthOptional
YearRequired
NoteOptional
ISBNOptional

Booklet

AddressOptional
AuthorOptional
HowPublishedOptional
KeyOptional (needed if no Author)
MonthOptional
NoteOptional
TitleRequired
YearOptional

InBook

AddressOptional
AuthorRequired or include Editor
ChapterRequired or include Pages
EditionOptional
EditorRequired or include Author
MonthOptional
NoteOptional
PagesRequired or include Chapter
PublisherRequired
SeriesOptional
TitleRequired
VolumeOptional
YearRequired

InCollection

AuthorRequired
TitleRequired
ChapterOptional
PagesRequired
EditorOptional
BooktitleRequired
PublisherRequired
AddressOptional
MonthOptional
YearRequired
NoteOptional

InProceedings

TitleRequired
AuthorRequired
BooktitleRequired
AddressOptional
MonthOptional
YearRequired
OrganizationOptional
PagesOptional
EditorOptional
PublisherOptional
NoteOptional
URLOptional

Manual

AddressOptional
AnnoteAnnotation
AuthorOptional
EditionOptional
KeyOptional (needed if no Author)
MonthOptional
NoteOptional
OrganizationOptional
TitleRequired
YearOptional

MastersThesis

AddressOptional
AuthorRequired
MonthOptional
NoteOptional
SchoolRequired
TitleRequired
YearRequired

Misc

AuthorOptional
HowPublishedOptional
KeyOptional (needed if no Author)
MonthOptional
NoteOptional
TitleOptional
YearOptional

PhDThesis

AddressOptional
AuthorRequired
MonthOptional
NoteOptional
SchoolRequired
TitleRequired
YearRequired

Proceedings

TitleRequired
EditorOptional
NoteOptional
OrganizationOptional
AddressOptional
PublisherOptional
MonthOptional
YearRequired

TechReport

AuthorRequired
TitleRequired
NoteOptional
TypeOptional
NumberOptional
MonthOptional
YearRequired
InstitutionRequired
AddressOptional

Unpublished

AuthorRequired
MonthOptional
NoteRequired
TitleRequired
YearOptional

Citations in the Text

The are a number of established forms for referencing a citation in the publication text. The reference should be unambiguous and the format used should be consistent. Some popular styles include:
[Author-Initial(s)Year]
as in [Spi97] (single author) or [WKS82] (multiple authors) or [Knu88b] (multiple works for the same author and year).
[Number]
as in [12]. Citations are then numbered by order of occurence in the document or by the order they appear when sorted by the author names.
Supersctipt number
as in12 numbered as described in the previous case.
Author (year)
as in Spinellis (1997) or Kernighan and Ritchie (1978), or (Knuth 1981). Append a lowercase letter (a, b, c) for multiple works by the same author in the same year. The format Author (year) is used in narrative form as used by Knuth (1983), while the format (Author year) is used when the reference is outside the flow of the text (Knuth 1983). We recomment against using this reference style as it confuses bilbiographic tools without offering any significant benefits.
Choose the format appropriate for the publication you are writing for and use it consistently. We prefer the first format, as it is helps us identify the reference in the text, and is efficiently supported by BibTeX.

Citation Formats

Citations to Electronic Data

Citations to data that is available in electronic format should follow the guidelines for traditional formats, appending at the end the following: It is generally preferable to cite traditional sources over Internet pages as the latter tend to be rather volatile. When you do cite material on the web, archive it using WebCite (http://www.webcitation.org/63NhkqLsO).

Examples:

Advice for Writing BibTeX Entries

Many electronic libraries provide the ability to export a reference in BibTeX format. However, these references often contain errors and style bugs. Before incorporating a reference into your database ensure that the following hold.

Examples

Tools and Links

I manage bibliography lists and automatically create citations using BibTeX a companion program for the LaTeX text-processing system. LaTeX, BibTeX and instructions can be found on CTAN: the Comprehensive TeX Archive Network (http://www.ctan.org/) Extensive bibliography lists in BibTeX format are maintained on many Internet sites such as the Networked Computer Science Technical Reports Library (http://www.ncstrl.org/) When forced to use Microsoft Word I have developed a set of BibTeX styles that create Microsoft Word RTF files. More information is also available from Dana Jacobsen's Survey of Bibliographic Tools (http://www.ecst.csuchico.edu/~jacobsd/bib/tools/index.html). Some other tools that you may wish to examine are ProCite, EndNote, Reference Manager, RefViz, and WriteNote.

References

Preparing a Poster Presentation

How to Setup a Brilliant Poster Stand

Elements that make up a good poster stand:

Drafting the Poster Advice

Poster Examples: Good Use of Color

The following are some good examples of posters that use color to stand out from the crowd. Click on the images for a larger version.

Poster - 07.07.2004
(EURO XX 2004)

Poster - 07.07.2004
(EURO XX 2004)

18.11.2004
(MSAD 2004)

Poster Examples: Liberal Use of Graphical Elements

The following are some good examples of posters that use graphics to bring their message across. Click on the images for a larger version.

Poster - 07.07.2004
(EURO XX 2004)

Poster - 07.07.2004
(EURO XX 2004)

18.11.2004
(MSAD 2004)

ICSE 2006
(ICSE 2006)

ICSE 2008
(ICSE 2008)

Poster Examples: Graphics that Tell a Story

The following are some good examples of posters that use graphics to guide the reader through the story. Click on the images for a larger version. See also the interesting layout examples.

Poster - 07.07.2004
(EURO XX 2004)

Poster - 07.07.2004
(EURO XX 2004)

Poster - 07.07.2004
(EURO XX 2004)

Poster Examples: Interesting Layout

The following are some good examples of posters with an interesting layout. Click on the images for a larger version.

Poster - 07.07.2004
The layout is self-referential. The work describes the use of Voronoi diagrams (http://en.wikipedia.org/wiki/Voronoi_diagram) for vote districting, and the poster layout follows the same scheme. This poster won the conference's best poster award. (EURO XX 2004)

18.11.2004
(MSAD 2004)

ICSE 2006
This slide (although a bit crowded) uses numbering and nested layouts to guide the reading order. (ICSE 2006)

Poster Examples: Adding Hardware

The following are some good examples of posters that have additional "hardware" elements pasted on them. Click on the images for a larger version.

Poster - ICSE 2006
The slide includes a folder with a copies of it, and an area for adding comments. (ICSE 2006)

Poster - ICSE 2006
Another approach for collecting comments: a notepad and a pen hanging on a pin (ICSE 2006)

Poster - ICSE 2006
This slide has copies and business cards attached to a paper fastener. (ICSE 2006)

Poster - ICSE 2008
This slide includes a blank area where a projector shows a demo of the software. (ICSE 2008)

Poster Examples: Making do Without a Big Format Printer

You don't necessarily need access to a big-format printer to create a good poster. If you find yourself stranded without suitable hardware (e.g. wanting to create a poster on the spot at the conference) you can improvise by assembling printed A4 sheets, or even by writing and drawing your poster in flip-chart paper. Here are two examples. Click on the images for a larger version.

Poster - SPLASH 2012
This poster is assembled mainly from printed A4 sheets, placed in an interesting pattern. (SPLASH 2012)

Poster - SPLASH 2012
This poster is written by hand on flip-chart and A4 sheets. The placement of the A4 sheets is used to indicate how they relate to the larger flip-chart sheets. (SPLASH 2012)

Poster Counterexamples

The following are some counterexamples of the poster design techniques I described. To protect the guilty, you can not click on the images for a larger version.

Poster - 24.05.2007
The paper's pages printed on an A0 sheet

Poster - 24.05.2007
The paper typeset and printed on an A0 sheet; this is worse the previous example, because our eye can't follow printed lines spanning half a metre.

Poster - 07.07.2004
Too much and small text, not enough color (EURO XX 2004)

Poster - 07.07.2004
Too much information (text and small diagrams), glossy paper (EURO XX 2004)

Poster - 07.07.2004
Too much and small text (EURO XX 2004)

Stand Examples

The following are some good examples of stands following the guidelines I described. Click on the images for a larger version.

Poster - The perfect setup: poster, book, physical demo, laptop, presentation - 07.07.2004
The perfect setup: poster, book, physical demo, laptop, presentation. The board allows the demonstration of Voronoi diagrams (http://en.wikipedia.org/wiki/Voronoi_diagram) using nails and rubber bands. This poster won the conference's best poster award. (EURO XX)

18.11.2004
A live hardware demo (MSAD 2004)

18.11.2004
Closeup of the live hardware demo (MSAD 2004)

18.11.2004
The perfect packing for the demo (MSAD 2004)

Packing List

Bring with you the following items: Don't forget to update your web site, before you leave. Keep copies of the promotional material (leaflet, publication) in a web-accessible directory. You may find them useful, if you need to print additional copies from an Internet cafe, or a public PC room.

PhD Student and Supervisor Resources

A Reading List for PhD Students (and their Supervisors)

PhD Student Achievements

This page contains significant achievements of PhD students I supervise.
26 February 2009
Book cover of Beautiful Architecture The book Beautiful Architecture: Leading Thinkers Reveal the Hidden Beauty in Software Design (http://oreilly.com/catalog/9780596517984/) (O'Reilly, 2008, ISBN 9780596517984) co-edited by Georgios Gousios obtained the top rank on the amazon.co.uk Software Architecture books category. The book's royalties are donated to the international humanitarian aid organisation Médecins Sans Frontières (http://www.msf.org/).
February 2009
PEVE logo Vassilis Karakoidas is awarded funding for his research through AUEB's Funding Programme for Basic Research (PEVE).
January 2009
Dimitris Mitropoulos publishes a paper in a high-impact journal: SDriver: Location-specific signatures prevent SQL injection attacks. Computers and Security, 2009.
October 13th, 2008
Vasileios Vlachos is elected Lecturer in the Department of Computer Science and Telecommunications at the Technological Educational Institution of Larissa. The subject of his post is the Development and Security of Internet Applications.
September 2008
Vasileios Vlachos publishes a paper he co-autored during his PhD study in a top-impact journal: Power laws in software. ACM Transactions on Software Engineering and Methodology, 18(1):1–26, September 2008. Article 2.
June 2008
Highly Commended Paper distinction. The paper A PRoactive Malware Identification System based on the Computer Hygiene Principles (Information Management and Computer Security, 15(4):295-312, 2007) co-authored by Vasileios Vlachos was awarded (http://info.emeraldinsight.com/authors/literati/awards.htm?jr=imcs) by Emerald (http://www.emeraldinsight.com/) publishers with the "Highly Commended Paper" distinction. The award was given by the journal's editorial board to three papers as part of the "Literati Network Awards for Excellence 2008".
May 2008
Vassilis Karakoidas publishes a paper in a high-impact journal: FIRE/J — optimizing regular expression searches with generative programming. Software: Practice & Experience, 38(6):557–573, May 2008.
November 17th, 2007
Ionian University logo Konstantinos Chorianopoulos is elected Lecturer in the Department of Informatics at the Ionian University.
July 24th, 2007
PhD logo Vasileios Vlachos successfully defends his PhD thesis.
April 2006
The paper A survey of peer-to-peer content distribution technologies (ACM Computing Surveys, 36(4):335–371, December 2004) co-authored by Stephanos Androutsellis Theotokis obtained the top yearly download rank in the ACM's digital library popular magazine and computing surveys articles category.
ACM Digital Library download ranks
Table from the Communications of the ACM Volume 49, Number 4 (2006), Pages 29-30.
January 2006
PENED-2003 logo Stephanos Androutsellis Theotokis, Georgios Gousios, and Konstantinos Stroggylos are awarded a scholarhip through the framework of the "Reinforcement Programme of Human Research Manpower" (PENED) co-financed by National and Community Funds (25% from the Greek Ministry of Development-General Secretariat of Research and Technology and 75% from E.U.-European Social Fund).
December 2004
Stephanos Androutsellis Theotokis publishes a paper in a top-impact journal: A survey of peer-to-peer content distribution technologies. ACM Computing Surveys, 36(4):335–371, December 2004.
July 2004
Vasileios Vlachos is awarded a scholarship co-funded by the European Social Fund and National Resources - EPEAEK II - IRAKLITOS Fellowships for research of Athens University of Economics and Business. EPEAEK logo
June 2004
Vasileios Vlachos and Stefanos Androutsellis Theotokis publish a paper in a high-impact journal: Security applications of peer-to-peer networks. Computer Networks, 45(2):195–205, June 2004.
May 10th, 2004
PhD logo Konstantinos Chorianopoulos successfully defends his PhD thesis.
May 2004
Konstantinos Chorianopoulos publishes a paper in a high-impact journal: User interface development for interactive television: Extending a commercial DTV platform to the virtual channel API. Computers & Graphics, 28(2):157–166, April 2004.
April 2004
Imagine Cup logo Vasileios Vlachos coordinated the student team that won the third place in the national phase of the Microsoft Imagine Cup 2004 competition.
December 2004
The paper A survey of peer-to-peer content distribution technologies (ACM Computing Surveys, 36(4):335–371, December 2004) co-authored by Stephanos Androutsellis Theotokis obtained the top monthly download rank in the ACM's digital library popular magazine and computing surveys articles category.
ACM Digital Library download ranks
Table from the Communications of the ACM Volume 48, Number 9 (2005), Pages 29-30.
June 16th, 2002
PhD logo Konstantinos Raptis successfully defends his PhD thesis.
June 2000
Konstantinos Raptis publishes a paper in a high-impact journal: Component mining: A process and its pattern language. Information and Software Technology, 42(9):609–617, June 2000.

The PhD Game

0.
BEGINNING
Throw away sanity to start

 

The Ph.D Game

 

 

1.
Your supervisor gives you project title. Go on 3 spaces.

2.

[-->]

3.
You are full of enthusiasm.
Have another turn.

4.
Realise supervisor has given nothing but project title.

5. Go to Library -you can't understand catalogue!
Miss one turn.

6. The important reference has gone missing in the lib. Back 2 spaces.

7.

[down]

14.

[down]

13.
Things don't go well. You become disillusioned
Miss one turn

12.
END OF FIRST YEAR
[<--]

11.
Examiners not impressed by first year report.
Throw 1 to cont

10.
Do extra work on first year report.

Extra turn.

9. Supervisor makes a comment you don't understand. Go back two spaces

8.
Need supervisor's help.
Miss one turn finding her.

15.
You become depressed.

Miss 2 turns.

16.
You become more depressed.

Miss 3 turns.

17.
Change project.

Go back to beginning

18. Change supervisor.
Throw 6 to cont
Otherwise go back 6 spaces.

19. Do lab demonstra -tions to get some dosh.
Go on 2 spaces.

20.

[-->]

21. Lab demos take up too much of your time.
Back 4 spaces.

28. You begin to think you will never finish. You are probably right.

27.
Beer monster strikes.
Spend 1 turn recovering.

26.
Work every weekend for two months. Go on 6 spaces.

25.
END OF SECOND YEAR
No results. Who cares?

24.
Experiment are working.

Go on 4 spaces.

23.
Specimens incorrectly labelled. Go back to 20.

22.
[<--]

29.
[-->]

30. You spend more time complaining than working. Miss 1 turn.

31. You realise your mates are earning 5 times your grant. Have a good cry

32. You are asked why you started a PhD. Miss a turn finding a reason

33. You are offered a job. You may cont. or retire from game.

34. Start writing up. Now you are really depressed.
Miss 5 turns.

35.

[down]

42. Your PhD is awarded.
Congratula -tions
now join dole queue!

41. You are asked to resubmit thesis.

Back to 33.

40. You decide PhD isn't worth the bother.
Withdraw now.
Game over.

39.
Harddisk crashes.

Back 3 spaces

38. It proves impossible to write up and work.
Go to 33.

37. Your thesis will disprove external examiners work. Go back to 28.

36. Your data has just been published by rival group.
Go back to 28.

*Matrix by somebody at the Jenner Institute (http://www.jenner.ac.uk/), who deserves the copyright with minor modifications by Kohei Watanabe.

The Nine Types of Principal Investigators

From The NIH Catalyst, Volume 3, page 23.

Delaying Higher Degree Completion

Collated by Diana Bental (D.Bental@lancaster.ac.uk) with the help of contributions from many PhD students, past and present.

Purpose

This document is intended for supervisors of students registered for higher degrees in just about any University department anywhere.

Though the presentation of the document is deliberately light in tone, the contents are based on a collation of feedback from a fairly large number of students currently pursuing their studies for a Ph.D.

Acknowledgement

This document is extremely close to that published in the AISB Quarterly (No. 80, Summer 1992), the quarterly magazine of the Society for the Study of Artificial Intelligence and the Simulation of Behaviour. It has been slightly edited by Paul Brna.

``All the information here has in fact been contributed by PhD students, past and present. Much of what is written here has been exaggerated for effect, but it is all based on students' real experiences and some of it is no more than a literal description of what has happened to them.'' (page 60, AISBQ No 80, Summer 1992)

Many PhD students have collaborated to provide the insights that are found within. Our thanks go to them, and to those that helped in pulling the contributions together into such a formidable body of knowledge.

Thesis Prevention: Advice to Supervisors

As you will be aware, Professor Hacker in his wisdom supervises a great many Higher degree students. Prof Hacker is currently angling for research money for his Automated Thesis Adviser, and it is his aim that no student of his should do anything which requires any input from him until he has obtained the grant for, researched, developed and completed the Automated Thesis Adviser which will replace him.

Clearly, it is not easy to prevent reasonably intelligent and mildly motivated students (such as ourselves) from producing useful work. Nevertheless, he has developed some excellent techniques for Thesis Prevention which we feel may be of use to others, and which we, Professor Hacker's research students present here for your enlightenment and entertainment. If you, as a supervisor, wish to prevent your students from researching and writing up a thesis, or indeed doing anything useful at all, we hope you will take inspiration from Prof. Hacker's example.

On Arrival: Settling In

Try to be away when the student arrives. Out of the country is preferable, but in today's economic climate Prof. Hacker acknowledges that it is also acceptable to be merely in another city. In this case, your student cannot try to set up any kind of regular contact with you, and will be forced to become independent of you early on.

Supervisions

Initially, Prof. Hacker attempted to shelve the whole problem of supervisions by simply refusing to see his students at all. He would smile at them on his way out of the tea room, realising that this was as much supervision as any student could expect, especially if he occasionally discussed the weather with them when meeting in the corridor. He was forced to drop this approach when his department laid down some guidelines which insisted that supervisors should actually sit down in the same room as students every few weeks and discuss the students' work. This was only a temporary setback to the intrepid Prof. Hacker, of the sort that spurs a good researcher on to new heights. It was at this point that he made some stunning discoveries about how to use these meetings to achieve depths of demotivation previously beyond human imagining.

Basic Etiquette

Here are some guidelines which, if adhered to strictly for even quite a short time, will convey the desired message to the student: a student's work is unimportant, uninteresting and not worth anybody's time, not even their supervisor's.

Punctuality

Arrive late for all appointments with the student. If you can't manage that, then be occupied in some long and complicated task when the student arrives and be sure to finish the task before turning your attention to the student.

Concentration

Encourage interruptions. Do not cut callers short with the rude statement that you are in a meeting. Never re-route telephone calls. Ask the secretaries to route all their calls through your office as a return favour for all those times you've re-routed your calls. Encourage your head of department, researchers from overseas and your three-year-old child to call at these times. Make any outgoing calls that you suddenly realise are necessary.

If supervisions are held in your office (and they needn't be) make sure that you have a keyboard handy. This is so that you can, in the middle of any detailed explanations that your student may indulge in, reach for the keyboard and read your mail. Prof. Hacker likes to get his workstation to emit distracting beeps at random intervals.

Reliability

Cancel meetings frequently on the flimsiest pretexts that you can. Do not ever tell students that the meeting is cancelled, but let them come prepared for a supervision and find the room empty. (If the students have prepared for the supervision, that is 90 per cent of the benefit anyway, so don't feel that you are depriving them.)

The Group Supervision

Try as far as possible to conduct the supervision of several students simultaneously. The students can talk to each other, thus decreasing your need to contribute. If they are all working on unrelated projects and share no common terminology, their attempts to hold a useful discussion should provide you with much diversion.

Productivity is increased even further if this is done as a lunch-time exercise. After all, you have to eat sometime, and if you can do this and fulfill your obligations to your students at the same time, so much the better.

Prof. Hacker warns that only experienced supervisors should attempt simultaneous supervision of more than two students. Note also that fewer than two is really not cost effective and in this case you should try to turn up as late as possible, grab your lunch and be busy eating for most of the next 15 minutes (which is the recommended duration for such supervisions).

Preparing for Supervisions

Do not prepare for any supervision. If you have an excellent memory, know all the background to the student's project and see the student often, then this technique will not help you. But if not, then your failure to take note of what the student has been doing and/or your failure to look back over your notes will enable you to start each supervision from scratch, requiring the student to explain and justify every step of background to their work before they can discuss any real problems with you. Do this one well enough and you will never have to discuss any real technical problems with your student.

Content of Supervisions

You will find that you are expected to talk during supervisions. Prof. Hacker prefers to avoid the strain of listening critically to students' ideas, and still more to avoid the strain of thinking up helpful and detailed advice.

Avoid at all times any discussion of practical possibilities. Inspire the student by using supervision meetings to soliloquise on all your vaguest and most esoteric ideas, particularly on philosophical issues. Tell lengthy anecdotes to illustrate a point which the student will have forgotten by the time you finish.

Your students will also expect you to respond to their ideas. Prof. Hacker has demonstrated that three quite different techniques may be expected to produce the same effect.

  1. Always agree with any suggestions a student makes. At first, this will boost their confidence beyond their wildest expectations which means they won't come back for supervision for a long time. The next time you use this approach they will become suspicious that whatever they say is enthusiastically accepted, however ludicrous, so they won't come for supervision since they don't trust you.

  2. Always disagree with what the student says. This is more dangerous since it is confrontational and so should only be attempted by persons of large stature or with a black belt in an appropriate martial art. A good way of ridding yourself of students with the possibility of unlimited earnings from suing for assault. If you have been unable to prevent a student from progressing deeply into a thesis, you can discourage the student by commenting only on the weak aspects of the work and assuming that the student will know, perhaps by psychic projection, that you think the rest is good.

  3. Maintain a strict neutrality to avoid unfairly influencing the student. This is far less obvious than either of the two previous approaches and it still frees you from having to think about what the student is doing. Never give clear approval or disapproval of any ideas the student comes up with, so that they don't know if the idea should be followed up or abandoned. The student, unlike you, is unfamiliar with doing a Higher Degree, so it would be unfair to bias their ideas of what is appropriate.

Research Guidance

Directing the Area and Scope of your Students' Research

A good way to prevent your students from doing any useful research is to ensure that they choose the right topic. An ideal topic is one that the student isn't interested in, and that the supervisor knows nothing about. Prof. Hacker is especially pleased with a topic if the department lacks the facilities required to pursue it, and if any results are likely to be inconclusive.

The department accidentally played right into Prof. Hacker's hands when it instigated the requirement that students submit a thesis proposal at the end of their first year. A feebler supervisor would have given in and tried to ensure that students produce a detailed and well thought out proposal by this deadline. Prof. Hacker is made of sterner stuff. By following techniques given in this section and the reading techniques given in the section below throughout his students' first year, Prof. Hacker was able to use this deadline to panic his students into choosing the right sort of topic - for his purposes.

Discourage students from following up their initial interests. Post-graduate work is a chance to explore new areas! Suggest subject areas that they know nothing about, so that they spend a year or two trying to understand an area before they find out that it's not worth the trouble to pursue.

Suggest that the student should apply a promising technique to a useless area, such as applying termination proof theory to Cobol programs.

Suggest that students should research a `related' area to their current research since the two areas share a common word in their titles, even though they are lightyears apart (see the following section on reading). This could set them on the wrong track for years.

Wait a year or two and then find a good reason why it would be pointless for the student to continue their current line of research. Refer them to the paper that reports someone else having done the work they intend to do, or explain that the equipment or facilities that the project depends upon will be unavailable. Remember that just because you know that a research group of thirty staff is working on a topic that your student is investigating alone, or that your equipment bid is unlikely to be funded, you don't have to tell the student immediately. You wouldn't want to discourage them, after all.

Finally, gild the lily. Prof. Hacker is delighted to report that having been initially sceptical about a student's choice of project and having suggested that the student spend several months preparing some alternative proposals, he was able to inform the student that the student's original proposal was indeed the best.

Directing the Student's Reading

Guidance on reading is vital. Prof. Hacker's aim is to ensure that his students' reading lists increase in length exponentially.

If ever a student raises an interesting point that Prof. Hacker fears might lead to a technical discussion, he exclaims ``Ah yes, you really must read what Whizzbang and Genius have to say about that in their theses at the University of Obscurity, Darkest Peru in about, oh, 1965''. He makes it quite clear that there is no point discussing the topic further until the student has read the vital reference (or better still, five or six of them).

The choice of reference material should be guided by a generate and test procedure. Prof. Hacker generates appropriate reference material by looking for titles that share a common word with the student's topic regardless of context. He filters out inappropriate references by making sure that all the references he gives are very hard to dig out. (Never actually produce one to lend to your students, for students are independent researchers who must not be spoon-fed.) Prof. Hacker prefers to mention theses done in remote corners of the world and of 1960s or 70's vintage.

The consistent application of these guidelines should put the student into a sufficiently desperate state that they will settle on a completely inappropriate topic when they have to write their thesis proposal (as discussed in the previous section).

Writing Up

If your students get this far (and if you follow all our guidelines strictly, we trust that your students will not), you will need to assist your students and ensure that they never finish writing up. Prof. Hacker takes care to identify every concept referred to in his students' work and reminds students that there must be a background chapter on each concept in the thesis, with accompanying related work section. As the size of this grows (we suspect factorially, see our appendix on complexity theory) that is a very off-putting task. If a student actually attempts the task it is guaranteed to produce a nervous breakdown, as each of these background chapters then requires further elaboration in itself, and so on recursively.

Reading

Students will expect that you will read technical papers that they have written, however badly worded, boring and pointless they may be. There are two main approaches to preventing students from giving you things to read. Applied with sufficient vigour they may prevent your student from ever writing anything at all.

  1. Do not write comments on anything that the student has written. This conveys the impression that you have not read the paper without providing the student with any concrete evidence that could be used against you. You can make verbal comments. These give the impression that yes, you did read the paper but you found it too pointless to be worth searching for a pencil. If you wish to convey the impression that you read the paper with pencil in hand and thought nothing of the contents, you can simply dip into the middle of the paper and correct a minor grammatical error.

  2. Allow several months to elapse before reading (or claiming to have read) anything the student writes. This is risky with drafts of conference papers which may have a deadline for submission, but is an adequate way to deal with thesis chapters, thesis proposals and other half-baked nonsense. This method is especially useful when applied to something that you have asked the student to write.

A caveat: we suggest that beginners apply only one of these two techniques to any one piece of work. Only the experienced can apply both and maintain a balance which will not cause an explosion.

Prof. Hacker occasionally takes a more subtle approach, in which comments are always written but are content free or (better still) ambiguous, thus leaving the student with the work of incorporating the wrong ideas into their paper.

If Prof. Hacker makes any comments on style or content, e.g. that some sentence should be re-written in a particular way, Prof. Hacker tries to remember to reverse the comments in the next draft. This can be applied ad infinitum, or at least until he forgets to do it.

Publications

Prof. Hacker believes that it is an excellent idea for a supervisor to add his or her name to all of a student's published work. This is justified for two reasons. Firstly, you are doing the student a favour because you are a more famous researcher and therefore your name as co-author will mean that the paper is more likely to be accepted. Secondly, you are the student's supervisor and therefore you are naturally the inspiration for everything the student publishes. This will delight the student even further if you have been practising all the other the techniques proposed here, especially those suggested in the section on reading.

Extra-Mural Activities

All supervisors should encourage their students to make contacts in other institutions and to broaden their range of interests. The ideal way to do this is to ask students to organise a conference or workshop, preferably on a topic unrelated to their thesis work.

Conclusions

We believe that we have gathered together a collection of techniques that will be of widespread use in the slowing down and prevention of the production of theses for Higher Degrees. We have emphasised the many ways in which a supervisor can contribute, and the great variety of approaches to the prevention of theses. Finally, we would like to think that Professor Hacker's supervision techniques were unique to him but we fear that they are not.

A Letter Regarding Attendance Time

And, yes, the receipient was a former student (http://www.carreira.ethz.ch/people/former_members).

(From the web page of Jinghai Rao (http://www.cs.cmu.edu/~jinghai/), brought to my attention by Vassilis Prevelakis (http://vp.cs.drexel.edu/).)

189 Things (Not) to Do at or for your Thesis Defense (in no particular order)

From: mnsotn#NoSpam.picard.cs.wisc.edu (Christopher Bovitz)
From The NIH Catalyst, Volume 3, page 23.


Written by Peter Dutton, Jim Lalopoulos, Alison Berube, and Jeff Cohen,
grad students extrordiannaire (#1 - 101).
Appended by Chris Bovitz, grad student grandioso (#102-131).
(#132 from Mary C. Liles).
Patricia Whitson and a few others (#130-...)
  1. "Ladies and Gentlemen, please rise for the singing of our National Anthem..."
  2. Charge 25 cents a cup for coffee.
  3. "Charge the mound" when a professor beans you with a high fast question.
  4. Interpretive dance.
  5. "Musical accompaniment provided by..."
  6. Stage your own death/suicide.
  7. Lead the specators in a Wave.
  8. Have a sing-a-long.
  9. "You call THAT a question? How the hell did they make you a professor?"
  10. "Ladies and Gentlemen, as I dim the lights, please hold hands and concentrate so that we may channel the spirit of Lord Kelvin..."
  11. Have bodyguards outside the room to "discourage" certain professors from sitting in.
  12. Puppet show.
  13. Group prayer.
  14. Animal sacrifice to the god of the Underworld.
  15. Sell T-shirts to recoup the cost of copying, binding, etc.
  16. "I'm sorry, I can't hear you - there's a banana in my ear!"
  17. Imitate Groucho Marx.
  18. Mime.
  19. Hold a Tupperware party.
  20. Have a bikini-clad model be in charge of changing the overheads.
  21. "Everybody rhumba!!"
  22. "And it would have worked if it weren't for those meddling kids..."
  23. Charge a cover and check for ID.
  24. "In protest of our government's systematic and brutal oppression of minorities..."
  25. "Anybody else as drunk as I am?"
  26. Smoke machines, dramatic lighting, pyrotechnics...
  27. Use a Super Soaker to point at people.
  28. Surreptitioulsy fill the room with laughing gas.
  29. Door prizes and a raffle.
  30. "Please phrase your question in the form of an answer..."
  31. "And now, a word from our sponsor..."
  32. Present your entire talk in iambic pentameter.
  33. Whine piteously, beg, cry...
  34. Switch halfway through your talk to Pig Latin. Or Finnish Pig Latin.
  35. The Emperor's New Slides ("only fools can't see the writing...")
  36. Table dance (you or an exotic dancer).
  37. Fashion show.
  38. "Yo, a smooth shout out to my homies..."
  39. "I'd like to thank the Academy..."
  40. Minstrel show (blackface, etc.).
  41. Previews, cartoons, and the Jimmy Fund.
  42. Pass the collection basket.
  43. Two-drink minimum.
  44. Black tie only.
  45. "Which reminds me of a story - A Black guy, a Chinese guy, and a Jew walked into a bar..."
  46. Incite a revolt.
  47. Hire the Goodyear Blimp to circle the building.
  48. Release a flock of doves.
  49. Defense by proxy.
  50. "And now a reading from the Book of Mormon..."
  51. Leave Jehovah's Witness pamphlets scattered about.
  52. "There will be a short quiz after my presentation..."
  53. "Professor Robinson, will you marry me?"
  54. Bring your pet boa.
  55. Tell ghost stories.
  56. Do a "show and tell".
  57. Food fight.
  58. Challenge a professor to a duel. Slapping him with a glove is optional.
  59. Halftime show.
  60. "Duck, duck, duck, duck... GOOSE!"
  61. "OK - which one of you farted?"
  62. Rimshot.
  63. Sell those big foam "We're number #1 (sic)" hands.
  64. Pass out souvenier matchbooks.
  65. 3-ring defense.
  66. "Tag - you're it!"
  67. Circulate a vicious rumor that the Dead will be opening, making sure that it gets on the radio stations, and escape during all the commotion.
  68. Post signs: "Due to a computer error at the Registrar's Office, the original room is not available, and the defense has been relocated to Made-up non-existent room number)"
  69. Hang a pinata over the table and have a strolling mariachi band.
  70. Make each professor remove an item of clothing for each question he asks.
  71. Rent a billboard on the highway proclaiming "Thanks for passing me Professors X,Y, and Z" - BEFORE your defense happens.
  72. Have a make-your-own-sundae table.
  73. Make committee members wear silly hats.
  74. Simulate your experiment with a virtual reality system for the spectators.
  75. Do a soft-shoe routine.
  76. Throw a masquerade defense, complete with bobbing for apples and pin-the-tail-on-the-donkey.
  77. Use a Greek Chorus to highlight important points.
  78. "The responsorial psalm can be found on page 124 of the thesis..."
  79. Tap dance.
  80. Vaudeville.
  81. "I'm sorry Professor Smith, I didn't say 'SIMON SAYS any questions?'. You're out."
  82. Flex and show off those massive pecs.
  83. Dress in top hat and tails.
  84. Hold a pre-defense pep rally, complete with cheerleaders, pep band, and a bonfire.
  85. Detonate a small nuclear device in the room. Or threaten to.
  86. Shadow puppets.
  87. Show slides of your last vacation.
  88. Put your overheads on a film strip. Designate a professor to be in charge of turning the strip when the tape recording beeps.
  89. Same as #88, but instead of a tape recorder, go around the room making a different person read the pre-written text for each picture.
  90. "OK, everybody - heads down on the desk until you show me you can behave."
  91. Call your advisor "sweetie".
  92. Have everyone pose for a group photo.
  93. Instant replay.
  94. Laugh maniacally.
  95. Talk with your mouth full.
  96. Start speaking in tongues.
  97. Explode.
  98. Implode.
  99. Spontaneously combust.
  100. Answer every question with a question.
  101. Moon everyone in the room after you are done.
  102. Rearrange the chairs into a peace symbol.
  103. Refer to yourself in the third person, like Julius Caesar did.
  104. Mention your professor as "my helper."
  105. Say that you'd like to thank a few people. Pull out the White Pages. Start reading.
  106. Advertise it as "pot luck".
  107. Talk in Klingonese.
  108. Dress like your favorite character from "Star Trek".
  109. Ask imaginary helpers to change transparencies; fly off the handle when they don't.
  110. Wear a trenchcoat. And nothing else.
  111. Dress in a Wild West style.
  112. Go dressed in scuba gear. Use the oxygen tank.
  113. Preface with the story of your life.
  114. Wear a swimsuit from the opposite sex: man - wear a bikini, woman - wear trunks.
  115. Have bodyguards on your sides as you talk. The bigger, the better. Have a questioner thrown out "as an example."
  116. Have someone wheel in a big cake with you in it. Jump out and begin.
  117. Perform your defense as a Greek tragedy, kill yourself offstage when you're done.
  118. Half way through, break down. Go to your professor, curl up on his or her lap and call him or her "Mommy". Suck your thumb.
  119. Suddenly develop Turret's Syndrome.
  120. Suddenly develop the China Syndrome.
  121. "This defense has been sponsored by the fine people at (your favorite corporation)..."
  122. Secede from the U.S. Give yourself political asylum.
  123. Talk in Canadianese - add an "eh" after every sentence.
  124. When a professor asks you a question, argue with your imaginary twin over the final answer.
  125. Videotape it ahead of time, and get someone set it up to show. Come in the back and sit there. When your tape is done, ask for questions. In person.
  126. Have every person pick a "CB" handle. Enforce their usage. Talk in CB lingo. End every statement with "good buddy." End every question with "over."
  127. Provide party favors. Noisy ones.
  128. Frequently ask if anyone has to go to the potty.
  129. Mention that you have to hurry because "Hard Copy" is on in 15 minutes.
  130. Dress like your school mascot.
  131. Urge your committee that if they like your defense enough to tell two friends, and then they'll tell two friends, and so on, and so on...
  132. Show up in drag accompanied by the Drag Queens you met at last night's performance and proclaim your thesis presentation will instead discuss: "Blue Eyeshadow: Our Friend Or Foe?" From: smitch#NoSpam.alcor.concordia.ca (Sidney N. Mitchell)
  133. Plead the fifth ammendment if you can't answer a question.
  134. Keep your back to the committee during the presentation and defense phases.
  135. Answer only questions that begin with sir and end with sir. (tell your committe this beforehand).
  136. Limit the number of questions that you will allow, and then when the limit is almost reached, go into aerobics terminology... four more...three more...two more..and...rest.
  137. Ignore the committee and say "I think that young man/lady at the back has a question".
  138. Have your parents call your committee members repeatedly the week before your defense to tell them how expensive it is putting a child through graduate school etc.
  139. At the defense, have your parents sit directly behind your committee.
  140. Burp, pass gas, scratch (anywhere repeatedly), and pick your nose.
  141. "Laugh, will you? Well, they laughed at Galileo, they laughed at Einstein..."
  142. Hand out 3-D glasses.
  143. "I'm rubber, you're glue..."
  144. Go into labor (especially for men).
  145. Give your entire speech in a "Marvin Martian" accent.
  146. "I don't know - I didn't write this."
  147. Before your defense, build trapdoors underneath all the seats.
  148. Swing in through the window, yelling a la Tarzan.
  149. Lock the department head and his secretary out of the defense room. And the coffee lounge, the department office, the copy room, and the mail room. Heck, lock them out of the building. And refuse to sell them stamps. (NOTE: This is an inside gripe, based on conditions that existed in the ME department at WPI while we were there. Sorry.)
  150. Roll credits at the end. Include a "key grip", and a "best boy".
  151. Hang a disco ball in the center of the room. John Travolta pose optional.
  152. Invite the homeless.
  153. "I could answer that, but then I'd have to kill you"
  154. Hide.
  155. Get a friend to ask the first question. Draw a blank-loaded gun and "shoot" him. Have him make a great scene of dying (fake blood helps). Turn to the stunned audience and ask "any other wise-ass remarks?"
  156. Same as #154, except use real bullets.
  157. "Well, I saw it on the internet, so I figured it might be a good idea..."
  158. Wear clown makeup, a clown wig, clown shoes, and a clown nose. And nothing else.
  159. Use the words "marginalized", "empowerment", and "patriarchy".
  160. Play Thesis Mad Libs.
  161. Try to use normal printed paper on the overhead projector.
  162. Do your entire defense operatically.
  163. Invite your parents. Especially if they are fond of fawning over you. ("We always knew he was such an intelligent child")
  164. Flash "APPLAUSE" and "LAUGHTER" signs.
  165. Mosh pit.
  166. Have cheerleaders. ("Gimme an 'A'!!")
  167. Bring Howard Cosell out of retirement to do color commentary.
  168. "I say Hallelujah, brothers and sisters!"
  169. Claim political asylum.
  170. Traffic reports every 10 minutes on the 1's.
  171. Introduce the "Eyewitness Thesis Team". Near the end of your talk, cut to Jim with sports and Alison with the weather.
  172. Live radio and TV coverage.
  173. Hang a sign that says "Thank you for not asking questions"
  174. Bring a microphone. Point it at the questioner, talk-show style.
  175. Use a TelePromTer
  176. "Take my wife - please!"
  177. Refuse to answer questions unless they phrase the question as a limerick.
  178. Have everyone bring wine glasses. When they clink the glasses with a spoon, you have to kiss your thesis. Or your advisor.
  179. Offer a toast.
  180. Firewalk.
  181. Start giving your presentation 15 minutes early.
  182. Play drinking thesis games. Drink for each overhead. Drink for each question. Chug for each awkward pause. This goes for the audience as well.
  183. Swoop in with a cape and tights, Superman style.
  184. "By the power of Greyskull..."
  185. Use any past or present Saturday Night Live catchphrase. Not.
  186. Stand on the table.
  187. Sell commercial time for your talk and ad space on your overheads.
  188. Hold a raffle.
  189. "You think this defense was bad? Let me read this list to show you what I COULD have done..."
(FINAL NOTE: Depending on the subject of your thesis, some of these things, such as tap dance, virtual reality, or reading from the Book of Mormon might be entirely appropriate, of course.)

(FINAL FINAL NOTE: Circulate this list freely if you'd like, but please remember to credit Peter, Jim, and Alison as the major authors.)

Recommended C Style and Coding Standards

Author List

L.W. Cannon
R.A. Elliott
L.W. Kirchhoff
J.H. Miller
J.M. Milner
R.W. Mitze
E.P. Schan
N.O. Whittington
Bell Labs

Henry Spencer
Zoology Computer Systems
University of Toronto

David Keppel
EECS, UC Berkeley
CS&E, University of Washington

Mark Brader
SoftQuad Incorporated
Toronto

Diomidis Spinellis
Department of Technology and Management
Athens University of Economics and Business
Athens, Greece
dds@aueb.gr (mailto:dds@aueb.gr)

Introduction

This document is a modified version of a document from a committee formed at AT&T's Indian Hill labs to establish a common set of coding standards and recommendations for the Indian Hill community. The scope of this work is C coding style. Good style should encourage consistent layout, improve portability, and reduce errors. This work does not cover functional organization, or general issues such as the use of gotos. We have tried to combine previous work [1,6,8] on C style into a uniform set of standards that should be appropriate for any project using C, although parts are biased towards particular systems. The opinions in this document do not reflect the opinions of all authors. Please reflect comments and suggestions to the last author. Of necessity, these standards cannot cover all situations. Experience and informed judgement count for much. Programmers who encounter unusual situations should consult either experienced C programmers or code written by experienced C programmers (preferably following these rules).

Ultimately, the goal of these standards is to increase portability, reduce maintenance, and above all improve clarity.

Many of the style choices here are somewhat arbitrary. Mixed coding style is harder to maintain than bad coding style. When changing existing code it is better to conform to the style (indentation, spacing, commenting, naming conventions) of the existing code than it is to blindly follow this document. This is particularly relevant when coding Microsoft Windows programs which depend on the Microsoft style of declarations and coding.

``To be clear is professional; not to be clear is unprofessional.'' - Sir Ernest Gowers.

File Organization

A file consists of various sections that should be separated by several blank lines. Although there is no maximum length limit for source files, files with more than about 1000 lines are cumbersome to deal with. The editor may not have enough temp space to edit the file, compilations will go more slowly, etc. Many rows of asterisks, for example, present little information compared to the time it takes to scroll past, and are discouraged. Lines longer than 79 columns are not handled well by all terminals or windows and should be avoided if possible. Excessively long lines which result from deep indenting are often a symptom of poorly-organized code.

File Naming Conventions

File names are made up of a base name, and an optional period and suffix. The first character of the name should be a letter and all characters (except the period) should be lower-case letters and numbers. The base name should be eight or fewer characters and the suffix should be three or fewer characters (four, if you include the period). These rules apply to both program files and default files used and produced by the program (e.g., ``rogue.sav'').

Some compilers and tools require certain suffix conventions for names of files [5]. The following suffixes are required:

The following conventions are universally followed:

In addition, it is conventional to use ``Makefile'' for the control file for make (for systems that support it) and ``README'' for a summary of the contents of the directory or directory tree.

Program Files

The suggested order of sections for a program file is as follows:

  1. First in the file is a prologue that tells what is in that file. A description of the purpose of the objects in the files (whether they be functions, external data declarations or definitions, or something else) is more useful than a list of the object names. The prologue also contains author(s), revision control information, copyright message, references, etc.
    /*
     * bitmap -- Routines that operate on square bitmaps
     *
     * (C) Copyright Yoyodyne Enterprises.  All rights reserved.
     *
     * Author: John Smith
     *
     * $Header$
     *
     */
  2. Any header file includes should be next. If the include is for a non-obvious reason, the reason should be commented. In most cases, system include files like stdio.h should be included before user include files.
  3. Any defines and typedefs that apply to the file as a whole are next. One normal order is to have ``constant'' macros first, then ``function'' macros, then typedefs and enums.
  4. Next come the global (external) data declarations, usually in the order: externs, non-static globals, static globals. If a set of defines applies to a particular piece of global data (such as a flags word), the defines should be immediately after the data declaration or embedded in structure declarations, indented to put the defines one level deeper than the first keyword of the declaration to which they apply.
  5. The functions come last, and should be in some sort of meaningful order. Like functions should appear together. A ``depth-first'' (functions defined as soon as possible before their calls) is preferred over a ``breadth-first'' approach (functions on a similar level of abstraction together). Considerable judgement is called for here. If defining large numbers of essentially-independent utility functions, consider alphabetical order.

Header Files

Header files are files that are included in other files prior to compilation by the C preprocessor. Some, such as stdio.h, are defined at the system level and must included by any program using the standard I/O library. Header files are also used to contain data declarations and defines that are needed by more than one program. Header files should be functionally organized, i.e., declarations for separate subsystems should be in separate header files. Also, if a set of declarations is likely to change when code is ported from one machine to another, those declarations should be in a separate header file.

Avoid private header filenames that are the same as library header filenames. The statement #include """math.h""" will include the standard library math header file if the intended one is not found in the current directory. If this is what you want to happen, comment this fact. Don't use absolute pathnames for header files. Use the <name> construction for getting them from a standard place, or define them relative to the current directory. The ``include-path'' option of the C compiler (-I on many systems) is the best way to handle extensive private libraries of header files; it permits reorganizing the directory structure without having to alter source files.

Header files that declare functions or external variables should be included in the file that defines the function or variable. That way, the compiler can do type checking and the external declaration will always agree with the definition.

Defining variables in a header file is often a poor idea. Frequently it is a symptom of poor partitioning of code between files. Also, some objects like typedefs and initialized data definitions cannot be seen twice by the compiler in one compilation. On some systems, repeating uninitialized declarations without the extern keyword also causes problems. Repeated declarations can happen if include files are nested and will cause the compilation to fail.

Header files should not be nested. The prologue for a header file should, therefore, describe what other headers need to be #included for the header to be functional. In extreme cases, where a large number of header files are to be included in several different source files, it is acceptable to put all common #includes in one include file.

It is common to put the following into each .h file to prevent accidental double-inclusion.

#ifndef EXAMPLE_H
#define EXAMPLE_H
/* body of example.h file */
/* ...  */
#endif /* EXAMPLE_H */

This double-inclusion mechanism should not be relied upon, particularly to perform nested includes.

Other Files

It is conventional to have a file called ``README'' to document both ``the bigger picture'' and issues for the program as a whole. For example, it is common to include a list of all conditional compilation flags and what they mean. It is also common to list files that are machine dependent, etc.

Comments

``When the code and the comments disagree, both are probably wrong.'' - Norm Schryer

The comments should describe what is happening, how it is being done, what parameters mean, which globals are used and which are modified, and any restrictions or bugs. Avoid, however, comments that are clear from the code, as such information rapidly gets out of date. Comments that disagree with the code are of negative value. Short comments should be what comments, such as ``compute mean value'', rather than how comments such as ``sum of values divided by n''. C is not assembler; putting a comment at the top of a 3-10 line section telling what it does overall is often more useful than a comment on each line describing micrologic.

Comments should justify offensive code. The justification should be that something bad will happen if unoffensive code is used. Just making code faster is not enough to rationalize a hack; the performance must be shown to be unacceptable without the hack. The comment should explain the unacceptable behavior and describe why the hack is a ``good'' fix.

Comments that describe data structures, algorithms, etc., should be in block comment form with the opening /* in columns 1-2, a * in column 2 before each line of comment text, and the closing */ in columns 2-3.

/*
 *      Here is a block comment.
 *      The comment text should be tabbed or spaced over uniformly.
 *      The opening slash-star and closing star-slash are alone on a line.
 */

Note that grep '^. *' will catch all block comments in the file. Some automated program-analysis packages use different characters before comment lines as a marker for lines with specific items of information. In particular, a line with a `-' in a comment preceding a function is sometimes assumed to be a one-line summary of the function's purpose. Very long block comments such as drawn-out discussions and copyright notices often start with /* in columns 1-2, no leading * before lines of text, and the closing */ in columns 1-2. Block comments inside a function are appropriate, and they should be tabbed over to the same tab setting as the code that they describe. One-line comments alone on a line should be indented to the tab setting of the code that follows.

if (argc > 1) {
        /* Get input file from command line. */
        if (freopen(argv[1], "r"stdin) == NULL) {
                perror (argv[1]);
        }
}

Very short comments may appear on the same line as the code they describe, and should be tabbed over to separate them from the statements. If more than one short comment appears in a block of code they should all be tabbed to the same tab setting.

if (a == EXCEPTION) {
        b = TRUE;                       /* special case */
else {
        b = isprime(a);                 /* works only for odd a */
}

Declarations

Global declarations should begin in column 1. All external data declaration should be preceded by the extern keyword. If an external variable is an array that is defined with an explicit size, then the array bounds must be repeated in the extern declaration unless the size is always encoded in the array (e.g., a read-only character array that is always null-terminated). Repeated size declarations are particularly beneficial to someone picking up code written by another.

The ``pointer'' qualifier, `*', should be with the variable name rather than with the type.

char            *s, *t, *u;
instead of
char*   s, t, u;
which is wrong, since `t' and `u' do not get declared as pointers.

Unrelated declarations, even of the same type, should be on separate lines. A comment describing the role of the object being declared should be included, with the exception that a list of #defined constants do not need comments if the constant names are sufficient documentation. The names, values, and comments are usually tabbed so that they line up underneath each other. Use the tab character rather than blanks (spaces). For structure and union template declarations, each element should be alone on a line with a comment describing it. The opening brace ({) should be on the same line as the structure tag, and the closing brace (}) should be in column 1.

struct boat {
        int             wllength;       /* water line length in meters */
        int             type;           /* see below */
        long            sailarea;       /* sail area in square mm */
};

/* defines for boat.type */
#define KETCH   (1)
#define YAWL    (2)
#define SLOOP   (3)
#define SQRIG   (4)
#define MOTOR   (5)

These defines are sometimes put right after the declaration of type, within the struct declaration, with enough tabs after the `#' to indent define one level more than the structure member declarations. When the actual values are unimportant, the enum facility is better.

enum bt { KETCH=1, YAWL, SLOOP, SQRIG, MOTOR };
struct boat {
        int             wllength;       /* water line length in meters */
        enum bt         type;           /* what kind of boat */
        long            sailarea;       /* sail area in square mm */
};

Any variable whose initial value is important should be explicitly initialized, or at the very least should be commented to indicate that C's default initialization to zero is being relied upon. The empty initializer, ``{}'', should never be used. Structure initializations should be fully parenthesized with braces. Constants used to initialize longs should be explicitly long. Use capital letters; for example two long ``2l'' looks a lot like ``21'', the number twenty-one.

int             x = 1;
char            *msg = "message";
struct boat     winner[] = {
        { 40, YAWL, 6000000L },
        { 28, MOTOR, 0L },
        { 0 },
};

In any file which is part of a larger whole rather than a self-contained program, maximum use should be made of the static keyword to make functions and variables local to single files. Variables in particular should be accessible from other files only when there is a clear need that cannot be filled in another way. Such usage should be commented to make it clear that another file's variables are being used; the comment should name the other file. If your debugger hides static objects you need to see during debugging, declare them as STATIC and #define STATIC as needed.

The most important types should be highlighted by typedeffing them, even if they are only integers, as the unique name makes the program easier to read (as long as there are only a few things typedeffed to integers!). Avoid typedeffing structures and unions, as this hides the fact that an object is composite from the code reader.

The return type of functions should always be declared. Always use function prototypes. One common mistake is to omit the declaration of external math functions that return double. The compiler then assumes that the return value is an integer and the bits are dutifully converted into a (meaningless) floating point value.

``C takes the point of view that the programmer is always right.'' - Michael DeCorte

Function Declarations

Each function should be preceded by a block comment prologue that gives a short description of what the function does and (if not clear) how to use it. Discussion of non-trivial design decisions and side-effects is also appropriate. Avoid duplicating information clear from the code.

The function return type should be alone on a line, (optionally) indented one stop. ``Tabstops'' can be blanks (spaces) inserted by your editor in clumps of 2, 4, or 8. Do not default to int; if the function does not return a value then it should be given return type void. If the value returned requires a long explanation, it should be given in the prologue; otherwise it can be on the same line as the return type, tabbed over. The function name (and the formal parameter list) should be alone on a line, in column 1. Destination (return value) parameters should generally be first (on the left). All local declarations and code within the function body should be tabbed over one stop. The opening brace of the function body should be alone on a line beginning in column 1.

Each parameter should be declared (do not default to int). In general the role of each variable in the function should be described. This may either be done in the function comment or, if each declaration is on its own line, in a comment on that line. Loop counters called ``i'', string pointers called ``s'', and integral types called ``c'' and used for characters are typically excluded. If a group of functions all have a like parameter or local variable, it helps to call the repeated variable by the same name in all functions. (Conversely, avoid using the same name for different purposes in related functions.) Like parameters should also appear in the same place in the various argument lists.

Comments for parameters and local variables should be tabbed so that they line up underneath each other. Local variable declarations should be separated from the function's statements by a blank line.

Be careful when you use or declare functions that take a variable number of arguments (``varargs''). Always use the ``stdarg.h'' header definitions and do not rely on item order on the stack.

If the function uses any external variables (or functions) that are not declared globally in the file, these should have their own declarations in the function body using the extern keyword.

Avoid local declarations that override declarations at higher levels. In particular, local variables should not be redeclared in nested blocks. Although this is valid C, the potential confusion is enough that lint will complain about it when given the -h option.

Whitespace

int i;main(){for(;i["]<i;++i){--i;}"];read('-'-'-',i+++"hell\
o, world!\n",'/'/'/'));}read(j,i,p){write(j/p+p,i---j,i/i);}

- Dishonorable mention, Obfuscated C Code Contest, 1984.
Author requested anonymity.

Use vertical and horizontal whitespace generously. Indentation and spacing should reflect the block structure of the code; e.g., there should be at least 2 blank lines between the end of one function and the comments for the next.

A long string of conditional operators should be split onto separate lines.

if (foo->next==NULL && totalcount<needed && needed<=MAX_ALLOT
        && server_active(current_input)) { ...
Might be better as
if (foo->next == NULL
        && totalcount < needed && needed <= MAX_ALLOT
        && server_active(current_input))
{
        ...
Similarly, elaborate for loops should be split onto different lines.
for (curr = *listp, trail = listp;
        curr != NULL;
        trail = &(curr->next), curr = curr->next )
{
        ...
Other complex expressions, particularly those using the ternary ?: operator, are best split on to several lines, too.
c = (a == b)
        ? d + f(a)
        : f(b) - d;
Keywords that are followed by expressions in parentheses should be separated from the left parenthesis by a blank. (The sizeof operator is an exception.) Blanks should also appear after commas in argument lists to help separate the arguments visually. On the other hand, macro definitions with arguments must not have a blank between the name and the left parenthesis, otherwise the C preprocessor will not recognize the argument list.

Examples

/*
 *      Determine if the sky is blue by checking that it isn't night.
 *      CAVEAT: Only sometimes right.  May return TRUE when the answer
 *      is FALSE.  Consider clouds, eclipses, short days.
 *      NOTE: Uses `hour' from `hightime.c'.  Returns `int' for
 *      compatibility with the old version.
 */
int                                             /* true or false */
skyblue(void)
{
        extern int      hour;           /* current hour of the day */

        return (hour >= MORNING && hour <= EVENING);
}
/*
 *      Find the last element in the linked list
 *      pointed to by nodep and return a pointer to it.
 *      Return NULL if there is no last element.
 */
node_t *
tail(node_t *nodep)
{
        node_t  *np;            /* advances to NULL */
        node_t  *lp;            /* follows one behind np */

        if (nodep == NULL)
                return (NULL);
        for (np = lp = nodep; np != NULL; lp = np, np = np->next)
                ;       /* VOID */
        return (lp);
}

Simple Statements

There should be only one statement per line unless the statements are very closely related.

case FOO:         oogle (zork);  boogle (zork);  break;
case BAR:         oogle (bork);  boogle (zork);  break;
case BAZ:         oogle (gork);  boogle (bork);  break;
The null body of a for or while loop should be alone on a line and commented so that it is clear that the null body is intentional and not missing code.
while (*dest++ = *src++)
        ;       /* VOID */

Do not default the test for non-zero, i.e.

if (f() != FAIL)
is better than
if (f())
even though FAIL may have the value 0 which C considers to be false. An explicit test will help you out later when somebody decides that a failure return should be -1 instead of 0. Explicit comparison should be used even if the comparison value will never change; e.g., ``if (!(bufsize % sizeof(int)))'' should be written instead as ``if ((bufsize % sizeof(int)) == 0)'' to reflect the numeric (not boolean) nature of the test. A frequent trouble spot is using strcmp to test for string equality, where the result should never ever be defaulted. The preferred approach is to define a macro STREQ.
#define STREQ(a, b) (strcmp((a), (b)) == 0)

The non-zero test is often defaulted for predicates and other functions or expressions which meet the following restrictions:

It is common practice to declare a boolean type ``bool'' in a global include file. The special names improve readability immensely.

typedef int     bool;
#define FALSE   0
#define TRUE    1
or
typedef enum { NO=0, YES } bool;

Even with these declarations, do not check a boolean value for equality with 1 (TRUE, YES, etc.); instead test for inequality with 0 (FALSE, NO, etc.). Most functions are guaranteed to return 0 if false, but only non-zero if true. Thus,

if (func() == TRUE) { ...
must be written
if (func() != FALSE) { ...
It is even better (where possible) to rename the function/variable or rewrite the expression so that the meaning is obvious without a comparison to true or false (e.g., rename to isvalid()).
if (isvalid()) { ...

There is a time and a place for embedded assignment statements. In some constructs there is no better way to accomplish the results without making the code bulkier and less readable.

while ((c = getchar()) != EOF) {
        process the character
}
The ++ and -- operators count as assignment statements. So, for many purposes, do functions with side effects. Using embedded assignment statements to improve run-time performance is also possible. However, one should consider the tradeoff between increased speed and decreased maintainability that results when embedded assignments are used in artificial places. For example,
a = b + c;
d = a + r;
should not be replaced by
d = (a = b + c) + r;
even though the latter may save one cycle. In the long run the time difference between the two will decrease as the optimizer gains maturity, while the difference in ease of maintenance will increase as the human memory of what's going on in the latter piece of code begins to fade.

Goto statements should be used sparingly, as in any well-structured code. The main place where they can be usefully employed is to break out of several levels of switch, for, and while nesting, although the need to do such a thing may indicate that the inner constructs should be broken out into a separate function, with a success/failure return code.

        for (...) {
                while (...) {
                        ...
                        if (disaster)
                                goto error;
            
                }
        }
        ...
error:
        clean up the mess
When a goto is necessary the accompanying label should be alone on a line and tabbed one stop to the left of the code that follows. The goto should be commented (possibly in the block header) as to its utility and purpose. Continue should be used sparingly and near the top of the loop. Break is less troublesome.

Compound Statements

A compound statement is a list of statements enclosed by braces. There are many common ways of formatting the braces. Please be consistent with our local standard. When editing someone else's code, always use the style used in that code.

control {
                statement;
                statement;
}

The style above is called ``K&R style'', and is preferred if you haven't already got a favorite. With K&R style, the else part of an if-else statement and the while part of a do-while statement should appear on the same line as the close brace. With most other styles, the braces are always alone on a line.

When a block of code has several labels (unless there are a lot of them), the labels are placed on separate lines. The fall-through feature of the C switch statement, (that is, when there is no break between a code segment and the next case statement) must be commented for future maintenance. A lint-style comment/directive is best.

switch (expr) {
case ABC:
case DEF:
        statement;
        break;
case UVW:
        statement;
        /*FALLTHROUGH*/
case XYZ:
        statement;
        break;
}

Here, the last break is unnecessary, but is required because it prevents a fall-through error if another case is added later after the last one. The default case, if used, should be last and does not require a break if it is last.

Whenever an if-else statement has a compound statement for either the if or else section, the statements of both the if and else sections should both be enclosed in braces (called fully bracketed syntax).

if (expr) {
        statement;
else {
        statement;
        statement;
}
Braces are also essential in if-if-else sequences with no second else such as the following, which will be parsed incorrectly if the brace after (ex1) and its mate are omitted:
if (ex1) {
        if (ex2) {
                funca();
        }
else {
        funcb();
}

An if-else with else if should be written with the else conditions left-justified.

if (STREQ (reply, "yes")) {
        statements for yes
        ...
else if (STREQ (reply, "no")) {
        ...
else if (STREQ (reply, "maybe")) {
        ...
else {
        statements for default
        ...
}
The format then looks like a generalized switch statement and the tabbing reflects the switch between exactly one of several alternatives rather than a nesting of statements.

Do-while loops should always have braces around the body.

Forever loops should be coded using the for(;;) construct, and not the while(1) construct. Do not use braces for single statement blocks.

for (;;)
        function();

Sometimes an if causes an unconditional control transfer via break, continue, goto, or return. The else should be implicit and the code should not be indented.

if (level > limit)
        return (OVERFLOW)
normal();
return (level);
The ``flattened'' indentation tells the reader that the boolean test is invariant over the rest of the enclosing block.

Operators

Unary operators should not be separated from their single operand. Generally, all binary operators except `.' and `->' should be separated from their operands by blanks. Some judgement is called for in the case of complex expressions, which may be clearer if the ``inner'' operators are not surrounded by spaces and the ``outer'' ones are.

If you think an expression will be hard to read, consider breaking it across lines. Splitting at the lowest-precedence operator near the break is best. Since C has some unexpected precedence rules, expressions involving mixed operators should be parenthesized. Too many parentheses, however, can make a line harder to read because humans aren't good at parenthesis-matching.

There is a time and place for the binary comma operator, but generally it should be avoided. The comma operator is most useful to provide multiple initializations or operations, as in for statements. Complex expressions, for instance those with nested ternary ?: operators, can be confusing and should be avoided if possible. There are some macros like getchar where both the ternary operator and comma operators are useful. The logical expression operand before the ?: should be parenthesized and both return values must be the same type.

Naming Conventions

Individual projects will no doubt have their own naming conventions. There are some general rules however.

In general, global names (including enums) should have a common prefix identifying the module that they belong with. Globals may alternatively be grouped in a global structure. Typedeffed names often have ``_t'' appended to their name.

Avoid names that might conflict with various standard library names. Some systems will include more library code than you want. Also, your program may be extended someday.

Also note the following (from [15]):

``Length is not a virtue in a name; clarity of expression is. A global variable rarely used may deserve a long name, maxphysaddr say. An array index used on every line of a loop needn't be named any more elaborately than i. Saying index or elementnumber is more to type (or calls upon your text editor) and obscures the details of the computation. When the variable names are huge, it's harder to see what's going on. This is partly a typographic issue; consider

for(i=0 to 100)
        array[i]=0
vs.
for(elementnumber=0 to 100)
        array[elementnumber]=0;
The problem gets worse fast with real examples. Indices are just notation, so treat them as such.''

``Pointers also require sensible notation. np is just as mnemonic as nodepointer if you consistently use a naming convention from which np means ``node pointer'' is easily derived.''

As in all other aspects of readable programming, consistency is important in naming. If you call one variable maxphysaddr, don't call its cousin lowestaddress.''

``Finally, I prefer minimum-length but maximum-information names, and then let the context fill in the rest. Globals, for instance, typically have little context when they are used, so their names need to be relatively evocative. Thus I say maxphysaddr (not MaximumPhysicalAddress) for a global variable, but np not NodePointer for a pointer locally defined and used. This is largely a matter of taste, but taste is relevant to clarity.

I eschew embedded capital letters in names; to my prose-oriented eyes, they are too awkward to read comfortably. They jangle like bad typography.'' ``Procedure names should reflect what they do; function names should reflect what they return. Functions are used in expressions, often in things like if's, so they need to read appropriately.

if(checksize(x))
is unhelpful because we can't deduce whether checksize returns true on error or non-error; instead
if(validsize(x))
makes the point clear and makes a future mistake in using the routine less likely.''

Constants

Numerical constants should not be coded directly. The #define feature of the C preprocessor should be used to give constants meaningful names. Symbolic constants make the code easier to read. Defining the value in one place also makes it easier to administer large programs since the constant value can be changed uniformly by changing only the define. The enumeration data type is a better way to declare variables that take on only a discrete set of values, since additional type checking is often available. At the very least, any directly-coded numerical constant must have a comment explaining the derivation of the value.

Constants should be defined consistently with their use; e.g. use 540.0 for a float instead of 540 with an implicit float cast. There are some cases where the constants 0 and 1 may appear as themselves instead of as defines. For example if a for loop indexes through an array, then

for (i = 0; i < ARYBOUND; i++)
is reasonable while the code
door_t *front_door = opens(door[i], 7);
if (front_door == 0)
        error("can't open %s\n", door[i]);
is not. In the last example front_door is a pointer. When a value is a pointer it should be compared to NULL instead of 0. NULL is available as part of the standard I/O library's header file stdio.h and stdlib.h. Even simple values like 1 or 0 are often better expressed using defines like TRUE and FALSE (sometimes YES and NO read better).

Simple character constants should be defined as character literals rather than numbers. Non-text characters are discouraged as non-portable. If non-text characters are necessary, particularly if they are used in strings, they should be written using a escape character of three octal digits rather than one (e.g., `\007'). Even so, such usage should be considered machine-dependent and treated as such.

Macros

Complex expressions can be used as macro parameters, and operator-precedence problems can arise unless all occurrences of parameters have parentheses around them. There is little that can be done about the problems caused by side effects in parameters except to avoid side effects in expressions (a good idea anyway) and, when possible, to write macros that evaluate their parameters exactly once. There are times when it is impossible to write macros that act exactly like functions.

Some macros also exist as functions (e.g., getc and fgetc). The macro should be used in implementing the function so that changes to the macro will be automatically reflected in the function. Care is needed when interchanging macros and functions since function parameters are passed by value, while macro parameters are passed by name substitution. Carefree use of macros requires that they be declared carefully.

Macros should avoid using globals, since the global name may be hidden by a local declaration. Macros that change named parameters (rather than the storage they point at) or may be used as the left-hand side of an assignment should mention this in their comments. Macros that take no parameters but reference variables, are long, or are aliases for function calls should be given an empty parameter list, e.g.,

#define OFF_A() (a_global+OFFSET)
#define BORK()  (zork())
#define SP3()   if (b) { int x; av = f (&x); bv += x; }

Macros save function call/return overhead, but when a macro gets long, the effect of the call/return becomes negligible, so a function should be used instead.

In some cases it is appropriate to make the compiler insure that a macro is terminated with a semicolon.

if (x==3)
    SP3();
else
    BORK();
If the semicolon is omitted after the call to SP3, then the else will (silently!) become associated with the if in the SP3 macro. With the semicolon, the else doesn't match any if! The macro SP3 can be written safely as
#define SP3() \
        do { if (b) { int x; av = f (&x); bv += x; }} while (0)
Writing out the enclosing do-while by hand is awkward and some compilers and tools may complain that there is a constant in the ``while'' conditional. A macro for declaring statements may make programming easier.
#ifdef lint
        static int ZERO;
#else
#       define ZERO 0
#endif
#define STMT( stuff )           do { stuff } while (ZERO)
Declare SP3 with
#define SP3() \
	STMT( if (b) { int x; av = f (&x); bv += x; } )
Using STMT will help prevent small typos from silently changing programs.

Except for type casts, sizeof, and hacks such as the above, macros should contain keywords only if the entire macro is surrounded by braces.

Conditional Compilation

Conditional compilation is useful for things like machine-dependencies, debugging, and for setting certain options at compile-time. Beware of conditional compilation. Various controls can easily combine in unforeseen ways. If you #ifdef machine dependencies, make sure that when no machine is specified, the result is an error, not a default machine. (Use ``#error'' and indent it so it works with older compilers.) If you #ifdef optimizations, the default should be the unoptimized code rather than an uncompilable program. Be sure to test the unoptimized code.

Note that the text inside of an #ifdeffed section may be scanned (processed) by the compiler, even if the #ifdef is false. Thus, even if the #ifdeffed part of the file never gets compiled (e.g., ),"#ifdefCOMMENT" it cannot be arbitrary text.

Put #ifdefs in header files instead of source files when possible. Use the #ifdefs to define macros that can be used uniformly in the code. For instance, a header file for checking memory allocation might look like (omitting definitions for REALLOC and FREE):

#ifdef DEBUG
        extern void *mm_malloc();
#       define MALLOC(size) (mm_malloc(size))
#else
        extern void *malloc();
#       define MALLOC(size) (malloc(size))
#endif

Conditional compilation should generally be on a feature-by-feature basis. Machine or operating system dependencies should be avoided in most cases.

#ifdef 4BSD
        long t = time ((long *)NULL);
#endif
The preceding code is poor for two reasons: there may be 4BSD systems for which there is a better choice, and there may be non-4BSD systems for which the above is the best code. Instead, use define symbols such as TIME_LONG and TIME_STRUCT and define the appropriate one in a configuration file such as config.h.

Program Structure

The following are some excerpts from [15] relevant to program structure and organisation.

Complexity

Most programs are too complicated - that is, more complex than they need to be to solve their problems efficiently. Why? Mostly it's because of bad design, but I will skip that issue here because it's a big one. But programs are often complicated at the microscopic level, and that is something I can address here.

Rule 1. You can't tell where a program is going to spend its time. Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you've proven that's where the bottleneck is.

Rule 2. Measure. Don't tune for speed until you've measured, and even then don't unless one part of the code overwhelms the rest.

Rule 3. Fancy algorithms are slow when n is small, and n is usually small. Fancy algorithms have big constants. Until you know that n is frequently going to be big, don't get fancy. (Even if n does get big, use Rule 2 first.) For example, binary trees are always faster than splay trees for workaday problems.

Rule 4. Fancy algorithms are buggier than simple ones, and they're much harder to implement. Use simple algorithms as well as simple data structures.

The following data structures are a complete list for almost all practical programs:

Of course, you must also be prepared to collect these into compound data structures. For instance, a symbol table might be implemented as a hash table containing linked lists of arrays of characters.

Rule 5. Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming. (See Brooks p. 102.)

Rule 6. There is no Rule 6.

Programming with data.

Algorithms, or details of algorithms, can often be encoded compactly, efficiently and expressively as data rather than, say, as lots of if statements. The reason is that the complexity of the job at hand, if it is due to a combination of independent details, can be encoded. A classic example of this is parsing tables, which encode the grammar of a programming language in a form interpretable by a fixed, fairly simple piece of code. Finite state machines are particularly amenable to this form of attack, but almost any program that involves the `parsing' of some abstract sort of input into a sequence of some independent `actions' can be constructed profitably as a data-driven algorithm.

Perhaps the most intriguing aspect of this kind of design is that the tables can sometimes be generated by another program - a parser generator, in the classical case. As a more earthy example, if an operating system is driven by a set of tables that connect I/O requests to the appropriate device drivers, the system may be `configured' by a program that reads a description of the particular devices connected to the machine in question and prints the corresponding tables.

One of the reasons data-driven programs are not common, at least among beginners, is the tyranny of Pascal. Pascal, like its creator, believes firmly in the separation of code and data. It therefore (at least in its original form) has no ability to create initialized data. This flies in the face of the theories of Turing and von Neumann, which define the basic principles of the stored-program computer. Code and data are the same, or at least they can be. How else can you explain how a compiler works? (Functional languages have a similar problem with I/O.)

Function pointers

Another result of the tyranny of Pascal is that beginners don't use function pointers. (You can't have function-valued variables in Pascal.) Using function pointers to encode complexity has some interesting properties.

Some of the complexity is passed to the routine pointed to. The routine must obey some standard protocol - it's one of a set of routines invoked identically - but beyond that, what it does is its business alone. The complexity is distributed.

There is this idea of a protocol, in that all functions used similarly must behave similarly. This makes for easy documentation, testing, growth and even making the program run distributed over a network - the protocol can be encoded as remote procedure calls.

I argue that clear use of function pointers is the heart of object-oriented programming. Given a set of operations you want to perform on data, and a set of data types you want to respond to those operations, the easiest way to put the program together is with a group of function pointers for each type. This, in a nutshell, defines class and method. The O-O languages give you more of course - prettier syntax, derived types and so on - but conceptually they provide little extra.

Combining data-driven programs with function pointers leads to an astonishingly expressive way of working, a way that, in my experience, has often led to pleasant surprises. Even without a special O-O language, you can get 90% of the benefit for no extra work and be more in control of the result. I cannot recommend an implementation style more highly. All the programs I have organized this way have survived comfortably after much development - far better than with less disciplined approaches. Maybe that's it: the discipline it forces pays off handsomely in the long run.

Debugging

``C Code. C code run. Run, code, run... PLEASE!!!'' - Barbara Tongue

If you use enums, the first enum constant should have a non-zero value, or the first constant should indicate an error.

enum { STATE_ERR, STATE_START, STATE_NORMAL, STATE_END } state_t;
enum { VAL_NEW=1, VAL_NORMAL, VAL_DYING, VAL_DEAD } value_t;
Uninitialized values will then often ``catch themselves''.

Check for error return values, even from functions that ``can't'' fail. Consider that close() and fclose() can and do fail, even when all prior file operations have succeeded. Write your own functions so that they test for errors and return error values or abort the program in a well-defined way. Include a lot of debugging and error-checking code and leave most of it in the finished product. Check even for ``impossible'' errors. [8]

Use the assert facility to insist that each function is being passed well-defined values, and that intermediate results are well-formed.

Build in the debug code using as few #ifdefs as possible. For instance, if ``mm_malloc'' is a debugging memory allocator, then MALLOC will select the appropriate allocator, avoids littering the code with #ifdefs, and makes clear the difference between allocation calls being debugged and extra memory that is allocated only during debugging.

#ifdef DEBUG
#       define MALLOC(size)  (mm_malloc(size))
#else
#       define MALLOC(size)  (malloc(size))
#endif

Check bounds even on things that ``can't'' overflow. A function that writes on to variable-sized storage should take an argument maxsize that is the size of the destination. If there are times when the size of the destination is unknown, some `magic' value of maxsize should mean ``no bounds checks''. When bound checks fail, make sure that the function does something useful such as abort or return an error status.

/*
 * INPUT: A null-terminated source string `src' to copy from and
 * a `dest' string to copy to.  `maxsize' is the size of `dest'
 * or UINT_MAX if the size is not known.  `src' and `dest' must
 * both be shorter than UINT_MAX, and `src' must be no longer than
 * `dest'.
 * OUTPUT: The address of `dest' or NULL if the copy fails.
 * `dest' is modified even when the copy fails.
 */
char *
copy (char *dest, size_t maxsize, char *src)
{
        char *dp = dest;

        while (maxsize-- > 0)
                if ((*dp++ = *src++) == '\0')
                        return (dest);

        return (NULL);
}

In all, remember that a program that produces wrong answers twice as fast is infinitely slower. The same is true of programs that crash occasionally or clobber valid data.

Portability

``C combines the power of assembler with the portability of assembler.''
- Anonymous, alluding to Bill Thacker.

The advantages of portable code are well known. This section gives some guidelines for writing portable code. Here, ``portable'' means that a source file can be compiled and executed on different machines with the only change being the inclusion of possibly different header files and the use of different compiler flags. The header files will contain #defines and typedefs that may vary from machine to machine. In general, a new ``machine'' is different hardware, a different operating system, a different compiler, or any combination of these. Reference [1] contains useful information on both style and portability. The following is a list of pitfalls to be avoided and recommendations to be considered when designing portable code:

ANSI C

Modern C compilers support the ANSI standard C [16]. Write code to run under standard C, and use features such as function prototypes, constant storage, and volatile storage. Standard C improves program performance by giving better information to optimizers. Standard C improves portability by insuring that all compilers accept the same input language and by providing mechanisms that try to hide machine dependencies or emit warnings about code that may be machine-dependent.

Formatting

Note that under ANSI C, the `#' for a preprocessor directive must be the first non-whitespace character on a line. Use this feature to improve the formatting of your files.

An ``#ifdef NAME'' should end with either ``#endif'' or ``#endif /* NAME */'', not with ``#endif NAME''. The comment should not be used on short #ifdefs, as it is clear from the code.

ANSI trigraphs may cause programs with strings containing ``??'' may break mysteriously.

The style for ANSI C is the same as for regular C, with two notable exceptions: storage qualifiers and parameter lists.

Because const and volatile have strange binding rules, each const or volatile object should have a separate declaration.

int const *s;           /* YES */
int const *s, *t;       /* NO */

Prototyped functions merge parameter declaration and definition in to one list. Parameters should be commented in the function comment.

/*
 * `bp': boat trying to get in.
 * `stall': a list of stalls, never NULL.
 * returns stall number, 0 => no room.
 */
int
enter_pier (boat_t const *bp, stall_t *stall)
{
        ...

Pragmas

Pragmas are used to introduce machine-dependent code in a controlled way. Obviously, pragmas should be treated as machine dependencies. Unfortunately, the syntax of ANSI pragmas makes it impossible to isolate them in machine-dependent headers.

Pragmas are of two classes. Optimizations may safely be ignored. Pragmas that change the system behavior (``required pragmas'') may not. Required pragmas should be #ifdeffed so that compilation will abort if no pragma is selected.

Two compilers may use a given pragma in two very different ways. For instance, one compiler may use ``haggis'' to signal an optimization. Another might use it to indicate that a given statement, if reached, should terminate the program. Thus, when pragmas are used, they must always be enclosed in machine-dependent #ifdefs. Pragmas must always be #ifdefed out for non-ANSI compilers. Be sure to indent the `#' character on the #pragma, as older preprocessors will halt on it otherwise.

#if defined(__STDC__) && defined(USE_HAGGIS_PRAGMA)
        #pragma (HAGGIS)
#endif
``The `#pragma' command is specified in the ANSI standard to have an arbitrary implementation-defined effect. In the GNU C preprocessor, `#pragma' first attempts to run the game `rogue'; if that fails, it tries to run the game `hack'; if that fails, it tries to run GNU Emacs displaying the Tower of Hanoi; if that fails, it reports a fatal error. In any case, preprocessing does not continue.''
- Manual for the GNU C preprocessor for GNU CC 1.34.

Special Considerations

This section contains some miscellaneous do's and don'ts.

Make

One very useful tool is make [7]. During development, make recompiles only those modules that have been changed since the last time make was used. It can be used to automate other tasks, as well. Some common conventions include:

all
always makes all binaries
clean
remove all intermediate files
debug
make a test binary 'a.out' or 'debug'
depend
make transitive dependencies
install
install binaries, libraries, etc.
deinstall
back out of ``install''
print/list
make a hard copy of all source files
shar
make a shar of all source files
spotless
make clean, use revision control to put away sources. Note: doesn't remove Makefile, although it is a source file
source
undo what spotless did
tags
run ctags, (using the -t flag is suggested)
rdist
distribute sources to other hosts
zip
create a zip file for distribution file.c
check out the named file from revision control
In addition, command-line defines can be given to define either Makefile values (such as ``CFLAGS'') or values in the program (such as ``DEBUG'').

Project-Dependent Standards

Individual projects may wish to establish additional standards beyond those given here. The following issues are some of those that should be addressed by each project program administration group.

Conclusion

A set of standards has been presented for C programming style. Among the most important points are:

As with any standard, it must be followed if it is to be useful. If you have trouble following any of these standards don't just ignore them. Talk with an experienced programmer.

References

  1. B.A. Tague, C Language Portability, Sept 22, 1977. This document issued by department 8234 contains three memos by R.C. Haight, A.L. Glasser, and T.L. Lyon dealing with style and portability.
  2. S.C. Johnson, Lint, a C Program Checker, USENIX Unix Supplementary Documents, November 1986.
  3. R.W. Mitze, The 3B/PDP-11 Swabbing Problem, Memorandum for File, 1273-770907.01MF, September 14, 1977.
  4. R.A. Elliott and D.C. Pfeffer, 3B Processor Common Diagnostic Standards- Version 1, Memorandum for File, 5514-780330.01MF, March 30, 1978.
  5. R.W. Mitze, An Overview of C Compilation of Unix User Processes on the 3B, Memorandum for File, 5521-780329.02MF, March 29, 1978.
  6. B.W. Kernighan and D.M. Ritchie, The C Programming Language, Prentice Hall 1978, Second Ed. 1988, ISBN 0-13-110362-8.
  7. S.I. Feldman, Make - A Program for Maintaining Computer Programs, USENIX Unix Supplementary Documents, November 1986.
  8. Ian Darwin and Geoff Collyer, Can't Happen or /* NOTREACHED */ or Real Programs Dump Core, USENIX Association Winter Conference, Dallas 1985 Proceedings.
  9. Brian W. Kernighan and P. J. Plauger The Elements of Programming Style. McGraw-Hill, 1974, Second Ed. 1978, ISBN 0-07-034-207-5.
  10. J. E. Lapin Portable C and UNIX System Programming, Prentice Hall 1987, ISBN 0-13-686494-5.
  11. Ian F. Darwin, Checking C Programs with lint, O'Reilly & Associates, 1989. ISBN 0-937175-30-7.
  12. Andrew R. Koenig, C Traps and Pitfalls, Addison-Wesley, 1989. ISBN 0-201-17928-8.
  13. Samuel P. Harbison and Guy L. Steele Jr. C: A Reference Manual 1984, 1987 ISBN is 0-13-109802-0
  14. Mark Horton Portable C Software Prentice-Hall, Englewood Cliffs NJ 1990 ISBN is 0-13-868050-7
  15. Rob Pike Notes on Programming in C
  16. American National Standard for Information Systems - Programming Language - C: ANSI X3.159-1989, December, 1989. Published by the American National Standards Institute, 1430 Broadway, New York, New York 10018.

Contents