Lister Panel Q&A

2014-07-23 — The Witness

By	Casey Muratori

Last week, I posted the source code to the Witness editor’s Lister Panel and asked you, the internet at large, for questions. I received quite a few, and have endeavored to answer them all below.

Q:	How does Panel::Layout.begin_row_template() work?

Row templates are a simple extension to the column-spacing system I already described. As you might expect, you often want a row of buttons to have different widths. Since the Lister Panel UI often has a set of rows that all have the same column layout, I made the UI system understand the concept of a set of column widths that could be repeated on several rows. Usage code “begins” a row template, then adds a bunch of column widths, then “ends” the template. From then on, until the template is reset or cleared, rows will take that shape. Here is the implementation:

void Panel_Layout::begin_row_template(void)

{

assert(!in_template);

in_template = true;

template_column_widths.reset();

}

void Panel_Layout::row_template_variable_column(void)

{

template_column_widths.add(-1.0f);

}

void Panel_Layout::row_template_fixed_column(float width)

{

template_column_widths.add(width);

}

void Panel_Layout::end_and_set_row_template(void)

{

assert(in_template);

int variable_width_count = 0;

float total_fixed_width = 0.0f;

for(int width_index = 0;

width_index < template_column_widths.items;

++width_index)

{

float this_width = template_column_widths[width_index];

if(this_width == -1.0f)

{

++variable_width_count;

}

else

{

total_fixed_width += this_width;

}

if(variable_width_count)

{

float var_width = (width - total_fixed_width) / (float)variable_width_count;

for(int width_index = 0;

width_index < template_column_widths.items;

++width_index)

{

float &this_width = template_column_widths[width_index];

if(this_width == -1.0f)

{

this_width = var_width;

}

The code is extremely straightforward, but there is one little trick, which is that the super-secret-don’t-tell-anyone value of negative one is used to mark a column as being variable width. When the row template is completely specified, the code loops through and sees how many of those variable-width columns there are, and divides the remaining space equally among them. This allows the user of the code to specify the columns they care about, and let the rest be determined automatically, which is more convenient than forcing them to always specify every column width manually.

Q:	You wrote, “// TODO(casey): Why can’t I just _clear_ the hash table?” If you found the reason, I’m interested to know, because it’s kinda funny!

Re-reading that comment, I realize that it sounds like perhaps there was a clear() function on the hash table, and I called it, and it didn’t work, and so I wrote a comment to remind myself to go see why it didn’t work. But that’s not actually what happened.

What actually happened was that I didn’t see a clear() function on the hash table at all, so I didn’t (and still don’t) think there was any way to clear it short of destroying it completely and creating it anew. The comment I wrote was just there to remind someone, should they ever look through it, that the code was not intending to destroy and recreate the hash table, but rather just to clear it, so if the hash table code was ever extended to support clearing, that’s what the code should call instead.

I will often put “TODO” comments in code that look like this, where it is not really a thing that I need to do, but rather is a note about the general state of the codebase that I would like someone, during final analysis before shipping, to read, consider, and then either remove or act on. In my own codebase, I actually have a few different keywords that I use to differentiate comment types. Every single comment begins with one of them. They are:

// NOTE(casey): A comment by me that does not fall into any other category (most comments are this kind).

// IMPORTANT(casey): A key comment that describes a very important idea about the code that must be read and fully understood by anyone who is going to touch it.

// TODO(casey): Something I need to do.

// STUDY(casey): Something I need to look at in more detail in the future, because I believe the code contains an opportunity to advance the way I program.

Q:	Regarding “#if ALLOW_DEVELOPER”, I cannot think of any use of this conditional. I work at Apple and this made me think of all the disclosure/NDA bullshit that we live with everyday.

That conditional is there to ensure all editor code is compiled out in builds that are not used by the developers or beta testers of the game. In the development version of the game, ALLOW_DEVELOPER is defined to be 1, and all the editor code and other developer tools are built into it. In the shipping version of the game, ALLOW_DEVELOPER will be defined to be 0, so all the tools are stripped out. That’s really the only reason it’s there. It helps make the shipping executable smaller on platforms that are resource-constrained (like the one you guys ship, the iPad!), and also prevents users from inadvertently invoking developer commands that would ruin their play experience (like popping up a console or launching the editor).

Q:	What’s happening under the hood with “Resource_Ptr”?

My rather disappointing answer here is “I have no idea”. That is part of the rendering system, which to the best of my knowledge is the only system in The Witness that uses these sorts of things. The game and editor portions of the code don’t use wrapped pointers as far as I remember, although I may just be forgetting (it’s been a while since I was editing any code in The Witness that wasn’t my own, and I never use wrapped pointers myself). Ignacio Castano is probably the right person to ask, and he also has a blog, so I will point him at this question and he may write up a more complete answer for you!

Update: Ignacio wrote in with a more complete answer. He said:

“Resource_Ptr is just an auto_ptr that invokes release(owner, resource) at the end of the scope. The resource is usually obtained with acquire_*(owner, name), so it’s just a way of not forgetting to invoke release after the resource is not being used anymore. Similar to the Free_On_Return/Delete_On_Return macros, that create such an object under the hood.

“Why is it so verbose? I don’t really know. I think a better solution would be to have Resource_Ptr take the entity owner and name directly, but the acquire_* functions are not parameterized based on the resource type.

“I would have just used intrusive reference counting for all resources and a more general acquire/release mechanism, but Jon was opposed to that at the time, so when we really needed to add referenced counting for streamed resources it was done in a somewhat ugly way.”

Q:	Why not just call thick_line in the loop at lines 533?

Purely an oversight, I believe. I probably pulled out the thick-line code and forgot to collapse that case.

Q:	Why use qsort? std::sort can do inlining and so is usually faster.

I use the sort that is available in the codebase I’m in. In my own code, I have my own sorting solutions that involve neither of these two functions. But in foreign codebases, I must use what is available. The Witness codebase does not use the C++ standard library as far as I know, so I did not want to introduce any new dependencies there. Thus, std::sort was out of the question, and qsort was the only existing option of which I was aware.

Q:	I guess you are not using STL in The Witness. Is there a reason for this programming choice?

I’m not sure if there was a reason or not, since it was decided long before I arrived on the project. However, I can tell you from personal experience that I think the STL is total poopsauce and I never use it in any of my projects either. I find that it puts a high level of unnecessary stress on the compiler pipeline and results in a number of unpleasant consequences for projects that use it. These include ugly and inflexible syntax, long compile times, hard-to-understand error messages, flakey debug symbols, bloated symbol files, and even just flat-out bugs (back in the 90s, when I still used the STL, I recall finding more than one bug in Microsoft’s implementation, for example).

In short, I have nothing positive to say about the STL and think it, combined with C++’s template specification, form one of the worst attempts at generic programming I’ve ever seen. This saddens me greatly because I personally love generic programming and would have loved it if anyone on the C++ committee had understood it well enough to deliver real support for it in the language and standard library. As it is, in my own codebase, I make extensive use of old-school metaprogramming to avoid resorting to depravities like C++ templates and the STL.

Q:	What’s the difference between Foreach and Array_Foreach? Why do they even exist?

These are Jon and Ignacio’s special sauces, and I’m not really qualified to comment on them. The impetus is straightforward: to not have to type full for() syntax when you want to iterate over something. Array_Foreach() was the first one defined in The Witness codebase, well before I arrived, and was used in most places. It did not support scoping the variable inside the loop, as you can see in the code. Foreach() came later, while I was on the project, and was a replacement that allows you to scope the variable inside the loop. There’s nothing particular interesting about either of them.

Q:	Are the EDITOR_VAR/EDITOR_VAR_INDEXED/EDITOR_VAR_SUBNAME macros some kind of serialization system? How do they work?

Yes, they are for serialization. They are a complete hack that I added.

When I came on the project, The Witness had no concept of persistent editor state. Things like the placement of panels and user input preferences were not stored, so artists had to reset things when they started work for the day. I wanted to fix this, so I asked Jon how he would like editor settings to be stored. He said he preferred they be stored via the game’s variable serialization system, so that’s what I did.

The definitions of the EDITOR_VAR macros are:

// NOTE(casey): To use the EDITOR_VAR macros, you need to define EDITOR_PANEL_NAME

// in your file so it can be auto-prepended

#define EDITOR_VAR(member, default_value) member = default_value; \

editor_variable_service->attach(EDITOR_PANEL_NAME "_" #member, &member);

#define EDITOR_VAR_PTR(ptr, name) editor_variable_service->attach( \

EDITOR_PANEL_NAME "_" name, ptr);

#define EDITOR_VAR_INDEXED(ptr, name, index) \

{ \

char *composite_name = mprintf("%s_%d_%s", EDITOR_PANEL_NAME, index, name); \

editor_variable_service->attach(composite_name, ptr); \

gamelib_free_string(composite_name); \

}

#define EDITOR_VAR_SUBNAME(ptr, name, index, sub_name) \

{ \

char *composite_name = mprintf("%s_%d_%s_%s", EDITOR_PANEL_NAME, index, name, sub_name); \

editor_variable_service->attach(composite_name, ptr); \

gamelib_free_string(composite_name); \

}

These macros allow any panel that wants to save its state to do so in a relatively simple way. It is not as nice as a fully automated system, but it was something I could get working in a few hours. It seems to have held up well over the course of the project, and to the best of my knowledge it is still being used to store all editor state.

Q:	Why is selected_hash not a member of Lister_Panel? Lister_Panel::update deletes it every frame, so if you have two panels you’ll be in trouble.

As hinted by the TODO above it, the reason for this is actually because I wanted to (at some point) move the hash out into the selection system itself, since lots of panels do lookups on the selection and they were all getting pushed through a very slow O(n) path. So if there were going to be multiple Lister panels, I would finish that work, since that is the superior option and pays additional dividends. There is no need for each panel to store their own hash, since the selection is universal and it would be exactly the same in each.

As to your suggestion that I would be “in trouble” with multiple panels, I do not see why. Multiple panels would just result in a bunch of redundant work done by the second panel for no reason, but it wouldn’t actually yield erroneous results. The only cause for trouble would be if the two panels were running on separate threads at the same time, but The Witness’s editor is not multithreaded in that way, so that is not a concern here.

Q:	There is a magic system to do the layout in a scrollable section, but you have to specify the number of elements in a row manually. Why?

The Witness’s GUI is relatively bare-bones, and does not do deferred hit-testing. Thus the width and height of buttons must be known at the time the button is drawn, so that it can be hit-tested immediately and the results returned to the caller. Scrollable sections present no problems for this, because scrolling is simply a y offset that is applied to the buttons as they are drawn. Multi-button rows, on the other hand, do present a problem. If the user didn’t pass in the number of buttons that were going to be in the row, only after completing all the button calls for that row would the UI system know how big each one should be. At that point it would be too late to return hit-test information to the caller.

Of course, one could work around this limitation by having an EndRow() function that returns some kind of code indicating which of the buttons in the row was hit, but since it is trivial to just pass the button count, this seemed like an unnecessary complication beyond the nicer format of being able to individually if() each button call to handle its operation.

Q:	How do you handle button/editbox/whatever id persistence from frame to frame? Is it handled at all or is the id just the index of the button in the panel?

Unless I am misremembering, there are no ids in the Witness editor GUI at all. It is a fully immediate-mode GUI. This means that button processing is not delayed to a second pass, it is processed immediately on the pass where the button is defined. IMGUI systems that require ids are usually doing things that are fancier than what the Witness’s editor GUI is doing, and thus need ids to correspond UI elements between passes. Since the Witness editor GUI doesn’t do anything fancy, it can get away with not having ids at all.

Q:	Do you have a length/complexity/whatever threshold for breaking out a sub-function not for “semantic compression,” but for making it easier to scan the parent function?

Not really. In general, it is just “to taste”. Typically I try to keep code together in one function if it is executed in order and is not reused. This makes it clear to everyone reading the code that, well, it is executed in order and it not reused! But sometimes, if a function gets excessively large and becomes hard to understand, I will break it up into smaller functions, even though those functions will never be called by anyone else. This is primarily to leverage the cognitive convenience of substituting, say, several hundred lines of code with just a descriptive function call like “ReticulateSplines()”. This lets someone reading the code know that splines are reticulated there, and they can understand that more easily than trying to pour through the several hundred lines and look for comments to that effect.

Q:	I noticed that in passes_filter, the LISTER_FILTER_MODE_POSITIVE and LISTER_FILTER_MODE_NEGATIVE cases are almost the same. differing only by “==” versus “!=”. What do you think about collapsing the POSITIVE into the NEGATIVE case?

For quick reference, the original code looked like this:

case LISTER_FILTER_MODE_POSITIVE:

{

int handler_result = NOT_APPLICABLE;

special.handler(special.param, e, 0, false, &handler_result);

if(handler_result != NOT_APPLICABLE)

{

passes_special = (handler_result != 0);

}

} break;

case LISTER_FILTER_MODE_NEGATIVE:

{

int handler_result = NOT_APPLICABLE;

special.handler(special.param, e, 0, false, &handler_result);

if(handler_result != NOT_APPLICABLE)

{

passes_special = (handler_result == 0);

}

} break;

And the person asking the question is asking about changing it to something like this:

case LISTER_FILTER_MODE_POSITIVE: // fall through, NEGATIVE case handles both POSITIVE and NEGATIVE

case LISTER_FILTER_MODE_NEGATIVE:

{

int handler_result = NOT_APPLICABLE;

special.handler(special.param, e, 0, false, &handler_result);

if(handler_result != NOT_APPLICABLE)

{

// check filter mode matches result of filter handler

passes_special = (handler_result == 0) == (filter_mode == LISTER_FILTER_MODE_NEGATIVE);

}

} break;

I’m not sure I have strong feelings about either of these. Personally, I prefer the original form, because it is very clear what is going on and there is no real intricacy or bulk to the code that would make it fragile to future updates. So collapsing it just introduces errors in comprehension or updating where someone would make a mistake in reading or changing the (now more complicated) computation of passes_special. That said, if someone were to make that change and felt strongly about it, I wouldn’t object.

If the cases were longer, and had more shared code that was intricate and likely to change, I would definitely prefer collapsing the two, even at the cost of legibility. So really this is a case of the code being small enough that I prefer the duplication, but anything bigger and I would probably change my preference.

Q:	All the for loops in this code are encapsulated within a block. Why?

Well… yeah. That’s a “Muratori-style for loop”, so named because I am literally the only person anyone’s ever seen write a for loop this way. Nowadays, there is little or no reason to use this bracketing style. It is a habit of mine that I picked up in 1995 when working on a cross-platform codebase that had to compile under both Microsoft Visual C++ and GNU C++ (among others). At the time, GNU C++ had adopted the convention that a variable declaration inside the for’s parentheses was scoped to the loop (as per the spec), but Visual C++ did the opposite, considering the declaration to be in the outer scope. This lead to a no-win situation where code written properly under GNU C++ would have multiply-defined symbol errors under Visual C++, and code written in Visual C++ would have undefined-symbol errors in GNU C++. My solution to this problem was to devise the double-bracket style you see in this code, which solved the problem rather elegantly in my opinion, but was apparently too ugly for anyone else to adopt. Later, Visual C++ fixed their compiler to have proper scoping rules, and the bracketing style was no longer necessary, but by that point I had gotten so used to doing it that I still haven’t stopped.

Q:	I noticed that you guys were banning all /* */ comment blocks. That’s rather odd. Any reason for this?

They are not banned, they probably just didn’t happen to get used in this file for whatever reason. I definitely do use them sometimes for longer comments, both in my own code and in the code I wrote for The Witness. My editor (GNU Emacs) handles commenting automatically, so it’s “free” for me to make long comments with //’s instead of /* */, so sometimes I just don’t notice that I have made a very long comment entirely in the former.

That said, I do tend to avoid using C-style comments for disabling code, because comments usually don’t nest properly under most compiler’s default settings. For this reason, I tend to prefer #if 0 blocks for code, since I can #if 0 a block of code that already has a #if 0 in it and it will work, whereas I cannot comment out a block of code that already has a commented-out block of code in it without it breaking most compilers on their default settings.

Q:	lister_panel.h begins with a forward declaration of struct Entity_Panel, but you include “entity_panel.h” in lister_panel.cpp which presumably has the definition. Why?

To be honest, I have no idea what those forward declarations are doing there. The .h file doesn’t reference them anymore as far as I can tell. I suspect it was historical, where at one point the .h file was referring to those structs and needed them to be declared, but didn’t want to include the whole .h file with their definition.

I try never to include .h files in a .h file unless I absolutely have to, since each .h file is usually included by multiple .cpp files, so the more .h files you include from an .h file, the more that multiplies the number of files processed by the compiler for each compilation unit. This leads to higher compile times, and I hate higher compile times.

These days, in my own codebase, I actually don’t compile files separately anymore, I just include all the .cpp files into one big .cpp file and compile that. So forward declarations are rarely necessary. But in other codebases where files are compiled separately, I tend to adhere to most of the compile-time-minimizing practices I learned long ago from Lakos’ Large Scale C++ Software Design. The only one I don’t tend to do in foreign codebases is redundant include guards, but that is solely because I rarely know if there are consistently named include guards in the target .h files I’m referencing, and I don’t usually want to spend the time to exhaustively check.

Q:	Why do you use macros for constants instead of const variables/enums?

I don’t have a strong opinion about this, and I do use enums sometimes. It tends to be whatever strikes my fancy. There are reasons to use macros instead of enums, namely that there can be situations where a macro works and an enum doesn’t. The example would be assigning the value 5 to a float. If you make the 5 an enum, some compilers may emit a precision warning because you are converting an integer to a float. If you make the 5 a macro, the same compiler may not warn because it can see that 5 can be perfectly represented as a float, unlike other integers which cannot be. This may not be true for today’s compilers, which might do analysis of the enum value just like it would with an expanded macro, but I haven’t played around much with that recently so I can’t say for certain.

As for why I might want a constant to be assignable to both a float and an int, this is typical of my thinking about programming. It is the opposite of how most C++ programmers think. I usually prefer the option that supports the widest possible code use, whereas most C++ programmers seem to prefer the option that supports the narrowest. They are all about “protected” and “private” and “const” and so on, because those language features restrict the number of ways in which a piece of code can be used. But I prefer to make code that can be used in the widest number of ways, and so I tend to prefer practices that are as permissive as possible. Had I grown up in the sixties, I probably would have been a happy LISP programmer :)

The reason I differ in opinion here is that I tend to find that the primary development cost in codebases to lie in working out the interoperation of large amounts of code, not in the debugging of that code. I think people often perceive the cost of debugging to be much higher than it actually is, probably because it is not much fun to do. I believe many modern programmers spend an inordinate amount of time using things like “const” and “private” to try to prevent possible misuses of a piece of code, all the while making the code less usable in a number of circumstances that would have worked just fine, thus costing even more development time than the time already spent “designing in” these restrictions.

Unfortunately, who is right and who is wrong about these sorts of things is not really a question anyone can definitively answer because we don’t have the kind of metrics that we would need to accurately assess it. I can’t say for sure that my way is the right way. But I would definitely encourage people to think deeply about every programming practice they currently do in the name of bug prevention and consider how much debugging time it is actually saving, and to jettison those for which there is no strong evidence for an overall time savings.

Q:	You wrote, “this is also editor-only code, which means it is not meant to be used in a released product.” Does that mean that there’s an entire codebase dedicated uniquely to the editor?

Not exactly. There is a directory full of editor-only code, but it is something that adds on to the game’s source code rather than replacing it entirely. The point of my comment was just to indicate that lister_panel.cpp, along with all the other editor-specific add-ons, does not get compiled into the shipping version of the game, and thus does not have to be as robust as code that users will interact with directly.

Q:	Do you purposefully avoid writing about the IM-ness of the UI to make it seem like it’s something ordinary?

No, I avoid writing about it because I didn’t write it :) In general, I try to avoid talking about code I didn’t write, because I don’t feel like I will be able to accurately represent the thought process that went into crafting the code.

Jon wrote the GUI for the Witness editor. It is a very old GUI codebase written before Braid that Jon has used on a number of projects. He would be the appropriate person to write about it, since he’s the only one who knows why he made the decisions he made and why it works the way it does.

Q:	Do you develop on Linux? I ask because I would love to see an article on the blog about your workstation, the specs of the PC (Mac?), OS, (IDE, editor) whatever.

I am often on Linux and I do use it for programming my own code. That said, though I did get The Witness compiling under Linux, the majority of the Witness development I did was on a Windows machine. At the time when I got things building on Linux, Ignacio and Andy hadn’t yet finished the port of The Witness to OpenGL. Since Linux doesn’t have Direct3D support (apart from Wine), I didn’t try to get things running completely. I just set up a dedicated Windows machine for development while working on The Witness and used that.

For my own code, I try to keep everything running on at least two platforms at all times, so I have Linux on all my laptops and Windows on my (much older) desktop. I use Emacs as my editor, so that is the same on both platforms, as is cmirror, my custom-made sync and backup utility. I do not use a build system, I just have a .bat on Windows and a shell script on Linux. For debugging, I use Microsoft Visual Studio on Windows, which sucks, and QtCreator on Linux, which somehow sucks even more. Sometimes I think they get together and have competitions to see who can make the shittier debugger, and QtCreator comes out on top, but its neck-and-neck right to the finish.

On the whole, I am very interested in trying to make alternatives like Linux viable. I’m not sure how realistic that is, given the fractured nature of Linux and its hardware compatibility difficulties. But I feel like I should at least be trying to help, rather than sitting idly by while Microsoft, through a combination of selfishness and incompetence, flushes my home computing platform straight down the drain.

Q:	It would be great to read about cases where OOP is the proper way to go. You’ve made it clear where not to use OOP, but where should we use it?

Honestly, I feel like the answer is “in some other language”. My position on object-oriented programming is generally that it does not work well in C++ because C++ does not have any of the features you might want to allow you to get a benefit out of thinking of things in terms of objects.

To expand on that answer a little bit: there are two reasons that I think object-oriented programming usually doesn’t work. The first reason is that I think people try to apply the methodology to a wide variety of circumstances, when really it is only appropriate for a very narrow set of circumstances. But the second reason is that C++ doesn’t have good object-oriented programming support, so even if you did find a circumstance where object-oriented design was going to be optimal, you would still end up with bad results if you tried to use C++ to do the actual implementation.

I would say that if you are programming in C++, you should basically never use object-oriented programming. If you are programming in some other language, perhaps there are times when object-oriented programming might be a good idea. Unfortunately, since I have thirty years of experience programming C and C++, and only a small fraction of that experience in any other language, I feel it would be out of place for me to speculate on anything specific, since I really can’t speak from experience in any of the more traditional object-oriented languages.

And, just to make sure the point isn’t lost, as I said in the previous articles, the phrase “never use object-oriented programming” doesn’t mean that I don’t think you will have data structures in your code that can have multiple types, or which might contain extensions or different data members based on some notion of their type. Instead, what I mean is that you should never be thinking in terms of “objects” and “members” and “encapsulation”. You should be thinking strictly in terms of the actual code and the algorithms that you are creating, and then you should build the backing data to fit those algorithms. If something like an object should then arise, that is totally fine. But at least as far as C++ is concerned, never start by thinking in terms of objects! It is always the wrong approach.

1935	Meow the Infinite	Star Code Galaxy	Handmade Hero

Molly Rocket	Anna Draws It	Casey’s Tech Stuff