The early 90s brought three pillars: HTML, CSS, JavaScript (JS).
Whenever we visit a website we receive an HTML response, referencing or embedding CSS and JS.
Our web browser renders the HTML and CSS immediately,
and executes the JS to provide interactivity.
Although HTML, CSS and JS are separate standards,
it is now common to generate both the HTML and the CSS using JS.
This is possible via Node.js Server-Side Rendering and CSS-in-JS, respectively.
They counter the fundamental asymmetry between the initial state of the document (determined by HTML and CSS) and all subsequent states (the orbit of JS executions).
JavaScript can perform arbitrary computation, but its central purpose is DOM mutation.
However, directly applying JS leads to spaghetti code.
To surmount this, developers invented the notion of JavaScript component, instantiated via XML tags.
If the JavaScript component is called MyComponent, the associated tag will be <MyComponent /> or perhaps <my-component />.
Intuitively, HTML is extended with these custom tags which ultimately unwind into plain old HTML.
Competing notions of JavaScript component exist in the wild.
One popular approach is React function components.
They are just JavaScript functions with constraints on their parameters and return value.
They have a single parameter conventionally called props.
It is a JavaScript Object defining the component's named inputs,
and possibly special properties like children, key and ref.
They must return either null or a virtual DOM node.
This returned value ultimately unwinds to an HTML fragment,
and may depend on the component's props and internal state (via hooks).
React developers use a grammatical extension of JavaScript called JSX.
We already mentioned the idea of extending HTML with custom XML tags.
JSX goes further by reinterpreting XML inside a grammatical extension of JavaScript.
Consider an example, a pan/zoomable grid (also on CodeSandbox).
interact
The file panzoom/PanZoom.jsx (see tab above) defines two React function components.
One called PanZoom, another called Grid.
Behaviourally:
PanZoom renders an SVG containing children (an image provided in SvgPanZoomDemo) and <Grid />. Over time it adjusts the SVG viewBox in response to mouse/pointer events.
Grid renders part of an SVG i.e. two grid patterns.
They repeat squares of size 10x10 and 60x60 in abstract SVG user units.
The above JS functions both have a single parameter props.
Moreover, they both return something which looks like HTML but isn't.
Then what does the XML-like value returned by PanZoom actually mean?
What are React function components actually returning?
Here's a whirlwind overview.
React devs use a grammatical extension of JavaScript called JSX, permitting XML syntax.
React applications are built by composing together React function components. A typical React function component will return XML syntax referencing one or more other components.
Dev tools convert JSX into JS by replacing XML tags with invocations of React.createElement (see example below).
This website actually uses Preact, a React alternative with the same API.
Then React.createElement is this function,
and creates Preact virtual DOM nodes.
The root component of an application is usually called App.
Running a React application means invoking ReactDOM.render
with two arguments: <App/> and a DOM node el. See how we bootstrap examples on CodeSandbox.
ReactDOM.render initially converts <App/> into a DOM node mounted at el.
Later a subcomponent may "re-render", recursively recreating a virtual DOM node.
It is diffed and only the difference is applied to the DOM.
When React renders a component, it invokes the respective function.
The return value of the function is a JavaScript representation of a DOM subtree.
This representation is usually referred to as "Virtual DOM".
React compares this JavaScript value to the previous one, and patches the DOM accordingly.
If many components change in a small amount of time, some renders are automatically avoided via the ancestral relationship.
Developers can also avoid recreating a particular virtual DOM subtree using React.memo.
But for many websites, the virtual DOM manipulations are neither too large nor too frequent, and React developers may simply ignore their overhead.
However, we are making a realtime video game.
We want to control the rendering as much as possible, to ensure good performance and aid debugging.
If we allowed React (actually, Preact) to render in response to user interaction, we'd lose this control.
Take another look at panzoom/PanZoom.jsx.
interact
PanZoom returns an <svg/> with a viewBox attribute determined by state.viewBox.
When a user zooms via mousewheel, the event handler state.onWheel updates state.viewBox.
But updating this variable does not automatically update the virtual DOM.
Usually one would trigger a re-render, so that PanZoom returns <svg/> with the updated viewBox, and the DOM-diffing algorithm does the update.
But how do we trigger a re-render?
A React function component is rendered whenever an ancestor is (modulo React.memo), or if its internal state changes. Internal state is represented using the React.useState hook e.g.
These declarations cannot be nested, must occur at the "top-level" of the React function component, and must always execute in the same order.
This induces a well-defined association with their enclosing component.
To change state we execute setValue(nextValue) e.g. in response to a click. If nextValue differs from value, the function setValue causes the component to re-render where now React.setState(...)[0] has the new value.
This propagation of internal state is possible because a component's hooks must always execute in the same order.
In panzoom/PanZoom.jsx, the variable state is the value of an internal state variable i.e. deconstructed from a React.useState hook. Observe that we do not deconstruct the setter (setValue in the terminology of the previous section).
Why?
Because we decided to never inform React we've changed state, despite mutating it on mouse and pointer events.
Instead we directly mutate the DOM via:
Then as far as React is concerned, nothing has changed.
Furthermore if React renders the component for another reason, it'll use the mutated state to set the viewBox attribute (producing no change).
But why not just use a setter setState?
Because otherwise we'd recompute children and <Grid /> whenever the player pans or zooms.
Our game may contain many elements, and we'd rather not needlessly recompute their virtual DOM tens of times per second.
Finally, we further justify our somewhat strange usage of React.useState i.e. sans the setter.
Since we only ever mutate the state, React.useRef may seem more suitable.
However, there's something special about React.useState.
Whilst working in a suitably tooled development environment, it is possible to textually edit React components without losing the internal state of their instances (the deconstructed values of React.useState hooks).
See this in action by editing one of our CodeSandboxes.
This important devtool is known as react-refresh (see also preact/prefresh).
It will help us develop sophisticated Game AI.
The Last Redoubt will present a birdseye viewpoint of the interior of starships.
The crew will have tasks, such as manning the bridge, patrolling the decks, monitoring low berths.
These behaviours will be constrained by e.g. sleep patterns, the behaviour of others, and hardware failures.
But how do video games implement these behaviours?
Well, there are three standard systems:
Navigation is of central importance to us, and will be discussed shortly.
As for animation, we won't obsess over realism,
but we'll need visual cues to indicate NPC actions.
We also want a sense of flow, achieved via interdependent concurrent animations.
As for a physics engine, we mentioned we won't be using one. In fact:
Collision detection will be handled at the level of navigation.
Force-based motion will be simulated via the Web Animations API.
In the rest of this article we'll discuss Navigation and Raycasting in detail.
To move an NPC from A to B, we need a respective path.
This might simply be a straight line e.g. when an item is directly within grasp.
However, usually there are objects to be avoided: static ones like walls, dynamic ones like NPCs.
Sans dynamic objects, a canonical approach exists.
The navigable area is represented by polygons (possibly with holes),
with A and B inside them.
These polygons can be triangulated i.e. partitioned into triangles with disjoint interiors.
Thin or large triangles can be avoided via Steiner points.
The triangulation induces an undirected graph.
A Navgraph is an undirected graph whose
nodes are the triangles of the provided triangulation.
Two nodes are connected if and only if their respective triangles share exactly one edge.
For example, the grey triangles below collectively induce the red navgraph.
interact
TODO Mention triangle-wasm CodeSandbox
Technically, an undirected graph is just a symmetric binary relation.
We have made it concrete by depicting each node as the centroid of its respective triangle.
This is a standard convention, although triangles have more than one notion of center.
It provides a weight for each edge i.e. the distance between the centroids.
Then the length of a path through the undirected graph may be defined as the sum of its edge's weights.
So, how to find a path from A to B?
Given A and B we have two triangles (maybe equal), so two nodes, hence may apply A* using our chosen edge weights (distance between centroids).
This quickly provides a solution i.e. a path.
However it is insufficient because realistic NPCs would not follow centroid to centroid paths.
One can solve this by applying the string-pulling algorithm.
It pulls the zig-zag path tight along the navigable polygons' extremal points.
Drag the nodes below to see string-pulling in action.
Navigation around dynamic objects is harder.
What was once a collision-free path may no longer be.
Two officers on the bridge could be swapping shifts,
or perhaps the player needs to rush through a moving crowd.
One common approach is to combine static navigation (previous section) with steering behaviours.
They are usually implemented via a physics engine.
An NPC will be driven by its own force, plus other forces induced by the position and velocity of nearby NPCs.
However, one cannot expect the vector sum of forces to capture complex interactions between multiple characters.
Reynolds introduced Steering Behaviours as part of a pipeline:
action selection → steering → animation.
In practice, one must rely heavily on action selection to avoid unrealistic behaviour such as oscillation and deadlock.
There is another well-known approach i.e. Detour and in particular DetourCrowd, providing a sophisticated solution to multiple character navigation.
It has been ported to JS in BabylonJS,
and also integrated into the Unreal Engine.
In Detour, a collection of NPCs is conceptualised as a Crowd.
One requests the Crowd to move individual NPCs to particular targets.
An updater function must be executed each frame.
For each fixed NPC, its nearby neighbours are modelled as temporary geometry, influencing the NPC's velocity.
We'll have more to say about this impressive open source library.