Releases: bonsairobo/building-blocks
building-blocks v0.7.0
Development Status Update
v0.7.0 is a modest release, but it does have some major API changes that I wanted to get out before taking a deep dive into any new features.
Release Highlights
Removal of ChunkPyramid
You might remember that a ChunkPyramid
was added in v0.6.0 for supporting chunk downsampling. Well, it's gone now! Instead, you can just use a ChunkMap
for everything that the ChunkPyramid
used to do. This just means that now chunk storages are keyed on ChunkKey
, which contains a level of detail.
As a result, now many of the ChunkMap
methods have changed to require an lod: u8
parameter. And the access trait impls have moved onto a wrapper called the ChunkMapLodView
. This means you can do all of the normal access trait stuff on a single LOD at a time.
map.lod_view(0).for_each(...);
Along with this change, I made it so that the ChunkDownsampler
can support multichannel Src
and Dst
chunk types. There aren't currently any users of this feature, but it should be as simple as combining some existing ChunkDownsampler
s and applying one to each channel.
borrow_channels
After using the new multichannel arrays for some time, it became apparent that we need a more convenient way of interacting with just a subset of channels. Technically, you could use the TransformMap
for this, but it wasn't the most ergonomic option. Now, Array
has two new methods: borrow_channels
and borrow_channels_mut
. They can be used like so:
let mut a = Array3x3::fill(extent, (0.0, 0, 'a'));
// Type annotation not required, just used for educational purpose.
let mut borrow: Array3x2<char, f32, &mut [char], &mut [f32]> = a.borrow_channels_mut(|(f, _, c)| (f, c));
And then borrow
is a single channel array that can be used like any other. It implements all of the access traits.
Lockstep array iteration and downsampling perf improvements
Some improvements have been made to the Array
iteration code. Now, via the ArrayForEach
and LockStepArrayForEach
types, you can iterate over the strides for two arrays in lockstep. They also support a step
parameter which allows for stepping through arrays at different intervals, which is useful when sampling from arrays at different resolutions. These new types have been used to rewrite the PointDownsampler
and SdfMeanDownsampler
, giving a 2x perf improvement in the bechmarks.
lod_terrain
example works with blocky and smooth voxels
Now you can run the lod_terrain
example with either blocky or smooth voxels. You just need to provide a CLI argument of "blocky" or "smooth".
Also, all of the examples have moved into their own crate outside of the root workspace, so you need to cd examples
before running them. This was done to avoid pulling in bevy as a dev-dependency when building tests and benchmarks.
Other Changes
Additions
FillExtent
traitdelete
andpop
methods forChunkWriteStorage
Modifications
- bug fixes in the
clipmap
module; several unit tests added collision
module of thebuilding_blocks_search
crate had some API renamesOctreeSet
now respectsVisitStatus::Stop
in post-order traversalsOctreeSet
method names have a new "fat" vs "thin" leaf concept- renamed
OctreeDBVT
toOctreeDbvt
Benchmark Results
These results come from running cargo bench --all
on my PC with an Intel i5-4590 3.3 GHz CPU, using the stable rust v1.52 toolchain and thin LTO enabled. All benchmarks are single-threaded.
SHOW RESULTS
greedy_quads_terrace/8 time: [9.4202 us 9.4465 us 9.4744 us]
greedy_quads_terrace/16 time: [53.374 us 53.545 us 53.735 us]
greedy_quads_terrace/32 time: [388.80 us 389.58 us 390.43 us]
greedy_quads_terrace/64 time: [3.0510 ms 3.0583 ms 3.0667 ms]
height_map_plane/8 time: [725.52 ns 726.56 ns 727.64 ns]
height_map_plane/16 time: [2.8329 us 2.8407 us 2.8496 us]
height_map_plane/32 time: [11.822 us 11.853 us 11.890 us]
height_map_plane/64 time: [49.684 us 49.964 us 50.270 us]
surface_nets_sine_sdf/8 time: [15.828 us 15.952 us 16.082 us]
surface_nets_sine_sdf/16
time: [166.98 us 167.42 us 167.92 us]
surface_nets_sine_sdf/32
time: [1.1799 ms 1.1831 ms 1.1866 ms]
surface_nets_sine_sdf/64
time: [9.9171 ms 9.9553 ms 9.9949 ms]
sphere_surface/8 time: [1.2471 us 1.2492 us 1.2514 us]
sphere_surface/16 time: [11.169 us 11.185 us 11.202 us]
sphere_surface/32 time: [98.618 us 98.776 us 98.944 us]
flood_fill_sphere/16 time: [32.166 us 32.220 us 32.280 us]
flood_fill_sphere/32 time: [268.58 us 269.05 us 269.52 us]
flood_fill_sphere/64 time: [1.9838 ms 1.9876 ms 1.9916 ms]
array_for_each/16 time: [1.7152 us 1.7187 us 1.7226 us]
array_for_each/32 time: [17.812 us 17.880 us 17.947 us]
array_for_each/64 time: [150.12 us 150.82 us 151.61 us]
array_for_each_point/16 time: [2.4739 us 2.4797 us 2.4858 us]
array_for_each_point/32 time: [22.124 us 22.171 us 22.220 us]
array_for_each_point/64 time: [192.28 us 192.86 us 193.58 us]
array_for_each_point_and_stride/16
time: [3.9797 us 3.9892 us 4.0013 us]
array_for_each_point_and_stride/32
time: [35.803 us 35.887 us 35.981 us]
array_for_each_point_and_stride/64
time: [304.04 us 304.63 us 305.24 us]
array_point_indexing/16 time: [10.691 us 10.707 us 10.724 us]
array_point_indexing/32 time: [100.41 us 100.87 us 101.30 us]
array_point_indexing/64 time: [835.65 us 837.50 us 839.67 us]
array_copy/16 time: [1.6250 us 1.6285 us 1.6321 us]
array_copy/32 time: [16.360 us 16.400 us 16.447 us]
array_copy/64 time: [419.59 us 420.45 us 421.35 us]
chunk_hash_map_for_each_point/16
time: [4.3827 us 4.3922 us 4.4024 us]
chunk_hash_map_for_each_point/32
time: [34.692 us 34.820 us 34.960 us]
chunk_hash_map_for_each_point/64
time: [278.21 us 279.76 us 281.59 us]
chunk_hash_map_point_indexing/16
time: [62.184 us 62.364 us 62.560 us]
chunk_hash_map_point_indexing/32
time: [481.83 us 482.77 us 483.86 us]
chunk_hash_map_point_indexing/64
time: [3.8085 ms 3.8159 ms 3.8236 ms]
chunk_hash_map_visit_chunks_sparse/128
time: [8.4282 us 8.4485 us 8.4701 us]
chunk_hash_map_visit_chunks_sparse/256
time: [179.92 us 183.50 us 187.06 us]
chunk_hash_map_visit_chunks_sparse/512
time: [1.5100 ms 1.5337 ms 1.5576 ms]
chunk_hash_map_copy/16 time: [973.24 ns 975.79 ns 978.56 ns]
chunk_hash_map_copy/32 time: [9.6658 us 9.7510 us 9.8324 us]
chunk_hash_map_copy/64 time: [82.749 us 83.486 us 84.364 us]
compressible_chunk_map_point_indexing/16
time: [72.292 us 72.756 us 73.223 us]
compressible_chunk_map_point_indexing/32
time: [547.20 us 548.71 us 550.51 us]
compressible_chunk_map_point_indexing/64
time: [4.3927 ms 4.4007 ms 4.4087 ms]
decompress_array_with_bincode_lz4/16
time: [13.079 us 13.118 us 13.165 us]
decompress_array_with_bincode_lz4/32
time: [98.359 us 98.610 us 98.864 us]
decompress_array_with_bincode_lz4/64
time: [817.73 us 821.12 us 824.67 us]
decompress_array_with_fast_lz4/16
time: [5.3044 us 5.3199 us 5.3369 us]
decompress_array_with_fast_lz4/32
time: [32.948 us 33.013 us 33.082 us]
decompress_array_with_fast_lz4/64
time: [261.91 us 263.90 us 265.97 us]
decompress_array_with_bincode_snappy/16
time: [20.258 us 20.375 us 20.485 us]
decompress_array_with_bincode_snappy/32
time: [100.17 us 100.45 us 100.80 us]
decompress_array_with_bin...
building-blocks v0.6.0
Development Status Update
Since v0.5.0, my goal has been to provide tools for implementing level of detail (LoD) and using them in building-blocks-editor. This has taken me on an unexpected detour.
For context, the voxel type in building-blocks-editor looks like this:
struct Voxel {
distance: Sd8,
type_id: u8,
}
With LoD, my plan was to downsample chunks into a ChunkPyramid
(mipmap). I had this prototyped for voxels with just a signed distance component, but this left me wondering what I should do with the type_id
. Just ignore it? Certainly it should not be treated the same by the sampler. In fact, I realized that I don't even have a good reason to downsample the type_id
at the moment. Generalizing a bit, I realized that when you have multiple voxel components, you often have workflows that only need to sample a subset of components at a time. But when your voxel is a struct
like this, you end up loading the entire thing into cache and wasting space. Of course, the lessons of the ECS paradigm and Structure of Arrays (SoA) dawned on me, and I realized that I should be treating the layout of voxel data a bit differently.
And so the main focus of this release has been what I am calling "multichannel" support. Read more below.
Release Highlights
Multichannel All the Things
TL;DR, now the Array
type uses an underlying tuple of Channel
s, each with a separate flat layout. This means that arrays with multiple channels, like Array3x2<A, B>
(an array with 3 spatial dimensions and 2 channels), are supported by multiple independent data channels, i.e. (Channel<A>, Channel<B>)
. Array
supports up to 6 channels.
// Manually create channels for educational purpose.
let ch1 = Channel::fill(0, extent.num_points());
let ch2 = Channel::fill('a', extent.num_points());
let array = Array::new(extent, (ch1, ch2));
// Or use a more convenient constructor.
let array = Array3x2::fill(extent, (0, 'a'));
Similarly, ChunkMap
leverages this feature of arrays, and there are new types like ChunkHashMap3x2<A, B>
and CompressibleChunkMap3x2<A, B>
which support the same access patterns as arrays. Here's what it looks like to use a multichannel ChunkMap
:
let ambient_values = (0, 'a');
let builder = ChunkMapBuilder3x2::new(CHUNK_SHAPE, ambient_values);
let mut map = builder.build_with_write_storage(
FastCompressibleChunkStorageNx2::with_bytes_compression(Lz4 { level: 10 }),
);
let iter_extent = Extent3i::from_min_and_shape(Point3i::fill(10), Point3i::fill(80));
assert_eq!(map.get_mut(Point3i::fill(1)), (&mut 0, &mut 'a'));
map.for_each_mut(&iter_extent, |_p, (num, letter)| {
*num = 1;
*letter = 'b';
});
let local_cache = LocalChunkCache::new();
let reader = map.reader(&local_cache);
assert_eq!(reader.get(Point3i::fill(1)), (0, 'a'));
assert_eq!(reader.get_ref(Point3i::fill(1)), (&0, &'a'));
reader.for_each(&iter_extent, |_p, (num, letter)| {
assert_eq!((num, letter), (1, 'b'));
});
As you can see, everything works like it used to, including chunk compression, access traits, and copy_extent
. You can even use TransformMap
to project your multichannel map to a subset of channels!
let projection = TransformMap::new(&reader, |(num, _): (i32, char)| num);
projection.for_each(&iter_extent, |_p, num| assert_eq!(num, 1));
Level of Detail
As I mentioned in the last release notes, level of detail is an important feature for scaling up voxel rendering solutions to large maps without wasting memory on high-resolution render resources that are far from the camera. As such, this release includes several new tools for implementing LOD.
While LOD support is very new, it might seem a little obtuse. As I continue integrating this code into the building-blocks editor, I will learn more about how the interface should be shaped. For now, the best resource for learning about this LOD code is the lod_terrain example.
ChunkPyramid
and ChunkDownsampler
The core of the LOD solution is to support downsampling chunks into lower levels of detail. The ChunkPyramid
is where these downsampled chunks can live. It can be thought of like a sparse mipmap; it is essentially just a vector of ChunkMap
s. All of the levels have the same chunk shape, but at lower levels of detail, each chunk covers a larger area of the map (hence having a lower resolution).
Pyramids can be used with any ChunkDownsampler
implementation; we currently have a PointDownsampler
, which very simply takes one point from each 2x2x2 extent, and the SdfMeanDownsampler
, which takes the mean of the signed distances in each 2x2x2 extent. However, ChunkPyramid
is currently just a single-channel storage, so you can only downsample one channel per pyramid. This may change in the future. There is a provided workaround for downsampling one channel from a multichannel ChunkMap
. You just need to provide a closure that wraps chunks in the proper TransformMap
projection. There is a test of this in the chunk_pyramid
module if you'd like to see how it's done.
OctreeChunkIndex
The recommended way of managing multiresolution data is to have a ChunkPyramid
and a corresponding OctreeChunkIndex
which tracks the set of chunks using an OctreeSet
for every "super chunk." A super chunk is essentially just a chunk of space indexed by a single OctreeSet
.
The reason for the index is for speeding up iteration over large regions of space. Rather than taking a large Extent
and checking every single overlapping chunk key (requiring a hash), you can iterate over an OctreeSet
, requiring only one hash per level of detail. It's very straightforward to construct an index by calling OctreeChunkIndex::index_chunk_map
on a ChunkMap
.
Sd8
and Sd16
Signed Distance Types
Previously most of the examples in the building-blocks repo used f32
for signed distances. However, much of the dynamic range of f32
is wasted in this particular use case, since samples of an SDF only need to represent the range [-1.0, 1.0]
of distances from the isosurface; any samples further away are not used for surface extraction.
So now we have the Sd8
and Sd16
fixed precision data types, capable of representing numbers with precision 2 / 2^8
and 2 / 2^16
respectively. This will save a lot of space over f32
s on large voxel maps.
New Examples
LOD Terrain
Now that we have preliminary support for LOD clipmaps, there is an example of this running in Bevy. And Bevy has a new WireframePlugin
, which makes this example even cooler to look at. View it here.
Array Texture Materials
A common technique for texturing Minecraft-style voxels is to use a texture atlas in the form of an "array texture." This is essentially just a 3D texture where each layer has the texture for one block type. Thanks to a PR from @malmz, we now have an example of this!
Quad Mesh UVs
The "array texture" example also brought to light an issue with how UV coordinates were being generated for a Quad
. Now Quad::simple_tex_coords
has been moved to OrientedCubeFace::simple_tex_coords
, and it supports flipping the U or V texture coordinate axes. This is useful for working with different graphics APIs, since OpenGL, Vulkan, and DirectX do not agree on the UV coordinate space.
A* Pathfinding
Thanks to a PR from @siler, the building_blocks_search
crate now has an astar_path
function for finding the optimal path between two points on a voxel grid.
Other Changes
Additions
ChunkMapBuilder
has become a trait. Feel free to implement your own builder. The providedChunkMapBuilderNxM
works for vanillaArray
chunks.OctreeSet::add_extent
andOctreeSet::subtract_extent
. These are useful for efficiently adding or deleting large regions from a chunk index.
Modifications
- For small-key hash maps, the
fnv
hasher has been replaced withahash
, which is about 30% faster in our benchmarks. This improves random access performance onChunkMap
s and also the overall performance ofOctreeSet
. - The methods on
SerializableChunks
have changed to take an iterator of(key, chunk)
pairs during serialization and fill a chunk storage on deserialization OctreeSet::empty
has becomeOctreeSet::new_empty
and we now also haveOctreeSet::new_full
- Access traits implemented on
Fn
types are now implemented on theFunc
type, which is just a newtype for wrappingFn
. This was necessary to avoid conflicting implementations.
Removals
- The
Chunk
type has been replaced with theChunk
trait. If you need extra metadata on your chunk type, you can implement theChunk
trait. In order to use a custom chunk with theCompressibleChunkStorage
, you also need to implement theCompression<Data=YourChunk>
trait. FastChunkCompression
is no longer needed and has been removed.
Benchmark Results
These results come from running cargo bench --all
on my PC with an Intel i5-4590 3.3 GHz CPU, using the stable rust v1.50 toolchain and LTO enabled. All benchmarks are single-threaded.
SHOW RESULTS
greedy_quads_terrace/8 time: [9.7631 us 9.9729 us 10.384 us]
greedy_quads_terrace/16 time: [65.749 us 66.223 us 66.648 us]
greedy_quads_terrace/32 time: [479.88 us 482.00...
building-blocks v0.5.0
Development Status Update
It's been a while since the last release in December. I hope everyone enjoyed their holidays!
Level of Detail Prototype
TL;DR: Check out this video: https://www.youtube.com/watch?v=6vSz86w3Zjk
My recent focus has been on implementing a level of detail scheme for voxel meshes. If you're not familiar with the problem, generating voxel meshes using the same grid resolution everywhere on a very large map, you will quickly run out of GPU resources, and it's very wasteful, because terrain in the distance will only take up a small number of pixels with many many triangles.
At first, I was looking into using adaptive distance fields and a recursive dual contouring algorithm for meshing. After putting a lot of time into this on the adf-experiment
branch, I wasn't pleased with the performance; it was significantly slower than the array-based surface_nets
meshing. Although I did learn a lot about meshing across varying levels of detail. Perhaps I will come back to this idea later.
My next idea was to use a clipmap of chunks at varying levels of detail, which would allow me to continue meshing arrays. This required a few new features:
- Chunk downsampling
- An octree of chunks to do optimal detection of LOD changes as the camera moves
- Transition meshing on LOD boundaries
(1) and (2) are mostly done, and you can see how they work in the lod-terrain prototype:
https://www.youtube.com/watch?v=6vSz86w3Zjk
The code for this prototype is mostly found here: https://github.com/bonsairobo/lod-terrain
It relies on the experiment-clipmap
branch of the building-blocks repo. The transition meshing is yet to be implemented, and so you can occasionally see some hairline cracks in the LOD terrain. Once I'm happy with the prototype, the plan is to integrate this LOD scheme into the building-blocks-editor
, then merge the experiment-clipmap
branch into main
to provide some code to make managing LOD simple.
Release Highlights
Octree Improvements
In order to facilitate the clipmap, many improvements have been made to the OctreeSet
data structure. It is now possible to visit sub-octants of full octants with an OctreeVisitor
, and the Location
type enables a "nested visitor" pattern, which is used to great effect in the clipmap code. This allows you to use one closure to locate a subtree, then run a separate closure on that subtree, like so:
octree.visit_branches_and_leaves(&mut |location: &Location| {
if some_condition(location) {
// Found a particular subtree, now narrow the search using a different algorithm.
location.visit_all_octants(&octree, &mut |sub_location: &Location| {
do_something(sub_location.octant());
VisitStatus::Continue
});
VisitStatus::ExitEarly
} else {
VisitStatus::Continue
}
});
ArrayN
Generic Storage
Arrays can now be created with any kind of slice storage that implements Deref<[T]>
or DerefMut<[T]>
.
let mut data = [1; 64 * 64 * 64];
let extent = Extent3i::from_min_and_shape(Point3i::ZERO, PointN([64; 3]));
let mut stack_array = Array3::new(extent, &mut data[..]);
More PointN
Ops
Point2i
and Point3i
now support some vectorized logical operations, like Shl
, Shr
, BitAnd
, BitOr
, BitXor
, Not
, etc.
sdfu
Integration
The signed_distances
module has been superceded by an integration with the sdfu
crate. Now you can create complex SDFs and easily sample them:
use building_blocks::core::sdfu;
let sdf = sdfu::Sphere::new(0.45)
.subtract(sdfu::Box::new(PointN([0.25, 0.25, 1.5])))
.union_smooth(
sdfu::Sphere::new(0.3).translate(PointN([0.3, 0.3, 0.0])),
0.1,
)
.union_smooth(
sdfu::Sphere::new(0.3).translate(PointN([-0.3, 0.3, 0.0])),
0.1,
);
Amanatides and Woo Ray Casting
Because it was simple and I thought some people might want to use it, I implemented the algorithm from this paper. Now you can easily traverse a grid along a given ray like so:
let mut traversal = GridRayTraversal3::new(start, velocity);
while some_condition(traversal.current_voxel()) {
traversal.step();
}
Change Log
Additions
- The traits in
sdfu::mathtypes
have been implemented forPoint3f
.sdfu
is now the recommended solution for constructing complex SDFs. See the examples. GetRef
andForEachRef
traits which provide access to&T
in a lattice mapGridRayTraversal2
andGridRayTraversal3
are implementations of the Amanatides and Woo algorithm- New APIs for octrees:
OctreeSet::add_extent
(TODO:subtract_extent
)OctreeSet::visit_all_octants
Octant::visit_self_and_descendants
von_neumann_flood_fill3
is a slightly optimized 3D flood fill using scanline searchgreedy_quads
now supports transparent voxels via theIsOpaque
trait- Several math operations have been implemented for
PointN
, includingMul<Self>
(and made scalar multiplication symmetric)Rem
Shl
,Shr
BitAnd
,BitOr
,BitXor
,Not
min_component
andmax_component
(viaMinMaxComponent
trait)
PointN::fill
lets you construct a point with all dimensions at the same value- Some new APIs for
ChunkMap
:visit_chunks
visit_mut_chunks
visit_occupied_chunks
visit_occupied_mut_chunks
get_mut_chunk_or_insert_ambient
CompressibleChunkMap::reader
allows for easy construction of aCompressibleChunkMapReader
.
Modifications
ArrayN
is now generic over the storage of its values, as long as the storage implementsDeref<[T]>
orDerefMut<[T]>
. This means you can use any kind of slice pointer:&[T]
,&mut [T]
,Box<[T]>
,Rc<[T]>
,Arc<[T]>
,Cow<[T]>
, etc.- The
(Octant, bool)
parameters inOctreeVisitor
have been replaced withLocation
to enable the "nested visitor" pattern OctreeSet::visit
has been renamed toOctreeSet::visit_branches_and_leaves
to differentiate fromOctreeSet::visit_all_octants
ChunkMap
can now only be constructed via theChunkMapBuilder
. This was done to ensure thatChunkMap
isn't accidentally created with a storage type that doesn't satisfy theChunkReadStorage
orChunkWriteStorage
traits.- For simplicity and consistency, wherever possible, we now pass
PointN
as a parameter by move/copy instead of by borrow. - A
MergeStrategy
trait has been extracted fromgreedy_quads
in an effort to provide more generic forms of quad merging. This will eventually be used to support certain kinds of voxel lighting algorithms. - Surface normal estimation in
surface_nets
has been improved by using bilinear interpolation of the distance field's partial derivatives, based on where the surface positions is located within a voxel. - The
SignedDistance
trait is now a supertrait ofInto<f32>
surface_nets
now takes avoxel_size
parameter to scale the mesh positions
Removals
- The
building_blocks_vox
crate has been removed. Functionality now lives inbuilding_blocks_storage
because it allows us to define thedecode_vox
method directly onArray3<VoxColor>
. - The
building_blocks_image
crate has been removed. Functionality now lives inbuilding_blocks_storage
because it allows us to implFrom<Im>
forArray2
. - The
building_blocks_procgen
crate has been removed. It only had some common signed distance functions, and these have been replaced with thesdfu
integration. - Some
ChunkMap
methods were deemed unnecessary or trivial:get_chunk_containing_point
get_mut_chunk_containing_point
get_mut_point_or_insert_chunk_with
Benchmark Results
I'm going to start posting the criterion benchmark results so it's easier to see performance changes between releases.
These results come from running cargo bench --all
on my PC with an Intel i5-4590 3.3 GHz CPU, using the stable rust v1.49 toolchain and LTO enabled. All benchmarks are single-threaded.
greedy_quads_terrace/8 time: [10.854 us 10.879 us 10.908 us]
greedy_quads_terrace/16 time: [71.589 us 71.679 us 71.768 us]
greedy_quads_terrace/32 time: [518.32 us 518.98 us 519.69 us]
greedy_quads_terrace/64 time: [4.1920 ms 4.2008 ms 4.2101 ms]
height_map_plane/8 time: [435.84 ns 436.48 ns 437.12 ns]
height_map_plane/16 time: [1.7318 us 1.7340 us 1.7363 us]
height_map_plane/32 time: [8.4769 us 8.4862 us 8.4960 us]
height_map_plane/64 time: [31.885 us 31.974 us 32.078 us]
surface_nets_sine_sdf/8 time: [12.561 us 12.577 us 12.594 us]
surface_nets_sine_sdf/16
time: [147.24 us 147.44 us 147.65 us]
surface_nets_sine_sdf/32
time: [1.0928 ms 1.0946 ms 1.0966 ms]
surface_nets_sine_sdf/64
time: [9.5181 ms 9.5401 ms 9.5633 ms]
sphere_surface/8 time: [14.841 us 14.856 us 14.874 us]
sphere_surface/16 time: [114.60 us 114.74 us 114.88 us]
sphere_surface/32 time: [863.09 us 864.74 us 866.56 us]
flood_fill_sphere/16 time: [406.85 us 407.23 us 407.62 us]
flood_fill_sphere/32 time: [2.7514 ms 2.7549 ms 2.7586 ms]
flood_fill_sphere/64 time: [20.050 ms 20.076 ms 20.102 ms]
arra...
building-blocks v0.4.3
This is a really small release that just contains some documentation bug fixes and improvements. No actual API changes, other than relaxed trait bounds, have been added.
building-blocks v0.4.1
ChunkMap
Refactor
This release has some exciting changes to the ChunkMap
data structure!
ChunkReadStorage
and ChunkWriteStorage
Before, we had a single kind of ChunkMap
which forced users to adhere to the LRU caching and compression architecture used in the compressible-map
crate. This was potentially cumbersome, especially if you were just getting started with the library and you didn't care about compression (yet).
Now the ChunkMap
is generic over the chunk storage. This means you are free to customize how you would like to store you chunks by implementing the ChunkReadStorage
and ChunkWriteStorage
traits. But we also provide some useful default implementations.
The simplest is ChunkHashMap
, which just uses a FnvHashMap
for the chunk storage. It's quick and simple, but it doesn't provide any kind of compression. You can construct one using the ChunkMapBuilder::build_with_hash_map_storage
method.
For more efficient memory usage, we also have the CompressibleChunkStorage
. This is mostly a rewrite of the old CompressibleMap
data structure to improve random access performance. The compressible-map
dependency was removed and the code lifted into building_blocks_storage::chunk_map::storage
.
The architecture is familiar: a top tier LruChunkCache
stores uncompressed Chunk
s, while a bottom tier Slab
stores the compressed Chunk
s. But now we only update the LRU order on insertion or an explicit touch_if_cached
method. This made a dramatic improvement to random access performance.
In order to use the read-only access traits on a CompressibleChunkMap
(a ChunkMap
with CompressibleChunkStorage
), you must construct a CompressibleChunkMapReader
from a LocalChunkCache
. This should be familiar. But now instead of having the reader contain a ChunkMap
, the ChunkMap
contains the reader as it's storage.
ChunkIndexer
Several methods like extent_for_chunk_at_key
, chunk_keys_for_extent
, chunk_key_containing_point
, etc have moved from the ChunkMap
to the ChunkIndexer
. Every ChunkMap
has a public indexer
field.
ChunkMapBuilder
Now it's easier to construct ChunkMap
s once you've established a chunk shape, ambient value, and default metadata. All of this data is captured by the ChunkMapBuilder
, and you can call ChunkMapBuilder::build
with any chunk storage in order to create a ChunkMap
. This is especially useful for constructing a CompressibleChunkMapReader
from a CompressibleChunkMap
, since they will use the same builder.
Take a look at the updated docs for the chunk_map
module to learn more.
New OctreeSet
Node Traversal
By request, we've added some node-based traversal methods to the OctreeSet
. The main use case is for traversing two octrees at the same time; visitors could not do this. So now you can use the root_node
and get_child
methods to manually traverse the tree.
Default Features
I finally figured out how to make features show up in docs.rs! The rule seems to be: docs.rs will only document default features. However, even if you re-export features from sub-crates in a top-level crate and make them default, that doesn't have any affect on what docs.rs does with the sub-crates (duh?). So I had to make sure every crate had the right default features for what I wanted in docs.rs. However, in order for users of the top-level crate to be able to disable those default features of sub-crates, I need the top-level crate to use default-features = false
on all of the sub-crate dependencies.
Phew! TL;DR, you can still pick and choose only the features you want by using default-features = false
in your Cargo.toml file, but all of the features will show up in docs.rs now.
building-blocks v0.3.0
This is a major version release with some breaking changes, but nothing groundbreaking in terms of new functionality.
If it seems like there is a lull in new features, that's because I'm currently focusing on building new applications on top of building-blocks, which is just one of the fundamental layer of a full stack the comprises a voxel game. Most of these changes were required by a map editor prototype I'm working on. As the applications get more complex, they will inform the design of building-blocks, and that's what drives new features.
Breaking changes:
- The
building_blocks_partition
crate has been removed! I decided that it made more sense to keep theOctreeSet
type in thebuilding_blocks_storage
crate. Everything else, including collision algorithms and theOctreeDBVT
have moved to thebuilding_blocks_search
crate. - Several APIs from the
compressible-map
crate have been exposed as methods directly on theChunkMap
, some have been renamed. - Some
ChunkMap
methods now returnMaybeCompressedChunk
instead of decompressing the chunk for you inline. Just call theas_decompressed
method to get theChunk
. - Renamed
Quad
toUnorientedQuad
.
Additions:
- Added the
Snappy
compression backend forChunkMap
. To use it, enable the "snappy" feature (and probably disable the "lz4" feature as well). Thanks to @indiv0, who wanted an alternative to LZ4 which works with the wasm target. LZ4 is still recommended as the default for best compression ratios. - Turned on LTO for release builds and benchmarks, showing a 2x speedup for meshing algorithms. You need to add
lto = true
to yourCargo.toml
to see the improvements. - Added the
Axis2
,Axis3
,SignedAxis2
, andSignedAxis3
types. These get a lot of use in the quad meshing code, where we want to access specific components ofPoint
s by indexing instead of doing a dot product. - Added the
ExtentN::from_corners
constructor. - Added the
glam
feature for easy type conversions betweenPointN
andglam::Vec2
andglam::Vec3
.
Other changes:
- Made the code in
building_blocks_mesh::quad
more reusable for cases outside of thegreedy_quads
algorithm. Now there is anOrientedCubeFace
type and anUnorientedQuad
that work together to provide generic quad meshing.
building-blocks v0.2.1
This is a minor release with a few improvements to make v0.2 more usable.
- The docs generated for docs.rs will now use all features.
- The
Point2
andPoint3
types now support conversion to/from themint::{Point2, Point3, Vector2, Vector3}
types. - Some parts of the docs around storage have been improved. It should be easier to navigate to the most important pages for getting started.
building-blocks v0.2.0
Building Blocks v0.2.0
This release is all about improving what we already had in v0.1.0, and several breaking changes were necessary. Most notably:
- There is an important bug fix for
Octree::from_array3
. - We've added a
voxel_sphere_cast
function that's similar tovoxel_ray_cast
. - All of the meshing functions now support
TransformMap
. - We've made several API ergonomics improvements.
Most of these changes have been driven by the effort to port the voxel-mapper project from ilattice3 to building-blocks. The completion of that effort confirms that building-blocks is viable for an application which enables interactive voxel terraforming.
What's Next
We will shift focus to adding new features in the next release. The choice and design of the features will be driven by real use cases. We expect to work on:
- procedurally generating large voxel terrains
- procedurally generating large height map terrains
- real-time voxel editing in a Bevy application
Detailed notes
Meta
- started using Github Actions for CI
Core
- add the
Point::at
trait method for indexing scalar components - add the
Point::map_components
method for transforming points component-wise - add
PointN
methods:- round
- floor
- as_2i and as_3i
- in_pixel and in_voxel
Storage
- remove
GetRef
andForEachRef
, instead relying onGet
andForEach
, which copy
values instead of borrowing. This had no noticeable performance impact. - impl
Deref
forChunkMapReader
so it has access to the methods ofChunkMap
- make
ChunkMap::from_serializable
and to_serializable async so chunk compression can happen in parallel - improve
copy_extent
performance when copying large extents between two chunk maps - rename
ChunkMap::key_iter
tochunk_keys_for_extent
- rename
ChunkMap::chunk_key
tochunk_key_containing_point
- cleaned up the
Array
trait by extract anArrayIndexer
trait - rename
Chunk::map
toChunk::array
- export type aliases for chunk map with default empty metadata type
()
Mesh
- make all meshing algorithms compatible with
TransformMap<Array>
- replace
pos_norm_tex_meshes_from_material_quads
function with more focused
methods on theQuadGroupMeta
object - remove the
material_weights
function, this was very application-specific and
relatively trivial to implement
Partition
- fixed a bug with
Octree::from_array3
that produced incorrect octants; regression test was added - add a
voxel_sphere_cast
function, similar tovoxel_ray_cast
but with a sphere
traveling along the ray - add the "partition" feature to the list of default features
- remove "ncollide" features from the list of default features
- rename
nearest_voxel_ray_cast
tovoxel_ray_cast
and adjust the API parameters - add
OctreeDBVT::remove
Octree::from_array3
now takes an extent instead of forcing the input array to
be the correct shape
Introducing building-blocks v0.1.0
I'm excited to announce the first release of Building Blocks, an engine-agnostic Rust library focused on voxel game development.
Here's just a glimpse of what's possible. This is the "bevy_meshing" example, which generates meshes on the fly using various shapes and algorithms:
This is the same core technology behind the voxel-mapper project, which is currently based on the deprecated ilattice3. After a while piling new code into those projects, I decided I was in need of a "new set of bones" with some performance and API improvements.
After a lot of exploration, trials, errors, and lessons learned, I believe I've distilled a useful feature set for developers who need to work with volumetric data in a real-time setting. The most common problems I've encountered when working with billions of voxels are related to performance and memory usage. It has been a fun challenge to craft an ergonomic API that doesn't compromise performance where it really counts. Even with all of the hard work that went into make this library usable, there is still so much room for optimization.
The specific feature set provided by this release is mostly focused on the core data types and containers for voxel data, as well as some meshing, search, and collision algorithms that run on the CPU (GPU acceleration is on the road map). More specifically, you will find:
- the core 2D and 3D point and extent data types used for addressing voxel storage
- no mandatory dependence on a 3rd party math library like nalgebra, but there are conversion trait impls behind a feature flag
- dense and sparse voxel storage
- 2D and 3D arrays
- chunked arrays that support fast compression and LRU caching
- for the simple kinds of voxels you see in Minecraft, LZ4 chunk compression can achieve an efficiency of 60:1
- this could theoretically support 125 billion 2-byte voxels in 4 GB of memory
- LZ4 decompression rates exceed 2 GB/s per core in the worst case
- fast, composable access traits for indexing, iteration, and copying
- compressed bincode serialization to save your voxel maps to files or send them over a network
- meshing algorithms that produce coherent meshes, even on chunk boundaries
- a Naive Surface Nets implementation for smooth voxels
- a Greedy Meshing implementation for cube voxels
- an algorithm for meshing height maps
- all single-threaded, but you can easily run these algorithms on many chunks in parallel using your preferred runtime
- octree acceleration structures for spatial queries
- very low memory usage (< 1 bit per voxel)
- supports raycasting against voxel bounding boxes
- ncollide3d integration behind a feature flag
- pathfinding algorithms
- greedy best-first search
- criterion benchmarks
- it is nontrivial to implement real-time voxel algorithms, so there is a suite of benchmarks written using the
criterion
crate to ensure that we aren't blindly micro-optimizing
- it is nontrivial to implement real-time voxel algorithms, so there is a suite of benchmarks written using the
- miscellaneous
- encoding/decoding 3D arrays in VOX format via the
dot_vox
crate- this allows you to import/export your maps for editing in applications like Magicavoxel
- encoding/decoding 2D arrays into pixel images via the
image
crate- so you can load up height maps from noise images if you like
- encoding/decoding 3D arrays in VOX format via the
One notable disclaimer, making a voxel game at massive scale (hundreds of billions of voxels) requires one very important feature that isn't implemented yet: dynamic level of detail. Without it, your meshes will not only look bad at a distance, but they will take up way too much memory. But this item is on the roadmap!
I hope to see more people building cool stuff with this library. Stay tuned for future releases, and please feel welcome to contribute when you have a problem that isn't solved by Building Blocks.
P.S. thanks to @superdump (Rob Swain) and @dekuraan (Declan Li-Carney) for being early adopters and providing me with feedback along the way. You can see one example of building-blocks being used in the wild here.