By Anna Rettberg
Casey Muratori
The Meow hash is a high-speed hash function named after the character Meow in Meow the Infinite. We developed the hash function at Molly Rocket for use in the asset pipeline of 1935.
Because we have to process hundreds of gigabytes of art assets to build game packages, we wanted a fast, non-cryptographic hash for use in change detection and deduplication. We had been using a cryptographic hash (SHA-1), but it was unnecessarily slowing things down.
To our surprise, we found a lack of published, well-optimized, large-data hash functions. Most hash work seems to focus on small input sizes (for things like dictionary lookup) or on cryptographic quality. We wanted the fastest possible hash that would be collision-free in practice (like SHA-1 was), and we didn’t need any cryptograhic security.
We ended up creating Meow to fill this niche.
How fast is it?
It’s the fastest hash function we know of, and we have benchmarked all the ones we could find. On modern Intel x64 CPUs, it hashes 16 bytes per cycle single-threaded. This means in cache it can hash at a rate of 64 gigabytes per second on a 4.2gHz machine. Out of cache, it hashes at whatever speed your main memory bus can provide to a single core, since that is usually the limiting factor on modern x64 CPUs.
It has also now been tuned to be the fastest hash on small inputs, too. Despite the fact that it is a full 128-bit hash, it still outperforms “fast” 64-bit hashes across all input sizes. So whether you have data buffers of a few bytes or a few gigabytes, Meow is the fastest hash available.
How robust is it?
The Meow hash is not designed for cryptography and therefore we make no claims about its security. Assume it is completely insecure.
That said, it is extremely robust for its designed purpose. It cleanly passes every test in smhasher and has produced no collisions on our large datasets as yet. Also, unlike other “fast hashes”, it is not a small 32-bit or 64-bit hash. It produces a full 128 bits of usable hash every time.
The Meow hash also passes smhasher cleanly at every truncation level down to 32 bits, so you can safely truncate a Meow hash value to the size you want to store. Both 32 bit and 64 bit truncations have been tested themselves and separate hashes, and pass smhasher cleanly.
Source Code and License
We have placed the source code to Meow on GitHub. We have released it under a permissive, open-source license:
zlib License
(C) Copyright 2018 Molly Rocket, Inc.
This software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software.
Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
It is a very basic license and should make it easy to use.
How to use it
The source code includes a number of files, but only meow_hash.h is actually necessary. The rest of the files may or may not be necessary depending on how you want to use Meow. Currently, the distribution supports x64, ARM, and vanilla C implementations.
To use the Meow hash in a program, just #include the headers then call it:
#include "meow_intrinsics.h"
#include "meow_hash.h"
// ...
// NOTE(casey): Hash away!
meow_hash Hash = MeowHash_Accelerated(0, Size, Buffer);
// NOTE(casey): Extract whatever hash size you want:
__m128i Hash128 = MeowU128From(Hash);
long long unsigned Hash64 = MeowU64From(Hash, 0);
int unsigned Hash32 = MeowU32From(Hash, 0);
// NOTE(casey): Check if two hashes are equal
if(MeowHashesAreEqual(HashA, HashB))
// NOTE(casey): Do something when the hashes match
// NOTE(casey): Do something else when they don't
Note that there are two headers to include: meow_intrinsics.h for the platform support, and meow_hash.h for the function itself. This is intentional  —  you do not need to include meow_intrinsics.h if you don’t want to! Instead, you can choose to make your own set of #define’s before including meow_hash.h, so that you can provide the foundation pieces for Meow to use. This can be very useful if you are trying to integrate Meow into a codebase that already has its own conventions for 128-bit SIMD.
How not to use it
The Meow hash was designed for very specific scenarios. There are two very good reasons not to use it:
Never use the Meow hash for anything involving security unless you are a cryptographer and know exactly what you are doing. The Meow hash was not designed for cryptography and has no cryptographic security proofs or analysis whatsoever.
Don’t base a system on the Meow hash if your primary deployment target uses a non-AES-capable CPU. Modern ARM and x64 cores usually have the AES instructions necessary to implement a Meow hash at high speed, but other CPUs might not. CPUs that don’t have AES instructions will be signficantly slower at running the Meow hash, so you do not want to use it in deployments that expect to use primarily non-AES-enabled CPU cores.
Building the entire Meow suite
If you download the entire Meow repository, you can build it all using simple batch files on both Windows and Linux. From the root of the repository, on Windows:
> build.bat
or on Linux:
> ./
Both will build several executables for testing. On Linux, there will only be a CLANG build, and it will be placed in a build directory. On Windows, it will attempt to build with cl into build_msvc and CLANG into build_clang, depending on which compilers you have installed.
Example code
The executable meow_example.exe (built from meow_example.cpp) is an example of the basic ways you might use a Meow hash. You can run it from the command line with no arguments to have it hash a random buffer, with one argument to print the hash of a file, and with two arguments to have it compare the hashes of two files. meow_example.cpp is designed to be very simple, and should provide an easy way to see how Meow hashes are used in practice.
Built-in testing
The executable meow_test.exe will test the Meow hash build for obvious defects. It’s not a replacement for serious testing (like smhasher), but it will give you a quick sanity check that you haven’t done something horribly wrong with the build.
build> meow_test
It will print out a clear indication of whether or not the C and accelerated versions of the Meow hash are working on your build and CPU.
Built-in benchmarking
The executable meow_bench.exe will run Meow hashes on lots of different sizes and attempt to determine the maximum consistent throughput it can achieve on your system. When run with an argument, it will also output a CSV that can be used to graph the results:
build> meow_bench results.csv
Built-in collision checking
The executable meow_search.exe will run Meow hashes on all the files in an entire directory tree and report any collisions that it finds. It will write a detailed collision report to a filename you supply:
build> meow_search c:/ c:/meow_report.txt
Hash collisions are subject to the birthday paradox, which means that once you process enough hashes, collisions are expected for various hash truncation levels. You should expect to see Meow32 collisions once you reach 16 bits worth of files (32-64 thousand files), Meow64 collisions once you get 32 bits worth of files (2-4 billion files), and Meow128 collisions once… well… you really shouldn’t be able to amass enough files for that! So if you find a collision with Meow128, please report it to us on the GitHub! We may need to look into it in more detail.
Additional APIs
You can also use Meow hash via a streaming API and a vanilla C API for compatibility. These are defined in meow_more.h. To use the streaming API:
#include "meow_intrinsics.h"
#include "meow_hash.h"
#include "more/meow_more.h"
// ...
// NOTE(casey): Begin the hash
meow_hash_state State;
// NOTE(casey): Feed data in however you want
MeowHashAbsorb(&State, Size, Buffer);
// NOTE(casey): Complete the hash
meow_hash Hash = MeowHashEnd(&State, Seed);
and to use the C version of the hash:
#include "meow_intrinsics.h"
#include "meow_hash.h"
#define MEOW_INCLUDE_C 1
#include "more/meow_more.h"
// ...
// NOTE(casey): Hash with the (slower) C version
meow_hash SlowHash = MeowHash_C(0, Size, Buffer);
// NOTE(casey): Hash with the (faster) CPU-specific version
meow_hash FastHash = MeowHash_Accelerated(0, Size, Buffer);
// NOTE(casey): They should alway produce the same result
assert(MeowHashesAreEqual(SlowHash, FastHash));
Generally speaking, if you are trying to use the C version for compatibility purposes, you will need to do some kind of CPU detection appropriate to your platform and then pick whether to call MeowHash_Accelerated or MeowHash_C. The proper way to detect this varies from platform to platform, and is outside the scope of Meow’s implementation, so it is expected that you will implement that part of the code yourself during the startup of your application. An example implementation using try/catch is given in meow_more_example.cpp.
Comments welcome
The Meow hash is still in alpha form. We would love to get feedback on people who can test it on their datasets, especially if you find any collisions or other problems with the hash quality. Please post issues on the GitHub as you find them!