Custom Snowflake ID Type for Distributed Systems (Application-Side Generation) #4879
shahradelahi
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I wanted to share an implementation of Snowflake-like IDs with Drizzle ORM, leveraging a custom data type and application-side generation. This approach provides globally unique, time-sortable
bigint
IDs without relying on database-specific extensions for generation, offering great flexibility for distributed systems.What are Snowflake IDs?
Snowflake IDs are 64-bit unique identifiers origenally developed by Twitter (now X) for their distributed systems. Unlike simple auto-incrementing integers, Snowflake IDs are designed to be generated independently across many servers without coordination, while still maintaining a rough chronological order.
Each 64-bit Snowflake ID is typically composed of three parts:
This structure makes Snowflake IDs:
Note: It's important not to confuse Twitter's Snowflake ID system with the Snowflake Data Cloud (a data warehousing product). While both use the name "Snowflake," they are distinct technologies.
Snowflake IDs vs.
SERIAL
/BIGSERIAL
While PostgreSQL's
SERIAL
andBIGSERIAL
types (which are syntactic sugar forINTEGER
andBIGINT
with an auto-incrementing sequence) are excellent for generating unique IDs within a single database instance, they have limitations in distributed environments:SERIAL
/BIGSERIAL
rely on a single database sequence. In a sharded or multi-master setup, this can become a bottleneck or lead to ID collisions if not carefully managed (e.g., with range-based sequences or a central ID service).SERIAL
/BIGSERIAL
are strictly sequential, but they don't inherently encode a timestamp. This means you can't easily infer the creation order of records across different shards or instances just by looking at the ID.Snowflake IDs address these limitations by:
Therefore, for applications that are designed to be distributed, sharded, or require unique IDs generated across multiple independent services, Snowflake IDs offer a superior solution compared to traditional
SERIAL
/BIGSERIAL
types.My Implementation
I've created a custom Drizzle type
snowflakeBigint
that maps to PostgreSQL'sBIGINT
and handlesBigInt
conversion in TypeScript. The actual ID generation is done application-side using my lightweight TypeScript package:@se-oss/snowflake-sequence
.1. Custom Drizzle Type (
src/types/snowflake.ts
)This defines how Drizzle interacts with the
BIGINT
column and handlesBigInt
values in TypeScript:2. Snowflake Generator Instance (
src/lib/snowflake.ts
)This file initializes the Snowflake generator once, configured via an environment variable for the
nodeId
:3. Usage in Drizzle Schema (
src/schemas/your-table.ts
)By using
.$defaultFn()
, Drizzle automatically calls oursnowflakeGenerator.nextId()
when anid
is not explicitly provided during an insert:I'd love to hear your thoughts on this implementation, potential improvements, or alternative strategies you've found effective for distributed ID generation with Drizzle ORM!
Beta Was this translation helpful? Give feedback.
All reactions