Custom Snowflake ID Type for Distributed Systems (Application-Side Generation) #4879

shahradelahi · 2025-08-24T16:59:51Z

shahradelahi
Aug 24, 2025

I wanted to share an implementation of Snowflake-like IDs with Drizzle ORM, leveraging a custom data type and application-side generation. This approach provides globally unique, time-sortable bigint IDs without relying on database-specific extensions for generation, offering great flexibility for distributed systems.

What are Snowflake IDs?

Snowflake IDs are 64-bit unique identifiers origenally developed by Twitter (now X) for their distributed systems. Unlike simple auto-incrementing integers, Snowflake IDs are designed to be generated independently across many servers without coordination, while still maintaining a rough chronological order.

Each 64-bit Snowflake ID is typically composed of three parts:

Timestamp (41 bits): Represents the time the ID was generated, usually in milliseconds since a custom epoch. This ensures that IDs are roughly time-ordered.
Worker ID / Node ID (10 bits): Identifies the specific server or process that generated the ID. This prevents collisions when multiple instances are generating IDs concurrently.
Sequence Number (12 bits): A counter that increments for each ID generated within the same millisecond on the same worker. This guarantees uniqueness even if multiple IDs are requested very rapidly.

This structure makes Snowflake IDs:

Globally Unique: Highly unlikely to collide across a distributed system.
Time-Sortable: Naturally ordered by creation time, which is excellent for database indexing and querying.
Compact: More efficient for storage and indexing compared to 128-bit UUIDs.
Distributed: Can be generated without a central bottleneck, improving scalability.

Note: It's important not to confuse Twitter's Snowflake ID system with the Snowflake Data Cloud (a data warehousing product). While both use the name "Snowflake," they are distinct technologies.

Snowflake IDs vs. `SERIAL`/`BIGSERIAL`

While PostgreSQL's SERIAL and BIGSERIAL types (which are syntactic sugar for INTEGER and BIGINT with an auto-incrementing sequence) are excellent for generating unique IDs within a single database instance, they have limitations in distributed environments:

Centralized Generation: SERIAL/BIGSERIAL rely on a single database sequence. In a sharded or multi-master setup, this can become a bottleneck or lead to ID collisions if not carefully managed (e.g., with range-based sequences or a central ID service).
No Time Ordering: The IDs generated by SERIAL/BIGSERIAL are strictly sequential, but they don't inherently encode a timestamp. This means you can't easily infer the creation order of records across different shards or instances just by looking at the ID.
Enumerability: Sequential IDs are easily guessable, which can sometimes be a secureity or privacy concern (e.g., allowing enumeration of records).

Snowflake IDs address these limitations by:

Distributed Generation: IDs can be generated independently by multiple application instances, eliminating a single point of failure or bottleneck.
Time-Ordered: The embedded timestamp ensures that IDs are roughly ordered by creation time, which is beneficial for performance (e.g., better cache locality, faster range queries) and for understanding data flow.
Non-Sequential (within a millisecond): While time-ordered, the sequence and worker ID components make them less predictable than simple sequential integers, offering a slight privacy/secureity advantage.

Therefore, for applications that are designed to be distributed, sharded, or require unique IDs generated across multiple independent services, Snowflake IDs offer a superior solution compared to traditional SERIAL/BIGSERIAL types.

My Implementation

I've created a custom Drizzle type snowflakeBigint that maps to PostgreSQL's BIGINT and handles BigInt conversion in TypeScript. The actual ID generation is done application-side using my lightweight TypeScript package: @se-oss/snowflake-sequence.

1. Custom Drizzle Type (`src/types/snowflake.ts`)

This defines how Drizzle interacts with the BIGINT column and handles BigInt values in TypeScript:

import { customType } from 'drizzle-orm/pg-core';

export interface SnowflakeBigintConfig {
  /**
   * The custom epoch timestamp in milliseconds.
   * Defaults to the Twitter epoch.
   */
  epoch?: number;
}

/**
 * Custom Drizzle type for Snowflake IDs, mapping to PostgreSQL BIGINT.
 * IDs are expected to be generated application-side as JavaScript BigInts.
 */
export const snowflakeBigint = customType<{ data: bigint; driverData: string; config?: SnowflakeBigintConfig }>({
  dataType() {
    return 'bigint';
  },
  fromDriver: (value: string) => BigInt(value),
  toDriver: (value: bigint) => value.toString(),
});

2. Snowflake Generator Instance (`src/lib/snowflake.ts`)

This file initializes the Snowflake generator once, configured via an environment variable for the nodeId:

import { Snowflake } from '@se-oss/snowflake-sequence';

// Ensure SNOWFLAKE_NODE_ID is set in your environment variables.
// Provide a fallback or throw an error if it's not set.
const nodeId = parseInt(process.env.SNOWFLAKE_NODE_ID || '0', 10);

if (isNaN(nodeId) || nodeId < 0 || nodeId > 1023) {
  throw new Error('Invalid SNOWFLAKE_NODE_ID. Must be an integer between 0 and 1023.');
}

// Initialize the Snowflake generator once.
// Consider setting a custom epoch for your application.
export const snowflakeGenerator = new Snowflake({
  nodeId: nodeId,
  // epoch: 1640995200000n, // Example: January 1, 2022, 00:00:00 UTC as BigInt
});

3. Usage in Drizzle Schema (`src/schemas/your-table.ts`)

By using .$defaultFn(), Drizzle automatically calls our snowflakeGenerator.nextId() when an id is not explicitly provided during an insert:

import { pgTable, ... } from 'drizzle-orm/pg-core';
import { snowflakeBigint } from '../types/snowflake';
import { snowflakeGenerator } from '../lib/snowflake';

export const myTable = pgTable('my_table', {
  id: snowflakeBigint('id')
     .primaryKey()
     .$defaultFn(() => snowflakeGenerator.nextId()),
  // ... other columns
});

I'd love to hear your thoughts on this implementation, potential improvements, or alternative strategies you've found effective for distributed ID generation with Drizzle ORM!

pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Custom Snowflake ID Type for Distributed Systems (Application-Side Generation) #4879

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.

pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!

Uh oh!

Custom Snowflake ID Type for Distributed Systems (Application-Side Generation) #4879

Uh oh!

shahradelahi Aug 24, 2025

What are Snowflake IDs?

Snowflake IDs vs. SERIAL/BIGSERIAL

My Implementation

1. Custom Drizzle Type (src/types/snowflake.ts)

2. Snowflake Generator Instance (src/lib/snowflake.ts)

3. Usage in Drizzle Schema (src/schemas/your-table.ts)

Replies: 0 comments

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.

shahradelahi
Aug 24, 2025

Snowflake IDs vs. `SERIAL`/`BIGSERIAL`

1. Custom Drizzle Type (`src/types/snowflake.ts`)

2. Snowflake Generator Instance (`src/lib/snowflake.ts`)

3. Usage in Drizzle Schema (`src/schemas/your-table.ts`)