How we implemented real-time collaboration in Lydie using Hocuspocus
An in-depth view into how we enabled real-time collaboration in Lydie
From the get-go, when I started working on Lydie, an open-source document editor and writing workspace, I knew I’d eventually have to implement real-time collaboration. I wanted multiple team members to be able to work on the same document, on par with tools such as Google Docs and Notion.
Despite never having worked on this before, I had heard good things about Hocuspocus, a WebSocket server that handles merging data into single source-of-truth documents using CRDTs via the Y.js library. Using CRDTs as the backbone for collaborative editing is one of the most widely used approaches, and is also used by large companies such as Google and Notion for their collaborative features.
Hocuspocus is also conveniently developed by the same team who is behind TipTap, which is the headless WYSIWYG library that Lydie uses to power its dynamic editor. And with their first-party extension, is it incredibly easy to set up the client-side communication between the editor and backend server.
Using Hocuspocus with Hono
Lydie uses Hono, a lightweight web framework for building backend APIs, and I wanted to keep our collaborative features close to our other backend features. Luckily, integrating the Hocuspocus server with Hono wasn't difficult, as Hono provides utilities for setting up WebSocket connections.
import { logger } from "hono/logger";
import { ExternalApi } from "./external";
import { InternalApi } from "./internal";
import { Hono } from "hono";
import { cors } from "hono/cors";
import { hocuspocus } from "../hocuspocus-server";
import { createNodeWebSocket } from "@hono/node-ws";
export const app = new Hono()
.use(
cors({
origin: [
"https://app.lydie.co",
"https://lydie.co",
"http://localhost:3000",
],
credentials: true,
})
)
.use(logger())
.get("/", async (c) => {
return c.text("ok");
})
.route("/internal", InternalApi)
.route("/v1/:idOrSlug", ExternalApi);
export const { injectWebSocket, upgradeWebSocket } = createNodeWebSocket({
app,
});
app.get(
"/yjs/:documentId",
upgradeWebSocket((c) => {
const documentId = c.req.param("documentId");
return {
onOpen(_evt, ws) {
if (!ws.raw) {
throw new Error("WebSocket not available");
}
hocuspocus.handleConnection(ws.raw, c.req.raw as any, documentId);
},
};
})
);
export type AppType = typeof app;import { app, injectWebSocket } from "./api";
import { serve } from "@hono/node-server";
import { hocuspocus } from "./hocuspocus-server";
const port = 3001;
// Start server
const server = serve(
{
fetch: app.fetch,
port,
},
(info) => {
hocuspocus.hooks("onListen", {
instance: hocuspocus,
configuration: hocuspocus.configuration,
port: info.port,
});
}
);
// Setup WebSocket support (Node.js specific)
injectWebSocket(server);One caveat is that we originally ran our Hono instance using Bun, but since the Hocuspocus interface is built around the Node.js WebSocket instance (where Bun's differ), I sadly had to move back to serving the Hono instance through the `@hono/node-server` helper. I found an issue that addresses this, which will hopefully be attended to.
Server-side: authorization, persistence, and side effects
Once the WebSocket is up, the collaboration backend is basically two jobs:
Only let the right people join the right document.
Load and persist the single source of truth document state.
Hocuspocus handles this via different storage extensions, in which we use the `Database` extension, and an ´onAuthenticate` method that holds the server-side logic to validate that a user can access and modify a document.
Authorization
Hocuspocus calls onAuthenticate whenever a client tries to connect to a document room. We validate the user session (we use Better Auth) from the incoming request headers and check if that user has access to the document they're asking for.
In Lydie, document access is organization scoped, so we load the document, read its organizationId, and then confirm that the user is a member of that organization.
async function verifyDocumentAccess(documentId: string, userId: string) {
const [document] = await db
.select()
.from(documentsTable)
.where(eq(documentsTable.id, documentId))
.limit(1)
if (!document) return false
const membership = await db
.select()
.from(membersTable)
.where(
and(
eq(membersTable.organizationId, document.organizationId),
eq(membersTable.userId, userId),
),
)
.limit(1)
return membership.length > 0
}Then we wire that into onAuthenticate. If anything fails, we throw, and Hocuspocus rejects the connection before any document data is exchanged.
async onAuthenticate({ documentName, request }: onAuthenticatePayload) {
const session = await authClient.api.getSession({
headers: request.headers as any,
})
if (!session?.user) throw new Error("Invalid authentication")
const hasAccess = await verifyDocumentAccess(documentName, session.user.id)
if (!hasAccess) throw new Error("Access denied")
// Returned user metadata is used for awareness.
return { id: session.user.id, name: session.user.name }
}Persisting to the database
The Database extension has two hooks:
fetch: load the stored Y.js state when a client connects.
store: persist the latest state back to your database.
The key detail is the data type. Hocuspocus gives you the document state as a Uint8Array. That is perfect for runtime, but it is not something we want to drop into a typical Postgres text column as-is, so we convert it into a Base64 string before saving it. When loading documents, we do the reverse.
new Database({
fetch: async ({ documentName }) => {
const result = await db
.select({ yjsState: documentsTable.yjsState })
.from(documentsTable)
.where(eq(documentsTable.id, documentName))
.limit(1)
if (!result[0]?.yjsState) return null
// Base64 string -> Uint8Array
const buffer = Buffer.from(result[0].yjsState, "base64")
return new Uint8Array(buffer)
},
store: async ({ documentName, state }) => {
// Uint8Array -> Base64 string
const base64State = Buffer.from(state).toString("base64")
await db
.update(documentsTable)
.set({ yjsState: base64State, updatedAt: new Date() })
.where(eq(documentsTable.id, documentName))
// Side effects live here too (more on that below).
},
})It's important to note that this would be highly inefficient if we performed this operation on every keystroke. Lucky for us, Hocuspocus has a built-in debouncing mechanism that ensures that persistence only happens after a certain window without any input - in our case, we only save the document to the database if 25 seconds have passed without a user editing the document.
export const hocuspocus = new Hocuspocus({
// ...extensions...
debounce: 25000,
})Side effects
Apart from handling the persistence to the database, Lydie also performs a side effect after the debounce's graze period. Having built-in semantic search and RAG-functionality, Lydie needs to procedurally generate vectorized embeddings of the documents - as they're being written. Hooking into the same method that handles persistence is an ideal choice for this, as we also do not want to perform this action more than necessarily - but still want embeddings to be as fresh as possible.
store: async ({ documentName, state }) => {
const base64State = Buffer.from(state).toString("base64")
await db
.update(documentsTable)
.set({ yjsState: base64State, updatedAt: new Date() })
.where(eq(documentsTable.id, documentName))
processDocumentEmbedding(
{ documentId: documentName, yjsState: base64State },
db,
).catch((error) => {
console.error(
`Failed to generate content embeddings for document ${documentName}:`,
error,
)
})
}We will go deeper on how we create and update embeddings of our documents in a future post.
Our entire Hocuspocus class looks like this:
import { Hocuspocus, onAuthenticatePayload } from "@hocuspocus/server";
import { Database } from "@hocuspocus/extension-database";
import { db } from "@lydie/database";
import { documentsTable, membersTable } from "@lydie/database/schema";
import { eq, and } from "drizzle-orm";
import { authClient } from "@lydie/core/auth";
import { processDocumentEmbedding } from "@lydie/core/embedding/document-processing";
// Verify user has access to document
async function verifyDocumentAccess(
documentId: string,
userId: string
): Promise<boolean> {
try {
const [document] = await db
.select()
.from(documentsTable)
.where(eq(documentsTable.id, documentId))
.limit(1);
if (!document) {
return false;
}
// Check if user is a member of the organization
const membership = await db
.select()
.from(membersTable)
.where(
and(
eq(membersTable.organizationId, document.organizationId),
eq(membersTable.userId, userId)
)
)
.limit(1);
return membership.length > 0;
} catch (error) {
return false;
}
}
export const hocuspocus = new Hocuspocus({
extensions: [
new Database({
fetch: async ({ documentName }) => {
try {
const result = await db
.select({ yjsState: documentsTable.yjsState })
.from(documentsTable)
.where(eq(documentsTable.id, documentName))
.limit(1);
if (!result[0] || !result[0].yjsState) {
return null;
}
// Convert base64 string back to Uint8Array
const buffer = Buffer.from(result[0].yjsState, "base64");
return new Uint8Array(buffer);
} catch (error) {
return null;
}
},
store: async ({ documentName, state }) => {
// Convert Uint8Array to base64 string for storage
const base64State = Buffer.from(state).toString("base64");
await db
.update(documentsTable)
.set({
yjsState: base64State,
updatedAt: new Date(),
})
.where(eq(documentsTable.id, documentName));
processDocumentEmbedding(
{
documentId: documentName,
yjsState: base64State,
},
db
).catch((error) => {
console.error(
`Failed to generate content embeddings for document ${documentName}:`,
error
);
});
},
}),
],
async onAuthenticate({
documentName,
request,
}: onAuthenticatePayload): Promise<any> {
if (!request?.headers) {
throw new Error("Authentication required");
}
try {
// Verify the session using better-auth by passing request headers
// The headers should contain the cookie with the session token
const session = await authClient.api.getSession({
headers: request.headers as any,
});
if (!session?.user) {
throw new Error("Invalid authentication");
}
// Verify user has access to the document
const hasAccess = await verifyDocumentAccess(
documentName,
session.user.id
);
if (!hasAccess) {
throw new Error("Access denied");
}
// Awareness
return {
id: session.user.id,
name: session.user.name,
};
} catch (error) {
throw new Error("Authentication failed");
}
},
debounce: 25000,
});Conclusion
Real-time collaboration looks complex from the outside, but in practice it comes down to a few well-defined responsibilities: a shared document model (Y.js), a reliable transport and sync layer (Hocuspocus), and server-side hooks for authorization and persistence.
Hocuspocus provides an incredible abstraction to handle all of these things in the matter of a few lines of code - that also integrates well with most Node-based tech stacks.
If you want to see the exact wiring, our repository is open source. The core pieces mentioned in this post live here:
packages/web/src/lib/editor/document-editor.ts (client-side editor setup and provider integration)
packages/backend/src/hocuspocus-server.ts (Hocuspocus server configuration and hooks)
packages/backend/src/index.ts (backend entry point and server wiring)
For a broader overview of the stack around this, see our blog post on Technologies powering Lydie.