Projects  /  NIC Punjab · iHRMS programme

iHRMS Punjab — FILESTREAM → SFTP document‑storage migration

Year 2025 – 26 Role Senior Developer (contract) Org NIC Punjab · iHRMS programme Status Rolling out

iHRMS modules accumulate attachments — medical bills, scanned forms, supporting documents — and for years those lived inside SQL Server as FILESTREAM blobs. It worked, until the database started carrying tens of thousands of files per module and replication, backup, and disk-pressure windows all began to suffer for the same reason.

Problem

FILESTREAM keeps the row and the file in the same logical unit, which is convenient until it isn’t. The convenience ends roughly the moment the database is large enough that operational work — backups, restores, replication, index maintenance — has to plan around files it doesn’t actually need to touch.1

We needed three things, in order: get attachments off the database; keep the existing application contract so consuming modules wouldn’t have to change at all; and do the cutover without a maintenance window long enough to register as one.

Approach

The migration is the boring part. The interesting part is the adapter layer that hides which storage a given attachment actually lives in — old rows still point at FILESTREAM until they’re moved; new rows go straight to SFTP; the application code reads the same interface either way.

C#
// Single read path. Both backends implement IAttachmentStore.
public interface IAttachmentStore {
    Task<Stream> OpenAsync(AttachmentRef r, CancellationToken ct);
    Task<AttachmentRef> PutAsync(Stream s, PutOpts o, CancellationToken ct);
    Task DeleteAsync(AttachmentRef r, CancellationToken ct);
}

// AttachmentRef carries enough to resolve either backend.
public record AttachmentRef(
    long   Id,
    string Backend,   // "filestream" | "sftp"
    string Locator,   // path or filestream pointer
    string Sha256);

The only thing the rest of the system cares about is that OpenAsync hands back a stream. Where that stream comes from is no longer the application’s problem.

Background workers move files from FILESTREAM to SFTP in batches, verify by SHA-256, swap the row’s Backend + Locator, and only then free the FILESTREAM blob. Reads during the move are idempotent because either backend can serve them.

The migration is “copy a file” repeated several hundred thousand times. The engineering is what you do when one of those copies half-finishes at 03:14 a.m.

Stack

Outcome

Database growth on the affected tables flattened almost immediately as new rows stopped writing into FILESTREAM. Backup windows dropped meaningfully, and module deployment no longer waits on file-page replication. No specific numbers in public — ask if you need them — but the shape of the change is what you’d expect: less DB pressure, more flexibility on the storage tier.

Reflection

The thing I keep underestimating about government IT work is how much of the difficulty is in the cutover, not the new code. Writing the SFTP adapter took a couple of weeks; designing the rollout so that nothing visible changed for end users took longer. If I were starting this project today, the very first thing I’d build is the consistency-checker that compares both backends row-by-row — not the writer, not the reader, the checker. The checker is what lets you sleep.

Aside Mostly I write up these projects so I can re-read them in two years and remember why I made the call to do the move-then-verify journal instead of a stream-and-verify pass. The why ages worse than the what.