Hey all,
# spicedb
m
Hey all, I've searched the archives here and realize that I'm beating a dead horse, but still wanted to check if there are any updates or clever tricks. After using pg_dump and pg_restore to migrate from an RDS Postgres to an Aurora Postgres (which I also realize is not officially supported), we're in a place where we need to run the datastore repair command to bump our transactions id's. this seems to be incredibly inefficient, as the delta runs in the billions and the ETA for the repair tool to fix is 35 hours. We've done some experiments with using other methods of incrementing the transactions id's (a db procedure that runs a loop), but that doesn't seem to work any faster than the datastore repair tool, which makes sense as that's doing the same thing, and we're most likely capped on IOPS. The SpiceDB data we have is tiny, somewhere in megabytes or tens of megabytes. Are there better ways of achieving this?
What I'm looking into now would be using the CLI 'zed' tool to do a backup, and then a restore, which I guess would need me to wipe some tables in between. I'm assuming this would be much quicker - and would that create new transaction id's for everything?
We've also explored the tables in the schema, and are wondering if we would be able to just update the relation_tuple table with a set of lower xid's (in order), as we don't rely on them anywhere outside of spicedb
j
> I'm assuming this would be much quicker - and would that create new transaction id's for everything? yes, and you'd lose access history
but if you're okay with no back-in-time queries, this is much faster
> we would be able to just update the relation_tuple table with a set of lower xid's (in order), as we don't rely on them anywhere outside of spicedb here be dragons
m
> here be dragons šŸ™‚
But for 'zed' to work for taking a backup, I need a working spicedb first, right? So I'd have to do this from my previous database before starting to migrate
y
i think so, yes
m
This seems like something that needs to be fixed in a different way in spicedb. Not being able to migrate to a different database without hammering commits to the datbase only to increase a number, which can have a delta of millions or billions, feels very inefficient
y
if you use zed backup + restore you won't hit this
and the reason that pg_dump + pg_restore can't be handled this way is down to implementation details of postgres' MVCC internals, and the fact that SpiceDB needs to be able to make point-in-time DB queries to be able to provide consistent queries together with its caching behavior
agreed that it's a pain in the butt - we ran into this multiple times in multiple ways at my old company - but i don't think there's a good way to rework things to make pg_dump and pg_restore work
j
the repair command is likely the most efficient way we can support it, for now
the reason that number needs to be increased is we make use of PG's internal MVCC to support our own MVCC
when we previously layered our own MVCC on top, it was significantly slower
using PG's built-in MVCC is far more efficient
v
if you change PG instance, the snapshot revisions generated are different. There is no way around this.
and this is not something that can be trivially changed in SpiceDB without basically reenginering the postgres datastore implementation
and this is not a compeling enough argument to do such engineering. There is no reason to optimize for the rare scenario of doing
pg_dump
the system is designed and optimized for what it is for
m
Yeah. We will need to learn how to use zed for backups and restoring and then update our runbooks for this kind of database migration to include it. I understand that the underlying implementation assumes that transaction id's won't change, and that this would perhaps not be trivial - but I still think that there are multiple reasons why users might need to use a pg_dump/restore in some case - and migrating between different databases is one of them. Potentially migrating between providers, losing native backups, etc. would be other versions of this.
v
I'm not saying there is no use for pg_dump/pg_restore. But rather that it's not enough of a motivation to rewrite the entire datastore for that use-case. This is not a small detail, it's the corner-stone to how SpiceDB implements its consistency options. If there is an alternative to do application-level MVCC in postgres, we'd be happy to consider it. I at least don't know of an alternative.
Fortunately there are workarounds. You can use application-native backup and restore (there is an API for it, so you can even build your own, or you can use
zed
). There is also
repair
for those that want to do pg_dump. It's annoying because it takes long, but it works
2 Views