Looking Forward to Postgres 19: It's About Time

(pgedge.com)

94 points | by xngbuilds 3 hours ago

15 comments

pjungwir 31 minutes ago
Hey I worked on this!
Thank you to everyone here saying they are excited about it. I often hear doubts that anyone wants this. Perhaps that's why vendors have been so slow to add it.
We are still missing system time, but if no one else wants to work on it, I hope to tackle that soon.
I have a lot of other ideas for improvement beyond SQL:2011, too. Here is a talk I gave last month about my personal roadmap:
https://illuminatedcomputing.com/pages/pgconf2026-temporal-r...
I've also been vibecoding a lisp REPL to play with the algebra of temporal relational operators (important for the planner): https://github.com/pjungwir/relsim
That overlaps with my attempt to write implementations for temporal semi/anti/outer-join and other relops: https://github.com/pjungwir/temporal_ops
If anyone has comments about what you'd like to see, I'm happy for feedback!
[-]
- bonesmoses 30 minutes ago
  Awesome! I'll look forward to that. :)
IgorPartola 2 hours ago
This is something that is incredibly useful. I built a system like this a while back that also adds versioning to each time period. The use case is this: let’s say you are tracking your state’s sales tax rate. You do not control this and data entry is manual so it is error prone. The rate is updated typically annually but sometimes more frequently.
Let’s say for 2026 you have it at 7.25% and you entered that into the system ahead of time (say December 2025). Today, June 12 you learn that it should have been 7.35%. It would be incorrect to say that the rate changed today: it was 7.35% since January 1. But you also don’t want to lose the fact that all your invoices have been generated using the wrong rate because if you go to recalculate them you will get a different answer.
In this case what you do is create version 2 of the rate in your database with the same time period but the correct rate. This would allow your other database objects to reference either version 1 or 2 and to even recalculate all the objects that reference version 1 to now reference version 2 such that you can get line item corrections and figure out what to do about them.
It is cumbersome to use but for the specific use case of modeling real world laws that are not available as machine-readable info it is the best option I came up with.
[-]
- munk-a 28 minutes ago
  Yeah, I think folks that under appreciate this new functionality look at the approaches that existed and say "That works, we could just sorta, wrap that in a function" but when you start getting into useful and entangled data the overhead of implementing proper bounds checking on ranges when you're more focused on the preservation of the linkages to existing data - it gets complicated quickly. This feature doesn't make anything building a system like you've described trivial but it moves us towards a higher level of expression - when building such a system you'd be focused on the complex logic of preserving the invoice integrity and reconciling it with the actual enforced tax rate rather than needing to step down to the lower level code and fiddling with date range boundaries. Much like moving from assembly to C a higher level of expression allows clearer focus on the actual problems the software is solving rather than getting tripped up in the procedural rules of implementation.
fabian2k 14 minutes ago
Some kind of versioning is extremely important for certain use cases. And having it a core DB feature makes it easier to show that you implement that checkbox.
One thing I'm wondering about is the performance of temporal tables for the common case, when you only query current rows. When you manually version tables, one strategy is to have a second table that contains archived versions. So your main table only has the current rows, avoiding a performance hit for having many versions per entry. Is there a way to do this with temporal tables? For example partitioning between active and old rows?
bonesmoses 9 minutes ago
Postgres 19 is looking to be a solid release. I don't think I've seen this much "new" stuff in a single version since v10 came out.
MBCook 28 minutes ago
Ooh fantastic. I’ve been using date ranges plus GIST indexes for like a decade to do this. It’s really nice. But the lack of foreign keys can be painful. I’ve resorted to stored procedures for crates and updates to ensure everything is done right and enforced.
This is WAY easier.
ris 23 minutes ago
Cool feature, but I'm a little uneasy with UPDATE operations adding new rows to a table. It upsets a lot of a DBA's assumptions.
munk-a 2 hours ago
It can be super ugly to try and hand-manage date time range manipulation in a system with moving parts. It is, of course, possible, but it's a headache to try and pull it off when there is complex business logic focused on those datetimes and interactions between ranges. The period constraint is an excellent tool for trivially guaranteeing range coverage (in a case where, for instance, a customer is known to be active from a to b it helps ensure that there aren't any gaps created during the juggling of different sub-ranges) while the new DELETE FOR PERIOD OF syntax makes manipulating swathes of history (I don't care what was happening here this interval should now be X) much more trivial than before.
oveja 50 minutes ago
Oh man this is really good. I have had to do some gist exclusions for payment data and it was clunky. Hopefully we get the other half of bitemporal support with postgres 20!
larsnystrom 2 hours ago
Why are they storing a time period (start and end date) in the first example? Why not just store the date when the price comes into effect? That would make both overlaps and time travel impossible without using any constraints.
[-]
- yourMadness 2 minutes ago
  If you sell a product where the customer can buy the future version today (for delivery in the future), that doesn't work.
- advisedwang 1 hour ago
  JOINs and other operations become really difficult if you can't evaluate whether a row applies or not based on that row alone.
- throwaway7783 2 hours ago
  Works when there is always an active price. Having an explicit end date allows certain rows to be inactive automatically after validity period. Think of seasonal categories/products etc which dont exist after a specific period
  [-]
  - promano 2 hours ago
    And even if you don't have seasonal products, you still need an end date to mark when you stop selling a product; otherwise you have to do something hacky like defining "<price> = NULL means we stopped selling the product after <valid_from>" and inserting an extra record.
    I think the end date should be nullable though, but valid_to is NOT NULL in the starting example... later in the article, when showing the "new way" using date ranges, it inserts a row with an open-ended range, which is more what I'd expect.
  - larsnystrom 2 hours ago
    You could also make the price column nullable and just insert a row with price null and the date from which there should be no price.
    [-]
    - throwaway7783 59 minutes ago
      Yeah, but now this is implicit information (no price == marker row) and duplicate the entire row. And it is baking in "price" as the special field. This may not be just one field for another temporal entity
- quotemstr 2 hours ago
  It's a trade-off. If you store both endpoints you can continue to think of rows as order-invariant tuples. If you store only one endpoint, you have to impose a meaningful order on the rows in order for them to make sense.
  [-]
  - larsnystrom 2 hours ago
    Sure, from a theoretical perspective, but in practice there’s got to be some sort of order at some point even when storing timespans.
- warmwaffles 1 hour ago
  It's a contrived example.
bhaak 2 hours ago
I remember reading about this feature for Oracle in the 2000s and was always interested to use it in a production environment.
It never came to pass when we used Oracle, maybe now with Postgres I will finally have a chance at it.
cherryteastain 1 hour ago
Very useful feature for trading systems where the exchange might tell you your order got filled but then send a trade correction or bust message
jacques_chester 1 hour ago
Exciting. Honestly I expect this will do more to advance bitemporal design than decades of jawboning has.
And really, ranges are an amazing substrate for this. I've had to do this by hand in a ... less featuresome ... SQL-speaking DB and it was clunky and performed fairly unimpressively.
evdubs 2 hours ago
> Recently, a new type of question has entered the database arena: what did this data look like last Tuesday?
This question has been answerable in Dolt for years now.
[-]
- quotemstr 2 hours ago
  This kind of DB isn't really answering this question. There's a lot more subtlety to time-span analysis than snapshotting. In particular, aligning two series is non-obvious.
  Say you have one time series with CPU-core task switches:
  T=1 task=A T=3, task=B, T=5 task=A, ...
  ... and another of CPU frequency changes ...
  T=2 freq_hz=800, T=5 freq_hz=1200, T=6 freq_hz=900
  How, in SQL, do you express the question "How many CPU cycles did each task use?"? Try to do it with more complex examples. You'll tear your hair out.
  Having worked on this sort of data analysis quite a lot, I'm strongly of the opinion that SQL needs syntax, not just table-valued functions, for expressing questions about timelines.
  [-]
  - evdubs 2 hours ago
    The cool thing about Dolt is that you [eventually] get the features of the databases (MySQL, PostgreSQL, SQLite, MongoDB) they emulate, so you can have your PG 19 temporality features as well as branching and merging.
    [-]
    - quotemstr 2 hours ago
      Yep. I'm just pointing out that the problems Dolt solves are different from the problems a timeline-aware SQL algebra solves.
quotemstr 2 hours ago
Great! I've been wanting native time-based tables for ages. Years and years ago, I even wrote my own DB engine to get them!
See https://dancol.org/dctv/index.xml
I still think there's a lot of value in first-class syntax for time queries. Splitting ranges like Postgres 19 does is a good first step, but there's also a lot of power in broadcasting over these ranges, combining them in various ways, and storing multiple, independent ranges in a single table.
Ignore the bit about active development: these days, it'd take more sense to add the operators I describe to Postgres and DuckDB than to make a numpy-based engine just to host the analysis. This work predates DuckDB, and it's reassuring that DuckDB (and now Postgres) are thinking along similar lines.
I'm also glad that in the intervening years "data lake"-style analysis has become more prominent. My ideal data processing pipeline consists of sourcing from raw data and pipelining views all the way to human-meaningful outputs. Materialization, if it occurs, is just an optimization.
horticulturist 2 hours ago
[dead]