Musings, Rants and Ponderings Of A DB Architect

Monday, December 3, 2018

Using PostgreSQL's Interval to mimic SQL Server's DATEADD function

I wanted to do some date calculations in PostgreSQL and was doing some research on if something like the DATEDIFF function that exists in SQL Server is available in PostgreSQL. These notes are mostly for me so I can refer back to them..but maybe they are useful for someone else as well

In SQL Server to add dates, minutes and other fractions of a date to a date, you can use the DATEADD function

Here are some quick examples, if you want to run this in SQL Server, create this table with one row first

CREATE  TABLE test(SomeDate date);

INSERT INTO test
VALUES('20120101');

And here is a simple DATEADD query, that adds 1 or 2 days to a date by using both the datepart and the abbreviated datepart

SELECT SomeDate,
  DATEADD(dd,1,SomeDate) as Interval1dd,
  DATEADD(dd,2,SomeDate) as Interval2dd,
  DATEADD(day,1,SomeDate) as Interval1Day,
  DATEADD(day,2,SomeDate) as Interval2Day
FROM test

That will give us the following output

SomeDate Interval1dd Interval2dd Interval1Day Interval2Day

2012-01-01 2012-01-02 2012-01-03 2012-01-02 2012-01-03

If you want to go negative, all you have to do is place a minus sign in front of the number

SELECT SomeDate,
  DATEADD(dd,-1,SomeDate) as Interval1dd,
  DATEADD(dd,-2,SomeDate) as Interval2dd,
  DATEADD(day,-1,SomeDate) as Interval1Day,
  DATEADD(day,-2,SomeDate) as Interval2Day
FROM test

Here is the output of that query

SomeDate Interval1dd Interval2dd Interval1Day Interval2Day
2012-01-01 2011-12-31 2011-12-30 2011-12-31 2011-12-30

Here are all the valid datepart arguments in SQL Server

datepart	Abbreviations
year	yy, yyyy
quarter	qq, q
month	mm, m
dayofyear	dy, y
day	dd, d
week	wk, ww
weekday	dw, w
hour	hh
minute	mi, n
second	ss, s
millisecond	ms
microsecond	mcs
nanosecond	ns

Now let's take a look at PostgreSQL

In PostgreSQL, there is no DATEPART function, but you can use interval literals to accomplish something that behaves the same

Let's do something similar like we did with the SQL Server queries, first create this temp table with one row

CREATE Temp TABLE test(SomeDate date);


INSERT INTO test
VALUES( to_date('20120101','YYYYMMDD'));

Now let's run this query

SELECT SomeDate,(SomeDate + 1 * INTERVAL '1 Day' ) as Interval1Day,
  (SomeDate + 1 * INTERVAL '1 D')  as IntervalD,
  (SomeDate + 2 * INTERVAL '1 Day' ) as Interval2Times1Day,
  (SomeDate + 1 * INTERVAL '2 Day' ) as Interval2Days,
  (SomeDate + 2 * INTERVAL '2 Days' ) as Interval2Times2Days,
  (SomeDate +  INTERVAL '1 Day' )    as IntervalD
FROM test

Here is the output

"2012-01-01";"2012-01-02 00:00:00";"2012-01-02 00:00:00";"2012-01-03 00:00:00";"2012-01-03 00:00:00";"2012-01-05 00:00:00";"2012-01-02 00:00:00"

When you copy from pgAdmin , you don't get the column aliases, so below is a screenshot of what it looks like(Click on the image for a bigger sized image)

As you can see there are two parts where you can supply a number

I think I prefer the top one from the query below, since it resembles the DATEPART function more

(SomeDate + 2 * INTERVAL '1 Day' ) as Interval2Times1Day,
(SomeDate + 1 * INTERVAL '2 Day' ) as Interval2Days,

But as you saw,it was possible to add 4 days by using a one in both places like shown below

(SomeDate + 2 * INTERVAL '2 Days' ) as Interval2Times2Days,

And of course if you want, you can just use the number inside the string like in this example below

(SomeDate +  INTERVAL '1 Day' )    as IntervalD

It's up to you, but I don't like changing numbers inside a string

To do negative numbers, you just change the positive number to a negative number, here is the same query from before but now with negative numbers

SELECT SomeDate,(SomeDate + -1 * INTERVAL '1 Day' ) as Interval1Day,
  (SomeDate + -1 * INTERVAL '1 D')    as IntervalD,
  (SomeDate + -2 * INTERVAL '1 Day' ) as Interval2Times1Day,
  (SomeDate + -1 * INTERVAL '2 Day' ) as Interval2Days,
  (SomeDate + -2 * INTERVAL '-2 Days')as Interval2Times2Days,
  (SomeDate +  INTERVAL '-1 Day' )    as IntervalD
FROM test

Here is the output
"2012-01-01";"2011-12-31 00:00:00";"2011-12-31 00:00:00";"2011-12-30 00:00:00";"2011-12-30 00:00:00";"2012-01-05 00:00:00";"2011-12-31 00:00:00"

As you can see that all works as expected, did you notice that the we get +4 when we do -2 * -2?

Here is the output also from pgAdmin so that you can see the column aliases

Besides using days, you can also use these parts of a date

Abbreviation	Meaning
Y	Years
M	Months (in the date part)
W	Weeks
D	Days
H	Hours
M	Minutes (in the time part)
S	Seconds

Let's take a look by using some of these in a query

SELECT SomeDate,(SomeDate + 1 * INTERVAL '1 Day' ) as Interval1Day,
  (SomeDate + 1 * INTERVAL '1 Week' ) as Interval1Week,
  (SomeDate + 1 * INTERVAL '1 Month' ) as Interval1Month,
  (SomeDate + 1 * INTERVAL '1 Year' ) as Interval1Year
  
FROM test
UNION ALL
SELECT SomeDate,(SomeDate + 1 * INTERVAL '1 D' ) as Interval1Day,
  (SomeDate + 1 * INTERVAL '1 W' ) as Interval1Week,
  (SomeDate + 1 * INTERVAL '1 M' ) as Interval1Month,
  (SomeDate + 1 * INTERVAL '1 Y' ) as Interval1Year
  
FROM test

Here is the output

"2012-01-01";"2012-01-02 00:00:00";"2012-01-08 00:00:00";"2012-02-01 00:00:00";"2013-01-01 00:00:00"
"2012-01-01";"2012-01-02 00:00:00";"2012-01-08 00:00:00";"2012-01-01 00:01:00";"2013-01-01 00:00:00"

Here is the output also from pgAdmin so that you can see the column aliases

As you can see when using M it used minute not month. I would recommend to always use the full name and not the abbreviated part so as not to create confusion

That's all for this post

Sunday, November 25, 2018

Easy running totals with windowing functions in PostgreSQL

Back in the pre windowing function days, if you wanted to do a running count, you either had to run a subquery or you could use a variable. This was slow because for each row the query that did the sum would be executed. With windowing functions in PostgreSQL, this is now running much faster.

Let's take a look, first create the following table

CREATE Temp TABLE test(Id int,SomeDate date, Charge decimal(20,10));


insert into test
values( 1,to_date('20120101','YYYYMMDD'),1000);
insert into test
values( 1,to_date('20120401','YYYYMMDD'),200);
insert into test
values( 1,to_date('20120501','YYYYMMDD'),300);
insert into test
values( 1,to_date('20120601','YYYYMMDD'),600);
insert into test
values( 2,to_date('20120101','YYYYMMDD'),100);
insert into test
values( 2,to_date('20120101','YYYYMMDD'),500);
insert into test
values( 2,to_date('20120101','YYYYMMDD'),-800);
insert into test
values( 3,to_date('20120101','YYYYMMDD'),100);

let's check that data we just inserted into the temporary table

SELECT * FROM test

The output looks like this

Id SomeDate Charge
1 2012-01-01 1000.0000000000
1 2012-04-01 200.0000000000
1 2012-05-01 300.0000000000
1 2012-06-01 600.0000000000
2 2012-01-01 100.0000000000
2 2013-01-01 500.0000000000
2 2014-01-01 -800.0000000000
3 2012-01-01 100.0000000000

What we want is the following

id StartDate Enddate         Charge         RunningTotal
1 2012-01-01 2012-03-31 1000.0000000000 1000.0000000000
1 2012-04-01 2012-04-30 200.0000000000 1200.0000000000
1 2012-05-01 2012-05-31 300.0000000000 1500.0000000000
1 2012-06-01 9999-12-31 600.0000000000 2100.0000000000
2 2012-01-01 2012-12-31 100.0000000000 100.0000000000
2 2013-01-01 2013-12-31 500.0000000000 600.0000000000
2 2014-01-01 9999-12-31 -800.0000000000 -200.0000000000
3 2012-01-01 9999-12-31 100.0000000000 100.0000000000

For each row, we want to have the date that the row starts on and also the date when it ends, we also want a running total as well. If there is no row after the current row for that id, we want the end date to be 9999-12-31.

So we will use a couple of functions. The first one is LEAD, LEAD accesses data from a subsequent row in the same result set without the use of a self-join. So the LEAD part looks like this

LEAD((SomeDate + -1 * INTERVAL '1 day' ),1,'99991231') OVER (PARTITION BY id ORDER BY SomeDate) as Enddate,

What we are doing is subtracting 1 from the date in the subsequent row (SomeDate + -1 * INTERVAL '1 day' )
We are using 1 as the offset since we want to apply this to the next row. Finally if there is no subsequent row, we want to use the date 9999-12-31 instead of NULL

To do the running count, we will do the following

SUM(Charge) OVER (PARTITION BY id ORDER BY SomeDate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS RunningTotal

What this means in English is for each id ordered by date, sum up the charge values for the rows between the preceding rows and the current row. Here is what all that stuff means.

ROWS BETWEEN
Specifies the rows that make up the range to use as implied by

UNBOUNDED PRECEDING
Specifies that the window starts at the first row of the partition. UNBOUNDED PRECEDING can only be specified as window starting point.

CURRENT ROW
Specifies that the window starts or ends at the current row when used with ROWS or the current value when used with RANGE.
CURRENT ROW can be specified as both a starting and ending point.

And here is the query

SELECT id, someDate as StartDate,
LEAD((SomeDate + -1 * INTERVAL '1 day' ),1,'99991231')
 OVER (PARTITION BY id ORDER BY SomeDate) as Enddate,
  Charge,
  SUM(Charge) OVER (PARTITION BY id ORDER BY SomeDate 
     ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) 
          AS RunningTotal
  FROM test
  ORDER BY id, SomeDate

And running that query, gives us the running count as well as the end dates

id StartDate Enddate         Charge         RunningTotal
1 2012-01-01 2012-03-31 1000.0000000000 1000.0000000000
1 2012-04-01 2012-04-30 200.0000000000 1200.0000000000
1 2012-05-01 2012-05-31 300.0000000000 1500.0000000000
1 2012-06-01 9999-12-31 600.0000000000 2100.0000000000
2 2012-01-01 2011-12-31 100.0000000000 100.0000000000
2 2012-01-01 2011-13-31 500.0000000000 600.0000000000
2 2012-01-01 9999-12-31 -800.0000000000 -200.0000000000
3 2012-01-01 9999-12-31 100.0000000000 100.0000000000

Here is what it looks like if you execute the query in PGAdmin

If you don't want the last row to have the end date filled in, just omit the default value in the LEAD function. Instead of

LEAD((SomeDate + -1 * INTERVAL '1 day' ),1,'99991231')

Make it

LEAD((SomeDate + -1 * INTERVAL '1 day' ),1)

Here is the whole query again


SELECT id, someDate as StartDate,
LEAD((SomeDate + -1 * INTERVAL '1 day' ),1)
 OVER (PARTITION BY id ORDER BY SomeDate) as Enddate,
  Charge,
  SUM(Charge) OVER (PARTITION BY id ORDER BY SomeDate 
     ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) 
          AS RunningTotal
  FROM test
  ORDER BY id, SomeDate





And here is what the output looks like after we made the change







As you can see the rows for an id that doesn't have a row with a date greater than the current date will have a null end date



That is all for this post

Friday, November 23, 2018

Happy Fibonacci day, here is how to generate a Fibonacci sequence in PostgreSQL

Image by Jahobr - Own work, CC0, Link

Since today is Fibonacci day I decided to to a short post about how to do generate a Fibonacci sequence in PostgreSQL. But first let's take a look at what a Fibonacci sequence actually is.

In mathematics, the Fibonacci numbers are the numbers in the following integer sequence, called the Fibonacci sequence, and characterized by the fact that every number after the first two is the sum of the two preceding ones:

1, 1, 2, 3, 5, 8, 13, 21, 34, ...

Often, especially in modern usage, the sequence is extended by one more initial term:

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...

November 23 is celebrated as Fibonacci day because when the date is written in the mm/dd format (11/23), the digits in the date form a Fibonacci sequence: 1,1,2,3.

So here is how you can generate a Fibonacci sequence in PostgreSQL, you can do it by using s recursive table expression. Here is what it looks like if you wanted to generate the Fibonacci sequence to up to a value of 1 million

;WITH RECURSIVE Fibonacci (Prev, Next) as
(
     SELECT 0, 1
     UNION ALL
     SELECT Next, Prev + Next
     FROM Fibonacci
     WHERE Next < 1000000
)
SELECT Prev as Fibonacci
     FROM Fibonacci
     WHERE Prev < 1000000

That will generate a Fibonacci sequence that starts with 0 if you need the Fibonacci sequence to start at 1, all you have to do is replace the 1 to 0 in the first select statement

;WITH RECURSIVE Fibonacci (Prev, Next) as
(
     SELECT 1, 1
     UNION ALL
     SELECT Next, Prev + Next
     FROM Fibonacci
     WHERE Next < 1000000
)
SELECT Prev as Fibonacci
     FROM Fibonacci
     WHERE Prev < 1000000

Here is what it looks like in PGAdmin when you run the query

Happy Fibonacci day!!

Here is pretty much the same post that I created for SQL Server: Happy Fibonacci day, here is how to generate a Fibonacci sequence in SQL

Tuesday, November 20, 2018

This brought back memories

The other day I walked past the building in the picture above and I felt a little sad. The reason I felt sad is because I have great memories about this building. When I first came to the US from the Netherlands I was amazed at how big this store was and how they had so many things on display. In Amsterdam where I arrived from, you didn’t really have a store like this. The story was named J&R, there was a J&R Music World store and a J&R Computer World store. J&R stands for the founders Joe and Rachelle Friedman.

There were several items I bought at this store

The first camera I ever owned I bought in this place, it was a Minolta but I don’t remember the model. I later sold the camera because a co-worker needed the camera while he went on vacation to Equador. So I only had it for 2 or 3 months.

A couple of years later I replaced that camera with a Canon 50E I think this Canon camera might have also been known as an Elan. The E in the model name stands for eye-control, this was an enhanced version of the 3-zone eye-controlled autofocus system that was first seen on the EOS 5 camera. This was very high tech back then.

Another item that I have fond memories about was the first MP3 player that I purchased; It was the first MP3 player that was available to buy. It was the Rio PMP-300 portable MP3 player. I believe it was around $200 and you could hold about 10 songs or so since it only came with 32 MB of memory. You could store more songs if you inserted a smartmedia card. But even then you could not store hundreds of songs. I remember encoding the songs in WMA 64k format so I could store even more songs…… lol 1^st world problems.

I bought many other things at J&R like camera lenses, filters for the camera, a tripod, RAM to upgrade the PC, I even purchased Plus! lol Plus had themes and utilities for Windows

The nice thing about J&R was that they had prices unlike 42 street photo or whatever those stores were called where you had to negotiate the price. Also unlike at the Nobody Beats The Wiz store, they did not try to sell you insurance with every item you bought.

These days I guess you would have to go to B&H to get a similar experience, just be aware that they are not open on Saturdays.