Monday, February 25, 2019

Cool day trip in February if you live in New Jersey.. tree tapping at Howell Living History Farm

Our family went to Howell Living History Farm for the tree tapping event 6 years or so ago, the kids were little and didn't really remember the event. We decided to repeat the event this year so that they would remember this.  From our home in Princeton it's about a 25 minute drive, not too bad

It was a cloudy day, so the pics look a little gloomy but it was not too cold

Here you can see some benches that are next to the main entrance where the shop is located


At the shop, you can see a board with all the activities and the time when the activities start. Ask someone for directions if you need to know where to go

While we were walking towards the area where they had the sap collecting buckets, these sheep came running towards us, my kids freaked out but they were very exciting to see the sheep

  Sheep Running Towards Us

A little later we saw this big duck, looked like it weighed at least 10 pounds to me, it was about the size of one of those Canada geese birds

Duck Drinking

The farm had this stack of neatly piled wood

Stack of Wood

You could actually get busy and cut the wood with this huge saw

Cut that wood

These orange roots looked really weird and much better in person

Orange roots over a stream

Here you can see the buckets that they use to collect the sap. If you stand still and don't make any noise, you can here the drops making noise when they hit the bucket. It's a pretty interesting sound

Many Buckets to tap maple sap

A close up shot of the buckets Buckets to tap maple sap

I tasted the sap as did my wife and the kids. It doesn't taste as syrup at all, it is like water with a very tiny hint of sweetness, the sap is also clear, it does not have a color

This is the evaporator, it is used to make syrup from the sap

Maple Syrup Evaporator

Open pan evaporation methods have been streamlined since colonial days, but remain basically unchanged. Sap must first be collected and boiled down to obtain pure syrup without chemical agents or preservatives. Maple syrup is made by boiling between 20 and 50 volumes of sap (depending on its concentration) over an open fire until 1 volume of syrup is obtained, usually at a temperature 4.1 °C (7.4 °F) over the boiling point of water

Boiling the sap for too long will create crystals so you have to be on top of the process and check it.  While they were doing the explaining, they also mentioned that the indigenous people would warm up stones and then drop those hot stones in the sap to create syrup.

After you are done with the presentation, you can go to the house where they will make you some old fashioned pancakes.

Farm house FarmHouse

Chickens Chickens

These chickens come inside at night because foxes and hawks will snatch them and eat them. They told us to come back in about 4 weeks because that is when they will have the little chicks.

My daughter Catherine petting this horse

After the farm, we drove to Nomad Pizza in Hopewell. If you want to eat there you will probably have to wait since they only have 10 tables or so. In the summer there is an outside area as well. This is why if we go there during the colder months we make sure to get there by 5

I had the chorizo with onion, pepper and mozzarella pizza, it was delicious

TWID Feb 25, 2019: Galaxy Fold, Huawei Mate X, Hololens 2, Juventus, Linux Fsync Issue fix for PostgreSQL, syrup

This is a post detailing some stuff I did, learned, posted and tweeted this week, I call this TWID (This week in Denis). I am doing this mostly for myself... a kind of an online journal so that I can look back on this later on. Will use the label TWID for these

This Week I Learned

Almost finished with the book The Annotated Turing: A Guided Tour Through Alan Turing's Historic Paper on Computability and the Turing Machine by Charles Petzold

It takes a lot of maple sap to create maple syrup.  The higher the sugar content of the sap, the smaller the volume of sap is needed to obtain the same amount of syrup. 57 units of sap with 1.5 percent sugar content will yield 1 unit of syrup, but only 25 units of sap with a 3.5 percent sugar content are needed to obtain one unit of syrup

Those are my youngest two trying the sap. I tried it as well, I must say there really is no taste to it.

I will make a separate post about our visit to the farm

This Week I Tweeted

Samsung’s foldable phone is the Galaxy Fold, available April 26th starting at $1,980

Samsung first teased its foldable phone back in November, and at the company’s Galaxy Unpacked event today, it’s further detailing its foldable plans. Samsung’s foldable now has a name, the Samsung Galaxy Fold, and the company is revealing more about what this unique smartphone can do. Samsung is planning to launch the Galaxy Fold on April 26th, starting at $1,980, through AT&T and T-Mobile in the US, with a free pair of Samsung’s new wireless earbuds. There will be both an LTE and 5G version of the Galaxy Fold, and Samsung is even planning on launching the device in Europe on May 3rd, starting at 2,000 euros.

Overpriced and if you damage a screen how much to repair this. I rather have a phone like the showed in the Expanse or even better a phone you can roll up so it is the size and shape of a pen. This thing is just too bulky as well. A better design would have been if there was a screen that you could slide out instead.

If you thought the Galaxy Fold was not expensive enough.. no worried The Mate X is Huawei’s 5G foldable phone... the price $2600.  Fitting name Mate X, as in mate you will need eXtra money for this one... 

Pass on both from me

Juventus share price is down 9% after their champions league result

Juventus share price is down 9% after their champions league result

  Ouch, that is not good, but Atletico played a much better 2nd half and converted their chances. Let's see if Juve can advance by scoring at least 2 at home in the return game.

Falsehoods Programmers Believe About Phone Numbers 

Some interesting things you might already know, still good to revisit this list

Some cool stuff you might enjoy

Microsoft’s HoloLens 2: a $3,500 mixed-reality headset for the factory

The Microsoft HoloLens 2 is available for preorder today for $3,500, and it’s expected to ship later this year. However, Microsoft has decided that it is only going to sell to enterprise customers who want to deploy the headset to their workers. As of right now, Microsoft isn’t even announcing a developer kit version of the HoloLens 2.

Compared to the HoloLens we first saw demonstrated four years ago, the second version is better in nearly every important way. It’s more comfortable, it has a much larger field of view, and it’s better able to detect real physical objects in the room. It features new components like the Azure Kinect sensor, an ARM processor, eye-tracking sensors, and an entirely different display system.

It has a couple of speakers, the visor flips up, and it can see what your hands are doing more accurately than before. There’s an 8-megapixel front-facing camera for video conferencing, it’s capable of full 6 degrees of tracking, and it also uses USB-C to charge. It is, in short, chock-full of new technology. But after four years, that should be no surprise.

Linux Fsync Issue for Buffered IO and Its Preliminary Fix for PostgreSQL

One of the common fixes applied to all the supported PostgreSQL versions is on – panic instead of retrying after fsync () failure. This fsync failure has been in discussion for a year or two now, so let’s take a look at the implications.

A fix to the Linux fsync issue for PostgreSQL Buffered IO in all supported versions
PostgreSQL performs two types of IO. Direct IO – though almost never – and the much more commonly performed Buffered IO.

PostgreSQL uses O_DIRECT when it is writing to WALs (Write-Ahead Logs aka Transaction Logs) only when wal_sync_method is set to : open_datasync or to  open_sync with no archiving or streaming enabled. The default  wal_sync_method may be fdatasync that does not use O_DIRECT. This means, almost all the time in your production database server, you’ll see PostgreSQL using O_SYNC / O_DSYNC while writing to WAL’s. Whereas, writing the modified/dirty buffers to datafiles from shared buffers is always through Buffered IO

Starting from kernel 4.13, we can now reliably detect such errors during fsync. So, any open file descriptor to a file includes a pointer to the address_space structure, and a new 32-bit value (errseq_t) has been added that is visible to all the processes accessing that file. With the new minor version for all supported PostgreSQL versions, a PANIC is triggered upon such error. This performs a database crash and initiates recovery from the last CHECKPOINT. There is a patch expected to be released in PostgreSQL 12 that works for newer kernel versions and modifies the way PostgreSQL handles the file descriptors. A long term solution to this issue may be Direct IO, but you might see a different approach to this in PG 12.

Some more info that I found in this hackernews comment thread that might interest you:
If you want an overview of the issue, here's a presentation from Tomas Vondra at FOSDEM 2019:
Or an early recap of the "fsyncgate" issue in textual form:

Related (also listed by Tomas Vondra): Linux's IO errors reporting

As always, I will leave you with a pic I took this past week. This one is a pic of some orange roots over a stream

Orange roots over a stream

Monday, February 18, 2019

Calculating Sexy Primes, Prime Triplets and Sexy Prime Triplets in PostgreSQL

The other day I was reading something on Hackernews and someone posted a link to a Sexy Primes wikipedia article.  I looked at that and then decided to do this in SQL Server because.. why not? Then I decided to see how different this would be to do in PostgreSQL.  For the first method to create the prime numbers it's different. For the method with the CTE it is very similar

From the Sexy Primes wikipedia link:

In mathematics, sexy primes are prime numbers that differ from each other by six. For example, the numbers 5 and 11 are both sexy primes, because 11 minus 5 is 6.

The term "sexy prime" is a pun stemming from the Latin word for six: sex.

If p + 2 or p + 4 (where p is the lower prime) is also prime, then the sexy prime is part of a prime triplet.

Ok I did a couple of versions of this and over the weekend.. here is what I ended up with

So first we need a table that will just have the prime numbers

I decided to populate a table with numbers from 2 till 500 and then use the sieve of Eratosthenes method to delete the non primes

This will look like this

Create this table

CREATE  TABLE  PrimeNumbers  (N INT);

In one window run this to create the function/proc

DECLARE I integer := 2;
    DELETE FROM PrimeNumbers WHERE N % I = 0 AND N > I;
    I := I + 1 ; 


$$ LANGUAGE plpgsql;

In a another window populate the table by making the call to the function

INSERT  INTO PrimeNumbers(n)
 FROM (SELECT generate_series(2,500) as n) x;

SELECT MakePrime() ; -- Yes that is a proc call

SELECT * FROM PrimeNumbers

Thinking about it a little more I decided to do it with a CTE instead of a loop with delete statements, if your tables will be big then the delete method is probably better... it's for you to test that out :-)

What we are doing is a NOT EXISTS query against the same cte and we are filtering out numbers that are greater than the number in the current row and are not divisible by the current number


;WITH cte AS (
  SELECT * FROM generate_series( 2, 500 )  n

INSERT INTO PrimeNumbers
FROM cte
  SELECT n FROM  cte as cte2
WHERE cte.n > cte2.n AND cte.n % cte2.n = 0)

SELECT * FROM PrimeNumbers;

If we run that last select statement, we should have 95 rows


Now that we have our table filled with prime numbers till 500, it's time to run the queries

Sexy prime pairs
The sexy primes (sequences OEIS: A023201 and OEIS: A046117 in OEIS) below 500 are:

(5,11), (7,13), (11,17), (13,19), (17,23), (23,29), (31,37), (37,43), (41,47), (47,53), (53,59), (61,67), (67,73), (73,79), (83,89), (97,103), (101,107), (103,109), (107,113), (131,137), (151,157), (157,163), (167,173), (173,179), (191,197), (193,199), (223,229), (227,233), (233,239), (251,257), (257,263), (263,269), (271,277), (277,283), (307,313), (311,317), (331,337), (347,353), (353,359), (367,373), (373,379), (383,389), (433,439), (443,449), (457,463), (461,467).

Here is that query for the sexy prime pairs

-- 46 rows.. sexy primes
SELECT t1.N,t2.N 
 FROM PrimeNumbers t1
join PrimeNumbers t2 on t2.N - t1.N = 6 
order by 1

It's very simple.. a self join that returns rows where the number from one table alias and the number from the other table alias differ by 6

Prime triplets
The first prime triplets below 500 (sequence A098420 in the OEIS) are

(5, 7, 11), (7, 11, 13), (11, 13, 17), (13, 17, 19), (17, 19, 23), (37, 41, 43), (41, 43, 47), (67, 71, 73), (97, 101, 103), (101, 103, 107), (103, 107, 109), (107, 109, 113), (191, 193, 197), (193, 197, 199), (223, 227, 229), (227, 229, 233), (277, 281, 283), (307, 311, 313), (311, 313, 317), (347, 349, 353), (457, 461, 463), (461, 463, 467)

A prime triplet contains a pair of twin primes (p and p + 2, or p + 4 and p + 6), a pair of cousin primes (p and p + 4, or p + 2 and p + 6), and a pair of sexy primes (p and p + 6).

So we need to check that the 1st and 3rd number have a difference of 6, we also check that that difference between number 1 and 2 is 2 or 4.  That query looks like this

-- 22 rows.. Prime Triplets
SELECT t1.N AS N1,t2.N AS N2, t3.N AS N3
 FROM PrimeNumbers t1
join PrimeNumbers t2 on t2.N > t1.N 
join PrimeNumbers t3 on t3.N - t1.N = 6
and t3.N > t2.N
and t2.n - t1.n IN (2,4)
order by 1

Here is what it looks like from pgAdmin

Sexy prime triplets
Triplets of primes (p, p + 6, p + 12) such that p + 18 is composite are called sexy prime.  p p, p+6 and p+12 are all prime, but p+18 is not

Those below 500 (sequence OEIS: A046118) are:

(7,13,19), (17,23,29), (31,37,43), (47,53,59), (67,73,79), (97,103,109), (101,107,113), (151,157,163), (167,173,179), (227,233,239), (257,263,269), (271,277,283), (347,353,359), (367,373,379)

The query looks like this.. instead of a self join, we do a triple self join, we also check that p + 18 is not a prime number in the line before the order by

-- 14 rows.. Sexy prime triplets
SELECT t1.N AS N1,t2.N AS N2, t3.N AS N3
 FROM PrimeNumbers t1
join PrimeNumbers t2 on t2.n - t1.n = 6
join PrimeNumbers t3 on t3.N - t1.N = 12
and t3.N > t2.N
AND NOT EXISTS( SELECT null FROM PrimeNumbers p WHERE p.n = t1.n +18)
order by 1

And that's it for this post.  If you are interested in the SQl Server version, you can find it here: Calculating Sexy Primes, Prime Triplets and Sexy Prime Triplets in SQL Server

More PostgreSQL posts can be found here:  /label/PostgreSQL

TWID Feb 18, 2019: Bruno Ganz, Red hat dropping MongoDB, 500px hacked, VFEmail

This is a post detailing some stuff I did, learned, posted and tweeted this week, I call this TWID (This week in Denis). I am doing this mostly for myself... a kind of an online journal so that I can look back on this later on. Will use the label TWID for these

This Week I Learned

Continued reading the book The Annotated Turing: A Guided Tour Through Alan Turing's Historic Paper on Computability and the Turing Machine by Charles Petzold

I was translating a block of T-SQL code that calculated sexy primes into PostgreSQL and found out that in order to use variables, you need to wrap it into a function in PostgreSQL . I also found out that PostgreSQL up until version 11 didn't really have stored procedures either but you could have functions behave like procs

This Week I Tweeted

Hackers wipe US servers of email provider VFEmail

"At this time, the attacker has formatted all the disks on every server," the company said yesterday. "Every VM is lost. Every file server is lost, every backup server is lost."

"This was more than a multi-password via SSH exploit, and there was no ransom. Just attack and destroy," VFEmail said.

Yep someone got pissed of for something and this was a big F U operation

500px Hacked: Personal Data Exposed for All 14.8 Million Users

The popular photo-sharing service 500px has announced that it was the victim of a hack back in 2018 and that personal data was exposed for all the roughly 14.8 million accounts that existed at the time.

In an email sent out to users and an announcement posted to its website, 500px states that it was only on February 8th, 2019, that its team learned of an unauthorized intrusion to its system that occurred on or around July 5th, 2018.

The personal data that may have been stolen by the intruder includes first and last names, usernames, email addresses, password hashes (i.e. not plaintext passwords), location (i.e. city, state, country), birth date, and gender.

Took over 6 months to find out...  that is a very long time.. as always make sure that your password is unique for each site that you use

Google will spend $13 billion on U.S. real estate in 2019, expanding into Nevada, Ohio  and Texas

CEO Sundar Pichai said in a blog post on Wednesday that the company is building new data centers and offices and expanding several key locations across the U.S., spending $13 billion this year.

Pichai outlined the plans, which include opening new data centers in Nevada, Ohio, Texas and Nebraska, the first time the company will have infrastructure locations in those states. The company is also doubling its workforce in Virginia, providing greater access to Washington, D.C., with a new office and more data center space, and expanding its New York campus at Hudson Square.

Have to spend all that that money to catch up to Amazon and Microsoft in the cloud

Red Hat Satellite to standardize on PostgreSQL backend, will be dropping MongoDB

When will MongoDB Community Edition be dropped as an embedded database within Red Hat Satellite?
This database change is a still to come, but the product team wanted to go ahead and communicate this intent to our users so they were not caught by surprise as this is a change to the underlying databases of Satellite.  No specific timing or release is being communicated at this time. At this point we’re simply hoping to raise awareness of the change that is coming to help users of Satellite prepare for the removal of MongoDB.

This is in response to the license changes that MongoDB made recently... Looking at the chart looks like this is no worry to investors, MongoDB  just hit a all time high

MongoDB  just hit a all time high

RIP Bruno Ganz, who Gen X remembers as the angel in "Wings of Desire" and millennials remember as Hitler in that bunker scene.

You have seen all the parodies of course, I actually only watched this movie on January 6th 2018. Downfall is an excellent movie, if you have some time, make sure to watch it


Some cool stuff you might enjoy

I wrote this post for a friend so that he has a reference on how to install SonarQUbe and how to get started. This post explain how you can user SonarQube to run static code analysis against your T-SQL procs and functions

February release of @AzureDataStudio is now available! 

- Admin Pack for SQL Server extension
- Auto-sizing columns in results
- Notebook UI improvements
- Profiler Filtering
- Save Results as XML
- Deploy scripts

I am still using SSMS but maybe I will switch to DataStudio one of these days

Someone took 50,000 images of the night sky to make an 81 Megapixel image of the moon  It's beautiful 
See it here: … 

mirrors of both JPG and PNG in zoomable versions here:  (JPG)  (PNG)

Finding rows where the column starts or ends with a 'bad' character  

Another post I wrote because of a problem that a co-worker had with some data

A nice view while going to the Princeton Junction train station...  had to take a pic

Princeton Junction parking lot path

Saturday, February 9, 2019

Twid Feb 10 2019: Gemini, Stablecoin, BrightBytes, Turing, world war I restored,githistory

This is a post detailing some stuff I did, learned, posted and tweeted this week, I call this TWID (This week in Denis). I am doing this mostly for myself... a kind of an online journal so that I can look back on this later on. Will use the label TWID for these

This Week I Learned

Continued reading the book The Annotated Turing: A Guided Tour Through Alan Turing's Historic Paper on Computability and the Turing Machine by Charles Petzold

Technically this didn't happen this week, but I decided to blog about it this week: After 20+ years in IT .. I finally discovered this...

This Week I Tweeted

Microsoft buys BrightBytes DataSense to bring more data analytics to schools

BrightBytes, based in San Francisco, is an education data-analytics company and has been a Microsoft partner for years. In addition to the DataSense data-integration platform it is selling to Microsoft, BrightBytes will continue to sell its decision-support platform called Clarity which uses machine learning and predictive analytics. 

Microsoft is planning to integrate DataSense into its Microsoft Education product family. According to Microsoft, DataSense is "the leading IPaaS (integration platform as a service) solution for both education solution providers and school districts across the US."

I don't know much about this company.. but it's the data and analytics that is valuable... companies like this one are getting snapped up left and right

Winklevoss Exchange Gemini Shuts Down Accounts Over Stablecoin Redemptions

In one instance, email correspondence obtained by CoinDesk shows an OTC trader based in Latin America had his account closed after he informed Gemini that he planned to redeem several million dollars of GUSD. (A major cryptocurrency exchange, speaking on condition of anonymity, attested to the desk’s professionalism and reported that it was in good standing.)

So you can deposit money but don't you dare take it out........

Python Developers Survey 2018 Results 

Python usage as a main language is up 5 percentage points from 79% in 2017 when Python Software Foundation conducted its previous survey.

Half of all Python users also use JavaScript. The 2018 stats are very similar to the 2017 results. The only significant difference is that Bash/Shell has grown from 36% in 2017 to 45% in 2018. Go and SQL have also grown by 2 percentage points each, while many other languages such as C/C++, Java, and C# have lost their share.

In 2018 we had significantly more respondents specifying they’re involved in DevOps (an increase of 8% compared to 2017). In terms of Python users using Python as their secondary language, DevOps has overtaken web development.

The use of Python 3 continues to grow rapidly. According to the latest research in 2017, 75% were using Python 3 compared with 25% for Python 2. Use of Python 2 is declining as it’s no longer actively developed, doesn’t get new features, and its maintenance is going to be stopped in 2020.

See all the other interesting fact on the jetbrains site

This is pretty cool.. point any public github file here and you can see the last 8  or so commits by scrolling from the top  here is what @BrentOzarULTD  sp_foreachdb changes look like ...

It looks like this

So for any file, just replace in the url with

For example.. the stuff in red needs to be replaced

Just hit this link to see it

Some cool stuff you might enjoy

List of stories set in a future now past

This is a dynamic list and may never be able to satisfy particular standards for completeness. You can help by expanding it with reliably sourced entries.
This is a list of fictional stories that, when written, were set in the future, but the future they predicted is now present or past. The list excludes works that were alternate histories, which were composed after the dates they depict. The list also excludes contemporary or near-future works (e.g. set within a year or two), unless it deals with some notable futuristic event as with the 2012 phenomenon. It also excludes works where the future is passively mentioned and not really depicting anything notable about the society, as with an epilogue that just focuses on the fate of the main characters. Entries referencing the current year may be added if their month and day were not specified or have already occurred.

Some from the year 2019... Akira, Blade Runner, Dark Angel

How Peter Jackson’s team made World War I footage look new

“I thought, ‘Can we actually make this 100-year-old footage look like it was shot now?’” Jackson said on the latest episode of Recode Decode with Kara Swisher. “So it’s sharp, it’s clear, it’s stable, it looks like modern [films].”

And he had the means to do that: Over the past five years, Jackson tasked Park Road Post (a subsidiary of his production company WingNut Films) with adding color and sound to the archive footage, as well as making the frame rate consistent and similar to what we’d expect from footage shot today. The result is the new documentary “They Shall Not Grow Old,” which he said removes the “barrier between us and the actual people that were being filmed.”

This is some amazingly looking stuff... here is a video so you can see what it looks like

Build 2019 registration opens on February 27th 

Join us in Seattle for Microsoft’s premier event for developers. Come and experience the latest developer tools and technologies. Imagine new ways to create software by getting industry insights into the future of software development. Connect with your community to understand new development trends and innovative ways to code.

Wondering what new stuff they will announce

The Chemical Brothers: Setting Sun (1996)
Haven't heard Setting Sun for a while, it's an excellent running song, I have added it to my running playlist on my phone

Setting Sun is a song by The Chemical Brothers with vocals by Noel Gallagher from the group Oasis. It was released as a single in 1996 from their second album Dig Your Own Hole and reached number one on the UK Singles Chart.

 Took this on my way to the Princeton Junction train station... beautiful morning colors

  Princeton Winter Wonderland

Monday, February 4, 2019

TWID Feb 3rd 2019, Huawei stealing, billion solar panels, big puzzle, beethoven, kodak

This is a post detailing some stuff I did, learned, posted and tweeted this week, I call this TWID (This week in Denis). I am doing this mostly for myself... a kind of an online journal so that I can look back on this later on. Will use the label TWID for these

This Week I Learned

Continued with the Getting Started with Docker Swarm Mode Pluralsight course

Started reading the book The Annotated Turing: A Guided Tour Through Alan Turing's Historic Paper on Computability and the Turing Machine by Charles Petzold

This Week I Tweeted

Huawei is accused of attempting to copycat a T-Mobile robot, and the charges read like a comical spy movie

The US on Monday charged the Chinese phone giant Huawei with trying to steal trade secrets from T-Mobile, among other crimes.
One Justice Department indictment includes internal emails between Huawei's US and Chinese employees who prosecutors said were trying to copy a T-Mobile device-testing robot.
The emails read like a comical spy movie, with one set of employees trying to avoid wrongdoing and another engineer getting caught putting part of the robot into his bag.
Huawei said that it hasn't violated any US laws and that it already settled with T-Mobile in a civil lawsuit.

What the heck is going on here?

How Do You Count Every Solar Panel in the U.S.? Machine Learning and a Billion Satellite Images

The DeepSolar Project, developed by engineers and computer scientists at Stanford University, is a machine learning framework that analyzes a dataset of satellite images in order to identify the size and location of installed solar panels.

To accurately count the panels, the DeepSolar team used a machine learning algorithm to analyze more than a billion high-resolution satellite images. The algorithm identified what the team believes to be almost every solar power installation across the contiguous 48 states.

The DeepSolar analysis reached a total of 1.47 million solar installations in the U.S., a much higher number than either of the two most commonly cited estimates.

“We can use recent advances in machine learning to know where all these assets are, which has been a huge question, and generate insights about where the grid is going and how we can help get it to a more beneficial place,” said Ram Rajagopal, associate professor of civil and environmental engineering, who supervised the project with Arun Majumdar, professor of mechanical engineering.

This is cool..but do they really need to use a billion images.. can't they use a subset and come pretty close to the real answer as well?

40x faster hash joiner with vectorized execution

For the past four months, I’ve been working with the incredible SQL Execution team at Cockroach Labs as a backend engineering intern to develop the first prototype of a batched, column-at-a-time execution engine. During this time, I implemented a column-at-a-time hash join operator that outperformed CockroachDB’s existing row-at-a-time hash join by 40x. In this blog post, I’ll be going over the philosophy, challenges, and motivation behind implementing a column-at-a-time SQL operator in general, as well as some specifics about hash join itself.

In CockroachDB, we use the term “vectorized execution” as a short hand for the batched, column-at-a-time data processing that is discussed throughout this post.

I love it when you see drastic speed improvements like this, I remember one time when we upgraded some hardware to use SSDs and more RAM.. a reporting query that took a minute now finished in less than a second.. I thought the results were wrong because it finished too fast lol

Some cool stuff you might enjoy

Kodak Premium Puzzle Presents: The World's Largest Puzzle 51,300 Pieces 27 Wonders from Around The World 28.5 Foot x 6.25 Foot Jigsaw Puzzle 

That is 8.68 meters wide for the metric people....aka 1 big white shark length

I don't always listen to classical music.. but when I do.. I make sure it's played on an electric guitar, enjoy this piece of Ludwig van Beethoven - Moonlight Sonata ( 3rd Movement ) by Tina S