TalkingQuickly's Today I Learned

Yellow / Beige Areas in Google Maps

In Google Maps, there are some areas which are coloured yellow (or, I'm told more accurately, beige). These apparently represent areas with high foot traffic. So when exploring a new city, they can be used to understand which bits people often walk around and which bits tend to be more residential / road traffic only. Really useful for working out which route to take through a city you're exploring.

Who has access to which schemas in Postgres

Great little snippet for understanding which users have which privileges on a schema in Postgres / Redshift.

SELECT *
FROM
    (
    SELECT
        schemaname
        ,objectname
        ,usename
        ,HAS_TABLE_PRIVILEGE(usrs.usename, fullobj, 'select') AND has_schema_privilege(usrs.usename, schemaname, 'usage')  AS sel
        ,HAS_TABLE_PRIVILEGE(usrs.usename, fullobj, 'insert') AND has_schema_privilege(usrs.usename, schemaname, 'usage')  AS ins
        ,HAS_TABLE_PRIVILEGE(usrs.usename, fullobj, 'update') AND has_schema_privilege(usrs.usename, schemaname, 'usage')  AS upd
        ,HAS_TABLE_PRIVILEGE(usrs.usename, fullobj, 'delete') AND has_schema_privilege(usrs.usename, schemaname, 'usage')  AS del
        ,HAS_TABLE_PRIVILEGE(usrs.usename, fullobj, 'references') AND has_schema_privilege(usrs.usename, schemaname, 'usage')  AS ref
    FROM
        (
        SELECT schemaname, 't' AS obj_type, tablename AS objectname, schemaname + '.' + tablename AS fullobj FROM pg_tables
        WHERE schemaname not in ('pg_internal')
        UNION
        SELECT schemaname, 'v' AS obj_type, viewname AS objectname, schemaname + '.' + viewname AS fullobj FROM pg_views
        WHERE schemaname not in ('pg_internal')
        ) AS objs
        ,(SELECT * FROM pg_user) AS usrs
    ORDER BY fullobj
    )
WHERE (sel = true or ins = true or upd = true or del = true or ref = true)
and schemaname='SOME_SCHEMA_NNAME';

Note that if you can't track down why a user has a particular permission, it's worth delving into the permissions granted to the public role, more about that here: https://blog.dbrhino.com/locking-down-permissions-in-postgresql-and-redshift.html

A git remote can push to multiple repos

This is especially useful if you want to automatically push to master on github (for example) when doing git push heroku master to prevent you from ending up with the repo and your deployment being out of date.

It can be done like this (assuming you already have a Heroku branch setup):

git remote set-url --add --push heroku [email protected]:USERNAME/PROJECT.git
git remote set-url --add --push origin [email protected]:ccgmurdoch/my-project.git

And you can verify that it worked by taking a look at the output of:

git remote -v

Once this is done, a git push heroku master will automatically push the same branch to your github branch.

Source: https://gist.github.com/rvl/c3f156e117e22a25f242

Why we get high tides on both sides of the earth

We experience high tides on two sides of the earth at once, the side which is closest and the side which is furthest away. The side which is closest is quite intuitive, the gravitational pull of the moon is strongest here, it exceeds the inertial force holding the water in place and so the water is pulled towards the moon.

The side furthest away is less intuitive. Apparently the matching tide here is because the moons gravitational pull is weakest here, inertia exceeds the gravitational force, causing the water to try and keep moving in a straight line, and so bulging out. (https://oceanservice.noaa.gov/education/tutorial_tides/tides03_gravity.html)

So why then are tides smaller at the equator? Turns out this isn't actually true, it's just true in some well known locations. See https://www.co-ops.nos.noaa.gov/faq2.html#27 and https://www.quora.com/Why-are-tides-lower-at-the-equator

Charts, Latitude, Longitude

Many charts will show that they are using data from surveys over 100 years ago. In practice surveys from 100 years ago and built up through triangulation are often very accurate in terms of the relative position of one thing compared to another. So if there was a survey of a remote island, the position of different visual landmarks making up the island with respect to each other, will generally be very accurate.

The challenge is that all of these sets of measurements require some sort of starting point to begin measuring from, e.g. an initial datum. While measurements taken via triangulation may well be accurate to a few metres, the initial datum, e.g. the position of the island itself may have been taken based on star positions and could easily vary by half a mile.

Because of this it's important when comparing charts and electronic devices to ensure they have been explicitly set to use the same datum.

While this is less of a problem in the age of GPS, a problem it still is due to the earth not being a perfectly uniform shape. More on the current standard (WGS84) here: https://en.wikipedia.org/wiki/World_Geodetic_System.

Calculating distance in Redshift

Execute the below in Redshift as an admin to create a function for calculating the distance in miles between two lat lngs, saves having lots of queries full of trig. Can get to km if needed by multiplying by 1.609344.

------- DISTANCE_IN_MILES FUNCTION ---------
CREATE FUNCTION DISTANCE_IN_MILES (orig_lat float, orig_long float, dest_lat float, dest_long float)
  RETURNS float
STABLE
AS $$
  import math
  r = 3963.1676
  phi_orig = math.radians(orig_lat)
  phi_dest = math.radians(dest_lat)
  delta_lat = math.radians(dest_lat - orig_lat)
  delta_long = math.radians(dest_long - orig_long)
  a = math.sin(delta_lat/2) * math.sin(delta_lat/2) + math.cos(phi_orig) \
      * math.cos(phi_dest) * math.sin(delta_long/2) * math.sin(delta_long/2)
  c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
  d = r * c
  return d
$$ LANGUAGE plpythonu;

From: https://stackoverflow.com/questions/25316193/postgres-sql-amazon-redshift-how-to-calculate-distance-between-two-latitude

Really Basic R Concepts

For assignment should always use <- rather than =

Can use class(VAR) to get the type of an object

R functions allow both positional and name based matching. So class(5) and class(x=5) are both valid invocations, class(y=5) is not because class does not have an argument named y. Default arguments for functions are supported.

Use ?method for docs, e.g. ?class. Running this in RStdudio triggers a wonderful bit of magic where docs are displayed in the output panel rather than the console.

Great overview: https://cecilialee.github.io/blog/2017/12/05/intro-to-r-programming.html

R has lots of built in example data sets

The package datasets contains a selection of example datasets which can be used for testing out and playing with R functions.

You can get a list of all available datasets by running data() in a console. Each dataset is available by default in a variable the name of which is outputted in data(), e.g. AirPassengers or cars.

A summary of the data can be generated using the R function summary, e.g. summary(cars).

More: https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/00Index.html

and: http://www.sthda.com/english/wiki/r-built-in-data-sets

Cloudflare Free SSL on Heroku

It's been bugging me for ages that I could only access this TIL instance via the Heroku URL, not the til.talkingquickly.co.uk subdomain. Turns out that for SSL to be terminated by Cloudflare, rather than using the custom subdomain CNAME from Heroku, e.g. yourdomain.herokudns.com, as a CNAME entry in Cloudflare, instead you should use the Heroku provided URL, e.g. YOURAPP.herokuapp.com.

From: https://thoughtbot.com/blog/set-up-cloudflare-free-ssl-on-heroku

Creating a read only user for Postgres

One of my most common DBA tasks is adding read only users for hooking up analytics systems, this is the most concise approach I've found to this. Since Redshift mirrors Postgres for most of these things, the same approach should also work on Redshift.

-- Create a group
CREATE ROLE readaccess;

-- Grant access to existing tables
GRANT USAGE ON SCHEMA public TO readaccess;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readaccess;

-- Grant access to future tables
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO readaccess;

-- Create a final user with password
CREATE USER some_username WITH PASSWORD 'secret';
GRANT readaccess TO some_username;

This came from this gist: https://gist.github.com/oinopion/4a207726edba8b99fd0be31cb28124d0

Mismatched kubeadm versions & kubeadm join

When joining a cluster with versions of kubeadm and/ or kubelet which don't match the master, you get an error like:

[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.11" ConfigMap in the kube-system namespace
configmaps "kubelet-config-1.11" is forbidden: User "system:bootstrap:tn8mhk" cannot get configmaps in the namespace "kube-system"

This can be fixed by installing matching versions (e.g. apt-cache madison kubeadm then apt-get install kubeadm=OUTPUT_FROM_APT-CACHE

Use apt-cache policy PACKAGE_NAME to get the currently installed version on the master.

Automatic Wrapping of Paragraphs in Vim

It's been driving me crazy for a while then when writing markdown in vim, linebreaks are automatically added to keep all lines a manageable length, but if you subsequently edit a line, you have to manually re-jig the whole paragraph.

It turns out there's a really simple solution:

:set formatoptions+=a

Will automatically re-format the whole paragraph every time you edit a line

Finding Large Directories on Ubuntu

A very quick and dirty way to find large directories, e.g. when out of disk space on an ubuntu server is this:

du -h --max-depth=1 /folder1/folder2

And then just drilling down, e.g. if folder3 shows up as being the largest folder in the output of the above command then do

du -h --max-depth=1 /folder1/folder2/folder3

You can also sort it if you remove -h and then pipe to sort -n e.g.

du -h --max-depth=1 /folder1/folder2 | sort -n

But haven't yet got that to work nicely with humanised file sizes (e.g. showing in M/G instead of bytes).

CTags Ruby and Vim

To get jumping to a class definition working in vim when writing ruby, use https://github.com/tmm1/ripper-tags

Out of the box it won't generated nested classes, e.g. Module::Class just Module and Class which will break if you have lots of identically named classes in individual modules. To fix this you need --extra=q so the full command is:

ripper-tags -R --exclude=vendor --exclude=chef --extra=q

You can then use v to enter visual mode after the leading :: of a definition, w to move to the end of it and then ctrl + ] to jump to the definition of a given class.

Installing old versions of therubyracer Gem

If working with a codebase which relies on an old version of therubyracer gem (in my case 0.9.9) there are likely to be lots of errors building native extensions.

This can be worked around by using an old Apple GCC version temporarily as follows:

brew install homebrew/dupes/apple-gcc42

export CC=/usr/local/Cellar/apple-gcc42/4.2.1-5666.3/bin/gcc-4.2
export CXX=/usr/local/Cellar/apple-gcc42/4.2.1-5666.3/bin/g++-4.2
export CPP=/usr/local/Cellar/apple-gcc42/4.2.1-5666.3/bin/cpp-4.2

brew uninstall v8
gem uninstall libv8

And then running bundle as usual. This will also resolve issues with old versions of twitter-bootstrap-rails failing to installing (in my case 2.0.3).

The AWS CLI is Awesome

Install (OSX) with brew install awscli and configure with aws configure. Enter credentials + default region (this should be e.g. eu-west-1 not Ireland).

To get the size of a bucket:

aws s3api list-objects --bucket BUCKET_NAME --output json --query "[sum(Contents[].Size), length(Contents[])]"

To download a local dump of a bucket:

aws s3 sync s3://BUCKET_NAME_OR_PATH .