Someone on the PyDev Slack channel was having trouble getting emails
from the Django admin system and another member mentioned there is a
sendtestemail manage.py command that can help one debug your email
settings. That made me wonder what else is available that I didn’t
know about.
diffsettings
The Django settings object is a bit odd in that you can’t do
print(settings.__dict__) to figure out what is available - and even
if you know (or guess) the name of a setting, how do you know if the
value is the default or something you have overridden in your app?
There is a manage.py command for that! The most useful version is
python manage.py diffsettings --output unified. That gives you all
the settings - with the overridden versions in red.
showmigrations
I have used the default version to check to see if I have applied all
the existing migrations in my test and prod environments - that’s the
python manage.py showmigrations --list version. But there is also a
python manage.py showmigrations --plan version. That will show you
the order in which Django will apply migrations.
inspectdb
If you run python manage.py inspectdb against an existing database,
it will generate the Django models that would have created those
tables (and indexes and constraints). This command is for projects
that must use a legacy database
so all of the models are created with a Meta class with managed =
False.
ping_google
If your site has had a bunch of changes that you want Google to
recrawl, you can use this command to submit your sitemap: python
manage.py ping_google. If you have the Django sitemaps app installed
and a url configured for your sitemap, this command will figure out
what the url should be.
Union queries are surprisingly easy to create. I need a list of ids
and content type ids from a bunch of different models. I was very
surpised at how straightforward it is in Django 3.2 to create the
UNION query I want.
Note: I used all=True because I will never have duplicates in my
(id, content_type_id) tuples and UNION ALL is faster than UNION in
this case because we can skip the DISTINCT operation on the final
result.
The observant among you will have noticed a bit of pseudocode in the
example above. I want to insert the content_type_id from python into
my query. In SQL this would be something like:
SELECTid,99FROMmyapp_model;
In the Django ORM, that turns out to be something I didn’t know how to
do. I can’t leave it as a bare name and I can’t quote it or the ORM
tries to turn it into a column name or relation that could be turned
into a column name. Turns out I need to use Value:
OK so that now will give me a queryset that produces a list of dicts
like: [{pk: 3, content_type_id: 44}, {pk: 3, content_type_id: 48}]
But when I tried to use those results in the filter section of another
query… I had my next problem.
Querying by value - without Foreign Key relationships
So now I need to use those ids and content_type_ids to filter another
model that has rows with content_type_id and object_id columns. I want
all the lines in the table for the ModelLogEntry model where the
(object_id, content_type_id) tuple is in the list of (pk,
content_type_id) tuples created by our UNION query above.
If I only needed to match on a single value, I would probably evaluate
the UNION query, and then do something like .filter(pk__in=<list of
pks>) - as I did to get the list of content types I need. But I need
to match the id and content_type_id fields. In SQL, I would do:
This was relatively straightforward to write in SQL, so I tried using
raw SQL, e.g. ModelLogQuery.objets.raw('<query here>'). That
definitely gave me the data I was looking for when I ran it in
shell_plus. But when I tried to use it in my monkey patch, the calling
function wanted to use values(), which is a function only defined on
real ORM QuerySets - and not available when using raw.
At this point I suspect I won’t want to use this in
production. Goodness only knows how big the union query is likely to
get. But it is bothering me that I don’t know how to get Django to let
me do a relatively straightforward join without having defined a
ForeignQuery relationship in my Python model code.
I still don’t know how to tell Django “do this join damn it!”, but
after some reading and thinking of alternate ways to write the SQL, I
think I have found a way to write this in the ORM using Exists to
create a correlated subquery.
I tested the Exists query by setting the union_query to be just one
type and it works fine. So I learned something useful about the Django
ORM - even if I can’t apply that knowledge in the context in which I
wanted to to use it this time.
I have had a VM for web hosting for ….. about as long as VMs have
been a thing. Before that I shared a physical machine with some old
work collegues from ArsDigita. But ever since I hosed the PHP on my
current VM, I haven’t really been doing anything that needed dynamic
hosting. So it’s about time I saved some time and effort and move
things to static hosting.
We do some AWS stuff at work so one of the simpler ways to move away
from having my own server would be to host static sites in S3. The AWS
console is a hot mess AND constantly changing AND isn’t version
controlled. But terraform will let me manage my AWS configuration
using code that can be version controlled. And will give me some
additional experience using terraform and AWS. So win, win, win.
I found this blog post
that does a pretty good job of explaining how to use terraform to set
up a static web site on aws: The only things I had questions about
where:
Why use CloudFront for a low traffic site?
Why not move DNS registration to AWS?
Why redirect everything to www instead of to the bare hostname?
Item 1: CloudFront. I think for the author, part of the answer
might be to get good scores on page response speed. I don’t really do
a lot of speed optimization (other than mostly having just text on my
site) but CloudFront is important for another reason. From the AWS docs
as of August 2021:
Amazon S3 website endpoints do not support HTTPS or access
points. If you want to use HTTPS, you can use Amazon CloudFront to
serve a static website hosted on Amazon S3.
Item 2: DNS registration. I had originally been thinking of moving
my DNS registration to AWS so I had everything in one place. I have
been using DotYou.com for ages and they are fine, but nothing
special. So while I was moving things, why not consolidate everything?
Because AWS doesn’t have a super simple way to do email
forwarding. With DotYou.com, I can navigate to the configuration for
any of my domains and add one or more email addresses and where the
email to those addresses should go. It is a personal life goal never
to run a mail server. So I’ll be keeping my domains registered
elsewhere - at least until setting up email forwarding on AWS is as
simple as with my current provider.
Note bene: Because I want to continue to use the email forwarding
from DotYou.com, I had to keep using their default nameservers rather
than setting the name servers to be the Route53 name servers that
Terraform set up for me. AND because CloudFront gives me a name rather
than an IP to point my domain at, I can no longer have an A record for
*.example.com; it needs to be a cname. I am not 100% there aren’t some
implications of this that I am missing - but so far having a cname
seems to work just fine….. or mostly, which brings me to item 3,
what should be my canonical url?
Item 3: Canonial url. Google prefers if there is one canonical
copy of any page - but it treats http://example.com,
https://example.com, http://www.example.com, and
https://www.example.com as 4 different sites - even if they all
serve the exact same content. The best way to consolidate your
analytics is to set up redirects so that all of those similar urls
actually end up on one canonical url. I usually choose
https://example.com as my real url. But the blog post I was
following chose to redirect everything to
https://www.example.com. It wasn’t until I had everything set up and
was updating the DNS records, that I realized why it might have been
better if I had redirected everything to www.example.com as in the
example. I can create a wildcard cname record and point everything to
some CloudFront distribution, and I have a wildcard SSL
certificate. But if I try accessing http://foo.example.com, I get
“403 Forbidden” and a message about this being a mis-configuration. If
I try https://foo.example.com, I get an error with the error code:
SSL_ERROR_NO_CYPHER_OVERLAP
CloudFlare supports wildcard CNAMEs
so I might be able to set things up to catch all names - but that if
that record points to the canonical distribution, then no redirect
happens - so no consolidation of requests. So I think what I need to
do is reverse my setup and make https://www.example.com my real bucket
and real url. I did find this post about wildcard CloudFront
distributions not behaving as the author expected - but I am not sure
I understand how this is going to affect my setup.
Nope
After much fooling around with DNS settings and S3 bucket settings, I
finally concluded that having working email addresses AND a web site
served by S3/Cloudflare on my bare domain name (e.g. cynthiakiser.com)
are incompatible. So I guess I will be keeping my VM after all.
We have largely been using UTF8 for our databases for some time. However, if we
want to support emojis or other more exotic characters, we need to start using
utf8mb4. In fact, our Rails projects are set up to require that character set.
So while we are upgrading databases versions (from the EOL MySQL 5.6 to MySQL
8), we decided to upgrade the character sets of our databases at the same time.
I initially tried to configure our database with collation_server as
utf8mb4_0900_ai_ci to get all the latest unicode goodness, but AWS said I
couldn’t set collation_server to that value. The closest option was
utf8mb4_unicode_ci so I went with that. Once I got into the database, I found
the following settings:
So from that we see databases should automatically be created with character set
utf8b4 and collation utf8mb4_0900_ai_ci because that is the default collation
for the character set utf8mb4. However, I suspect I may want to set the
character set (and collation) for the client and connection to utf8b4.
Changing the values in my terraform aws_db_parameter_group did not change the
values in the running database.
Authentication plugin
I was expecting the password hashing mechanism for the new database to be
caching_sha2_password since I had read that was the new method for MySQL8. But
when I created my first user on the new database, I found they had their
password stored as mysql_native_password.
We do still have some clients that will need to connect with
mysql_native_password but I want those to be the exceptions, not the rule.
Trying to update that in place gave:
Timezone information
The Django docs about using MySQL say you need to load the timezone
tables
when creating a new MySQL instance. Both the docker containers we use in dev and
the AWS RDS MySQL instances take care of loading this data automatically.
Notes from MySQL 8 Docs
https://dev.mysql.com/doc/refman/8.0/en/charset-server.html The server character
set and collation are used as default values if the database character set and
collation are not specified in CREATE DATABASE statements. They have no other
purpose.
The values you see for show variables like '%collation%'; will vary depending
on the session details. The database you are in (e.g. one of the user schemas vs
in mysql or information_schema) will affect what you see for collation_database
and character_set_database. And how you connected will affect the values you see
for collation_connection and character_set_connection, character_set_results,
and character_set_client. It looks to me like the controlling variable there is
character_set_client.
We could probably use a single database with multiple schemas for all our development work. That’s
how we have been doing it for years. But sometimes that leads to us making compromises - like
running a slightly different version of the database in dev vs production. And it leads to being
somewhat conservative when trying out different database options. Now that we are using Docker for
deploying our applications, it makes sense to use docker-compose to create all the services our apps
use: relational databases, ElasticSearch, caches. The docker-compose file also manages mounted
volumes (for preserving the data in our development databases) AND a private network for each set of
containers.
Version 1
Creating a database and user
If we want to take full advantage of docker-compose’s automated workflow, we need to be able to
recreate our databases - including loading dev data - automatically. The official MySQL database
container image supports a really easy way to do this. If you set a
handful of environment variables in your docker-compose.yml file, the first time you start the
database container, it will create the database instance and set the root password to the value from
MYSQL_ROOT_PASSWORD. If you include values for MYSQL_DATABASE, MYSQL_USER and
MYSQL_PASSWORD, the first startup of the database will create that database and grant all
privileges on that database to the specified user. Excellent! That gets us most of the way there.
Now if we could only load some initial data….
Loading initial data
The MySQL image provides for this too. In the section “Initializing a fresh instance”:
When a container is started for the first time, a new database with the specified name will be
created and initialized with the provided configuration variables. Furthermore, it will execute
files with extensions .sh, .sql and .sql.gz that are found in /docker-entrypoint-initdb.d. Files
will be executed in alphabetical order. You can easily populate your mysql services by mounting a
SQL dump into that directory and provide custom images with contributed data. SQL files will be
imported by default to the database specified by the MYSQL_DATABASE variable.
So the database section of our standard docker-compose.yml looks like:
And our project’s sql/docker/ directory has:
And don’t forget to exclude that mysql-data directory from your image by including it in your
.dockerignore file:
Version 2a - custom user creation
The stock setup above works great - until you have some super old code that you don’t want to have
to upgrade to recent MySQL libraries. If you want to connect those projects to MySQL 8, you need to
tell the database that this user will be using an older authentication plugin. So you need to issue
the user create yourself. One option is to only do the user creation and privilege setting in your
script, leaving the database creation and loading as above. Put the following in
sql/docker/init.sql:
And then remove MYSQL_USER and MYSQL_PASSWORD from the docker-compose.yml environment.
Version 2b - fully custom script
The version above works fine, but if you need additional configuration of the database - or just
want to take full control of the user and database creation and data loading - you can use the
following.
First, move the initial_data.sql file out of the project’s sql/docker/ directory:
Then remove MYSQL_DATABASE, MYSQL_USER, and MYSQL_PASSWORD variables from your
docker-compose.yml and add another volume to mount the directory where you put initial_data.sql. I
just moved the file up one level and then mounted /sql as /sql_data in the container. The
docker-compose.yml should now look like this:
Then, in sql/docker/init.sql, create your user and database; then load your data from the mounted file: