Set up Postgresql database using with production data using Docker
Setting up production like database locally and other useful functions
PostgresSQL is one of the most popular relational database. suitable for recording data having ‘relationships’ with each other, e.g. purchase transactions with customer and inventory.
In software development, we have a separate local database for local development. In order to see what the end results look like in the production setting, we would want to have a development database mirroring the actual production database. Also, we want to be able to create those dev databases as many as we want easily, effectively running a simple line of code!
How?
Docker!
Assuming you already have a production database. Now you want to have create a local database with the same data as in the production database
- Copy the production database and create a dump file for populating in the new database image
- Create an empty local database
- Import the production dump file to the local database
- Cleanup the data in the local database e.g. remove the user accounts and add test users for running automation tests
- Create a dump file from the local database
#!/bin/bash// 1. create a dump file from production db
> /usr/local/bin/pg_dump "$PROD_DATABASE_URL" > "$DIR/raw_dump.sql"// 2. URL of the newly created local database is
// postgres://{user}:{password}@{server}:{port}/{db_name}
// e.g.
LOCAL_DATABASE_URL=postgres://postgres@localhost:5432/postgres// 3. Import production dump file into local db
> psql "$LOCAL_DATABASE_URL" < "$DIR/raw_dump.sql"// 4. Cleanup the data in local db
> psql "$LOCAL_DATABASE_URL" < "$DIR/clean.sql"// 5. Create dump file from local db
> /usr/local/bin/pg_dump \
--no-acl \
--no-owner \
"$TO_DATABASE_URL" > "$DIR/../seed.sql"
6. Dockerfile: copy the seed.sql to /docker-entrypoint-initdb.d/ as all the sql scripts will be automatically run during container startup
FROM postgres:9.6.6-alpine
COPY *.sql /docker-entrypoint-initdb.d/
RUN chmod a+r /docker-entrypoint-initdb.d/*
7. Create docker image
docker build -t "$IMAGE_NAME:$NEW_TAG" .
docker login -u "$DOCKER_USER" -p "$DOCKER_PASS"
docker push "$IMAGE_NAME:$NEW_TAG"
Dala! Everyone in the team can pull down the docker image and run production like database locally for development!
Now, the magic single line of code to run to spin up a production-like database locally is: docker run -p 9999:5432 “$IMAGE_NAME:$NEW_TAG”
// start container from the image, mapping port number from default 5432 to 9999. port mapping can avoid port conflict
> docker run -p 9999:5432 "$IMAGE_NAME:$NEW_TAG"// to connect, using the original local db URL, with port 9999
URL: postgres://postgres@localhost:9999/postgres
One line to spin up multiple databases
There are times where you would want to spin up different databases, e.g. a redis database for caching, a postgresql database for express server…etc. Instead of running docker run…
multiple times, we have a quicker way..
Docker Compose
version: '3'services:
pgdb_dev:
image: myrepo/my-postgres:1049
environment:
POSTGRES_USER: db1
POSTGRES_PASSWORD: password123
ports:
- "9999:5432"
volumes:
- ./pgdata:/var/lib/postgresql/data
redis_dev:
image: redis:4
ports:
- "6379:6379"
In the above docker-compose.yml
, we define 2 database services to start: pgdb_dev
and redis_dev
. pgdb_dev
uses the image we created before. Also, we passed in the environment variables to define the user and password of the database. According to the documentation of our base postgres image, in the absence of POSTGRES_DB
environment variable, POSTGRES_USER
will be used. Therefore the connection URL to this pgdb_dev is: postgres://db1:password123@localhost:9999/db1
To save the state of the docker compose, we use volumes
, which saves the docker postgres data onto our local desktop ./pgdata
folder. To avoid port conflict, we can map the default 5432 to a different port in 9999.
This time, the magic one line code to run is: docker-compose up
and both of the defined services, pgdata_dev
and redis_dev
will be up and running!
Other Common Tips
- Double quotes,
“”
, can be used for wrapping around column names, and single quote‘’
can be used for wrapping around values. jsonb_setup
— a very useful function for updating the json or jsonb object inside the database- Use
like
to quickly search for items in database by matching string. e.g.name like ‘Romeo and %’
where%
is wildcard