Cannot resolve db hostname from web container

Hello,

I am trying to deploy Mastodon using tootsuite image and configuration but I had some issues with running db:migrate command.

I am running Docker-Compose and Podman 3.0 on a Fedora 33 server.

$ docker-compose --version
docker-compose version 1.27.4, build unknown

$ docker --version
podman version 3.0.1

$ cat /etc/fedora-release
Fedora release 33 (Thirty Three)

I had to make some changes in the docker-compose.yml to make it work. Here is my current config file:

version: '3'
services:

  db:
    restart: always
    image: postgres:9.6-alpine
    shm_size: 256mb
    networks:
      - internal_network
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "postgres"]
      timeout: 45s
      interval: 10s
      retries: 10
    volumes:
      - ./postgres:/var/lib/postgresql/data
    environment:
      - POSTGRES_HOST_AUTH_METHOD=trust

  redis:
    restart: always
    image: redis:6.0-alpine
    networks:
      - internal_network
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      timeout: 45s
      interval: 10s
      retries: 10
    volumes:
      - ./redis:/data

#  es:
#    restart: always
#    image: docker.elastic.co/elasticsearch/elasticsearch-oss:6.8.10
#    environment:
#      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
#      - "cluster.name=es-mastodon"
#      - "discovery.type=single-node"
#      - "bootstrap.memory_lock=true"
#    networks:
#      - internal_network
#    healthcheck:
#      test: ["CMD-SHELL", "curl --silent --fail localhost:9200/_cluster/health || exit 1"]
#    volumes:
#      - ./elasticsearch:/usr/share/elasticsearch/data
#    ulimits:
#      memlock:
#        soft: -1
#        hard: -1

  web:
    #    build: .
    image: tootsuite/mastodon
    restart: always
    env_file: .env.production
    command: bash -c "rm -f /mastodon/tmp/pids/server.pid; bundle exec rails s -p 3000"
    networks:
      - external_network
      - internal_network
    healthcheck:
      test: ["CMD-SHELL", "wget -q --spider --proxy=off localhost:3000/health || exit 1"]
      timeout: 45s
      interval: 10s
      retries: 10
    ports:
      - "127.0.0.1:3000:3000"
    depends_on:
      - db
      - redis
#      - es
    volumes:
      - ./public/system:/mastodon/public/system

  streaming:
    build: .
    image: tootsuite/mastodon
    restart: always
    env_file: .env.production
    command: node ./streaming
    networks:
      - external_network
      - internal_network
    healthcheck:
      test: ["CMD-SHELL", "wget -q --spider --proxy=off localhost:4000/api/v1/streaming/health || exit 1"]
      timeout: 45s
      interval: 10s
      retries: 10
    ports:
      - "127.0.0.1:4000:4000"
    depends_on:
      - db
      - redis

  sidekiq:
    build: .
    image: tootsuite/mastodon
    restart: always
    env_file: .env.production
    command: bundle exec sidekiq
    depends_on:
      - db
      - redis
    networks:
      - external_network
      - internal_network
    volumes:
      - ./public/system:/mastodon/public/system
## Uncomment to enable federation with tor instances along with adding the following ENV variables
## http_proxy=http://privoxy:8118
## ALLOW_ACCESS_TO_HIDDEN_SERVICE=true
#  tor:
#    image: sirboops/tor
#    networks:
#      - external_network
#      - internal_network
#
#  privoxy:
#    image: sirboops/privoxy
#    volumes:
#      - ./priv-config:/opt/config
#    networks:
#      - external_network
#      - internal_network

networks:
  external_network:
  internal_network:
    internal: true

And a diff to help you spot the changes:

TL;DR: I added options to healthchecks and an env variable to authorize postgres to run without password (DB_PASS is set in .env.production, but not used apparently) and commented the build to use the image from the repository (build was failing for some reason, see logs at the end).

$ git diff docker-compose.yml
diff --git a/docker-compose.yml b/docker-compose.yml
index 52eea7a74..a8e047ec7 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -9,8 +9,13 @@ services:
       - internal_network
     healthcheck:
       test: ["CMD", "pg_isready", "-U", "postgres"]
+      timeout: 45s
+      interval: 10s
+      retries: 10
     volumes:
       - ./postgres:/var/lib/postgresql/data
+    environment:
+      - POSTGRES_HOST_AUTH_METHOD=trust

   redis:
     restart: always
@@ -19,6 +24,9 @@ services:
       - internal_network
     healthcheck:
       test: ["CMD", "redis-cli", "ping"]
+      timeout: 45s
+      interval: 10s
+      retries: 10
     volumes:
       - ./redis:/data

@@ -42,7 +50,7 @@ services:
 #        hard: -1

   web:
-    build: .
+    #    build: .
     image: tootsuite/mastodon
     restart: always
     env_file: .env.production
@@ -52,6 +60,9 @@ services:
       - internal_network
     healthcheck:
       test: ["CMD-SHELL", "wget -q --spider --proxy=off localhost:3000/health || exit 1"]
+      timeout: 45s
+      interval: 10s
+      retries: 10
     ports:
       - "127.0.0.1:3000:3000"
     depends_on:
@@ -72,6 +83,9 @@ services:
       - internal_network
     healthcheck:
       test: ["CMD-SHELL", "wget -q --spider --proxy=off localhost:4000/api/v1/streaming/health || exit 1"]
+      timeout: 45s
+      interval: 10s
+      retries: 10
     ports:
       - "127.0.0.1:4000:4000"
     depends_on:

Generating secrets was fine, but db:migrate failed:

$ sudo docker-compose run --rm web bundle exec rails db:migrate
Creating network "mastodon_internal_network" with the default driver
Creating network "mastodon_external_network" with the default driver
Creating mastodon_db_1    ... done
Creating mastodon_redis_1 ... done
Creating mastodon_web_run ... done
rails aborted!
PG::ConnectionBad: could not translate host name "db" to address: Name or service not known

db service is running as I can see from the logs of the container:

$ sudo docker logs -f mastodon_db_1

PostgreSQL Database directory appears to contain a database; Skipping initialization

LOG:  database system was shut down at 2021-04-01 07:02:04 UTC
LOG:  MultiXact member wraparound protections are now enabled
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started

So the question is: Why web service cannot resolve db hostname to reach db service?

Additional questions: Can it be possible to test with a command wether db is reachable from web container?

Bonus 1: stdout of building web image:

STEP 14: RUN cd /opt/mastodon &&   bundle config set deployment 'true' &&   bundle config set without 'development test' &&     bundle install -j$(nproc) &&    yarn install --pure-lockfile
Don't run Bundler as root. Bundler can ask for sudo if it is needed, and
installing your bundle as root will break this application for all non-root
users on this machine.
Fetching gem metadata from https://rubygems.org/............
Your bundle is locked to mimemagic (0.3.5), but that version could not be found
in any of the sources listed in your Gemfile. If you haven't changed sources,
that means the author of mimemagic (0.3.5) has removed it. You'll need to update
your bundle to a version other than mimemagic (0.3.5) that hasn't been removed
in order to install.
subprocess exited with status 7
subprocess exited with status 7
STEP 15: FROM ubuntu:20.04

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/compose/cli/main.py", line 67, in main
    command()
  File "/usr/lib/python3.9/site-packages/compose/cli/main.py", line 126, in perform_command
    handler(command, command_options)
  File "/usr/lib/python3.9/site-packages/compose/cli/main.py", line 290, in build
    self.project.build(
  File "/usr/lib/python3.9/site-packages/compose/project.py", line 468, in build
    build_service(service)
  File "/usr/lib/python3.9/site-packages/compose/project.py", line 450, in build_service
    service.build(no_cache, pull, force_rm, memory, build_args, gzip, rm, silent, cli, progress)
  File "/usr/lib/python3.9/site-packages/compose/service.py", line 1147, in build
    raise BuildError(self, event if all_events else 'Unknown')
compose.service.BuildError: (<Service: web>, {'error': 'error building at STEP "RUN cd /opt/mastodon &&   bundle config set deployment \'true\' &&   bundle config set without \'development test\' && \tbundle install -j$(nproc) && \tyarn install --pure-lockfile": exit status 7\n'})

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/docker-compose", line 33, in <module>
    sys.exit(load_entry_point('docker-compose==1.27.4', 'console_scripts', 'docker-compose')())
  File "/usr/lib/python3.9/site-packages/compose/cli/main.py", line 78, in main
    reason = " : " + e.reason

The issue you’re hitting is because of https://discourse.joinmastodon.org/t/mimemagic-0-3-5-no-longer-available/ unfortunately. In short, because existing versions of a dependency of Mastodon have been removed from repositories, no release version of Mastodon is currently installable. Hopefully a new version will soon be available, but in the meantime, you have a few options: Mimemagic 0.3.5 no longer available - #15 by Claire

2 Likes

I have switch to main branch and I was able to build images, but I encountered the same issue.

I assume that there is another issue preventing web service to resolve hostname db.

Ah, I’m sorry, I only reacted to the last part of your message, I’m not knowledgeable enough with docker/docker-compose to help with that issue :confused:

1 Like

Looks like a podman/compose issue and not Mastodon’s… what happens if you leave only internal_network definition? How do /etc/resolv.conf look like in a container? Can they resolve any other names?

Yes, it seems to be an issue with Podman and/or Docker-Compose. I highly doubt the issue comes from Mastodon.

Without any change (both internal_network and external_network declared):

$ sudo docker-compose run --rm web cat /etc/resolv.conf
Creating mastodon_web_run ... done
search openstacklocal
nameserver 10.89.8.1

With only internal_network

$ sudo docker-compose run --rm web cat /etc/resolv.conf
Creating mastodon_web_run ... done
search openstacklocal
nameserver 213.186.33.99

How can I test to reach any other name from web container? I tried nslookup, ping and curl but none of them are included in the image.

maybe “getent hosts name” works?

213.186.33.99 is OVH nameserver (probably the one you are using externally), this one is unlikely to work.

First, getent hosts db and getent hosts web gave nothing from web container.

I tried to completely remove networks’ declaration and… it worked.

Now I have another error, but I should create another topic for this error I guess.

ActiveRecord::NoDatabaseError: FATAL:  role "mastodon" does not exist

And from the db container:

$ sudo docker exec --user postgres -it mastodon_db_1 /bin/sh
/ $ psql
psql (9.6.21)
Type "help" for help.

postgres=# \l
                                 List of databases
   Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges
-----------+----------+----------+------------+------------+-----------------------
 postgres  | postgres | UTF8     | en_US.utf8 | en_US.utf8 |
 template0 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres
(3 rows)

postgres=# \du
                                   List of roles
 Role name |                         Attributes                         | Member of
-----------+------------------------------------------------------------+-----------
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}

So it seems that the db was not initialized.

1 Like

I think you can slowly work from there. If db container was not reachable, the Postgres is probably in the blank state, so you can start this part from scratch.

What is the working /etc/resolv.conf configuration set by podman?

/etc/resolv.conf is in the same state as before (with both internal_network and external_network)

But now getent hosts db gives the correct result:

$ sudo docker-compose run --rm web getent hosts db
Creating mastodon_web_run ... done
10.89.7.48      db

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.