Unable to deploy Elasticsearch indices when upgrading to Mastodon 3.0

I upgraded my DigitalOcean droplet instance to Mastodon v3.0.0rc1. I successfully followed the Upgrade Notes instructions on the GitHub releases page. The instructions end with step 7:

7. If you are using ElasticSearch, there are new indices to be deployed 
( **this step is likely to take a considerable amount of time** , so 
running it through  `screen`  or  `tmux`  is advisable):

* Non-Docker:  `RAILS_ENV=production bin/tootctl search deploy`
* Docker:  `docker-compose run --rm web bin/tootctl search deploy`

My DigitalOcean droplet is non-Docker. As user “mastodon”, I ran RAILS_ENV=production bin/tootctl search deploy. In a few seconds, I get this output:

949:in `rescue in block in connect': Failed to open TCP 
connection to localhost:9200 (Connection refused - connect(2) 
for "localhost" port 9200) (Faraday::ConnectionFailed)

Why does this involve localhost? Why did it need to make a TCP connection? What is Faraday?

After these steps, my instance is running v3, but when I type an at-sign and a username into the Toot textbox, it attempts and fails to autocomplete. This then shows a 500 error.

1 Like

Connection refused means the network server which should answer (probably Elasticsearch component) is not running. Localhost means the connection will be not done over the Internet network but to the same machine that is running mastodon.

Faraday is a Ruby component that lets us build HTTP connections. This is the component which tried to connect and it didn’t work.

Make sure your ElasticSearch programs are running properly.

Thanks. It sounds like I need to learn more about Elasticsearch in specific, and network administration in general, in order to determine what is improper about my Elasticsearch programs. I don’t know where to start, but perhaps the best place to start is to get a book about Elasticsearch.

Not sure you need that. I don’t know what kind of system you are running this on but you should start looking into this:

  1. how does elasticsearch start on my machine? does it start automatically? does it start manually?
  2. where is the logging information about the progress of elastic search startup? (things like journalctl etc.).

This is the DigitalOcean droplet which the Mastodon developers provided as a 1-click install.

After I first got the droplet running, I set up Elasticsearch, which was not turned on in the droplet’s image by default. To get it to automatically start whenever the system boots, I ran sudo /bin/systemctl enable elasticsearch.service after installing Elasticsearch.

But just in case it was not running, I just now ran sudo systemctl start elasticsearch.service to start it. I ran RAILS_ENV=production bin/tootctl search deploy again. This time, the output ends with:

/home/mastodon/.rbenv/versions/2.6.4/lib/ruby/2.6.0/net/protocol.rb:217:in 'rbuf_fill': Net::ReadTimeout with #<TCPSocket:(closed)> (Faraday::TimeoutError)

The config file, /etc/elasticsearch/elasticsearch.yml, says the logs are in path.logs: /var/log/elasticsearch. As I nano through these log files, I have not found much that is meaningful to me for the time periods in which I attempted to run the command.

That’s a different message than Connection refused you’ve had before - this means you are one step ahead (not much further but still). Can you check journalctl for the services? Can you see netstat -l --inet output to see if the port 9200 is open? Can you see some java processes running?

The output to sudo journalctl -u elasticsearch, in its entirety, is:

-- Logs begin at Wed 2019-08-07 11:23:10 UTC, end at Tue 2019-10-01 16:15:38 UTC. -- Oct 01 01:55:02 mastodon-s-1vcpu-1gb-nyc1-01 systemd[1]: Started Elasticsearch. Oct 01 01:55:02 mastodon-s-1vcpu-1gb-nyc1-01 elasticsearch[1509]: warning: Falling back to java on path. This behavior is deprecated. Specify JAVA_HOME

It is not yet clear to me which file contains that configuration. For that matter, it might not be related to the error.

netstat -l --inet reveals that there is no port 9200:

Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 localhost:6379*               LISTEN
tcp        0      0  *               LISTEN
tcp        0      0 localhost:domain*               LISTEN
tcp        0      0   *               LISTEN
tcp        0      0 localhost:3000*               LISTEN
tcp        0      0 localhost:postgresql*               LISTEN
tcp        0      0 *               LISTEN
tcp        0      0 localhost:4000*               LISTEN
udp        0      0 localhost:domain*

ps aux | grep java gives me output which begins with “elastic”, so it might be relevant to something:

elastic+ 1509 0.5 25.1 3235652 512648 ? Ssl 01:55 4:34 /usr/bin/java -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.io.tmpdir=/tmp/elasticsearch-1491125781595641508 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/elasticsearch -XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:/var/log/elasticsearch/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=32 -XX:GCLogFileSize=64m -Des.path.home=/usr/share/elasticsearch -Des.path.conf=/etc/elasticsearch -Des.distribution.flavor=default -Des.distribution.type=deb -cp /usr/share/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch -p /var/run/elasticsearch/elasticsearch.pid --quiet root 20886 0.0 0.0 14660 1012 pts/0 S+ 16:18 0:00 grep --color=auto java

This contains -Des.path.conf=/etc/elasticsearch, so it’s a clue to the location of configs. I went to /etc/elasticsearch and opened elasticsearch.yml. There does not appear to be something related to a path which might be set to “JAVA_HOME”. Among other things, it contains:

# ---------------------------------- Network -----------------------------------
# Set the bind address to a specific IP (IPv4 or IPv6):
# Set a custom port for HTTP:
#http.port: 9200

And yet 9200 is commented out. I have run out of time for now, but will continue to investigate tonight. If un-commenting this line breaks something, I will have time to fix it.

Wait a minute… wait a minute… I can successfully search for things on my instance. The problem appears to have gone away.

Huh. It seems like if I don’t figure out why I need to make a TCP connection, something bad might happen. But, for now, there are no visible symptoms. OK.

1 Like

I’m leaving the solution here for those who come along in the future: increase your memory.

The problem came back with the next Mastodon update. My DigitalOcean droplet had 2GB of memory. I suspect Elasticsearch was reaching the limit of the memory, and turning itself off. So when I ran RAILS_ENV=production bin/tootctl search deploy, port 9200 was refusing connections (Faraday::ConnectionFailed).

When I would instead run systemctl start elasticsearch.service && RAILS_ENV=production bin/tootctl search deploy, I would instead get (Faraday::TimeoutError). I suspect this is because elasticsearch would start and then run out of memory and turn off mid-process.

I upgraded my DigitalOcean droplet to the 3GB-memory service plan. Then I was able to deploy indices successfully.

1 Like