New server coming up

After the recent server failure, and the restore problems involved with this, I decided to upgrade to new server with some extra RAM and SSD space.

This took a bit longer to be provided due to some CPU shortages, but I now have access to the machine and have started to set it up. I hope to have it ready for testing by tomorrow, and it should be ready to fully take over on the next weekend.

The key figures are that the new machine has 128GB instead of 64GB, and 7TB of SSD space instead of the previous 2TB SSD plus 3TB HDD.

The extra RAM should help with the “out of memory” errors during rendering which happen every once in a while with large map areas.

The extra SSD space now allows to have the full database on fast storage, and has enough headroom for adding a few imposm based stiles, which was not possible so far.

Outage and new server

After a recent outage I had to restore the server from a backup. While it is mostly running again, some minor services as e.g. the translation backend, are still missing.

As I was planning to switch to a new server soon anyway I did not bother to look into all the minor things as of yet after having main services restored.

The new server should be available any day now, and I will take my time setting it up properly in parallel to the current server still running, also improving tthe setup, and especially the backup/restore plan wile I’m on it.

The new server should be ready for prime time in a week or two from now, with more RAM and more NVMe SSD disk space available.

Downtime notice for new DB import

Database bloat has again taken its toll again, and I’m running out of SSD disk space for the main databases soon.

So I’m going to take the server down over the next weekend, staring late Friday Feb 20th, around 20:00UTC. Assuming that everything goes well everything should be back up and running on Sunday afternoon.

Continue reading “Downtime notice for new DB import”

Import lag

The database is currently lagging behind, it is catching up now, but will take a few more days to be fully up to date again.

Problem was that I had temporarily stopped minutely diff import for some maintenance about three weeks ago, and then forgot to turn them on again.

Problem #2 was that my monitoring only checked for failed diff imports, but not for them not running at all.

Experimental: Larger areas supported

So far this service only supported map areas up to about 40×40 kilometers. I for now extended this to 300×300 kilometers.

This is experimental for now, and I may end up rolling this back, at least partially.

The main problem with this change, aside of longer render times, is that this change may lead to out-of-memory errors while processing render requests. This especially seems to affect the compressed SVG output. Problem is that failure to render one output format will make the whole render job fail, even if some other formats could have been rendered just fine.

I had experimented with raising the limit to 1000×1000 kilometers, and tried to render a map of all of Germany with that setting, but with that SVG output failed with out-of-memory errors regardless of paper size, even though PDF output, which also uses vector format, could be created just fine.

Unfortunately an out-of-memory error in the rendering library is not something that can just easily be caught and recovered from using Python exception handlers, so I need to come up with more complicated ways to deal with this.

Rendering the northern part of Germany within the 300×300 kilometer limit works fine, so I’ll keep the 300×300 kilometer setting for now. I have not tested such large maps with other styles than the default OSM one though, so there may still be error situations I’m not aware of.

I will watch the number of out-of-memory failures, and the average render time and render wait queue size closely for the next days or weeks, and may return to the smaller 40×40 kilometer limit if any of these monitor values get too high.

How not to test changes …

Yesterday I found out the hard way that the neighborhood POI frontend did no longer work, and must have been broken for quite a while already.

Why didn’t I spot this in testing? Well …

  • The problem was a typo in the OCitysMap renderer backend code, not in the frontend
  • I tested the change back then locally
  • But due to another typo in the test setup the local config for the POI frontend was copied to the wrong folder, and so not actually used
  • So the frontend fell back to default settings for the rendering host, and that happened to be the public, not the local server
  • Meaning that when testing the change I actually tested the frontend, which didn’t really change, against the not yet changed public render backend
  • As test results looked good I pushed the changes, pulled them on the public server, but didn’t test there once more
  • So the typo in the renderer went unnoticed as I actually didn’t test the local instance of that, and after the push/pull the public instance was broken, too 🙁

Renderer code and local test setup are fixed now, so something like this should hopefully not happen again.

Now I need to work on making the alternative frontend send email notifications about rendering errors to me, like the main frontend already does, so that a failure like this can’t go unnoticed for this long …

130K – Status updates

130000+ maps rendered

Some day in September my MapOSMatic instance crossed the line of 130K maps rendered, since it started in May 2016. At this rate it will get to 150K even before it’s fourth anniversary 🙂

Database reimport

I had to do a database re-import recently as it had turned out that some data was missing, probably due to recent general problems with minutely changesets on the OSM side. At that point I also took the opportunity to upgrade osm2pgsql and the import style, so that the database is now ready for v5 releases of the OpenStreetMap Carto style. Older styles should continue to work as before with this.

Email setup

So far the MapOSMatic setup used my GMail account for sending out error notifications to me, and “your map is ready” notifications.

I now switched to running my own mail server instead, so reducing the number of external dependencies and increasing privacy a little bit.

During this I discovered something embarrassing: looks as if for years, ever since I introduced the feature, “your map is ready” notifications were actually never sent to the notification address given by users, but to the server administrators address (so mine) only. I never really noticed as I used my own address for testing anyway … this is now fixed, too.

Style updates

I have been a bit sloppy with style upgrades in recent month, but I’m going to catch up with this again. For a start I’m now going over the style installation scripts in my MapOSMatic Vagrant test project, once that’s finished and all tests pass I will upgrade styles on the public instance, too. This will affect:

  • The OpenStreetMap Carto style
  • The Humanitarian style
  • The Belgian style
  • The OpenArdenne style

and probably a few others, too.

I’m also still on the hunt for more open source MapnikXML and CartoOSS styles to support, so if you know of any that I’m missing that is based on the osm2pgsql schema, please let me know.

Styles that use imposm instead of osm2pgsql may also be supported in the future, but right now this is not possible due to disk space constraint on the current server.

What’s next?

Once I’m done with bringing all styles up to date I plan to gradually increase the maximum area size that can be rendered. Currently this is limited to 20x20km², but there’s no real technical reason for this. The limit is just there to put a limit to rendering time.

As all render requests are processed by a single queue one by one so far, this was meant to keep the total queue waiting time low, together with a hard 60 minute limit after which render jobs are killed.

My plan is to set up at least two separate rendering queues for the future, a “quick” and a “slow” queue. Based on area covered, base and overlay style choices, and single vs. multi page layout, the system will try to predict whether a requested render job will be fast or slow, and will put it in the quick or slow queue accordingly.

This way the typical quick requests that only take a few minutes or less to render will no longer have to wait for e.g. multi page requests with many pages, and the one hour limit can be lifted a bit for the slow queue. At the same time I can lower the time limit for the quick queue, so that in cases where the render time estimate turns out to be wrong a request can be canceled early, and be moved to he slow queue for re-rendering.

Once that’s done I will then probably focus on the paper size dialog once more, finally making it possible to choose a specific map scale there, so that things like “I want a 1:10,000 map on DinA2 paper” will finally become possible.

Another thing I’m working on on the side is to try to create — or extend one of the existing — Mapnik symbolizers for rendering arcs. This would allow for rendering things like the camera viewing angles in the “Surveillance under Surveillance” overlay — which right now uses the Python Cairo Bindings directly instead of Mapnik — or like the nautical navigational light visibility arcs on OpenSeaMap — which uses its own custom renderer implementation for this.