Hi guys,
recently I have been asked independently by several people about the possibility for auto-importing data to the template when the station goes offline and comes back online. I have been thinking about this a lot and currently I am not 100% sure how this would actually work technically, if at all possible, and there are also other things that are not black and white so I decided to post about it and see if you have some ideas/feedback.
This is actually a very problematic issue for several reasons, I will try to briefly explain them below (though I am 99% sure I will forget some :D):
- the API uses direct HTTP requests instead of FTP. FTP in general is slow and not very reliable. The problem with HTTP request is that it is limited in length. There is no way you could send for example a day worth of data in a single request, not to mention longer periods. Depending on the sensors you have you would probably be ok with sending 5-10 data sets, but if you for example send data to the API every minute, this is equal to 10 minutes. To overcome this we have two options – use FTP upload of some log file with the missing period data, or use multiple HTTP request.
- another major obstacle is the way browsers and webpages work in general. The problem is that the communication between your software and the template is always one-way. Your station sends data to your software and your software sends data to the template, and it will never go the other direction, the template will never send data to your SW. The reason this is a problem is because the template has no way to tell the software which data/period it is missing, likewise, it cannot tell the software “yes I received your data, everything was saved correctly”. As you can imagine, the import might be unsuccessful for all sorts of reasons. This then becomes again a serious issue. Right now when you do the import manually you see immediately what is going on. The import script tells you about any potential errors and you also can check right after the import if the data is in the database. But now imagine this was done automatically. You would not check it, you might not even know that in the background your database was updated with past records – or there was an attempt to do it, which failed…
- regularly checking the data even if it is automatic does partially solve the problem, but given I have implemented the possibility to save the import settings the actual import is very fast, a matter of a few clicks and therefore having it automatic, but still checking it manually after every station outage would save practically no time
- sending multiple HTTP requests is possible, but you have to be very careful not to overload the server with these requests and the question then is what should be the interval between sending it, how to tell the data was saved. This also doesn’t have a clear answer because from personal experience I can tell every server is different.
- and now finally we get to another major problem. Even if I/we figure out a way of doing this, it would still be mostly work for the software developers, not for me. It would require changes in the software. The software would somehow have to log what data it already sent to the API, what is missing, how often it has to send the previous records, somehow load some text file that the template saves upon successful save, creation of some log file so that the user can check what is going on, and there would actually be more work to do. In the end it might well be more work for the other developers than for me. It is never easy to try to co-ordinate these things with many other people, in majority of cases the developers were very nice and helpful and co-operative when we were creating the API, but it was not easy to co-ordinate it, and it was trivial compared to this. So question is whether at all they would be willing to do it, whether it would be possible to find a mechanism that all the software would be able to handle/support (it would have to be like with the API, creating a unique script for each software would be a step back, that was one of the major goals of the API – standardization – it makes things so much easier for me, the developers and you benefit from it as well because given I only have to maintain one script, there is much less potential for errors, so several scripts for auto-import is not an option for me). With Meteobridge for example this would create yet another problem because my guess is this would not be a bug fix, but a new feature, so obviously not available to all users unless you pay extra.
- keep in mind we are only talking about catching up when the station goes offline and then back online, it would definitely not be for importing years’ of data which you might do at the very beginning after installing the template, this would have to be done manually
- this would also obviously only be something to consider for software like Weather Display, Meteobridge, WeatherCat, WeeWx, not Weather Underground, NetAtmo, WeatherLink or similar. For these companies Meteotemplate is of course absolutely insignificant and they will never make any changes or add support, which is necessary for this.
As you can see it is quite complicated thing, I am not saying I don’t want to do it, I only wanted to mention some, not all, the potential issues/questions and also the fact that the question is whether at all it would be worth it given the time, work and number of people involved necessary to accomplish this and how much time it would actually save. Because if it did not work correctly it could do more harm than good.
I am open to suggestions and any feedback. Do you have any idea? Did I miss something? Do you think it would be worth it? Would the developers be willing to work on this?
Hi.
And what about the time fetch interval?
I use cumulus. It saves sensor data to its archive every 1 minute, but the meteotemplate updates every 5 minutes. How will it merge correctly?
Thx,
O.
This would again have to be handeled by the SW developer and sent in 5 min intervals. In case of Cumulus there is unfortunately no need to even consider it since Steve did not implement the API, which would be 10 times easier than this and based on our previous conversation it is unlikely Cumulus will ever support the API. Not much I can do.
Luc
Any way your utility could export the data between two dates/times rather than the whole year?
That way you could get a csv file of any data during an outage which you could then import into MT and fill in the gaps!
Bill
Currently, without an argument (in weewx.conf) the data of the most recent week is exported. Also a begin date can be given. The end date is currently always the most recent record.
During the export a log is presented with dates/times, so one could always stop the export utility before it is finished with ctrl/c.
Luc
Bill,
The meteotemplate import utility default skips existing records, so it might be less work to export more data in weewx and import data of several outages in one go.
Luc
At first I thought this would be a great feature to add, but after reading your post and thinking about it over night I’m no longer sure it’s worth the effort!
Since switching to WeeWx the longest outage I’ve had here was about 2 hours and that one was operator error (Typo in htaccess). Most power outages here are fairly short, under an hour, so if I really wanted to fill in the blanks I could do it manually within a short period of time.
From my point of view I think the effort involved well exceeds the benefit.
My $.02
For those who use weewx: I have written an export utility for weewx data to import in meteotemplate. The plan is to include this utility in the weewx meteotemplate uploader, but Matthew did not found time yet to do so.
The utility (export_mt.py) has to be called stand-alone while weewx temporary is stopped. Each year of data is written to a separate file to be imported in meteotemplate.
Luc
Hi Jachym,
Technically it can be done and in fact I used this method a while.
The software I used was a modified version of MesoWX.
At start of weewx a request was sent to meteotemplate to report the date of the latest record in the meteotemplate database. When weewx has newer records (e.g. which were saved in the weather station during a power cut), these were sent to meteotemplate in groups of 200 records at a time via a POST message. Sending 2 years of data was handled within a couple of hours.
The problem is that it can’t be easy made generic. Each uploader software has to write such a fairly complex program.
Luc
Yes, I already thought of a slightly different but also possible solution, the problem is that it would require major changes or rather many new things added to each SW and I dont think all the developers would be willing to implement this. It was not easy to persuade them to do the API (in some cases) and this would be ten times more difficult in terms of time.
Hi Jachym,
thanks for your very complete reflection.
I have one more, why Do not just write a php script or other way and launch it with a CronJob?
Can’t we upload datas from WeatherDisplay log’s?
Like that we are free to use it manually or to put it on a CronJob.
Many Many thanks for all your great job!!!
Hi Didier,
CRON job does not solve anything. The program can do CRON job itself, so it would be even easier to do the CRON with the program.
But even if we say that the software uploads the files to some folder on the server:
1. how will the template know, which files it should import – we cannot scan everything all the time and there could be new data, which the template wouldnt know
2. it would again use FTP which is not ideal
3. importantly, each program produces a different format of the log file, so this would not work
You’re perfectly right 🙁