Hello @mbatz,
Today I encountered a strange situation in the operation of the application.
I have 2 instances of applications on different servers, the situation occurred on both instances, although it showed differently.
Both instances went from 17/06/2020 and I will add that nothing has been put off in the application logs, either by warnings or errors.
Namely:
1 instance - the application does not go to the tabs:
Categories, Import Type, Export Type, Import Objects, Export Objects, System Information, Authentication, Database Properties, Rights,
2 instance - after the above mentioned also Exportd Job Logs, Users, Groups, Profile
When I clicked on these tabs in the application nothing appeared in the logs, as if the application stopped working
However, when I clicked on the Types tab or walked on objects, it worked
I also tried to add a rigid context, e.g. /framework/category, and this also did not help, after reloading the page returned to the empty and basic link ip_address: 4000 in both instances
I tried on chrome , and firefox and results was the same
After restart everything (mongod, rabbitmq and datagerry) on one instance the application started working again correctly on this instance
can you reproduce that behavior or was that a single case? As I can see with port 4000 in the screenshot, do you access the DATAGERRY webserver directly? We recommend running the application behind an Nginx proxy server for performance reasons.
Hi,
So far, this has only happened once.
Yes one instance work behind Nginx and second work without Nginx and it happened on both instances
This ip_address:4000 i wrote because i can’t show you my addresses which i use.
I can’t reproduce this situation. Maybe if i don’t restart application during 14 days maybe then we could reproduce that behavior or maybe someone else could reproduce that behavior Datagerry or something like that happened to him?
just let me know when this error will appear again. I haven’ t seen this in one of our setups yet, but if we see that error again, we need to pay attention on that.
Hello @mbatz,
according to what we agreed, when the problem reappeared, I am addressing the topic.
Today I wanted to check the logs in page Object logs from tab “Object Logs”
and i can’t go there.
When i clicked this button “Object Logs” menu hidde and nothing more. I’m still in the same place where i was. I can’t go to logs. So i checked other tabs and i can’t go also to pages from tabs:
System Information
Authentication
Database Properties
Rights
Everything i testing in chrome.
In other web browser don’t work too.
I also can writed that when i clicked on works tabs then i saw that information in log but when i clicked on tabs which don’t work i don’t had any information in log.
thanks for your detailed report. Did a restart of DATAGERRY solve the problem? It seems, in some rare cases (currently I have not seen this in any of our customer setups), the backend will not answer anymore to HTTP queries. As I could not reproduce it in my testing environment, this is hard to debug. I will discuss the options with the team.
Always when i restarting datagerry it works again ok.
In my case on two instances that application works the same -> after 14 days can’t works normally with every tabs.
These instances works on virtual machines on which is os Centos 7 and Centos Stream (now 8 major version).
These instances works not on docker have 4gb and 8gb ram, 2 and 4 vcpu
one from these instances works with nginx. On these virtual machines works only these applications.
Hello,
Today it happened again exactly after 14 days that I could not enter different tabs.
I wanted to check the logs, but there were only 3 files:
exportd.log
webapp.log
webserver.access.log
Only restart help me
yesterday I did some analysis on this, as I saw such a behavior in my development environment. DATAGERRY was started in foreground and I did some resizing of the SSH terminal window. This caused the signal “SIGWINCH” to be send. The internal webserver we use for the DATAGERRY backend, gunicorn, will handle that signal by shutting down its worker processes. This was implemented for a specific use case. After that, the DATAGERRY webserver was not responding anymore. This only happened when starting DATAGERRY in foreground and I could not reproduce that behavior in any other setting (Docker, running as daemon in background, running in background). So that should not be the reason for the issues in your setup. But what we could see, were logentries in the webserver.error.log. Everytime a worker was closed or anything happened there, there was a logentry created by gunicorn. In our example yesterday, we saw the following logs:
That makes me think of, that the DATAGERRY webserver should not be the problem in your setup, as I cannot find any logentries in your webserver.error.log file. Could that be an other issue in your machines? Maybe something like Firewalld or anything else. Can you try to access the DATAGERRY backend with curl from remote and the local DATAGERRY machine, the next time that happens?
Hello,
Today after wrote url my instance datagerry blank page. I check on firefox and chrome and was the same results.
i checked debugger on chrome and saw:
DATAGERRY
Please enable JavaScript to continue using this application.
curl from remote host show
<!doctype html>
DATAGERRY
Please enable JavaScript to continue using this application.
Command find show only 2 files:
find /tmp/_MEIR63D85/ -type f
/tmp/_MEIR63D85/cmdb/interface/net_app/DATAGERRYApp/index.html
/tmp/_MEIR63D85/logs/webserver.access.log
is it possible, that /tmp on your machine is cleared after some time? When you start the DATAGERRY binary it extracts content (like a Python interpreter and the DATAGERRY code) in a subdirectory of /tmp, which in your case is /tmp/_MEIR63D85. If some of the files were deleted during DATAGERRY is running, a crash of the application is possible.
Hi @mbatz,
it could be this because i don’t changed anything and there is default configuration for cleaning /tmp
now i change configuration and check what happend next.
I think also if You have right that should develop some mechanism to refresh avery files once time a day or when starting the application, protect against such a situation, whether it is an entry in the configuration file for / tmp or some mechanism against such cleaning if the application unpacks exactly in / tmp or it is an unchanging directory name that would facilitate an entry in the configuration for cleaning the / tmp resource
And the best if it would be possible to set the exact directory in which to unpack and omit / tmp and set the possibility of creating an instance elsewhere
I did some research on that. It is systemd-tmpfiles, which cleans files in some directories (like /tmp) based on a configuration. Configuration files for systemd-tmpfiles are placed in /usr/lib/tmpfiles.d and /etc/tmpfiles.d. On my CentOS development box, the default configuration for /tmp is defined to clean all files, that are older than 10d (ctime, atime and mtime). Unfortunately, we can not manipulate the timestamps of the files placed in /tmp, as we use a library (PyInstaller) to create the binary, which does not provide such a functionality.
In future releases, we’ll rollout a configuration file in /usr/lib/tmpfiles.d/datagerry.conf, which prevents systemd-tmpfiles from deleting our files in /tmp:
# systemd tmpfiles exclude file for DATAGERRY
# Exclude PyInstaller temporary files
x /tmp/_MEI*
Hello,
Sure, got it.
This will be a great solution, because nobody new to installing datagerry will have a problem with crashing functions in a running application, because they were removed by the systemd-tmpfiles function in the / tmp directory.
So far I have implemented a similar configuration in the main configuration file
so far I made a similar entry in the main tmp.conf file:
X /tmp/_MEI*