This resource is for KnowledgeTree community members. Commercial Customers can log new support cases via the KnowledgeTree Support Portal


KnowledgeTree Community Edition

Scheduler - Task: Indexing error in log file

Details

  • Type: Bug Bug
  • Status: Closed Closed
  • Priority: Priority One: Immediate fix Priority One: Immediate fix
  • Resolution: Fixed
  • Affects Version/s: DEV 3.5.3
  • Fix Version/s: 3.6.0
  • Component/s: None
  • Description:
    Hide
    The following output is from the log file...

    2008-07-17 | 19:57:21 | INFO | 29046 | 678 | n/a | dms | default | main | Scheduler - Task: Indexing
    2008-07-17 | 19:57:21 | INFO | 29046 | 679 | n/a | dms | default | main | Scheduler - Command: "/opt/ktdms/php/bin/php" "/opt/ktdms/knowledgeTree/search2/bin/cronIndexer.php"
    2008-07-17 | 19:57:21 | INFO | 29046 | 679 | n/a | dms | default | main | Scheduler - Output: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
    <html><head>
    <title>404 Not Found</title>
    </head><body>
    <h1>Not Found</h1>
    <p>The requested URL /opt/ktdms/knowledgeTree/search2/indexing/bin/cronIndexer.php was not found on this server.</p>
    <hr>
    <address>Apache/2.0.63 (Unix) mod_ssl/2.0.63 OpenSSL/0.9.8g DAV/2 PHP/5.2.5 Server at 127.0.0.1 Port 80</address>
    </body></html>

    2008-07-17 | 19:57:21 | INFO | 29046 | 679 | n/a | dms | default | main | Scheduler - Background tasks should not produce output. Please review why this is producing output.
    Show
    The following output is from the log file... 2008-07-17 | 19:57:21 | INFO | 29046 | 678 | n/a | dms | default | main | Scheduler - Task: Indexing 2008-07-17 | 19:57:21 | INFO | 29046 | 679 | n/a | dms | default | main | Scheduler - Command: "/opt/ktdms/php/bin/php" "/opt/ktdms/knowledgeTree/search2/bin/cronIndexer.php" 2008-07-17 | 19:57:21 | INFO | 29046 | 679 | n/a | dms | default | main | Scheduler - Output: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>404 Not Found</title> </head><body> <h1>Not Found</h1> <p>The requested URL /opt/ktdms/knowledgeTree/search2/indexing/bin/cronIndexer.php was not found on this server.</p> <hr> <address>Apache/2.0.63 (Unix) mod_ssl/2.0.63 OpenSSL/0.9.8g DAV/2 PHP/5.2.5 Server at 127.0.0.1 Port 80</address> </body></html> 2008-07-17 | 19:57:21 | INFO | 29046 | 679 | n/a | dms | default | main | Scheduler - Background tasks should not produce output. Please review why this is producing output.

Issue Links

Activity

Hide
Kevin Fourie added a comment - 17/Jul/08 07:05 PM
/opt/ktdms/knowledgeTree/search2/bin/cronIndexer.php exists and is readable by everyone.
Show
Kevin Fourie added a comment - 17/Jul/08 07:05 PM /opt/ktdms/knowledgeTree/search2/bin/cronIndexer.php exists and is readable by everyone.
Hide
Megan Watson added a comment - 18/Jul/08 09:13 AM
This may be related to KTS-3491.
Show
Megan Watson added a comment - 18/Jul/08 09:13 AM This may be related to KTS-3491.
Hide
Megan Watson added a comment - 18/Jul/08 10:57 AM
This seems to be a linux specific issue, I haven't reproduced it in Windows. The problem is caused by the way in which the rootUrl is resolved. The $_SERVER['script_name'] variable, containing the path to the script, is used to resolve it. When running through the browser the path is relative to the server root - so if the server root is /opt/ktdms/knowledgetree then the path will be search2/indexing/bin/cronIndexer.php. However, when running in the background or via the command line it is taken relative to root so the path becomes /opt/ktdms/knowledgetree/search2/indexing/bin/cronIndexer.php.

There was a similar problem with the $_SERVER['server_name'] variable. As a result the server name is saved in a file, serverName.txt on opening the login screen. I've added the rootUrl to this file as well. It'll fix the problem, however the serverName.txt file needs to be deleted so that it can be regenerated with the new url.
Show
Megan Watson added a comment - 18/Jul/08 10:57 AM This seems to be a linux specific issue, I haven't reproduced it in Windows. The problem is caused by the way in which the rootUrl is resolved. The $_SERVER['script_name'] variable, containing the path to the script, is used to resolve it. When running through the browser the path is relative to the server root - so if the server root is /opt/ktdms/knowledgetree then the path will be search2/indexing/bin/cronIndexer.php. However, when running in the background or via the command line it is taken relative to root so the path becomes /opt/ktdms/knowledgetree/search2/indexing/bin/cronIndexer.php. There was a similar problem with the $_SERVER['server_name'] variable. As a result the server name is saved in a file, serverName.txt on opening the login screen. I've added the rootUrl to this file as well. It'll fix the problem, however the serverName.txt file needs to be deleted so that it can be regenerated with the new url.
Hide
Ricardo dos Santos Trindade added a comment - 13/Nov/08 02:52 PM
I had the exact same problem, running version 3.5.4 stack and source versions on a OpenSuse 11.0 environment. The URL formed by kt_url() function in the ktutil.txt doesn't use rootUrl correctly in the final string.

My address should be:
http://***.***.***.***/documentos/...
and it becomes
http://***.***.***.***/...

The same URL malforming is noticed in the Document Indexer Statistics refresh link, also without "/documentos" in the URL.

I tried locating the serverName.txt but it wasn't where it should be. I solved the first one with a dirty patch in the ktutil.inc, switching:

$base_url = str_replace(array("\n","\r"), array('',''), $base_url);
$base_url = 'http://201.6.9.151/documentos';

Any suggestions on a definitive patch for this? In my vision it could be simply joining the servername with the rootUrl, but I am seeing this just from my point of view.

Thanks in advance.
Show
Ricardo dos Santos Trindade added a comment - 13/Nov/08 02:52 PM I had the exact same problem, running version 3.5.4 stack and source versions on a OpenSuse 11.0 environment. The URL formed by kt_url() function in the ktutil.txt doesn't use rootUrl correctly in the final string. My address should be: http://***.***.***.***/documentos/... and it becomes http://***.***.***.***/... The same URL malforming is noticed in the Document Indexer Statistics refresh link, also without "/documentos" in the URL. I tried locating the serverName.txt but it wasn't where it should be. I solved the first one with a dirty patch in the ktutil.inc, switching: $base_url = str_replace(array("\n","\r"), array('',''), $base_url); $base_url = 'http://201.6.9.151/documentos'; Any suggestions on a definitive patch for this? In my vision it could be simply joining the servername with the rootUrl, but I am seeing this just from my point of view. Thanks in advance.
Hide
Kevin Fourie added a comment - 02/Dec/08 04:47 PM
This is happening again on a fresh install of 3.5.4a on Ubuntu 8.04.

The main symptom was that the the External Dependencies Dashlet was complaining about not finding OpenOffice. I eventually tracked this down to the sheduler tasks for indexing and search not running. The logs contain the above error.
Show
Kevin Fourie added a comment - 02/Dec/08 04:47 PM This is happening again on a fresh install of 3.5.4a on Ubuntu 8.04. The main symptom was that the the External Dependencies Dashlet was complaining about not finding OpenOffice. I eventually tracked this down to the sheduler tasks for indexing and search not running. The logs contain the above error.
Hide
Kevin Fourie added a comment - 02/Dec/08 07:01 PM
Added '/' as the default rootUrl in the config_settings table. NULL was seen as empty and the guessing was wrong which lead to the full path being used - which broke the cron scripts.
Show
Kevin Fourie added a comment - 02/Dec/08 07:01 PM Added '/' as the default rootUrl in the config_settings table. NULL was seen as empty and the guessing was wrong which lead to the full path being used - which broke the cron scripts.
Hide
Kevin Fourie added a comment - 02/Dec/08 07:44 PM
Using '/' as rootUrl breaks other Urls :$
Show
Kevin Fourie added a comment - 02/Dec/08 07:44 PM Using '/' as rootUrl breaks other Urls :$
Hide
Kevin Fourie added a comment - 03/Dec/08 05:11 AM
This issue is a *real* pain. Seems the whole rootUrl mechanism is brittle.

I have implemented a fix that will work for default installations but NOT for users that install in a sub folder.

e.g. The URL https://mykt.myorg.org/ will work. The URL https://www.myorg.org/mykt/ will NOT work.

This can be solved by simply creating a vhost for KnowledgeTree instead of using a subfolder.

We will need to document this in the release notes while we try to find a way around this.
Show
Kevin Fourie added a comment - 03/Dec/08 05:11 AM This issue is a *real* pain. Seems the whole rootUrl mechanism is brittle. I have implemented a fix that will work for default installations but NOT for users that install in a sub folder. e.g. The URL https://mykt.myorg.org/ will work. The URL https://www.myorg.org/mykt/ will NOT work. This can be solved by simply creating a vhost for KnowledgeTree instead of using a subfolder. We will need to document this in the release notes while we try to find a way around this.
Hide
Fabio Marzocca added a comment - 02/Feb/09 03:16 PM
Any chance to have a workaround for installations into a subfolder, please?
Show
Fabio Marzocca added a comment - 02/Feb/09 03:16 PM Any chance to have a workaround for installations into a subfolder, please?
Hide
Fabio Marzocca added a comment - 02/Feb/09 03:22 PM
@Kevin: is there any special workaround further to install on a direct link: e.g. http://mykt.myorg.org/?

 
Show
Fabio Marzocca added a comment - 02/Feb/09 03:22 PM @Kevin: is there any special workaround further to install on a direct link: e.g. http://mykt.myorg.org/?  
Hide
Istvan Hubay Cebrian added a comment - 20/Feb/09 05:00 PM
I had a very similar problem the diference being that the error was (URL did not include rootUrl or complete path):

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL search2/indexing/bin/cronIndexer.php was not found on this server.</p>
<hr>
<address>Apache/2.2.3 (CentOS) Server at 127.0.0.1 Port 80</address>
</body></html>

To fix I edited cronIndex.php and changed it to:

chdir(dirname(__FILE__));
require_once(realpath('../../config/dmsDefaults.php'));

global $default;
KTUtil::call_page($default->rootUrl.'/search2/indexing/bin/cronIndexer.php');
Show
Istvan Hubay Cebrian added a comment - 20/Feb/09 05:00 PM I had a very similar problem the diference being that the error was (URL did not include rootUrl or complete path): <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>404 Not Found</title> </head><body> <h1>Not Found</h1> <p>The requested URL search2/indexing/bin/cronIndexer.php was not found on this server.</p> <hr> <address>Apache/2.2.3 (CentOS) Server at 127.0.0.1 Port 80</address> </body></html> To fix I edited cronIndex.php and changed it to: chdir(dirname(__FILE__)); require_once(realpath('../../config/dmsDefaults.php')); global $default; KTUtil::call_page($default->rootUrl.'/search2/indexing/bin/cronIndexer.php');
Hide
Istvan Hubay Cebrian added a comment - 20/Feb/09 05:12 PM
Btw Kevin, rootUrl mechanism is quite strange. Why does a config.ini entry exist yet the same rootUrl is stored in the database?

If I migrate my KT install over to a diff. server, my install will not work until I manually change the database entry.
Show
Istvan Hubay Cebrian added a comment - 20/Feb/09 05:12 PM Btw Kevin, rootUrl mechanism is quite strange. Why does a config.ini entry exist yet the same rootUrl is stored in the database? If I migrate my KT install over to a diff. server, my install will not work until I manually change the database entry.
Hide
Megan Watson added a comment - 09/Apr/09 08:12 PM
The rootUrl mechanism has caused an endless number of headaches. Ideally it will be auto-detected and it won't matter if you're working in the server root or in a sub-folder. However, with the addition of scheduled tasks that run in the background this falls flat. As I described in an earlier comment, the server root path is not set in the global variables and the rootUrl is detected as the full path instead of being relative to the server root.

To overcome this predicament, in the recent version we've added the server settings section to the configuration settings. Here the internal and external server ip / address and port can be set. These are used by the background tasks to run the indexing through a curl function (for correct permissions) and to create links in alert emails, etc. When running through a browser the address and port should be auto-detected and the system does not rely on the server settings.

The rootUrl poses an additional problem. Our stack install runs KnowledgeTree within the server root and therefor the rootUrl is not required. For source installs into a subfolder it is required. When running through a browser the rootUrl is auto-detected, it should then be saved in the config cache (var/cache/configcache) where it can be used by any background tasks. This is the way in which my development machine works (source install on Mac) so it is tested constantly.

I have noticed in a few cases that the configcache does not get created. For these, the config setting for the rootUrl is utilised. The setting in the database will override both the auto-detected setting and the setting in the config.ini. The entry in the config.ini is legacy and needs to be removed.
Show
Megan Watson added a comment - 09/Apr/09 08:12 PM The rootUrl mechanism has caused an endless number of headaches. Ideally it will be auto-detected and it won't matter if you're working in the server root or in a sub-folder. However, with the addition of scheduled tasks that run in the background this falls flat. As I described in an earlier comment, the server root path is not set in the global variables and the rootUrl is detected as the full path instead of being relative to the server root. To overcome this predicament, in the recent version we've added the server settings section to the configuration settings. Here the internal and external server ip / address and port can be set. These are used by the background tasks to run the indexing through a curl function (for correct permissions) and to create links in alert emails, etc. When running through a browser the address and port should be auto-detected and the system does not rely on the server settings. The rootUrl poses an additional problem. Our stack install runs KnowledgeTree within the server root and therefor the rootUrl is not required. For source installs into a subfolder it is required. When running through a browser the rootUrl is auto-detected, it should then be saved in the config cache (var/cache/configcache) where it can be used by any background tasks. This is the way in which my development machine works (source install on Mac) so it is tested constantly. I have noticed in a few cases that the configcache does not get created. For these, the config setting for the rootUrl is utilised. The setting in the database will override both the auto-detected setting and the setting in the config.ini. The entry in the config.ini is legacy and needs to be removed.
Hide
Jarques added a comment - 14/Apr/09 06:03 AM
14 04 2009. Tested, Passed and Closed
KTDMS 3.6 (2009-04-13-000502) OSS

-Ubuntu 8.04

Result:
The issue seems to have been fixed. Indexing and search is OK.

JC
Show
Jarques added a comment - 14/Apr/09 06:03 AM 14 04 2009. Tested, Passed and Closed KTDMS 3.6 (2009-04-13-000502) OSS -Ubuntu 8.04 Result: The issue seems to have been fixed. Indexing and search is OK. JC

People

Dates

  • Created:
    17/Jul/08 07:04 PM
    Updated:
    14/Apr/09 06:03 AM
    Resolved:
    02/Dec/08 07:01 PM