twaugh.redhat: 09:54 Aug 22, 2008
If there are very many completed jobs preserved in the history, an IPP-Get-Jobs operation may tie up the scheduler in get_jobs() for several minutes with 100% CPU usage, preventing jobs being serviced.
Perhaps it ought to be possible to configure CUPS to deny requests that do not use the 'limit' attribute if the result would exceed some certain number of jobs? |
mike: 14:54 Aug 22, 2008
How many jobs?
Even 500 jobs shouldn't take very long to load... |
twaugh.redhat: 04:12 Aug 26, 2008
Tens of thousands. Even 5,000 jobs takes over 10s for 'lpstat -Wall -o', and that's without PreserveJobFiles being set.
If PreserveJobFiles *is* set, it seems that cupsd wants to auto-type every job file as well... |
mike: 08:03 Aug 26, 2008
OK, I'm pushing this to a RFE for a future release (*not* 1.4), since the default limit is 500 jobs.
It still shouldn't take that long to load the job history, but we'll just need to do some performance tuning for that use case. |
rojon: 12:17 Aug 26, 2008
Where comes the limit from? When looking in the current code, i can find only a hard limit in scheduler/ipp.c:6138 limit = 1000000; (cups-1.3.8-r7864) which is set if no limit is set by the requestor. Neither in the cgi-bin's nor in the lpstat, there is a limit imposed other than in ipp.c (yet). Even worser, even if a destination is given, the job-uri is not adapted to limit search only for destination, forcing to build an array of all jobs matching the "which-jobs" tag, before emiting an output. This results in a heavy degradation of the cupsd service. Also this consumes an incredible high amount of memory if you have to create an array of all "MaxJobFiles" elements, even searching for 10 matching entries, or for a specific printer ... |
mike: 14:42 Aug 26, 2008
The maximum size of the job history is controlled by the MaxJobs directive in cupsd.conf. The default value for this is 500. |
rojon: 13:23 Aug 27, 2008
This limits the number of max reported Jobs efectivly, nonetheless, this does not impose a limit to IPP_GET_JOBS if you want to store lets say about 50000 Jobs for a Job-History. I think we could change at least lpstat to behave more effective and give cups the chance to serve jobs, even if we retrieve a full list of completed jobs. Find attached a patch to lpstat.c to behave much friendlier ... |
twaugh.redhat: 04:53 Sep 24, 2008
Attached is a patch to do the same for the web interface. |
mike: 08:46 Feb 06, 2009
Considering for CUPS 1.5, although I may rev the cupsGetJobs API to handle this for all of the current clients. |
mike: 18:02 May 11, 2011
Pushing to future release; we can't use the patches as-is and I'd like to do some different optimizations when the client is asking for already-cached data. |
twaugh.redhat: 08:05 Oct 17, 2011
Any update on this? This is essentially a denial of service using what should be a non-privileged operation. |
mike: 12:03 Oct 17, 2011
The work is queued up for 1.6 and will likely be addressed in the coming weeks. |
mike: 21:10 Feb 15, 2012
Pushing out a bit; I want to add support for the new first-index attribute, and then we'll apply this. Too late for 1.6... |
twaugh.redhat: 09:01 Mar 23, 2012
I'm not sure how first-index will help this. If the client doesn't supply that attribute, Get-Jobs will have the same performance problems as it ever has.
Perhaps the timer support (from the Avahi work I did) could be used to break up long operations into fairer portions? |
mike: 09:08 Mar 23, 2012
Tim, the first-index attribute fixes issues with using first-job-id in the "window fetching" changes you have provided (due to priority and state, job-id's may not come across in numerical order...)
I am thinking about adding a default limit value of 500 (configurable of course) so that clients that just ask for job history will not cause the attributes of all jobs to be loaded; this combined with the latest changes to support time-based history preservation (STR #3143) should mitigate this issue until cupsd can better deal with long history reports. Future versions of cupsd will be multi-threaded as well, so a single long-running operation won't impact other clients like it does today. |