Announcement

Collapse
No announcement yet.

QueueUtil::hasScheduledJobOnMinute() optimisation

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • QueueUtil::hasScheduledJobOnMinute() optimisation

    Hi,

    We have faced with an issue when a slow query executed against `job` table:


    Code:
    # Query_time: 12.193867 Lock_time: 0.000081 Rows_sent: 0 Rows_examined: 144349
    
    SELECT job.id AS `id` FROM `job` WHERE job.scheduled_job_id = 'xxxxxyyyyxxxxiiiiixxxx' AND job.execute_time LIKE '2021-05-21 02:14:%' AND job.deleted = 0 LIMIT 0, 1;

    There are 1350000 records in the table and the query takes about 10-14 seconds.

    After a quick research we found a drawback and found a way to optimise the code - see the screenshot attached.

    Click image for larger version

Name:	Screenshot 2021-05-21 at 10.41.18.png
Views:	251
Size:	364.8 KB
ID:	71017

    The proposed patch for version 6.1.7:


    Code:
    $ git diff
    
    diff --git a/application/Espo/Core/Utils/Cron/Job.php b/application/Espo/Core/Utils/Cron/Job.php
    
    index ee5b550d2..4900e3af5 100644
    
    --- a/application/Espo/Core/Utils/Cron/Job.php
    
    +++ b/application/Espo/Core/Utils/Cron/Job.php
    
    @@ -165,13 +165,17 @@ class Job
    
    public function hasScheduledJobOnMinute(string $scheduledJobId, string $time) : bool
    
    {[INDENT]$dateObj = new DateTime($time);[/INDENT]
     
    -         $timeWithoutSeconds = $dateObj->format('Y-m-d H:i:');
    
    +        $minuteBegin = $dateObj->format('Y-m-d H:i:00');
    
    +        $minuteEnd = $dateObj->format('Y-m-d H:i:59');
    
    
    
              $job = $this->getEntityManager()->getRepository('Job')
    
                    ->select(['id'])
    
                    ->where([
    
                        'scheduledJobId' => $scheduledJobId,
    
    -                   'executeTime*' => $timeWithoutSeconds . '%',
    
    +                  'AND' => [
    
    +                          'executeTime>=' => $minuteBegin,
    
    +                          'executeTime<=' => $minuteEnd
    
    +                 ]
    
                 ])
    
                 ->findOne();
    
    
    
    diff --git a/application/Espo/Resources/metadata/entityDefs/Job.json b/application/Espo/Resources/metadata/entityDefs/Job.json
    
    index 71d45b0bf..125b5ecbd 100644
    
    --- a/application/Espo/Resources/metadata/entityDefs/Job.json
    
    +++ b/application/Espo/Resources/metadata/entityDefs/Job.json
    
    @@ -107,6 +107,9 @@
    
         "textFilterFields": ["id", "name", "methodName", "serviceName", "scheduledJob"]
    
    },
    
    "indexes": {
    
    +     "executeTimeExclusive": {
    
    +     "columns": ["executeTime"]
    
    + },
    
    "executeTime": {
    
           "columns": ["status", "executeTime"]
    
    },

    A patch for `master` branch:

    Code:
    $ git diff
    
    diff --git a/application/Espo/Core/Job/QueueUtil.php b/application/Espo/Core/Job/QueueUtil.php
    
    index 49f97ca98..14b5c744a 100644
    
    --- a/application/Espo/Core/Job/QueueUtil.php
    
    +++ b/application/Espo/Core/Job/QueueUtil.php
    
    @@ -154,14 +154,18 @@ class QueueUtil
    
    {
    
         $dateObj = new DateTime($time);
    
    
    
    -    $timeWithoutSeconds = $dateObj->format('Y-m-d H:i:');
    
    +   $minuteBegin = $dateObj->format('Y-m-d H:i:00');
    
    +   $minuteEnd = $dateObj->format('Y-m-d H:i:59');
    
    
    
        $job = $this->entityManager
    
           ->getRDBRepository('Job')
    
           ->select(['id'])
    
           ->where([
    
               'scheduledJobId' => $scheduledJobId,
    
    -          'executeTime*' => $timeWithoutSeconds . '%',
    
    +         'AND' => [
    
    +               'executeTime>=' => $minuteBegin,
    
    +               'executeTime<=' => $minuteEnd
    
    +         ]
    
         ])
    
         ->findOne();
    
    
    
    diff --git a/application/Espo/Resources/metadata/entityDefs/Job.json b/application/Espo/Resources/metadata/entityDefs/Job.json
    
    index 71d45b0bf..125b5ecbd 100644
    
    --- a/application/Espo/Resources/metadata/entityDefs/Job.json
    
    +++ b/application/Espo/Resources/metadata/entityDefs/Job.json
    
    @@ -107,6 +107,9 @@
    
        "textFilterFields": ["id", "name", "methodName", "serviceName", "scheduledJob"]
    
    },
    
    "indexes": {
    
    +     "executeTimeExclusive": {
    
    +     "columns": ["executeTime"]
    
    + },
    
    "executeTime": {
    
           "columns": ["status", "executeTime"]
    
    },

    Hope to see this patch in the upcoming release (6.1.8?).

    Thank you!
    Attached Files

  • #2
    Cool vladimir, not sure if you aware of their Github and you can probably do a Pull Request on it: https://github.com/espocrm/espocrm/pulls

    By any chance are you using MariaDB?

    Comment


    • #3
      Hi,

      What database system do you use? I believe we didn't have performance issues here when we tested.

      Is it necessary to add a new index? It's problematic to add indexes in hotfix version and we always try to do it only in minor/major releases.
      Last edited by yuri; 05-24-2021, 06:03 AM.

      Comment


      • #4
        What if we add an additional condition:

        PHP Code:
        'status' => ['Pending''Ready''Running''Success'

        Then an existing combined index should work out I believe.
        Last edited by yuri; 05-24-2021, 06:16 AM.

        Comment


        • #5
          I tested with an additional condition on status (along with executeTIme change) and it applied a proper index. So we can avoid adding an additional index. Could you test on your database whether the query performs fast?

          Comment


          • #6
            The fix commit: https://github.com/espocrm/espocrm/c...f0e72db151139e


            Please let me know if it executes fast on your environment. Thanks.

            Comment


            • #7
              yuri Thank you, the patch looks good but I cannot test it on our environment as we use EspoCRM Cloudron App and it's on read-only filesystem.

              Could you `build` the query so I could test it?

              Comment


              • #8
                Code:
                SELECT id FROM job
                WHERE
                job.status IN ('Pending', 'Ready', 'Running', 'Success') AND
                job.execute_time >= 'BEFORE' AND
                job.execute_time <= 'AFTER' AND
                job.deleted = 0

                Comment


                • #9
                  Well done, it works quite fast, it takes about 0.1-0.5 seconds in our environment (cloudron app).
                  Hope the patch will appear in the nearest release.
                  Thank you

                  Comment

                  Working...
                  X