Announcement

Collapse
No announcement yet.

Duplicate checking

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Duplicate checking

    Hi

    can u let me on what field is the duplicate check is done when uploading leads? is it possible to change it to a specific field?

  • #2
    Hi livewire,

    Perhaps the following thread will be helpful:

    Comment


    • #3
      Originally posted by livewire View Post
      Hi

      can u let me on what field is the duplicate check is done when uploading leads? is it possible to change it to a specific field?
      The lead duplicate is checked through (first and last name) OR email address.
      Rabii
      Web Dev | Freelancer

      Comment


      • #4
        Thank you.

        What i want is to check duplicates only through phone numbers. So if add a custom duplicate check function will it replace the current duplicate checking method or add another clause?

        i have around 90k leads on my crm.. when i try to import a file with about 1500 leads it takes more than 2 days to complete it in idle mode, if i disable duplicate check it uploads in a few minutes.

        so if i write a custom duplicate check to check duplicate only through phone number. will there be an improvement?

        Comment


        • #5
          i think this is not a problem of duplicate check class, 1500 leads shouldn't take that much, try to run it directly without using Idle mode. i have uploaded 70K leads into the system as chunks of 10K each upload and it took only few minutes each chunck. so using a custom duplicateChecker class wouldn't help in this case especially if you want to check by phone number it is more expensive query then checking by name. if you decided to do it anyway, here is a class i have used in past which combine checking by name Or email Or phone number, fee free to customise it however you wanted.

          PHP Code:
          <?php

          namespace Espo\Custom\Classes\DuplicateWhereBuilders;

          use 
          Espo\Core\ORM\Entity as CoreEntity;

          use 
          Espo\Core\{
              
          Duplicate\WhereBuilder,
              
          Field\EmailAddressGroup,
              
          Field\PhoneNumberGroup,
          };

          use 
          Espo\ORM\{
              
          Query\Part\Condition as Cond,
              
          Query\Part\WhereItem,
              
          Query\Part\Where\OrGroup,
              
          Entity,
          };

          /**
           * @implements WhereBuilder<CoreEntity>
           */
          class Contact implements WhereBuilder
          {
              public function 
          build(Entity $entity): ?WhereItem
              
          {
                  
          assert($entity instanceof CoreEntity);

                  
          $orBuilder OrGroup::createBuilder();

                  
          $toCheck false;

                  if (
          $entity->get('firstName') || $entity->get('lastName')) {
                      
          $orBuilder->add(
                          
          Cond::and(
                              
          Cond::equal(
                                  
          Cond::column('firstName'),
                                  
          $entity->get('firstName')
                              ),
                              
          Cond::equal(
                                  
          Cond::column('lastName'),
                                  
          $entity->get('lastName')
                              )
                          )
                      );

                      
          $toCheck true;
                  }

                  if (
                      (
          $entity->get('emailAddress') || $entity->get('emailAddressData')) &&
                      (
                          
          $entity->isNew() ||
                          
          $entity->isAttributeChanged('emailAddress') ||
                          
          $entity->isAttributeChanged('emailAddressData')
                      )
                  ) {
                      foreach (
          $this->getEmailAddressList($entity) as $emailAddress) {
                          
          $orBuilder->add(
                              
          Cond::equal(
                                  
          Cond::column('emailAddress'),
                                  
          $emailAddress
                              
          )
                          );

                          
          $toCheck true;
                      }
                  }

                  if (
                      (
          $entity->get('phoneNumber') || $entity->get('phoneNumberData')) &&
                      (
                          
          $entity->isNew() ||
                          
          $entity->isAttributeChanged('phoneNumber') ||
                          
          $entity->isAttributeChanged('phoneNumberData')
                      )
                  ) {
                      foreach (
          $this->getPhoneNumberList($entity) as $phoneNumber) {
                          
          $orBuilder->add(
                              
          Cond::equal(
                                  
          Cond::column('phoneNumber'),
                                  
          $phoneNumber
                              
          )
                          );

                          
          $toCheck true;
                      }
                  }

                  if (!
          $toCheck) {
                      return 
          null;
                  }

                  return 
          $orBuilder->build();
              }

              
          /**
               * @return string[]
               */
              
          private function getEmailAddressList(CoreEntity $entity): array
              {
                  if (
          $entity->get('emailAddressData')) {
                      
          /** @var EmailAddressGroup $eaGroup */
                      
          $eaGroup $entity->getValueObject('emailAddress');

                      return 
          $eaGroup->getAddressList();
                  }

                  if (
          $entity->get('emailAddress')) {
                      return [
                          
          $entity->get('emailAddress')
                      ];
                  }

                  return [];
              }

              private function 
          getPhoneNumberList(CoreEntity $entity): array
              {
                  if (
          $entity->get('phoneNumberData')) {
                      
          /** @var PhoneNumberGroup $eaGroup */
                      
          $eaGroup $entity->getValueObject('phoneNumber');

                      return 
          $eaGroup->getNumberList();
                  }

                  if (
          $entity->get('phoneNumber')) {
                      return [
                          
          $entity->get('phoneNumber')
                      ];
                  }

                  return [];
              }
          }
          ​
          Hope this helps
          Rabii
          Web Dev | Freelancer

          Comment


          • #6
            Thanks again.. When i run it directly, it it adds about 30 leads and says in progress forever. doesn't add further. its extremely slow when i enable enable the duplicate check...

            i just noticed when i skip the email fields its much faster.. i need the emails

            Comment


            • #7
              which version you are using ?
              Rabii
              Web Dev | Freelancer

              Comment


              • #8
                > it takes more than 2 days to complete it in idle mode

                How many leads do you have? It should not take that long.

                Comment


                • #9
                  Originally posted by rabii View Post
                  which version you are using ?
                  7.0.7

                  Comment


                  • #10
                    Originally posted by yuri View Post
                    > it takes more than 2 days to complete it in idle mode

                    How many leads do you have? It should not take that long.
                    The overall system has around 97000 leads

                    Comment


                    • #11
                      I uploaded a file with 2000 leads. Over 24 hours has passed and only 1221 has been added. Im so confused why it takes so much time with duplicate checking

                      Comment


                      • #12
                        97000 is not a big number. How many email addresses and phone numbers do you have? You can check at Administration > Email Address / Phone Numbers.

                        Comment


                        • #13
                          Originally posted by yuri View Post
                          97000 is not a big number. How many email addresses and phone numbers do you have? You can check at Administration > Email Address / Phone Numbers.
                          yeh.. Thats why im confused. Its super slow.

                          Phone : 99,302
                          Emails: 78,874

                          is there any specific setting that need to check?
                          Last edited by livewire; 05-12-2023, 08:22 AM.

                          Comment


                          • rabii
                            rabii commented
                            Editing a comment
                            can you check if the cache is enable (administration -> setting -> use cache)

                          • livewire
                            livewire commented
                            Editing a comment
                            its enabled

                        • #14
                          Does it take long if you import leads w/o email addresses so that only the name is checked for duplicates.

                          Comment


                          • #15
                            How much time does it take to run this query:

                            Code:
                            SELECT SQL_NO_CACHE `id` FROM `lead` WHERE `first_name` = 'some name' AND `last_name` = 'some name' AND deleted = 0

                            Comment

                            Working...
                            X