Home > Apex, salesforce > Getting Crafty with Queries

Getting Crafty with Queries

May 12th, 2011

I’ve recently found myself running into several pain points when it came to extracting data from salesforce.com using SOQL. Since salesforce.com/force.com runs on a shared infrastructure they don’t allow all the bells and whistles of traditional SQL as they need to protect their resources and prevent some crazy guy like me from bringing down then entire system with some redonkulous query. You may be thinking by now that I am referring to the specific governor limits salesforce.com imposes on your queries but I am actually talking about fairly simple extractions that don’t always have the most straight forward solution.

Let’s say that you have a table of data like this:

Now let’s also say the requirement is to query items from this table but only return items based on a unique Order_Number__c. If there are duplicate Order Numbers in the table return the oldest record based on create date. The records we need to return would be the ones highlighted green.

What some of you are probably thinking is this is a totally unrealistic example because in any well designed system a field like Order Number should be unique. Well guess what? Not everyone gets to live this this happy database fairly land where duplicates don’t exist. If you you, whoop-di-do, hooray for you, click here for your prize. If you don’t live in fairy database land this example could be a totally realistic problem.

So how to solve it? Out of the box SOQL doesn’t have a solution for this problem (if it does I just wasted some time writing this) so we need to get crafty! You could maybe use aggregate queries to group by Order Number and then determine the oldest record. Or maybe query records, add them to unique Lists based on Order Number, sort these, and then return the oldest record. These are all good…well…they would work, not so sure about the “good” part. Below is the way I approached this problem and I’m sure there may be better ways to do this but I thought it was pretty slick.

We know one of the requirement is to use the oldest records if there is a duplicate. In our query we can sort by Created Date ascending so the oldest records a processed first. Before running this query we will also create a Map where the key is a Integer representing the Order Number and the value is a salesforce object, in this example Opportunity. Data returned would be processed in this order top to bottom:

As we process the returned results we will check to see if the Map contains a value for the given Order Number on the record we are currently processing. If it does not, it means this is the oldest record for this Order Number and we add it to the map. If the map does return a value for any given Order Number it means there was already an older record found and this record should not be used. In the end with have a nice clean map with the keys being the Order Number and the values being the oldest record with this Order Number. If we then need to add all these records to a list we can do that with a simple values() method called on the Map. Below is the code for this, it assumes the object being queried is an Opportunity but this logic could be applied to any object/table.

//Create the map Order => Opportunity
Map<Integer,Opportunity> oppMap = new Map<Integer,Opportunity>();
 
//Query the records and sort by CreatedDate ascending so oldest records are returned first
for(Opportunity opp : [select Id, Name, Order_Number__c, CreatedDate from Opportunity order by CreatedDate asc]){
 
    //Check if map contains Opportunity for given Order Number
    if(oppMap.get(opp.Order_Number__c) == null){
	//If null add opp to map, it is the oldest
	oppMap.put(opp.Order_Number__c,opp);
    }
 
    //Else do nothing, ignore the record as we already found an older one.
}
 
//If we need the opps in a list just do this	
List<Opportunity> opps = oppMap.values();

The one down side to this approach is that you are querying duplicate records that will never be used. Ideally you would only query the exact records needed without the duplicates even returned at all. I know of no way to do this but if you do please leave a comment below. Even with this added overhead of returned records that are never used I like this approach over the other described earlier because it is super simple and I am a big fan of simple.

UPDATE:
Based on some great comments below Greg has pointed that if you sort in desc order you don’t need the if() condition because the oldest records will be processed last and when you add them to the map it will replace any value that is currently there for the given key. The for loop could be simplified to this:

//Query the records and sort by CreatedDate descend so oldest records are processed last
for(Opportunity opp : [select Id, Name, Order_Number__c, CreatedDate from Opportunity order by CreatedDate desc]){
 
    //Add opp to map, it will replace any existing value in the map with the same key
    oppMap.put(opp.Order_Number__c,opp);
 
}
Categories: Apex, salesforce
  1. May 12th, 2011 at 09:06 | #1

    Nice dude, I like it. The using maps to track previous existence is a pretty clever idea. Haven’t run across this problem yet, but nice to know I got a solution for if/when I do. I wonder if they will ever get some more keywords like distinct supported in SOQL.

  2. Greg Grinberg
    May 12th, 2011 at 09:13 | #2

    This is an excellent example of a really common sf pattern. Minor nitpick: If you order by CreatedDate desc you can just put the opportunity into the map without any checking as older opportunities will replace newer ones.

  3. May 12th, 2011 at 09:56 | #3

    Ditto to Greg’s comment – there are many ways to use ORDER BY to reduce the need for conditionals inside of your loops.

  4. Kevin Bromer
    May 12th, 2011 at 10:08 | #4

    Nice one! (and thanks for the rickroll)

    There’s a similar pattern I use (and shamelessly stole from people smarter than me) to sort soql results into a map with a key-list pairing instead, which is great for extracting collections of records based on a random key (say, lists of contacts grouped on a common text fieldvalue) It looks like this:

    Map<String,List<Contact>> newMap = new Map<String,List<Contact>>();
    for (Contact c : [select id, fieldvalue from Contact]){
      if (newMap.containskey(c.fieldvalue)){
         List<Contact> clist = newmap.get(c.fieldvalue)
         clist.add(c);
         newmap.put(c.fieldvalue, clist);   
      }else{
         List<Contact> newlist = new list<contact>();
         newlist.add(c);
         newmap.put(c.fieldvalue, newlist);
      }
    }
  5. May 12th, 2011 at 10:19 | #5

    @Greg Grinberg

    You are absolutely right! I completely overlooked this and have updated the main post.

  6. Ralph Callaway
    May 20th, 2011 at 18:03 | #6

    @Kevin Bromer
    Great pattern. I’ve got a variant that is pretty similar:

    for (Contact c : [select id, fieldvalue from Contact]){
      if(!newMap.containskey(c.fieldvalue))
        newMap.put(c.fieldValue, new List();
      newMap.get(c.fieldValue).add(c);
    }
    
  7. March 7th, 2013 at 06:52 | #7

    To go through this treatment you need to open the pores because the root
    cause of acne is clogged pores. This means that
    you can save a lot when you buy Clear Skin Max as the monthly price decreases massively.
    If you do have a chronic problem with acne and blemishes then you
    may want to research each one.