AddThis Feed Button

Recent Items

Tags: Data Loader, SQL

Oh boy, I love debugging!

And today I really had to debug DataLoader!

I do a lot of stuff with DataLoader — see my previous blogs on the topic. But I’ve always uploaded to Salesforce from a PostgreSQL database — never extracted.

DataLoader allows a job to be configured where a SOQL query is run, the data is extracted from Salesforce and the results are inserted into a database. The documentation is woeful on this whole topic, but fortunately DataLoader comes with sample process-conf.xml and database-conf.xml files that contain some examples.

The basic method is:

  • process-conf.xml defines the job and includes the SOQL in a extractionSOQL parameter
  • An SDL file defines the field mapping between Salesforce and the output
  • database-conf.xml defines the SQL statement to use, plus the parameter types

First hint: The Field names in the SDL file are case-sensitive. Use “ID” instead of “Id” and the field won’t come across.

Second hint: All fields come across in quotes. This totally destroys any attempt to load numbers!

Here’s part of the database-conf.xml that I used:

<bean id="extractMegaphoneQuery"
      class="com.salesforce.dataloader.dao.database.SqlConfig" singleton="true">
    <property name="sqlString">
        <value>
            INSERT INTO renewals_megaphone (
               period, megaphone_date, opportunity_name, reason_lost, owner, expiry_date, product, years_owned)
            VALUES (@period@, @megaphone_date@, @opportunity_name@, @reason_lost@, @owner@,
                    @expiry_date@, @product@, @years_owned@)
        </value>
    </property>
    <property name="sqlParams">
        <map>
            <entry key="period"           value="bigint"/>
            <entry key="megaphone_date"   value="java.sql.Date"/>
            <entry key="opportunity_name" value="java.lang.String"/>
            <entry key="reason_lost"      value="java.lang.String"/>
            <entry key="owner"            value="java.lang.String"/>
            <entry key="expiry_date"      value="java.sql.Date"/>
            <entry key="product"          value="java.lang.String"/>
            <entry key="years_owned"      value="integer"/>
        </map>
    </property>
</bean>

You’ll notice I have two fields that are numbers — period (bigint) and years_owned (integer). When trying to load these fields into my database, I kept getting the message:

Sql error: ERROR: column "period" is of type bigint but expression is of type character varying.
org.postgresql.util.PSQLException: ERROR: column "period" is of type bigint but expression is of type character varying

There was no way that I could convince DataLoader that the field was numeric — I tried lots of java.lang.Integer type entries in the sqlParams, but didn’t help.

I then changed the output to csvWrite instead of databaseWrite so that I could see the output, and it came through like this:

"PERIOD","MEGAPHONE_DATE","OPPORTUNITY_NAME","REASON_LOST","OWNER","EXPIRY_DATE","PRODUCT","YEARS_OWNED"
"83823.0","2010-06-08","Joe Smith","Invalid","John","2010-03-14","a0624004000qRtkgA2","3.0"

You’ll notice that the first and last columns are meant to be numbers, but they are coming through in quotation marks. That’s what was causing the database load to fail!

So, I updated my SQL to convert the strings to numbers:

    <property name="sqlString">
        <value>
            INSERT INTO renewals_megaphone (
               period, megaphone_date, opportunity_name, reason_lost, owner, expiry_date, product, years_owned)
            VALUES (@period@::numeric, @megaphone_date@, @opportunity_name@, @reason_lost@, @owner@, @expiry_date@, @product@, @years_owned@::numeric)
        </value>
    </property>

This successfully cast the strings into numbers and they finally got stored in the database (which automatically ignored the decimal places when converting to bigint and integer).

I hope the blog post saves other people similar problems!

The Bottom Line

  • DataLoader can extract data and insert it into a database
  • Numbers are quoted on export and cause database errors
  • I type-cast the string into numeric to enable the database to load it

Okay, this post isn’t quite about Salesforce.com today, but I think it’s a tidbit worth publishing…

I learned something about SQL today!

I love SQL — It’s the only programming language that I learnt back in my University days that is still useful. (Remember Pascal? Ha!) I especially love PostgreSQL. I did some neat procedural stuff with it earlier this year that was especially pleasing.

Anyway, today we were burdened with an overly complex query doing funky stuff with sales data. We were trying to join together two sets of data using the good old UNION statement. Result set A had 20 rows and result set B had 24 rows. When combined with UNION, the result had… not 44 rows, but 24 rows! This was a real head-scratcher until I found this in the PostgreSQL manual:

UNION effectively appends the result of query2 to the result of query1 (although there is no guarantee that this is the order in which the rows are actually returned). Furthermore, it eliminates duplicate rows from its result, in the same way as DISTINCT, unless UNION ALL is used.

The kicker is that last sentence — duplicate rows are removed unless UNION ALL is used. Sure enough, we stuck in that ALL word and everything was fine.

The Bottom Line

  • You learn something every year
  • PostgreSQL rocks!