Working with "Threads"
For performance reasons, processing of bulk data can be split among a maximum of 16 worker threads.
If multiple <fields>
blocks are used, they are assigned
sequentially to the worker threads, with each worker thread that has finished processing his
<fields>
block obtaining the next until no blocks remain. If only one
<fields>
block is used, its records are split evenly amongst the
threads, with the last thread also processing the remainder if not evenly divisible.
It is necessary to group dependent records together in blocks to ensure that parent records (or in general, linked records) are processed before their children (or linked-to records).
The following could occur in a 3rd party interface where records are always generated individually:
<import threads="5">
<fields>
<!-- ... -->
<Company>
<!-- ... -->
</Company>
<Person>
<link>
<!-- to preceding Company -->
</link>
<!-- ... -->
</Person>
<!-- ... -->
</fields>
</import>
If this was the only <fields>
block, the split between threads could
occur between the two records (unless the request is carefully tuned), resulting in the
possibility of the Person record to be processed before the Company record. Either more than
one <fields>
block has to be used to ensure proper ordering, or the
records itself have to generated dependent of each other:
<import threads="5">
<fields>
<!-- ... -->
<Company>
<!-- ... -->
</Company>
<Person>
<link>
<!-- to preceding Company -->
</link>
<!-- ... -->
</Person>
<!-- ... -->
</fields>
<fields>
<!-- ... -->
</fields>
<!-- ... -->
</import>
<import threads="5">
<fields>
<!-- ... -->
<Company>
<!-- ... -->
<Person>
<!-- implicit link to Company, no explicit <link> needed -->
<!-- ... -->
</Person>
</Company>
<!-- ... -->
</fields>
</import>