File endpoint tips

Producer/consumer concurrency

The default locking strategy is creating lock file that is consuming more IO than required. To reduce the IO usage the following can be done

  • on the producer side (to), add tempPrefix=.writting/ option
  • on the consumer side (from), add readLock=none option

With that approach the files are first created in a .tmp subdirectory, and when they are completly written they have moved to the destination directory.

When possible, prefer this approach instead of using doneFileName.

Consumer performance tips

Avoid to use recursive option when not needed.

Producer performance tips

The option forceWrites=false can be used on a producer to improve performance on a machine with a slow disk system. Be carefull, with this option written file can be lost if a machine crash occurs after the file is written.

Gotcha with SEDA

In order to process file in parallel with a given number of thread, one can use a SEDA queue to achieve that, like in the following route:

route.xml
<route>
    <from uri="file://directory?delete=true&amp;moveFailed=.error&amp;readLock=none"/>
    <to uri="seda:fileQueue?size=1024&amp;blockWhenFull=true" />
</route>
 
<route>
    <from uri="seda:fileQueue?size=1024&amp;blockWhenFull=true&amp;concurrentConsumers=4" />
 

However in that case, there is an issue in Camel that lead to have some exception logged while in fact there is no error, and every files are correctly processed.

To avoid theses errors, the premove option must be used:

route.xml
<route>
    <from uri="file://directory?delete=true&amp;moveFailed=.error&amp;readLock=none&amp;preMove=.processing"/>
    <to uri="seda:fileQueue?size=1024&amp;blockWhenFull=true" />
</route>
 
<route>
    <from uri="seda:fileQueue?size=1024&amp;blockWhenFull=true&amp;concurrentConsumers=4" />
 

Related Links