--- a
+++ b/website/faqsandhowtos/HandleCustomField.txt
@@ -0,0 +1,69 @@
+== Generating a custom field and using it to sort results
+
+We are going to show how to generate a custom field from a Recoll filter,
+and use it for sorting results. The example chosen comes from an actual
+user request: sorting results on pdf page counts. 
+
+The details here are obsolete, as the +pdf+ input handler is now a quite
+different python program, but the general idea is still relevant.
+
+The page count from a pdf file can be displayed by the pdfinfo command
+(xpdf or poppler tools). 
+
+We first modify a copy of the rclpdf filter
+('/usr/[local/]share/recoll/filters/rclpdf'), to compute the pdf page count,
+and output the value as an html meta field. This is a not very interesting
+bit of shell/awk magic. Another approach would be to just rewrite the
+rclpdf filter in your favorite scripting language (ie: perl, python...), as
+all it does is execute pdftotext and pdfinfo and output html, nothing
+complicated. Here follows the rclpdf modification as a pseudo patch: 
+
+----
+# compute the page count and format it so that it's alphabetically sortable
++set `pdfinfo "$infile" | egrep ^Pages:`
++pages=`printf "%04d" $2`
+[skip...]
+# Pass the page count value to awk
+-awk 'BEGIN'\
++awk -v Pages="$pages" 'BEGIN'\
+[skip...]
+# Inside the awk program startup section: compute the "meta" field line
++  pagemeta = "<meta name=\"pdfpages\" content=\"" Pages "\">\n"
+[skip...]
+# Then print it as part of the header:
++    $0 =  part1 charsetmeta pagemeta part2
+[skip...]
+----
+
+You can execute your own version of rclpdf by modifying '~/.recoll/mimeconf':
+
+----
+[index]
+application/pdf = exec /path/to/my/own/rclpdf
+----
+
+At this point, recollindex would receive and extract a +pdfpages+ field,
+but it would not know what to do with it. We are going to tell it to store
+the value inside the document data record so that it can be displayed in
+the results, and sorted on. For this we modify the '~/.recoll/fields' file: 
+
+----
+[stored]
+pdfpages=
+----
+
+That's it ! After reindexing, you can now display +pdfpages+ inside the
+result list (add a +%(pdfpages)+ value to the paragraph format), and display
++pdfpages+ inside the result table (right-click the table header), and sort
+the results on page count (click the column header). 
+
+Note that +pdfpages+ has not been defined as searchable (this would not make
+much sense). For this, you'd have to define a prefix and add it to the
+[prefixes] fields file section: 
+
+----
+[prefixes]
+pdfpages = XYPDFP
+----
+
+Have a look at the comments inside the 'fields' file for more information.