|
a/src/doc/user/usermanual.xml |
|
b/src/doc/user/usermanual.xml |
|
... |
|
... |
4352 |
|
4352 |
|
4353 |
<sect3 id="RCL.PROGRAM.PYTHON.INTRO">
|
4353 |
<sect3 id="RCL.PROGRAM.PYTHON.INTRO">
|
4354 |
<title>Introduction</title>
|
4354 |
<title>Introduction</title>
|
4355 |
|
4355 |
|
4356 |
<para>&RCL; versions after 1.11 define a Python programming
|
4356 |
<para>&RCL; versions after 1.11 define a Python programming
|
4357 |
interface, both for searching and indexing. The indexing
|
4357 |
interface, both for searching and indexing.</para>
|
4358 |
portion has seen little use, but the searching one is used
|
|
|
4359 |
in the Recoll Ubuntu Unity Lens and Recoll Web UI.</para>
|
|
|
4360 |
|
4358 |
|
|
|
4359 |
<para>The search interface is used in the Recoll Ubuntu Unity Lens
|
|
|
4360 |
and Recoll WebUI.</para>
|
|
|
4361 |
|
|
|
4362 |
<para>The indexing section of the API has seen little use, and is
|
|
|
4363 |
more a proof of concept. In truth it is waiting for its killer
|
|
|
4364 |
app...</para>
|
|
|
4365 |
|
4361 |
<para>The API is inspired by the Python database API
|
4366 |
<para>The search API is modeled along the Python database API
|
4362 |
specification. There were two major changes in recent &RCL;
|
4367 |
specification. There were two major changes along &RCL; versions:
|
4363 |
versions:
|
|
|
4364 |
<itemizedlist>
|
4368 |
<itemizedlist>
|
4365 |
<listitem>The basis for the &RCL; API changed from Python
|
4369 |
<listitem><para>The basis for the &RCL; API changed from Python
|
4366 |
database API version 1.0 (&RCL; versions up to 1.18.1),
|
4370 |
database API version 1.0 (&RCL; versions up to 1.18.1),
|
4367 |
to version 2.0 (&RCL; 1.18.2 and later).</listitem>
|
4371 |
to version 2.0 (&RCL; 1.18.2 and later).</para></listitem>
|
4368 |
<listitem>The <literal>recoll</literal> module became a
|
4372 |
<listitem><para>The <literal>recoll</literal> module became a
|
4369 |
package (with an internal <literal>recoll</literal>
|
4373 |
package (with an internal <literal>recoll</literal>
|
4370 |
module) as of &RCL; version 1.19, in order to add more
|
4374 |
module) as of &RCL; version 1.19, in order to add more
|
4371 |
functions. For existing code, this only changes the way
|
4375 |
functions. For existing code, this only changes the way
|
4372 |
the interface must be imported.</listitem>
|
4376 |
the interface must be imported.</para></listitem>
|
4373 |
</itemizedlist>
|
4377 |
</itemizedlist>
|
4374 |
</para>
|
4378 |
</para>
|
4375 |
|
4379 |
|
4376 |
<para>We will mostly describe the new API and package
|
4380 |
<para>We will mostly describe the new API and package
|
4377 |
structure here. A paragraph at the end of this section will
|
4381 |
structure here. A paragraph at the end of this section will
|
4378 |
explain a few differences and ways to write code
|
4382 |
explain a few differences and ways to write code
|
|
... |
|
... |
4390 |
<userinput>python setup.py build</userinput>
|
4394 |
<userinput>python setup.py build</userinput>
|
4391 |
<userinput>python setup.py install</userinput>
|
4395 |
<userinput>python setup.py install</userinput>
|
4392 |
</screen>
|
4396 |
</screen>
|
4393 |
</para>
|
4397 |
</para>
|
4394 |
|
4398 |
|
|
|
4399 |
<para>As of &RCL; 1.19, the module can be compiled for
|
|
|
4400 |
Python3.</para>
|
|
|
4401 |
|
4395 |
<para>The normal &RCL; installer installs the Python
|
4402 |
<para>The normal &RCL; installer installs the Python2
|
4396 |
API along with the main code.</para>
|
4403 |
API along with the main code. The Python3 version must be
|
|
|
4404 |
explicitely built and installed.</para>
|
4397 |
|
4405 |
|
4398 |
<para>When installing from a repository, and depending on the
|
4406 |
<para>When installing from a repository, and depending on the
|
4399 |
distribution, the Python API can sometimes be found in a
|
4407 |
distribution, the Python API can sometimes be found in a
|
4400 |
separate package.</para>
|
4408 |
separate package.</para>
|
4401 |
|
4409 |
|
|
|
4410 |
<para>The following small sample will run a query and list
|
|
|
4411 |
the title and url for each of the results. It would work with &RCL;
|
|
|
4412 |
1.19 and later. The <filename>python/samples</filename> source directory
|
|
|
4413 |
contains several examples of Python programming with &RCL;,
|
|
|
4414 |
exercising the extension more completely, and especially its data
|
|
|
4415 |
extraction features.</para>
|
|
|
4416 |
<programlisting>
|
|
|
4417 |
from recoll import recoll
|
|
|
4418 |
|
|
|
4419 |
db = recoll.connect()
|
|
|
4420 |
query = db.query()
|
|
|
4421 |
nres = query.execute("some query")
|
|
|
4422 |
results = query.fetchmany(20)
|
|
|
4423 |
for doc in results:
|
|
|
4424 |
print(doc.url, doc.title)
|
|
|
4425 |
</programlisting>
|
4402 |
</sect3>
|
4426 |
</sect3>
|
4403 |
|
4427 |
|
4404 |
<sect3 id="RCL.PROGRAM.PYTHON.PACKAGE">
|
4428 |
<sect3 id="RCL.PROGRAM.PYTHON.PACKAGE">
|
4405 |
<title>Recoll package</title>
|
4429 |
<title>Recoll package</title>
|
4406 |
|
4430 |
|
|
... |
|
... |
4458 |
|
4482 |
|
4459 |
<para>A Db object is created by
|
4483 |
<para>A Db object is created by
|
4460 |
a <literal>connect()</literal> call and holds a
|
4484 |
a <literal>connect()</literal> call and holds a
|
4461 |
connection to a Recoll index.</para>
|
4485 |
connection to a Recoll index.</para>
|
4462 |
<variablelist>
|
4486 |
<variablelist>
|
4463 |
<title>Methods</title>
|
|
|
4464 |
<varlistentry>
|
4487 |
<varlistentry>
|
4465 |
<term>Db.close()</term>
|
4488 |
<term>Db.close()</term>
|
4466 |
<listitem>Closes the connection. You can't do anything
|
4489 |
<listitem>Closes the connection. You can't do anything
|
4467 |
with the <literal>Db</literal> object after
|
4490 |
with the <literal>Db</literal> object after
|
4468 |
this.</listitem>
|
4491 |
this.</listitem>
|
|
... |
|
... |
4509 |
cursor in the Python DB API) is created by
|
4532 |
cursor in the Python DB API) is created by
|
4510 |
a <literal>Db.query()</literal> call. It is used to
|
4533 |
a <literal>Db.query()</literal> call. It is used to
|
4511 |
execute index searches.</para>
|
4534 |
execute index searches.</para>
|
4512 |
|
4535 |
|
4513 |
<variablelist>
|
4536 |
<variablelist>
|
4514 |
<title>Methods</title>
|
|
|
4515 |
|
4537 |
|
4516 |
<varlistentry>
|
4538 |
<varlistentry>
|
4517 |
<term>Query.sortby(fieldname, ascending=True)</term>
|
4539 |
<term>Query.sortby(fieldname, ascending=True)</term>
|
4518 |
<listitem>Sort results
|
4540 |
<listitem>Sort results
|
4519 |
by <replaceable>fieldname</replaceable>, in ascending
|
4541 |
by <replaceable>fieldname</replaceable>, in ascending
|
|
... |
|
... |
4657 |
object. Especially this will not be the case for the
|
4679 |
object. Especially this will not be the case for the
|
4658 |
document text. See the <literal>rclextract</literal>
|
4680 |
document text. See the <literal>rclextract</literal>
|
4659 |
module for accessing document contents.</para>
|
4681 |
module for accessing document contents.</para>
|
4660 |
|
4682 |
|
4661 |
<variablelist>
|
4683 |
<variablelist>
|
4662 |
<title>Methods</title>
|
|
|
4663 |
|
4684 |
|
4664 |
<varlistentry>
|
4685 |
<varlistentry>
|
4665 |
<term>get(key), [] operator</term>
|
4686 |
<term>get(key), [] operator</term>
|
4666 |
<listitem>Retrieve the named doc attribute</listitem>
|
4687 |
<listitem>Retrieve the named doc attribute</listitem>
|
4667 |
</varlistentry>
|
4688 |
</varlistentry>
|
|
... |
|
... |
4692 |
in replacement of the query language approach. The
|
4713 |
in replacement of the query language approach. The
|
4693 |
interface is going to change a little, so no detailed doc
|
4714 |
interface is going to change a little, so no detailed doc
|
4694 |
for now...</para>
|
4715 |
for now...</para>
|
4695 |
|
4716 |
|
4696 |
<variablelist>
|
4717 |
<variablelist>
|
4697 |
<title>Methods</title>
|
|
|
4698 |
|
4718 |
|
4699 |
<varlistentry>
|
4719 |
<varlistentry>
|
4700 |
<term>addclause(type='and'|'or'|'excl'|'phrase'|'near'|'sub',
|
4720 |
<term>addclause(type='and'|'or'|'excl'|'phrase'|'near'|'sub',
|
4701 |
qstring=string, slack=0, field='', stemming=1,
|
4721 |
qstring=string, slack=0, field='', stemming=1,
|
4702 |
subSearch=SearchData)</term>
|
4722 |
subSearch=SearchData)</term>
|
|
... |
|
... |
4727 |
|
4747 |
|
4728 |
<sect5 id="RCL.PROGRAM.PYTHON.RECOLL.CLASSES.EXTRACTOR">
|
4748 |
<sect5 id="RCL.PROGRAM.PYTHON.RECOLL.CLASSES.EXTRACTOR">
|
4729 |
<title>The Extractor class</title>
|
4749 |
<title>The Extractor class</title>
|
4730 |
|
4750 |
|
4731 |
<variablelist>
|
4751 |
<variablelist>
|
4732 |
<title>Methods</title>
|
|
|
4733 |
|
4752 |
|
4734 |
<varlistentry>
|
4753 |
<varlistentry>
|
4735 |
<term>Extractor(doc)</term>
|
4754 |
<term>Extractor(doc)</term>
|
4736 |
<listitem>An <literal>Extractor</literal> object is
|
4755 |
<listitem>An <literal>Extractor</literal> object is
|
4737 |
built from a <literal>Doc</literal> object, output
|
4756 |
built from a <literal>Doc</literal> object, output
|