Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
167 changes: 82 additions & 85 deletions base_import_pdf_by_template/README.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,3 @@
.. image:: https://odoo-community.org/readme-banner-image
:target: https://odoo-community.org/get-involved?utm_source=readme
:alt: Odoo Community Association

===========================
Base Import Pdf by Template
===========================
Expand All @@ -17,7 +13,7 @@ Base Import Pdf by Template
.. |badge1| image:: https://img.shields.io/badge/maturity-Beta-yellow.png
:target: https://odoo-community.org/page/development-status
:alt: Beta
.. |badge2| image:: https://img.shields.io/badge/license-AGPL--3-blue.png
.. |badge2| image:: https://img.shields.io/badge/licence-AGPL--3-blue.png
:target: http://www.gnu.org/licenses/agpl-3.0-standalone.html
:alt: License: AGPL-3
.. |badge3| image:: https://img.shields.io/badge/github-OCA%2Fedi-lightgray.png?logo=github
Expand Down Expand Up @@ -56,58 +52,59 @@ is to have the document defined with a specific structure.

Fields to consider completing on template:

- Main Model: model on which the record will be generated.
Example: purchase.order
- Child field: One2many field that will create records from
selected template. Example: Order Lines (purchase.order)
- Auto detect pattern: Define a characteristic pattern of the
document so that it recognizes that it corresponds to the
template we are creating. Need to use regular expression.
Example: (?<=ESA79935607)[Ss]\*
- Header Items: Complete this field if the template has a header
table to extract information lines. Example:
Reference,Quantity,Price
- Company: Set the company that will use the template. If it is
empty, template will apply for all companies set on the
environment.
- Main Model: model on which the record will be generated.
Example: purchase.order
- Child field: One2many field that will create records from
selected template. Example: Order Lines (purchase.order)
- Auto detect pattern: Define a characteristic pattern of the
document so that it recognizes that it corresponds to the
template we are creating. Need to use regular expression.
Example: (?<=ESA79935607)[Ss]\*
- Header Items: Complete this field if the template has a header
table to extract information lines. Example:
Reference,Quantity,Price
- Company: Set the company that will use the template. If it is
empty, template will apply for all companies set on the
environment.

1. Add new lines.

..

- Related model: When adding new line, the section where to locate
the data; "header" which, as its name indicates, refers to the
header of the document and "lines" refers to the structure of lines
or table of the document.
- Related model: When adding new line, the section where to locate
the data; "header" which, as its name indicates, refers to the
header of the document and "lines" refers to the structure of
lines or table of the document.

- Field: Map the field to be completed. Example: product
- Field: Map the field to be completed. Example: product

- Pattern: Optional field to complete. Define pattern of the document
so that it recognizes the place to get the field selected on PDF
template. Need to use regular expression. Example: ([0-9]{7})
[0-7]{1}
- Pattern: Optional field to complete. Define pattern of the
document so that it recognizes the place to get the field selected
on PDF template. Need to use regular expression. Example:
([0-9]{7}) [0-7]{1}

- Value type:
- Value type:

- Fixed: Select this value, if the field mapped will always have an
specific value and not extract the information from template. In
this case Pattern field must be empty.
- Variable: Select variable to get the information from template.
In this case, Pattern field must be completed.
- Fixed: Select this value, if the field mapped will always have
an specific value and not extract the information from
template. In this case Pattern field must be empty.
- Variable: Select variable to get the information from template.
In this case, Pattern field must be completed.

- For Value type "Variable" will appear extra fields to complete:
- For Value type "Variable" will appear extra fields to complete:

- Search value: Indicates the field by which the value obtained in
the PDF will be searched on the system.
- Search value: Indicates the field by which the value obtained in
the PDF will be searched on the system.

- Default value: If the search result is empty for the search value
option, you can set default value to create a record and not
getting error message.
- Default value: If the search result is empty for the search value
option, you can set default value to create a record and not
getting error message.

- Log distint value?: This option is useful when getting prices in
order to compare prices inside system and prices obtained from PDF.
This will create lines with prices obtained from the system but
create log on chatter to see the differences obtained from PDF.
- Log distint value?: This option is useful when getting prices in
order to compare prices inside system and prices obtained from
PDF. This will create lines with prices obtained from the system
but create log on chatter to see the differences obtained from
PDF.

Check demo data to further information.

Expand All @@ -124,47 +121,47 @@ document structure.
Known issues / Roadmap
======================

- Add operator in template lines (= or ilike)
- Simplify auto-detection (defining a text only to search the system
should search the corresponding regular expression).
- Allow compatibility with registration process created from email alias
(for purchase order for example).
- Remove error if some file is not auto-detected template, options:
boolean (default option according to system parameter) to omit error
for not found files or change process to 2 steps, auto-detect and show
lines (each one with respect to a file) with template applied (similar
to dms_auto_classification).
- Display a more readable error if there is an error in Preview process,
example: wrong pattern. Message: "Please check template defined, some
items are not correctly set".
- Add a progress bar (widget=“gauge”) in the import wizard process,
useful if we import for example sales orders with 20 lines and thus
know the progress.
- Add date_format model to define the specific formats.
- Add operator in template lines (= or ilike)
- Simplify auto-detection (defining a text only to search the system
should search the corresponding regular expression).
- Allow compatibility with registration process created from email
alias (for purchase order for example).
- Remove error if some file is not auto-detected template, options:
boolean (default option according to system parameter) to omit error
for not found files or change process to 2 steps, auto-detect and
show lines (each one with respect to a file) with template applied
(similar to dms_auto_classification).
- Display a more readable error if there is an error in Preview
process, example: wrong pattern. Message: "Please check template
defined, some items are not correctly set".
- Add a progress bar (widget=“gauge”) in the import wizard process,
useful if we import for example sales orders with 20 lines and thus
know the progress.
- Add date_format model to define the specific formats.

Compatibility with csv, xls, etc:

- Separate much of the logic to new module base_import_simple that would
contain the logic of templates, type of files (csv, excel, etc) in the
templates and wizard and this module would depend on the other adding
only what relates to PDF.
- The base module should take into account for each template whether
each line is a new record or not, and start line (in case you want to
omit any), only page 1 would be imported.
- The preview smart-btton would serve exactly the same purpose.
- In the case of csv and Excel that each record is a line, the document
will NOT be attached to the record.
- If you indicate that each record is a line the column will be the key,
otherwise you must specify to which line each line of the template
refers.
- In the case of csv it will try to auto-detect the lines and columns
(no need to complicate delimiters configuration).
- The menu "Import PDF" of the favorite menu would become "Import file",
and the allowed file extensions would be those obtained from a method
(it would be extended by other modules that add other formats such as
PDF).
- Add queue_job_base_import_simple module to process everything by
queues (example: Excel with hundreds of lines, each one a record).
- Separate much of the logic to new module base_import_simple that
would contain the logic of templates, type of files (csv, excel, etc)
in the templates and wizard and this module would depend on the other
adding only what relates to PDF.
- The base module should take into account for each template whether
each line is a new record or not, and start line (in case you want to
omit any), only page 1 would be imported.
- The preview smart-btton would serve exactly the same purpose.
- In the case of csv and Excel that each record is a line, the document
will NOT be attached to the record.
- If you indicate that each record is a line the column will be the
key, otherwise you must specify to which line each line of the
template refers.
- In the case of csv it will try to auto-detect the lines and columns
(no need to complicate delimiters configuration).
- The menu "Import PDF" of the favorite menu would become "Import
file", and the allowed file extensions would be those obtained from a
method (it would be extended by other modules that add other formats
such as PDF).
- Add queue_job_base_import_simple module to process everything by
queues (example: Excel with hundreds of lines, each one a record).

Bug Tracker
===========
Expand All @@ -187,10 +184,10 @@ Authors
Contributors
------------

- `Tecnativa <https://www.tecnativa.com>`__:
- `Tecnativa <https://www.tecnativa.com>`__:

- Víctor Martínez
- Pedro M. Baeza
- Víctor Martínez
- Pedro M. Baeza

Maintainers
-----------
Expand Down
2 changes: 1 addition & 1 deletion base_import_pdf_by_template/__manifest__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# License AGPL-3.0 or later (https://www.gnu.org/licenses/agpl).
{
"name": "Base Import Pdf by Template",
"version": "18.0.1.0.1",
"version": "18.0.1.1.0",
"website": "https://github.com/OCA/edi",
"author": "Tecnativa, Odoo Community Association (OCA)",
"license": "AGPL-3",
Expand Down
7 changes: 3 additions & 4 deletions base_import_pdf_by_template/demo/base_import_pdf_template.xml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
<field name="name">Partner Template</field>
<field name="model_id" ref="base.model_res_partner" />
<field name="child_field_id" ref="base.field_res_partner__child_ids" />
<field name="header_items">Name,Address,Child Country</field>
<field name="auto_detect_pattern">Test partner info.*</field>
</record>
<record
Expand Down Expand Up @@ -66,7 +65,7 @@
<field name="template_id" ref="demo_base_import_pdf_template_res_partner" />
<field name="related_model">lines</field>
<field name="field_id" ref="base.field_res_partner__name" />
<field name="column">0</field>
<field name="sequence">0</field>
<field name="pattern">(.*),.*,</field>
</record>
<record
Expand All @@ -76,7 +75,7 @@
<field name="template_id" ref="demo_base_import_pdf_template_res_partner" />
<field name="related_model">lines</field>
<field name="field_id" ref="base.field_res_partner__street" />
<field name="column">1</field>
<field name="sequence">1</field>
<field name="pattern">.*,(.*),</field>
</record>
<record
Expand All @@ -88,7 +87,7 @@
<field name="field_id" ref="base.field_res_partner__country_id" />
<field name="search_field_id" ref="base.field_res_country__code" />
<field name="pattern">.*,.*, [A-Z].*[(]([A-Z]{1,2})[)]</field>
<field name="column">2</field>
<field name="sequence">2</field>
<field name="log_distinct_value" eval="True" />
</record>
</odoo>
23 changes: 23 additions & 0 deletions base_import_pdf_by_template/migrations/18.0.1.1.0/pre-migration.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Copyright 2026 Tecnativa - Víctor Martínez
# License AGPL-3.0 or later (https://www.gnu.org/licenses/agpl).

from openupgradelib import openupgrade


@openupgrade.migrate()
def migrate(env, version):
openupgrade.logged_query(
env.cr,
"""
ALTER TABLE base_import_pdf_template_line
ADD COLUMN IF NOT EXISTS sequence INTEGER
""",
)
openupgrade.logged_query(
env.cr,
"""
UPDATE base_import_pdf_template_line
SET sequence = column::integer
WHERE column IS NOT NULL
""",
)
27 changes: 15 additions & 12 deletions base_import_pdf_by_template/models/base_import_pdf_template.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ class BaseImportPdfTemplate(models.Model):
help="""It will be necessary to set a patter that only finds something
in the documents for this template.""",
)
header_items = fields.Char(help="Header columns separated by commas")
company_id = fields.Many2one(comodel_name="res.company", string="Company")
line_ids = fields.One2many(
comodel_name="base.import.pdf.template.line",
Expand Down Expand Up @@ -89,8 +88,11 @@ def _auto_detect_from_text(self, text):
def _get_table_info(self, text):
"""Convert table data to a readable dict."""
res = False
if text and self.header_items:
res = {"header": self.header_items, "data": []}
if text and self.line_ids:
lines = self.line_ids.filtered(
lambda x: x.related_model == "lines" and x.value_type != "fixed"
)
res = {"header": lines.mapped("field_name"), "data": []}
data = self._get_table_info_data(text)
res["data"].extend(data)
return res
Expand All @@ -104,13 +106,15 @@ def _get_table_info_data(self, text):
and x.value_type != "fixed"
and x.pattern
)
sequence = 0
for child_line in child_lines:
data_column = []
matches = re.finditer(child_line.pattern, text, re.MULTILINE)
for _matchNum, match in enumerate(matches, start=1):
match_group = match.groups(0)[0]
data_column.append(match_group.strip())
data_map_column[int(child_line.column)] = data_column
data_map_column[sequence] = data_column
sequence += 1
# Convert data column to lines (table lines "split" in pages not supported)
data_keys = list(data_map_column.keys())
data_key_0 = data_keys[0]
Expand Down Expand Up @@ -159,18 +163,17 @@ def _get_field_child_values(self, table_info):
def _get_field_values_from_table_item(self, item):
res = False
child_lines = self.line_ids.filtered(
lambda x: x.related_model == "lines"
and x.value_type != "fixed"
and x.column
lambda x: x.related_model == "lines" and x.value_type != "fixed"
)
if item and child_lines:
item_lenght = len(item) - 1
res = {}
sequence = 0
for child_line in child_lines:
column = int(child_line.column)
if item_lenght >= column and item[column]:
value = child_line._process_value(item[column])
if item_lenght >= sequence and item[sequence]:
value = child_line._process_value(item[sequence])
res[child_line.field_name] = value
sequence += 1
return res

def _get_field_values(self, related_model, text):
Expand All @@ -187,8 +190,9 @@ def _get_field_values(self, related_model, text):
class BaseImportPdfTemplateLine(models.Model):
_name = "base.import.pdf.template.line"
_description = "Base Import Pdf Template Line"
_order = "model asc, id"
_order = "sequence, id"

sequence = fields.Integer(default=10)
template_id = fields.Many2one(
comodel_name="base.import.pdf.template",
string="Template",
Expand All @@ -211,7 +215,6 @@ class BaseImportPdfTemplateLine(models.Model):
field_name = fields.Char(related="field_id.name")
field_ttype = fields.Selection(related="field_id.ttype")
field_relation = fields.Char(related="field_id.relation")
column = fields.Char()
pattern = fields.Char()
search_field_id = fields.Many2one(
comodel_name="ir.model.fields",
Expand Down
Loading
Loading