OpenERP data load using OpenETL

Introduction:

 

OpenETL is a library in python for OpenERP S.A Data Migration. This tool allows us to perform all typical ETL operations (extract, transform and load) with the added value of being very well integrated with OpenERP.

The website of the library is https://launchpad.net/openetl

There is also a graphical interface packaged as a module (etl_interface) for OpenERP. This module is an extra add-ons. Although since the interface is easier to handle, this post will focus on the use of python library.

Installing:

To download the library you must install bazaar.

The specific command:

bzr branch lp:openetl

After downloading the openerp branch, copy the folder openetl/lib/openetl to your system libraries folder. In my case /usr/lib/python2.6/.

Although in my eclipse with PyDev I already have set the route, Eclipse Indigo seems not to “catch” the first library. To refresh the accessible libraries in eclipse, go to PyDevInterpreter – Python, click on Restore Defaults (if we already had configured the system) and then click on Apply.

Performance:

The library is divided into work, components, connectors, and transitions.

  • Jobs: Processes that can run, pause and stop.
  • Components: Inputs, outputs and transformation components. They allow us to get data and store them into external systems.The transformation components will be those that fit the data before the final charge.
  • Connectors: The connectors define connections with the external systems. They are used by the components. The current version of the library has connectors to treat local files, Openobjects, different databases (postgres, mysql, oracle), URLs (http, ftp, https, gopher), xmlrpc web services , SugarCRM, other google services (GDocs , GCalendar, gblog) and facebook.
  • Transitions:  Transitions are the flow the data between component is passing through.

The programmer must define as many input conectors as outputs are needed, at least one component for each connector, and a minimal transition to pass data from one component to another. Connectors are linked” with the components, so for writting data into the component, they are written into the external system.

Instance:

We have a OrangeHRM system installed in the company with the HR tabs we want to migrate to OpenERP. The HR staff has exported the data to a .Csv format and asks us to perform the OpenERP data loading.

What we have to do is create two connectors, one will connect locally against the .csv file and the other type of XMLRPC will connect to OpenERP. Then define the components that are going to store the information, create the transitions (one for writing data to the final component and one for sorting the data) which are going to run and finally we launch the task. In the OpenObject component, we’ll define the csv fields mapping to the erp object fields . Transitions are executed sequentially as you have defined them in the .py .

Diagrama de carga de datos con OpenETL con 2 transiciones y 2 conectores
Diagrama de carga de datos con OpenETL con 2 transiciones y 2 conectores

The associated source code:

import sys
sys.path.append('..')

import openetl

# Conectors
fileconnector_orange=openetl.connector.localfile('/home/carlos/Escritorio/csvRaquel/DatosRRHHOrangeHRM.csv')

ooconnector = openetl.connector.openobject_connector('http://localhost:8069', 'testProject', 'admin', 'admin', con_type='xmlrpc')

# Components
csv_in1= openetl.component.input.csv_in(fileconnector_orange,name='Datos de Orange')

oo_out_employee = openetl.component.output.openobject_out(
     ooconnector,
     'hr.employee',
     {'name':'firstName'}
    )

sort1=openetl.component.transform.sort('firstName')

# Transitions

tran1=openetl.transition(csv_in1,sort1)

tran2=openetl.transition(sort1, oo_out_employee)

# Work and run definition
job1=openetl.job([csv_in1,sort1,oo_out_employee])
job1.run()

To make the example easier, I have simplified the number of fields to load from the .csv, fields with related tables, etc.. But the load can be as much complicated as necessary so that all data is migrated correctly.

In the folder openetlexamples there are examples of all the things we will need in the migrating data process , going from examples with multiple inputs and outputs (csv_in_out.py) to data load with related tables (m2m_into_oo.py). This example is quite interesting, as it reads users and csv files groups and load them directly into OpenERP. Let’s note also the example of migrating data from SugarCRM, facebook, GCalendar, etc..

Processes that can run, pause and stop.

Leave a Reply