
In the i2010 vision of a European Digital Library, the EU launched an ambitious plan for large scale digitization projects, the reason was to transform Europe’s printed heritage into digitally available resources. The aim of fully integrating intellectual content into the modern information and communication technologies environment can only be achieved by full-text digitisation: transforming digital images of scanned books into electronic text.
Over the last 2-3 years mass-digitization, the matter of almost libraries are lack of institutional knowledge and expertise, the cost for full-featured electronic text are too high and the results of automated text recognition of Optical Character Recognition (OCR) are poor or even useless.
In early 2008, a project called IMPACT (Improving Access to Text) was officially launched to overcome these barriers. It is funded under the Seventh Framework Programme of the European Commission (FP7) with a total funding of 12.1 million Euros, which is planned to have a processing time of 4 years (2008-2012). From 2011 onwards, IMPACT will continue as the IMPACT Centre of Competence for Digitisation, with the aim to make digitisation of historical printed text in Europe better, faster, cheaper by sharing expertise and providing access to tools for all parts of the digitisation workflow, as well as tools, services and facilities for further advancement of the State of the Art in this field.
The consortium brings together twenty-six national and regional libraries, research institutions and commercial suppliers who will share their know-how and best practices, develop innovative tools to enhance the capabilities of OCR engines and the accessibility of digitized text and lay down the foundations for the mass-digitization programs that will take place over the next decade.
As a prestigious digitizing services provider in the European market, DIGI-TEXX has, since December 2010, officially become the service provider for the completion of test digital content for the 5 members of the IMPACT project:

French National Library (Bibliothèque Nationale de France)

Poznan Supercomputing and Networking Center

The British Library

National library of the Netherlands (Koninklijke Bibliotheek)

National Library of Bulgaria (The St. St. Cyril and Methodius National Library)
DIGI-TEXX’s task in these project series is to revise the content of the digitized data identified automatically by the OCR engine, for a certain amount of scanned pages from the libraries datasets, following the rule of “key as such” to ensure that the digitized content would be the same as the original printed version (this data is called ground truth).

To become a reliable partner for the members of IMPACT, DIGI-TEXX had to pass very strict requirements. Among those, the most important factor is that the accuracy of the completed data must reach at least 99.95% and the finishing time of each project is maximum 10 weeks with the number of characters from 4 to 10 million. In addition, requirements for data security according to the international standards ISO/IEC 27001 and requirements for working conditions following the standards of the International Labor Organization (ILO) are also essential factors for DIGI-TEXX to proceed with the signing of cooperation agreements.
Working with members of the IMPACT project has marked a great step of DIGI-TEXX to enhance its position as a leading digitizing services provider in European markets in general and the European library group in particular. With the experience during the implementation of various projects, DIGI-TEXX is more confident in the plan to increase market share in Europe, North America and Vietnam in the near future.
To learn more about the IMPACT Project, please kindly visit the following link:

http://www.impact-project.eu/